toy list processing problem: collect similar terms

M

Mirko

... faulty code deleted

Aaand one more fix (apply -> funcall) (This version at least produces
a close
facsimile of the desired output)

(defun reduce-tagged (function sequence &key
(key-tag #'first)
(key-datum #'rest))
"Use a binary operation, `function' to combine a sequence of tagged
elements. like-tagged elements are `reduce'd according to `function'

`sequence' is a sequence of tagged elements. reduce-m will reduce
like-tagged-elements.

If `key-tag' is supplied it is used to extract the element tag. If
`key-tag' is not supplied, the function `first' is used.

If `key-datum' is supplied, it is used to extract the element datum.
If `key-datum' is not supplied, the function `rest' is used.

"
(let ((hash (make-hash-table)))
(dolist (datum sequence)
(let ((tag (funcall key-tag datum))
(values (funcall key-datum datum)))
(multiple-value-bind (it present)
(gethash tag hash)
(declare (ignore it))
(if present
(setf (gethash tag hash)
(funcall function (gethash tag hash) values))
(setf (gethash tag hash) values)))))
(let (result)
(maphash #'(lambda(key value)
(push (list key value) result))
hash)
result)))
 
C

ccc31807

here's a interesting toy list processing problem.

I have a list of lists, where each sublist is labelled by
a number. I need to collect together the contents of all sublists
sharing
the same label. So if I have the list

((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q
r) (5 s t))

where the first element of each sublist is the label, I need to
produce:

output:
((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))

Here is a solution in Perl -- the verbose version. Please see my note
below.

SCRIPT:
use strict;
use warnings;
my %lists;

while (<DATA>)
{
chomp;
my ($k, @v) = split(/ /, $_);
push(@{$lists{$k}}, @v);
}

foreach my $k (sort keys %lists)
{
print "$k - @{$lists{$k}}\n";
}

exit(0);

__DATA__
0 a b
1 c d
2 e f
3 g h
1 i j
2 k l
4 m n
2 o p
4 q r
5 s t

OUTPUT:
perl lists.plx
0 - a b
1 - c d i j
2 - e f k l o p
3 - g h
4 - m n q r
5 - s t

NOTE:
I assume that you want an idiomatic solution for the language. I have
therefore converted your data into a typical record oriented
structure. Perlistas don't use parenthesis. If you want a Lispy
solution, use Lisp. Further, Perl was made for exactly this kind of
problem, which is simple data munging, taking some input, transforming
it, and printing it out -- Practical Extraction and Reporting
Language. I know that some Lispers (R.G., are you reading?) will
object to a formulation like this: @{$lists{$k}}, but all this says
(in Perl) is to spit out the value contained in the hash element
$lists{$k} as an array, and is good idiomatic Perl, even if some
Lispers aren't quite up to the task of understanding it.

CC.
 
S

Steven D'Aprano

Here is my Common Lisp (and I only care about Common Lisp answers)

Good for you. So why are you spamming other newsgroups with your CL
solution? Not once, but three times.

Replies to /dev/null.
 
S

Seebs

btw, i disagree about your remark on crossposting.

You're wrong. Crossposting is indeed a standard trolling tactic.

This does not prove that all crossposters are trolls, nor does it assert
that all crossposters are trolls, but it is absolutely factually correct
that it's a standard trolling tactic.

-s
 
S

Seebs

It was livibetter who without any motivation or reasoning posted Python
code in CLPM.

Not exactly; he posted it in a crossposted thread, which happened to include
CLPM and other groups, including comp.lang.python.

It is quite possible that he didn't know about the crossposting. However,
while I would agree that this would constitute a form of ignorance, I'd think
that, especially with how well some newsreading interfaces hide that detail,
I don't think it's really worth getting angry over.

-s
 
J

John Bokma

Seebs said:
You're wrong.

FYI: Xah Lee is well known for his abuse of cross posting. Be happy that
Google Groups limits the number of groups to crosspost to five.

Related: his spamming behaviour has already resulted in Xah Lee
having to look for another hosting provider [1]. Currently 1and1 provides
him shelter, most likely carefully selected by Xah (as Google Groups)
since 1and1 is not known for their spam fighting reputation nor clue.

But one can only hope, so if you want to do something about this Xah
thing, please report it with 1and1 and (e-mail address removed). He won't learn
respect from it, but maybe you end up being honored [2] on his
collection of drivel [3].

In short:

= don't reply to Xah Lee: at best it's a waste of time
= if his post is abusive (crossposting to 5 groups just because you can
is) report it with /and/ his hosting provider (since most of his
posts are copy paste jobs of articles on his site just to drive
traffic) and Google Groups (his Usenet provider of choice since they
hardly do a thing about spam).


[1] http://www.mail-archive.com/[email protected]/msg91631.html
[2] http://www.google.com/search?q=site:xahlee.org bokma
[3] What's sad is that some of its stuff is actually good/not bad.
But tainted: Xah Lee is a spammer and a Usenet abuser.
 
J

John Bokma

Seebs said:
Not exactly; he posted it in a crossposted thread, which happened to include
CLPM and other groups, including comp.lang.python.

It is quite possible that he didn't know about the crossposting.

Oh, he does. It has been Xah's game for years.
while I would agree that this would constitute a form of ignorance, I'd think
that, especially with how well some newsreading interfaces hide that detail,
I don't think it's really worth getting angry over.

You think wrong. And Jurgen should have known better than to reply several
times (but like too many people in cplm he posts for posting's sake, the
main reason why I don't follow that group anymore).

Xah has been doing this for many years and most of his posts are just
made to drive traffic to his site (hence they are copy paste of articles
on his site + link(s) to his site). It's Usenet abuse, on purpose.

The reason it pisses off people is that nearly weekly it causes a lot of
noise in newsgroups that are really better off without the noise
(IMNSHO).

See my other post on how to deal with this spammer. If you've missed it:
report him for spamming, since that's what he does. It already made him
have to move hosting providers once. While it's not going to stop him,
it will cost him money. See:
http://www.google.com/search?q=site:xahlee.org bokma

While I am named in that article be assured that I was not the only one
contacting dreamhost (+10 for doing this, btw). Quite some people
contacted me via email that they also talked with Dreamhost. Just keep
reporting this spammer, and maybe 1and1 will kick it out.
 
S

Seebs

Oh, he does. It has been Xah's game for years.

But did "livibetter" know about it? I wasn't defending Xah, who is indeed
at the very least clueless and disruptive. But I was sort of defending
the poster who was accused of posting Python code in CLPM, because that
poster may not have understood the game.

-s
 
J

John Bokma

fup set to poster
But did "livibetter" know about it? I wasn't defending Xah, who is indeed
at the very least clueless and disruptive.

Heh, he's not clueless, the problem is that he knows exactly what he's
doing. And like most spammers, very hard to get rid off.
But I was sort of defending
the poster who was accused of posting Python code in CLPM, because that
poster may not have understood the game.

Ah, clear. Well, the problem is somewhat also in CLPM where people
somehow have to reply to messages like this :-(. And I am sure Xah
laughes his ass off each time it happens.
 
X

Xah Lee

fup set to poster



Heh, he's not clueless, the problem is that he knows exactly what he's
doing. And like most spammers, very hard to get rid off.


Ah, clear. Well, the problem is somewhat also in CLPM where people
somehow have to reply to messages like this :-(. And I am sure Xah
laughes his ass off each time it happens.

Hi John Bokma,

can you stop this?

doesn't seems fruitful to keep on this.

if you don't like my posts, ignore them? i don't post in
comp.lang.python or comp.lang.perl.misc that often... maybe have done
so 5 times this year.

i visited your home page
http://johnbokma.com/mexit/2010/08/15/
and there are really a lot beautiful photos.

this isn't bribery or something like that. I've been annoyed by you,
of course, but it's not fruitful to keep going on this.

Best,

Xah ∑ xahlee.org ☄
 
J

John Bokma

Xah Lee said:
can you stop this?

Can you stop crossposting? And if there is really, really a need to
crosspost, can you please set the follow-up to?
doesn't seems fruitful to keep on this.

if you don't like my posts, ignore them? i don't post in
comp.lang.python or comp.lang.perl.misc that often... maybe have done
so 5 times this year.

Which is enough to disrupt those groups for days.
i visited your home page
http://johnbokma.com/mexit/2010/08/15/
and there are really a lot beautiful photos.

Thanks Xah. Like I wrote, your site /does/ have good information, it's
so sad that you somehow think it's necessary to spam Usenet to get
visitors. Or maybe you've another reason, don't know. But it /is/ Usenet
abuse.
this isn't bribery or something like that. I've been annoyed by you,
of course, but it's not fruitful to keep going on this.

Well, you annoy me, I annoy you. It's in your hands to make it stop.

My advice is:

1) remove all the excessive swearing from your site. If you have a
point, you don't need it. Your argument(s) without the swearing
should speak for themselves

2) Stop abusing Usenet. Instead focus on writing more good stuff on
your site.

1) & 2) will keep me from linking to your site, ever. And I am sure I am
not alone in this.
 
J

John Bokma

Paul Rubin said:
John, can you ALSO stop crossposting?

Since the issue is on-topic in all groups: no. I did set a follow-up
header, which you ignored and on top of that redirected the thing to
comp.lang.python. So:

Paul, can you ALSO stop acting like a complete ass?
 
W

w_a_x_man

It's too complex. Just write:

(let ((list '((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n)
              (2 o p) (4 q r) (5 s t))))

  (mapcar (lambda (class) (reduce (function append) class :key (function rest)))
           (com.informatimago.common-lisp.list:equivalence-classes list :key (function first)))

   )

--> ((S T) (Q R M N) (G H) (O P K L E F) (I J C D) (A B))

Ruby:

[[0, 'a', 'b'], [1, 'c', 'd'], [2, 'e', 'f'], [3, 'g', 'h'], [1,
'i', 'j'], [2, 'k', 'l'], [4, 'm', 'n'], [2, 'o', 'p'], [4, 'q', 'r'],
[5, 's', 't']].
group_by{|x| x.first}.values.map{|x| x.map{|y| y[1..-1]}.flatten}

==>[["s", "t"], ["a", "b"], ["c", "d", "i", "j"],
["e", "f", "k", "l", "o", "p"],
["g", "h"], ["m", "n", "q", "r"]]
 
N

namekuseijin

It's too complex. Just write:
(let ((list '((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n)
              (2 o p) (4 q r) (5 s t))))
  (mapcar (lambda (class) (reduce (function append) class :key (function rest)))
           (com.informatimago.common-lisp.list:equivalence-classes list :key (function first)))
--> ((S T) (Q R M N) (G H) (O P K L E F) (I J C D) (A B))

Ruby:

[[0, 'a', 'b'], [1, 'c', 'd'], [2, 'e', 'f'], [3, 'g', 'h'], [1,
'i', 'j'], [2, 'k', 'l'], [4, 'm', 'n'], [2, 'o', 'p'], [4, 'q', 'r'],
[5, 's', 't']].
group_by{|x| x.first}.values.map{|x| x.map{|y| y[1..-1]}.flatten}

    ==>[["s", "t"], ["a", "b"], ["c", "d", "i", "j"],
 ["e", "f", "k", "l", "o", "p"],
 ["g", "h"], ["m", "n", "q", "r"]]

cool, it comes with order all fucked up. This is something I was
criticized for before, though not all that important to most
functional processing. Not the case here, though.

here's a scheme version that is hopefully better than the given one:

(define (dig in)

(if (null? in) '()

(let* ((n (first-of-first in))

(all-n (filter in (lambda (x) (eq? n (first x)))))

(all-but-n (filter in (lambda (x) (not (eq? n (first
x)))))))

(pair

(fold all-n
(lambda (i o) (pair (second i) (pair (third i) o))))

(dig all-but-n)))))


; given these aliases to lisp n00bs

(define pair cons)

(define first car)

(define rest cdr)

(define first-of-first caar)

(define second cadr)

(define third caddr)


; and these well-known functions
(non-tail-recursive for benefit of n00bs)
(define (fold ls f) ; AKA reduce

(if (null? ls) '()

(f (first ls) (fold (rest ls) f))))


(define (filter ls f)

(fold ls (lambda (i o) (if (f i) (pair i o) o))))



; testing
(let ((in '((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n)
(2 o p) (4 q r) (5 s t))))
(display (dig in))
(newline))
 
N

namekuseijin

On Sep 26, 9:24 am, (e-mail address removed) (Pascal J. Bourguignon)
wrote:
here's a interesting toy list processing problem.
I have a list of lists, where each sublist is labelled by
a number. I need to collect together the contents of all sublists
sharing
the same label. So if I have the list
((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q
r) (5 s t))
where the first element of each sublist is the label, I need to
produce:
output:
((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))
a Mathematica solution is here:
http://xahlee.org/UnixResource_dir/writ/notations_mma.html
R5RS Scheme lisp solution:
http://xahlee.org/UnixResource_dir/writ/Sourav_Mukherjee_sourav.work_...
by Sourav Mukherjee
also, a Common Lisp solution can be found here:
http://groups.google.com/group/comp.lang.lisp/browse_frm/thread/5d1de...
It's too complex. Just write:
(let ((list '((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n)
              (2 o p) (4 q r) (5 s t))))
  (mapcar (lambda (class) (reduce (function append) class :key (function rest)))
           (com.informatimago.common-lisp.list:equivalence-classes list :key (function first)))
   )
--> ((S T) (Q R M N) (G H) (O P K L E F) (I J C D) (A B))

[[0, 'a', 'b'], [1, 'c', 'd'], [2, 'e', 'f'], [3, 'g', 'h'], [1,
'i', 'j'], [2, 'k', 'l'], [4, 'm', 'n'], [2, 'o', 'p'], [4, 'q', 'r'],
[5, 's', 't']].
group_by{|x| x.first}.values.map{|x| x.map{|y| y[1..-1]}.flatten}
    ==>[["s", "t"], ["a", "b"], ["c", "d", "i", "j"],
 ["e", "f", "k", "l", "o", "p"],
 ["g", "h"], ["m", "n", "q", "r"]]

cool, it comes with order all fucked up.  This is something I was
criticized for before, though not all that important to most
functional processing.  Not the case here, though.

here's a scheme version that is hopefully better than the given one:

(define (dig in)
  (if (null? in) '()
    (let* ((n         (first-of-first in))
           (all-n     (filter in (lambda (x)      (eq? n (first x)))))
           (all-but-n (filter in (lambda (x) (not (eq? n (first
x)))))))
       (pair
          (fold all-n
             (lambda (i o) (pair (second i) (pair (third i) o))))
          (dig all-but-n)))))

; given these aliases to lisp n00bs
(define pair cons)
(define first car)
(define rest cdr)
(define first-of-first caar)
(define second cadr)
(define third caddr)

; and these well-known functions (non-tail-recursive for benefit of
n00bs)
(define (fold ls f) ; AKA reduce
  (if (null? ls) '()
      (f (first ls) (fold (rest ls) f))))
(define (filter ls f)
  (fold ls (lambda (i o) (if (f i) (pair i o) o))))

; testing
(let ((in '((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n)
            (2 o p) (4 q r) (5 s t))))
  (display (dig in))
  (newline))

;frakkin text editor...
 
S

sln

here's a interesting toy list processing problem.

I have a list of lists, where each sublist is labelled by
a number. I need to collect together the contents of all sublists
sharing
the same label. So if I have the list

((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q
r) (5 s t))

where the first element of each sublist is the label, I need to
produce:

output:
((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))
[snip]

anyone care to give a solution in Python, Perl, javascript, or other
lang? am guessing the scheme solution can be much improved... perhaps
using some lib but that seems to show scheme is pretty weak if the lib
is non-standard.

Crossposting to Lisp, Python and Perl because the weird list of lists looks
like Lisp or something else, and you mention other languages so I'm throwing
this out for Perl.

It appears this string you have there is actually list syntax in another language.
If it is, its the job of the language to parse the data out. Why then do you
want to put it into another language form? At runtime, once the data is in variables,
dictated by the syntax, you can do whatever data manipulation you want
(combining arrays, etc..).

So, in the spirit of a preprocessor, given that the text is balanced, with proper closure,
ie: ( (data) (data) ) is ok.
( data (data) ) is not ok.

the below does simple text manipulation, joining like labeled sublists, without going into
the runtime guts of internalizing the data itself. Internally, this is too simple.

-sln
-----------------
Alternate input:
(
(
(0 a b) (1 c d) (2 e f )
)
(3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q r) (5 s t)
)
------------------
use strict;
use warnings;

my $input = <<EOI;
((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q r)
(5 s t))
EOI
my $output = $input;

my $regxout = qr/
( (?: \( \s* [^()]+ \s* \) (\s*) )+ )
/x;


$output =~
s{ $regxout }
{
my ( $list, $format ) = ( $1, $2 );
my ( %hseen,
@order,
$replace
);
while ($list =~ /\(\s* (\S+) \s* (.+?) \s*\)/xsg) {
if ( exists $hseen{$1} ) {
$hseen{$1} .= " $2";
next;
}
push @order, $1;
$hseen{$1} = $2;
}
for my $id (@order) {
$replace .= "($hseen{$id}) ";
}
$replace =~ s/ $//;
$replace . $format
}xeg;

print "Input -\n$input\n";
print "Output -\n$output";

__END__

Input -
((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q r)
(5 s t))

Output -
((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))
 
S

sln

here's a interesting toy list processing problem.

I have a list of lists, where each sublist is labelled by
a number. I need to collect together the contents of all sublists
sharing
the same label. So if I have the list

((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q
r) (5 s t))

where the first element of each sublist is the label, I need to
produce:

output:
((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))
[snip]

anyone care to give a solution in Python, Perl, javascript, or other
lang? am guessing the scheme solution can be much improved... perhaps
using some lib but that seems to show scheme is pretty weak if the lib
is non-standard.

Crossposting to Lisp, Python and Perl because the weird list of lists looks
like Lisp or something else, and you mention other languages so I'm throwing
this out for Perl.

It appears this string you have there is actually list syntax in another language.
If it is, its the job of the language to parse the data out. Why then do you
want to put it into another language form? At runtime, once the data is in variables,
dictated by the syntax, you can do whatever data manipulation you want
(combining arrays, etc..).

So, in the spirit of a preprocessor, given that the text is balanced, with proper closure,
ie: ( (data) (data) ) is ok.
( data (data) ) is not ok.

the below does simple text manipulation, joining like labeled sublists, without going into
the runtime guts of internalizing the data itself. Internally, this is too simple.

If not preprocessor, then ...
The too simple, order independent, id independent, Perl approach.

-sln
-----------------

use strict;
use warnings;
use Data::Dump 'dump';

my @inp = ([0,'a','b'],[1,'c','d'],[2,'e','f'],[3,'g','h'],
[1,'i','j'],[2,'k','l'],[4,'m','n'],[2,'o','p'],
[4,'q','r'],[5,'s','t']);

my ($cnt, @outp, %hs) = (0);

for my $ref (@inp) {
$hs{ $$ref[0] } or $hs{ $$ref[0] } = $cnt++;
push @{$outp[ $hs{ $$ref[0] } ] }, @{$ref}[ 1 .. $#{$ref} ];
}

dump @outp;

__END__

(
["a", "b"],
["c", "d", "i", "j"],
["e", "f", "k", "l", "o", "p"],
["g", "h"],
["m", "n", "q", "r"],
["s", "t"],
)
 
X

Xah Lee

here's a interesting toy list processing problem.

I have a list of lists, where each sublist is labelled by
a number. I need to collect together the contents of all sublists
sharing
the same label. So if I have the list

((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n) (2 o p) (4 q
r) (5 s t))

where the first element of each sublist is the label, I need to
produce:

output:
((a b) (c d i j) (e f k l o p) (g h) (m n q r) (s t))
...

thanks all for many interesting solutions. I've been so busy in past
month on other computing issues and writing and never got around to
look at this thread. I think eventually i will, but for now just made
a link on my page to point to here.

now we have solutions in perl, python, ruby, common lisp, scheme lisp,
mathematica. I myself would also be interested in javascript perhps
i'll write one soon. If someone would go thru all these solution and
make a good summary with consistent format/names of each solution...
that'd be very useful i think. (and will learn a lot, which is how i
find this interesting)

PS here's a good site that does very useful comparisons for those
learning multiple langs.

* 〈Lisp: Common Lisp, Scheme, Clojure, Emacs Lisp〉
http://hyperpolyglot.wikidot.com/lisp
* 〈Scripting Languages: PHP, Perl, Python, Ruby, Smalltalk〉
http://hyperpolyglot.wikidot.com/scripting
* 〈Scripting Languages: Bash, Tcl, Lua, JavaScript, Io〉
http://hyperpolyglot.wikidot.com/small
* 〈Platform Languages: C, C++, Objective C, Java, C#〉
http://hyperpolyglot.wikidot.com/c
* 〈ML: Standard ML, OCaml, F#, Scala, Haskell〉 http://hyperpolyglot.wikidot.com/ml

Xah ∑ http://xahlee.org/ ☄
 
W

WJ

Pascal said:
It's too complex. Just write:

(let ((list '((0 a b) (1 c d) (2 e f) (3 g h) (1 i j) (2 k l) (4 m n)
(2 o p) (4 q r) (5 s t))))

(mapcar (lambda (class) (reduce (function append) class :key
(function rest)))
(com.informatimago.common-lisp.list:equivalence-classes list :key
(function first)))

)

--> ((S T) (Q R M N) (G H) (O P K L E F) (I J C D) (A B))

Clojure:

(def groups '((0 a b)(1 c d)(2 e f)(3 g h)(1 i j)(2 k l)(4 m n)
(2 o p)(4 q r) (5 s t)))

Using group-by:

(map (fn[[k v]](flatten (map rest v))) (group-by first groups))

Using reduce:

(map #(flatten(rest %)) (reduce (fn[h [k & v]]
(merge-with concat h {k v})) {} groups))
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,962
Messages
2,570,134
Members
46,690
Latest member
MacGyver

Latest Threads

Top