join not in Enumerable

L

Logan Capaldo

Just a few minutes ago I was playing with irb as I am wont to do, and
typed this:

('a'..'z').join(' ')

Lo and behold it protested at me with a NoMethodError. I said to my
self, self there is no reason that has to be Array only functionality.
Why isn't it in Enumerable? So I said:

module Enumerable
def join(sep =3D '')
inject do |a, b|
"#{a}#{sep}#{b}"
end
end
end

And then I said ('a'..'z').join(' ') and got:
=3D> "a b c d e f g h i j k l m n o p q r s t u v w x y z"

#inject has to be the most dangerously effective method ever. But I digress=
:

Why is join, and perhaps even pack in Array and not in Enumerable?
 
A

Ara.T.Howard

Just a few minutes ago I was playing with irb as I am wont to do, and
typed this:

('a'..'z').join(' ')

Lo and behold it protested at me with a NoMethodError. I said to my
self, self there is no reason that has to be Array only functionality.
Why isn't it in Enumerable? So I said:

module Enumerable
def join(sep = '')
inject do |a, b|
"#{a}#{sep}#{b}"
end
end
end

And then I said ('a'..'z').join(' ') and got:
=> "a b c d e f g h i j k l m n o p q r s t u v w x y z"

#inject has to be the most dangerously effective method ever. But I digress:

Why is join, and perhaps even pack in Array and not in Enumerable?

the only reason i can think of is that just because somthing is countable
(Enumerable) doesn't mean each sub-thing is singular. take a hash for
example. this is no stubling block (pun intended) for ruby however:

harp:~ > cat a.rb
module Enumerable
def join(sep = '', &b)
inject(nil){|s,x| "#{ s }#{ s && sep }#{ b ? b[ x ] : x }"}
end
end
class Array; def join(*a, &b); super; end; end

r = 'a' .. 'z'
p(r.join(' '))

h = {:k => :v, :K => :V}
p(h.join(';'){|kv| kv.join '=>'})

a = [ [0, 1], [2, 3] ]
p(a.join(','){|kv| kv.join ':'})


harp:~ > ruby a.rb
"a b c d e f g h i j k l m n o p q r s t u v w x y z"
"k=>v;K=>V"
"0:1,2:3"

this allows 'nesting' of join calls for arbitrarily deep enumerable structures.

a3 = [
[ [:a, :b], [:c, :d] ],
[ [:e, :f], [:g, :h] ],
]

p( a3.join('___'){|a2| a2.join('__'){|a1| a1.join '_'}} )

#=> "a_b__c_d___e_f__g_h"

it's a nice idea you have there!

cheers.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| My religion is very simple. My religion is kindness.
| --Tenzin Gyatso
===============================================================================
 
D

David A. Black

Hi --

Just a few minutes ago I was playing with irb as I am wont to do, and
typed this:

('a'..'z').join(' ')

Lo and behold it protested at me with a NoMethodError. I said to my
self, self there is no reason that has to be Array only functionality.
Why isn't it in Enumerable? So I said:

module Enumerable
def join(sep = '')
inject do |a, b|
"#{a}#{sep}#{b}"
end
end
end

And then I said ('a'..'z').join(' ') and got:
=> "a b c d e f g h i j k l m n o p q r s t u v w x y z"

#inject has to be the most dangerously effective method ever. But I digress:

You can speed it up a lot if you do this:

module Enumerable
def join(sep = '')
to_a.join(sep)
end
end

Benchmarking 10 calls to each version, for a dummy class where each
just iterates from 1 to 1000:

user system total real
inject 2.720000 0.030000 2.750000 ( 2.759071)
to_a 0.300000 0.000000 0.300000 ( 0.298650)
Why is join, and perhaps even pack in Array and not in Enumerable?

I guess to_a makes the conversion pretty easy, and Array tends to
serve as the "normalized" version of Enumerable in a lot of contexts.
I don't know if there's any other reason.


David
 
J

Jim Weirich

Hi --



You can speed it up a lot if you do this:

[... elided version using to_a ...]

The reason the non-to_a version is slow is because it creates a series of
increasingly larger strings. A faster version (without resorting to to_a)
would build up a single string gradually. Here is another version:

def join(sep='')
inject(nil) { |a, b|
a ? (a << sep << b.to_s) : "#{b}"
}
end

Here are the timings I got ...

user system total real
to_a: 0.580000 0.000000 0.580000 ( 0.583975)
inject slow: 10.520000 0.210000 10.730000 ( 11.998484)
inject fast: 0.590000 0.020000 0.610000 ( 0.651972)
 
C

Christoph

Jim said:
def join(sep='')
inject(nil) { |a, b|
a ? (a << sep << b.to_s) : "#{b}"
}
end
It's
p ([].join) # ""

so this should be

def join(sep="")
if sep == ""
inject('') { |a, b|
a << b.inspect
}
else
inject('') { |a, b|
a << sep << b.inspect
}
end
end

/Christoph
 
N

nobu.nokada

Hi,

At Sun, 22 May 2005 08:56:08 +0900,
Ara.T.Howard wrote in [ruby-talk:143311]:
the only reason i can think of is that just because somthing is countable
(Enumerable) doesn't mean each sub-thing is singular. take a hash for
example. this is no stubling block (pun intended) for ruby however:

Feels interesting.


Index: enum.c
===================================================================
RCS file: /cvs/ruby/src/ruby/enum.c,v
retrieving revision 1.54
diff -U2 -p -r1.54 enum.c
--- enum.c 30 Oct 2004 06:56:17 -0000 1.54
+++ enum.c 22 May 2005 09:36:21 -0000
@@ -967,4 +967,52 @@ enum_zip(argc, argv, obj)
}

+static VALUE
+enum_join_s(obj, arg, recur)
+ VALUE obj, *arg;
+ int recur;
+{
+ if (recur) {
+ static const char recursed[] = "[...]";
+ if (!NIL_P(arg[1]) && RSTRING(arg[0])->len != 0) {
+ rb_str_append(arg[0], arg[1]);
+ }
+ rb_str_cat(arg[0], recursed, sizeof(recursed) - 1);
+ }
+ else {
+ if (rb_block_given_p()) {
+ obj = rb_yield(obj);
+ }
+ if (TYPE(obj) != T_STRING) {
+ obj = rb_obj_as_string(obj);
+ }
+ if (!NIL_P(arg[1]) && RSTRING(arg[0])->len != 0) {
+ rb_str_append(arg[0], arg[1]);
+ }
+ rb_str_append(arg[0], obj);
+ }
+ return arg[0];
+}
+
+static VALUE
+enum_join_i(el, arg)
+ VALUE el, arg;
+{
+ return rb_exec_recursive(enum_join_s, el, arg);
+}
+
+static VALUE
+enum_join(argc, argv, obj)
+ int argc;
+ VALUE *argv;
+ VALUE obj;
+{
+ VALUE arg[2];
+
+ rb_scan_args(argc, argv, "01", &arg[1]);
+ arg[0] = rb_str_new(0, 0);
+ rb_iterate(rb_each, obj, enum_join_i, (VALUE)arg);
+ return arg[0];
+}
+
/*
* The <code>Enumerable</code> mixin provides collection classes with
@@ -998,4 +1046,5 @@ Init_Enumerable()
rb_define_method(rb_mEnumerable,"inject", enum_inject, -1);
rb_define_method(rb_mEnumerable,"partition", enum_partition, 0);
+ rb_define_method(rb_mEnumerable,"classify", enum_classify, 0);
rb_define_method(rb_mEnumerable,"all?", enum_all, 0);
rb_define_method(rb_mEnumerable,"any?", enum_any, 0);
@@ -1008,4 +1057,5 @@ Init_Enumerable()
rb_define_method(rb_mEnumerable,"each_with_index", enum_each_with_index, 0);
rb_define_method(rb_mEnumerable, "zip", enum_zip, -1);
+ rb_define_method(rb_mEnumerable, "join", enum_join, -1);

id_eqq = rb_intern("===");
 
K

Kristof Bastiaensen

I guess to_a makes the conversion pretty easy, and Array tends to
serve as the "normalized" version of Enumerable in a lot of contexts.
I don't know if there's any other reason.

I believe because join requires an ordered collection, and enumerables
aren't guaranteed to be ordered. For example the order of traversing a
Hash may differ for a different hash with the same elements. For this
reason the output of join for an enumerable is undefined.

Regards,
KB
 
R

Robert Klemme

Kristof Bastiaensen said:
I believe because join requires an ordered collection, and enumerables
aren't guaranteed to be ordered.

That would be my answer, too.
For example the order of traversing a
Hash may differ for a different hash with the same elements. For this
reason the output of join for an enumerable is undefined.

At least it is unpredictable. Even more so: order may change completely
with each insertion:
h=(0..5).inject({}){|h,i| h[i.to_s]=i;h} => {"0"=>0, "1"=>1, "2"=>2, "3"=>3, "4"=>4, "5"=>5}
h.to_a => [["0", 0], ["1", 1], ["2", 2], ["3", 3], ["4", 4], ["5", 5]]
h["6"]=6 => 6
h.to_a
=> [["6", 6], ["0", 0], ["1", 1], ["2", 2], ["3", 3], ["4", 4], ["5", 5]]

Kind regards

robert
 
D

David A. Black

Hi --

I believe because join requires an ordered collection, and enumerables
aren't guaranteed to be ordered. For example the order of traversing a
Hash may differ for a different hash with the same elements. For this
reason the output of join for an enumerable is undefined.

I don't think the unorderedness would matter; consider, for example,
Hash#to_s.


David
 
D

David A. Black

Hi --

That would be my answer, too.

As per my previous post, I don't think that matters for join, which is
just a "dumb" string representation facility and won't care about
order.

Another related thought: Enumerables have this underlying numerical
index, as reflected in Enumerable#each_with_index. Even hashes are,
in that sense, "ordered": their elements are "indexed" from 0 up.

I have to say, though, that I think #each_with_index should be removed
from Enumerable and pushed down to the classes that mix it in
(similarly to #each_index). But I suppose as long as they are called
"enumerable" they are in some sense associated with a numerical index.

That's probably only tangentially related to #join, though. Mainly I
think that #join is just a fancy #to_s, and orderedness isn't an
issue.


David
 
A

Ara.T.Howard

Hi,

At Sun, 22 May 2005 08:56:08 +0900,
Ara.T.Howard wrote in [ruby-talk:143311]:
the only reason i can think of is that just because somthing is countable
(Enumerable) doesn't mean each sub-thing is singular. take a hash for
example. this is no stubling block (pun intended) for ruby however:

Feels interesting.


Index: enum.c
===================================================================
RCS file: /cvs/ruby/src/ruby/enum.c,v
retrieving revision 1.54
diff -U2 -p -r1.54 enum.c
--- enum.c 30 Oct 2004 06:56:17 -0000 1.54
+++ enum.c 22 May 2005 09:36:21 -0000
@@ -967,4 +967,52 @@ enum_zip(argc, argv, obj)
}

+static VALUE
+enum_join_s(obj, arg, recur)
+ VALUE obj, *arg;
+ int recur;
+{
+ if (recur) {
+ static const char recursed[] = "[...]";
+ if (!NIL_P(arg[1]) && RSTRING(arg[0])->len != 0) {
+ rb_str_append(arg[0], arg[1]);
+ }
+ rb_str_cat(arg[0], recursed, sizeof(recursed) - 1);
+ }
+ else {
+ if (rb_block_given_p()) {
+ obj = rb_yield(obj);
+ }
+ if (TYPE(obj) != T_STRING) {
+ obj = rb_obj_as_string(obj);
+ }
+ if (!NIL_P(arg[1]) && RSTRING(arg[0])->len != 0) {
+ rb_str_append(arg[0], arg[1]);
+ }
+ rb_str_append(arg[0], obj);
+ }
+ return arg[0];
+}
+
+static VALUE
+enum_join_i(el, arg)
+ VALUE el, arg;
+{
+ return rb_exec_recursive(enum_join_s, el, arg);
+}
+
+static VALUE
+enum_join(argc, argv, obj)
+ int argc;
+ VALUE *argv;
+ VALUE obj;
+{
+ VALUE arg[2];
+
+ rb_scan_args(argc, argv, "01", &arg[1]);
+ arg[0] = rb_str_new(0, 0);
+ rb_iterate(rb_each, obj, enum_join_i, (VALUE)arg);
+ return arg[0];
+}
+
/*
* The <code>Enumerable</code> mixin provides collection classes with
@@ -998,4 +1046,5 @@ Init_Enumerable()
rb_define_method(rb_mEnumerable,"inject", enum_inject, -1);
rb_define_method(rb_mEnumerable,"partition", enum_partition, 0);
+ rb_define_method(rb_mEnumerable,"classify", enum_classify, 0);
rb_define_method(rb_mEnumerable,"all?", enum_all, 0);
rb_define_method(rb_mEnumerable,"any?", enum_any, 0);
@@ -1008,4 +1057,5 @@ Init_Enumerable()
rb_define_method(rb_mEnumerable,"each_with_index", enum_each_with_index, 0);
rb_define_method(rb_mEnumerable, "zip", enum_zip, -1);
+ rb_define_method(rb_mEnumerable, "join", enum_join, -1);

id_eqq = rb_intern("===");

only you would crank that out in C nobu ;-)


looks good for enumerable:

harp:~/build/ruby > ./ruby -e' p( {:k => :v, :K => :V }.join(","){|kv| kv.join "=>"} ) '
"k=>v,K=>V"

but doesn't override Array's current behaviour:

harp:~/build/ruby > ./ruby -e'a3 = [ [ [4], [2] ], [ ["forty"], ["two"] ] ]; p a3.join("___"){|a2| a2.join("__"){|a1| a1.join "_"}}'
"4___2___forty___two"


i'm not sure how to do this in C:

module Enumerable
def join(sep = '', &b)
inject(nil){|s,x| "#{ s }#{ s && sep }#{ b ? b[ x ] : x }"}
end
end
class Array
def join(*a, &b); super; end
end

so Array's join is clobbered...

kind regards.


-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| My religion is very simple. My religion is kindness.
| --Tenzin Gyatso
===============================================================================
 
D

Daniel Berger

Logan said:
Just a few minutes ago I was playing with irb as I am wont to do, and
typed this:

('a'..'z').join(' ')

Lo and behold it protested at me with a NoMethodError. I said to my
self, self there is no reason that has to be Array only functionality.
Why isn't it in Enumerable? So I said:

module Enumerable
def join(sep = '')
inject do |a, b|
"#{a}#{sep}#{b}"
end
end
end

And then I said ('a'..'z').join(' ') and got:
=> "a b c d e f g h i j k l m n o p q r s t u v w x y z"

#inject has to be the most dangerously effective method ever. But I digress:

Why is join, and perhaps even pack in Array and not in Enumerable?

Because there is nothing explicitly iterative about join. Also, every
class except Array would have to have a custom definition of join,
since there's no reasonable default behavior for any class outside of
Array. And if every class would have to implement its own version of a
method, that method doesn't belong in a module. Modules are not
interfaces.

I can see that Ara has already found an excuse to give join a block -
lovely. The slide continues....

Regards,

Dan
 
A

Ara.T.Howard

Because there is nothing explicitly iterative about join.

Enumerable#join(sep): concatinate each (Enumerable#each) thing onto a string
followed by sep, unless it is the last (implying iteration) thing.

isn't this definition reasonable and iterative?
Also, every class except Array would have to have a custom definition of
join, since there's no reasonable default behavior for any class outside of
Array.

really?

set = Set::new
set.join ','

ll = LinkedList::new
ll.join '->'

dll = DoublyLinkedList::new
dll.join '<->'

v = BitVector::new
v.join '|'

path = graph.shortest_path from, to
path.join '=>'

string = String::new
string.join "<br>"

stack = Stack::new
stack.join '-'

rope = Rope::new
rope.join '_'

priority_queue = PriorityQueue::new
priority_queue.join(','){|priority_and_obj| priority_and_obj.join ':'}

come to mind ;-)

And if every class would have to implement its own version of a method, that
method doesn't belong in a module. Modules are not interfaces.

why would every class have to implement it's own? with the defintion we've
been throwing around we already have things like

harp:~/build/ruby > ./ruby -e'html = "line1\nline2\nline3".join "<br>"; p html'
"line1\n<br>line2\n<br>line3"

which is kinda handy and makes good sense no?
I can see that Ara has already found an excuse to give join a block -
lovely. The slide continues....

weee. ;-)

cheers.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| My religion is very simple. My religion is kindness.
| --Tenzin Gyatso
===============================================================================
 
D

Daniel Berger

Ara.T.Howard said:
really?

set = Set::new
set.join ','

ll = LinkedList::new
ll.join '->'

dll = DoublyLinkedList::new
dll.join '<->'

v = BitVector::new
v.join '|'

path = graph.shortest_path from, to
path.join '=>'

string = String::new
string.join "<br>"

stack = Stack::new
stack.join '-'

rope = Rope::new
rope.join '_'

priority_queue = PriorityQueue::new
priority_queue.join(','){|priority_and_obj| priority_and_obj.join ':'}

come to mind ;-)

Fine, replace "Array" with "most lists" and my point still stands.
why would every class have to implement it's own? with the defintion we've
been throwing around we already have things like

harp:~/build/ruby > ./ruby -e'html = "line1\nline2\nline3".join
"line1\n<br>line2\n<br>line3"

which is kinda handy and makes good sense no?

No, it doesn't make sense.

Regards,

Dan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,152
Members
46,698
Latest member
LydiaHalle

Latest Threads

Top