Question: Time efficiency of Array <<

P

Peter Suk

Forgive the newbie-ish question. I have been playing around with
Array, and discovered the << operator:

irb(main):001:0> array = [1, 2, 3, 4]
=> [1, 2, 3, 4]
irb(main):002:0> array2 = array << 5
=> [1, 2, 3, 4, 5]
irb(main):003:0> array
=> [1, 2, 3, 4, 5]
irb(main):004:0> array2
=> [1, 2, 3, 4, 5]
irb(main):005:0>

I am curious about the time & space complexity of n << operations to an
array. Is it O(n^2) or is it O(n)? Is there a doubling of allocated
space going on behind the scenes?

--Peter
 
E

Eric Hodel

--Apple-Mail-8-618338431
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII; format=flowed

Forgive the newbie-ish question. I have been playing around with
Array, and discovered the << operator:

irb(main):001:0> array = [1, 2, 3, 4]
=> [1, 2, 3, 4]
irb(main):002:0> array2 = array << 5
=> [1, 2, 3, 4, 5]
irb(main):003:0> array
=> [1, 2, 3, 4, 5]
irb(main):004:0> array2
=> [1, 2, 3, 4, 5]
irb(main):005:0>

I am curious about the time & space complexity of n << operations to
an array. Is it O(n^2) or is it O(n)? Is there a doubling of
allocated space going on behind the scenes?

Take a look at rb_ary_store in array.c...

It looks like an Array grows by half of its current capacity when an
index is larger than the current capacity, but by no less than
ARY_DEFAULT_SIZE (16 elements).


--
Eric Hodel - (e-mail address removed) - http://segment7.net
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04

--Apple-Mail-8-618338431
content-type: application/pgp-signature; x-mac-type=70674453;
name=PGP.sig
content-description: This is a digitally signed message part
content-disposition: inline; filename=PGP.sig
content-transfer-encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)

iEYEARECAAYFAkJpmjAACgkQMypVHHlsnwQxsQCdFv0GFoCMDfaDY9OFS65jQQ9P
uWwAnRvS/Z4PYx0S41BlKJ5I/YyP5xaQ
=R+pV
-----END PGP SIGNATURE-----

--Apple-Mail-8-618338431--
 
A

Ara.T.Howard

--Apple-Mail-8-618338431
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII; format=flowed

Forgive the newbie-ish question. I have been playing around with
Array, and discovered the << operator:

irb(main):001:0> array = [1, 2, 3, 4]
=> [1, 2, 3, 4]
irb(main):002:0> array2 = array << 5
=> [1, 2, 3, 4, 5]
irb(main):003:0> array
=> [1, 2, 3, 4, 5]
irb(main):004:0> array2
=> [1, 2, 3, 4, 5]
irb(main):005:0>

I am curious about the time & space complexity of n << operations to
an array. Is it O(n^2) or is it O(n)? Is there a doubling of
allocated space going on behind the scenes?

Take a look at rb_ary_store in array.c...

It looks like an Array grows by half of its current capacity when an
index is larger than the current capacity, but by no less than
ARY_DEFAULT_SIZE (16 elements).

so i guess it's since that's the worst case. note that one can avoid via

a = Array::new(mb = 2 ** 20)
a.size.times{|i| a = method i}

unless storing nil is required...

interestingly real-world time with largish values is better for array than a
container with O(log n) performance:

harp:~ > cat a.rb
require 'rbtree'
def time label
a = Time::now.to_f
fork{ yield } and Process::wait
b = Time::now.to_f
puts "#{ label } @ #{ b - a }"
end

n = 2 ** 20

time('rbtree2array') do
rbtree = RBTree::new
n.times{|i| rbtree = rand}
array = rbtree.values
end

time('array <<') do
array = []
n.times{|i| array << rand}
end

harp:~ > ruby a.rb
rbtree2array @ 8.43025994300842
array << @ 1.26344704627991

who knew. perhaps function call overhead ends up being responsible for the
difference.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| although gold dust is precious, when it gets in your eyes, it obstructs
| your vision. --hsi-tang
===============================================================================
 
S

Saynatkari

Le 23/4/2005 said:
--Apple-Mail-8-618338431
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII; format=flowed

Forgive the newbie-ish question. I have been playing around with
Array, and discovered the << operator:

irb(main):001:0> array = [1, 2, 3, 4]
=> [1, 2, 3, 4]
irb(main):002:0> array2 = array << 5
=> [1, 2, 3, 4, 5]
irb(main):003:0> array
=> [1, 2, 3, 4, 5]
irb(main):004:0> array2
=> [1, 2, 3, 4, 5]
irb(main):005:0>

I am curious about the time & space complexity of n << operations to
an array. Is it O(n^2) or is it O(n)? Is there a doubling of
allocated space going on behind the scenes?

Take a look at rb_ary_store in array.c...

It looks like an Array grows by half of its current capacity when an
index is larger than the current capacity, but by no less than
ARY_DEFAULT_SIZE (16 elements).

so i guess it's since that's the worst case. note that one can avoid via

a = Array::new(mb = 2 ** 20)
a.size.times{|i| a = method i}

unless storing nil is required...

interestingly real-world time with largish values is better for array than a
container with O(log n) performance:


I think it is a bit of a stretch to call the following test
'real-world' :) That being said, Array does perform well.

Remember that an rb-tree is O(log n) for insertion, deletion
_and_ search, and it can be treated as sparse.
harp:~ > cat a.rb
require 'rbtree'
def time label
a = Time::now.to_f
fork{ yield } and Process::wait
b = Time::now.to_f
puts "#{ label } @ #{ b - a }"
end

n = 2 ** 20

time('rbtree2array') do
rbtree = RBTree::new
n.times{|i| rbtree = rand}
array = rbtree.values
end

time('array <<') do
array = []
n.times{|i| array << rand}
end

harp:~ > ruby a.rb
rbtree2array @ 8.43025994300842
array << @ 1.26344704627991


As a sidenote, you can use Benchmark for convenient timing:

require 'benchmark'
Benchmark.bm do |bench|
# Setup
# ...
bench.report { # test1 }
bench.report { # test2 }
# ...
end
who knew. perhaps function call overhead ends up being responsible for the
difference.

In your test, it most likely suffers from allocating
memory piecemeal rather than in chunks. Not sure how
effectively one could preallocate for a tree.

E
 
A

Ara.T.Howard

I think it is a bit of a stretch to call the following test 'real-world' :)
That being said, Array does perform well.

true true. by that i meant more that just big O analysis.
Remember that an rb-tree is O(log n) for insertion, deletion _and_ search,
and it can be treated as sparse.

oh i know - i use rbtree all the time and can't imagine why it's not in the
core. on the other hand i also have bsearch code for Arrays that makes them
O(log n) for searching in most (pre-sorted) cases.
As a sidenote, you can use Benchmark for convenient timing:

require 'benchmark'
Benchmark.bm do |bench|
# Setup
# ...
bench.report { # test1 }
bench.report { # test2 }
# ...
end

i prefer to fork and (tho i forgot in this case) kill the CG - i like
factoring out threads and eliminating any possible memory clashes.
In your test, it most likely suffers from allocating memory piecemeal rather
than in chunks. Not sure how effectively one could preallocate for a tree.

yes that's true too. i've done tests in the past comparing bdb, gperf, and
other hashing type look ups against a bsearch and found them to be quite a bit
slower. on closer analysis it turned out that, although hashing is O(1), the
call stack is so much deeper for each search as compared to a non-recursive
bsearch that it predominates - i was suprised.

thanks for the info.

cheers.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| although gold dust is precious, when it gets in your eyes, it obstructs
| your vision. --hsi-tang
===============================================================================
 
P

Peter Suk

I think it is a bit of a stretch to call the following test
'real-world' :) That being said, Array does perform well.

Remember that an rb-tree is O(log n) for insertion, deletion
_and_ search, and it can be treated as sparse.

The key here is "real world." Array's trick of doubling its allocation
gives us an "Amortized" time of O(2n) = O(n) which beats n insertions
into a structure like a Binary Tree, which is O(n*log n).

One of the handful of truly useful things I learned from Algorithms.

A good summary for the Array trick:

http://www.eli.sdsu.edu/courses/fall96/cs660/notes/amortizedAnalysis/
amortizedAnalysis.html

Some more:

http://www.cs.duke.edu/~mlittman/courses/Archive/cps130-97/lectures/
lect10/node14.html


You can think of it as the "reverse Zeno's paradox."

1 + 2 + 4 + 8 + ....
2^0 + 2^1 + 2^2 + 2^3 + ... + 2^(n-1) = 2^n - 1

Or visually:

X
XX
XXXX
XXXXXXXX
XXXXXXXXXXXXXXXX

You'll always be able to stack the smaller levels into the missing
space on the next to the last level and still have 1 empty space left
over.

So, Ruby's Array = Smalltalk's OrderedCollection.

--Peter
 
W

William Morgan

Excerpts from (e-mail address removed)'s mail of 22 Apr 2005 (EDT):
yes that's true too. i've done tests in the past comparing bdb,
gperf, and other hashing type look ups against a bsearch and found
them to be quite a bit slower. on closer analysis it turned out that,
although hashing is O(1), the call stack is so much deeper for each
search as compared to a non-recursive bsearch that it predominates - i
was suprised.

Just goes to show that theoretic bounds on worst case performance are
sometimes just that....
 
P

Peter Suk

Excerpts from (e-mail address removed)'s mail of 22 Apr 2005 (EDT):

Just goes to show that theoretic bounds on worst case performance are
sometimes just that....

It's actually that the upper bounds are too loose in the naive
analysis. Amortized worst-case bounds on the doubling re-allocation
are O(n) for n insertions, or the same O(1) per insertion as hashing.
Then, it should become clear that the constant is bigger for hashing.

--Peter
 
W

William Morgan

Excerpts from Peter Suk's mail of 23 Apr 2005 (EDT):
It's actually that the upper bounds are too loose in the naive
analysis. Amortized worst-case bounds on the doubling re-allocation
are O(n) for n insertions, or the same O(1) per insertion as hashing.
Then, it should become clear that the constant is bigger for hashing.

In this case he's talking about lookups and call stack depth.
 
P

Paulo Sérgio

Hi all,

My question is very "simple" ... how to define an abstract class (or
method) in ruby?
 
C

Christoph

Paulo said:
Hi all,

My question is very "simple" ... how to define an abstract class (or
method) in ruby?
How about

a) just define a module

b)

class MyAbstractClass
private_class_method :new
end


or c)

class MyAbstractClass
def self.new
raise TypeError, "I am an abstract class"
end
end


/Christoph
 
S

Saynatkari

Le 23/4/2005 said:
Hi all,

My question is very "simple" ... how to define an abstract class (or
method) in ruby?

Usually one does not need an abstract class in ruby, if you are
thinking along the lines of C++ or Java. Reason for this is that
ruby is not strictly typed so any class with a conforming interface
can be used where that interface is expected. This is the concept
of 'duck-typing'.

Could you describe your situation in more detail?
Paulo S Medeiros

E
 
P

Paulo Sérgio

Saynatkari said:
Usually one does not need an abstract class in ruby, if you are
thinking along the lines of C++ or Java. Reason for this is that
ruby is not strictly typed so any class with a conforming interface
can be used where that interface is expected. This is the concept
of 'duck-typing'.

Could you describe your situation in more detail?

Hey thanks, but i figured out the ruby way :)
 
P

Peter Suk

Hi all,

My question is very "simple" ... how to define an abstract class (or
method) in ruby?

Paulo brings up a key issue. We should have a standard
"subclass_responsibility" method to denote this for methods. If there
is s standard idiom for Ruby, then code tools will be able to identify
purely abstract classes. (Classes that have no concrete methods at
all.)

But to elaborate, nothing prevents someone from putting concrete
methods into an abstract class. Ruby is not about enforcement. If you
want to communicate this, there is an idiom in Smalltalk where people
define a class-side method isAbstract like:

---- code follows ----
class Node
def Node.abstract?
return self == Node
end
end
---- end code ----

Now you can make subclasses of Node and they will automatically return
false when asked "abstract?" If one of your subclasses also turns out
to be abstract, just write a similar method. Now if you want to know
if a class is abstract, you can just ask.

--Peter
 
P

Paulo Sérgio

Hi all, again,
I have a simple piece of code thats returning a unexpected result (to me).

def successors(state) #state is an integer array equal to, for example,
[0,0,0,0]
successorsStates = []
for i in 0...state.length
break if state != 0
for j in 1..4
stateAux = state
stateAux = j
puts stateAux.inspect #(ive put the "puts" methods just to
ilustrate what is happening)
successorsStates.push(stateAux) # i've tried
"successorsStates << stateAux" and "successorsStates.concat([stateAux])"
too
puts successorsStates.inspect #(ive put the "puts" methods
just to ilustrate what is happening)
end
end
return successorsStates
end

=================================================================================================

I was expecting for successorsStates value:
[[1,0,0,0],[2,0,0,0],[3,0,0,0],[4,0,0,0],[0,1,0,0],[0,2,0,0],[0,3,0,0],[0,4,0,0],

[0,0,1,0],[0,0,2,0],[0,0,3,0],[0,0,4,0],[0,0,0,1],[0,0,0,2],[0,0,0,3],[0,0,0,4]]

but it returns (look at how its concatenates)

[1, 0, 0, 0]
[[1, 0, 0, 0]]
[2, 0, 0, 0]
[[2, 0, 0, 0], [2, 0, 0, 0]]
[3, 0, 0, 0]
[[3, 0, 0, 0], [3, 0, 0, 0], [3, 0, 0, 0]]
[4, 0, 0, 0]
[[4, 0, 0, 0], [4, 0, 0, 0], [4, 0, 0, 0], [4, 0, 0, 0]]
[4, 1, 0, 0] ---*why here the result isn't [0,1,0,0], since stateAux =
state and then stateAux = j (state = [0,0,0,0])*
[[4, 1, 0, 0], [4, 1, 0, 0], [4, 1, 0, 0], [4, 1, 0, 0], [4, 1, 0, 0]]
[4, 2, 0, 0]
[[4, 2, 0, 0], [4, 2, 0, 0], [4, 2, 0, 0], [4, 2, 0, 0], [4, 2, 0, 0],
[4, 2, 0, 0]]
[4, 3, 0, 0]
[[4, 3, 0, 0], [4, 3, 0, 0], [4, 3, 0, 0], [4, 3, 0, 0], [4, 3, 0, 0],
[4, 3, 0, 0], [4, 3, 0, 0]]
[4, 4, 0, 0]
[[4, 4, 0, 0], [4, 4, 0, 0], [4, 4, 0, 0], [4, 4, 0, 0], [4, 4, 0, 0],
[4, 4, 0, 0], [4, 4, 0, 0], [4, 4, 0, 0]]
[4, 4, 1, 0]
[[4, 4, 1, 0], [4, 4, 1, 0], [4, 4, 1, 0], [4, 4, 1, 0], [4, 4, 1, 0],
[4, 4, 1, 0], [4, 4, 1, 0], [4, 4, 1, 0], [4, 4, 1, 0]]
[4, 4, 2, 0]
[[4, 4, 2, 0], [4, 4, 2, 0], [4, 4, 2, 0], [4, 4, 2, 0], [4, 4, 2, 0],
[4, 4, 2, 0], [4, 4, 2, 0], [4, 4, 2, 0], [4, 4, 2, 0], [4, 4, 2, 0]]
[4, 4, 3, 0]
[[4, 4, 3, 0], [4, 4, 3, 0], [4, 4, 3, 0], [4, 4, 3, 0], [4, 4, 3, 0],
[4, 4, 3, 0], [4, 4, 3, 0], [4, 4, 3, 0], [4, 4, 3, 0], [4, 4, 3, 0],
[4, 4, 3, 0]]
[4, 4, 4, 0]
[[4, 4, 4, 0], [4, 4, 4, 0], [4, 4, 4, 0], [4, 4, 4, 0], [4, 4, 4, 0],
[4, 4, 4, 0], [4, 4, 4, 0], [4, 4, 4, 0], [4, 4, 4, 0], [4, 4, 4, 0],
[4, 4, 4, 0], [4, 4, 4, 0]]
[4, 4, 4, 1]
[[4, 4, 4, 1], [4, 4, 4, 1], [4, 4, 4, 1], [4, 4, 4, 1], [4, 4, 4, 1],
[4, 4, 4, 1], [4, 4, 4, 1], [4, 4, 4, 1], [4, 4, 4, 1], [4, 4, 4, 1],
[4, 4, 4, 1], [4, 4, 4, 1], [4, 4, 4, 1]]
[4, 4, 4, 2]
[[4, 4, 4, 2], [4, 4, 4, 2], [4, 4, 4, 2], [4, 4, 4, 2], [4, 4, 4, 2],
[4, 4, 4, 2], [4, 4, 4, 2], [4, 4, 4, 2], [4, 4, 4, 2], [4, 4, 4, 2],
[4, 4, 4, 2], [4, 4, 4, 2], [4, 4, 4, 2], [4, 4, 4, 2]]
[4, 4, 4, 3]
[[4, 4, 4, 3], [4, 4, 4, 3], [4, 4, 4, 3], [4, 4, 4, 3], [4, 4, 4, 3],
[4, 4, 4, 3], [4, 4, 4, 3], [4, 4, 4, 3], [4, 4, 4, 3], [4, 4, 4, 3],
[4, 4, 4, 3], [4, 4, 4, 3], [4, 4, 4, 3], [4, 4, 4, 3], [4, 4, 4, 3]]
[4, 4, 4, 4]
*[[4, 4, 4, 4], [4, 4, 4, 4], [4, 4, 4, 4], [4, 4, 4, 4], [4, 4, 4, 4],
[4, 4, 4, 4], [4, 4, 4, 4], [4, 4, 4, 4], [4, 4, 4, 4], [4, 4, 4, 4],
[4, 4, 4, 4], [4, 4, 4, 4], [4, 4, 4, 4], [4, 4, 4, 4], [4, 4, 4, 4],
[4, 4, 4, 4]] *this is the WRONG final state of successorsStates**
 
F

Florian Frank

Paulo said:
def successors(state) #state is an integer array equal to, for
example, [0,0,0,0]
successorsStates = []
for i in 0...state.length
break if state != 0
for j in 1..4
stateAux = state


stateAux = state.dup
 
P

Paulo Sérgio

Florian said:
Paulo Sérgio wrote:


def successors(state) #state is an integer array equal to, for
example, [0,0,0,0]
successorsStates = []
for i in 0...state.length
break if state != 0
for j in 1..4
stateAux = state


stateAux = state.dup

Yeah, thanks, it really worked. The dup method creates a new copy of the
object, correct?
But i can't figure out why without dup it not worked since state
variable never change.
 
D

Douglas Livingstone

Yeah, thanks, it really worked. The dup method creates a new copy of the
object, correct?
yep

But i can't figure out why without dup it not worked since state
variable never change.

stateAux changes, and because stateAux = state, state does change.

Douglas
 
B

Bertram Scharpf

Am Sonntag, 24. Apr 2005, 07:06:48 +0900 schrieb Paulo Sérgio:
Subject: Re: What im doing WRONG?!?!?!

You post using the "Reply" function of your Thunderbird.
This makes it and its answers appear within another thread
everywhere thread view is turned on.

Please use the "New Message" Button instead of "Reply" on a
mail your post will have nothing to do with. Thanks.

Probably you like to turn on the thread view feature of your
Thunderbird, too.

Bertram
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
474,171
Messages
2,570,935
Members
47,472
Latest member
KarissaBor

Latest Threads

Top