Optimizing ruby constant array data

George Ogata · Mar 11, 2006

Mike Austin said:
I decided it was time to do a little profiling, and am so glad that
it's built right into Ruby. My biggest problem seems to be array
access:

% cumulative self self total
time seconds seconds calls ms/call ms/call name
36.02 3.35 3.35 350 9.57 15.10 String#each_byte
10.45 4.32 0.97 27230 0.04 0.04 Array#[]
6.73 4.95 0.63 16100 0.04 0.04 GL.Vertex

I'll eventually go to display-lists or similar in OpenGL, but I was
wondering if I could optimize this any further?

def draw_string( string )
size = font.height
x = 0

GL::Enable( GL::TEXTURE_2D )
GL::Begin( GL::QUADS )
string.each_byte do |char|
offset = char - 32
GL::TexCoord2f( @tex_coords_left[offset], @tex_coords_top[offset] );
GL::Vertex( x, 0 )
GL::TexCoord2f( @tex_coords_left[offset], @tex_coords_bottom[offset] );
GL::Vertex( x, size )
GL::TexCoord2f( @tex_coords_right[offset], @tex_coords_bottom[offset] );
GL::Vertex( x + size, size )
GL::TexCoord2f( @tex_coords_right[offset], @tex_coords_top[offset] );
GL::Vertex( x + size, 0 )
x += @sizes[char-32][0]
end
GL::End()
GL:isable( GL::TEXTURE_2D )
end

Perhaps use #at instead of #[] ? Using texture coord arrays might
help too. But if that's really only 10% of the total time, I don't
think it's going to make too much difference...

Mike Austin · Mar 11, 2006

I decided it was time to do a little profiling, and am so glad that it's built
right into Ruby. My biggest problem seems to be array access:

% cumulative self self total
time seconds seconds calls ms/call ms/call name
36.02 3.35 3.35 350 9.57 15.10 String#each_byte
10.45 4.32 0.97 27230 0.04 0.04 Array#[]
6.73 4.95 0.63 16100 0.04 0.04 GL.Vertex

I'll eventually go to display-lists or similar in OpenGL, but I was wondering
if I could optimize this any further?

def draw_string( string )
size = font.height
x = 0

GL::Enable( GL::TEXTURE_2D )
GL::Begin( GL::QUADS )
string.each_byte do |char|
offset = char - 32
GL::TexCoord2f( @tex_coords_left[offset], @tex_coords_top[offset] );
GL::Vertex( x, 0 )
GL::TexCoord2f( @tex_coords_left[offset], @tex_coords_bottom[offset] );
GL::Vertex( x, size )
GL::TexCoord2f( @tex_coords_right[offset], @tex_coords_bottom[offset] );
GL::Vertex( x + size, size )
GL::TexCoord2f( @tex_coords_right[offset], @tex_coords_top[offset] );
GL::Vertex( x + size, 0 )
x += @sizes[char-32][0]
end
GL::End()
GL:

isable( GL::TEXTURE_2D )
end

Thanks,
Mike

Mike Austin · Mar 11, 2006

Thanks for the feedback. at() didn't seem to do much, but I just remembered
the glCallLists() trick:

def draw_string( string )
GL::Enable( GL::TEXTURE_2D )
GL::CallLists( string )
GL:

isable( GL::TEXTURE_2D )
end

Oh yea

Mike

George said:
Mike Austin said:

I decided it was time to do a little profiling, and am so glad that
it's built right into Ruby. My biggest problem seems to be array
access:

% cumulative self self total
time seconds seconds calls ms/call ms/call name
36.02 3.35 3.35 350 9.57 15.10 String#each_byte
10.45 4.32 0.97 27230 0.04 0.04 Array#[]
6.73 4.95 0.63 16100 0.04 0.04 GL.Vertex

I'll eventually go to display-lists or similar in OpenGL, but I was
wondering if I could optimize this any further?

def draw_string( string )
size = font.height
x = 0

GL::Enable( GL::TEXTURE_2D )
GL::Begin( GL::QUADS )
string.each_byte do |char|
offset = char - 32
GL::TexCoord2f( @tex_coords_left[offset], @tex_coords_top[offset] );
GL::Vertex( x, 0 )
GL::TexCoord2f( @tex_coords_left[offset], @tex_coords_bottom[offset] );
GL::Vertex( x, size )
GL::TexCoord2f( @tex_coords_right[offset], @tex_coords_bottom[offset] );
GL::Vertex( x + size, size )
GL::TexCoord2f( @tex_coords_right[offset], @tex_coords_top[offset] );
GL::Vertex( x + size, 0 )
x += @sizes[char-32][0]
end
GL::End()
GL:isable( GL::TEXTURE_2D )
end

Click to expand...

Perhaps use #at instead of #[] ? Using texture coord arrays might
help too. But if that's really only 10% of the total time, I don't
think it's going to make too much difference...

Mauricio Fernandez · Mar 11, 2006

I'll eventually go to display-lists or similar in OpenGL, but I was
wondering if I could optimize this any further?

def draw_string( string )
size = font.height
x = 0

GL::Enable( GL::TEXTURE_2D )
GL::Begin( GL::QUADS )
string.each_byte do |char|
offset = char - 32
GL::TexCoord2f( @tex_coords_left[offset], @tex_coords_top[offset] );
GL::Vertex( x, 0 )
GL::TexCoord2f( @tex_coords_left[offset], @tex_coords_bottom[offset] );
GL::Vertex( x, size )
GL::TexCoord2f( @tex_coords_right[offset], @tex_coords_bottom[offset] );
GL::Vertex( x + size, size )
GL::TexCoord2f( @tex_coords_right[offset], @tex_coords_top[offset] );
GL::Vertex( x + size, 0 )
x += @sizes[char-32][0]
end
GL::End()
GL:isable( GL::TEXTURE_2D )
end

This should be a bit faster:

def draw_string( string )
size = font.height
x = 0

GL::Enable( GL::TEXTURE_2D )
GL::Begin( GL::QUADS )

# lvars are often faster than dvars ...
char = offset = left = top = bottom = right = 0
string.each_byte do |char|
offset = char - 32

# ... and also faster than ivars (array access vs. hash lookup)
# we save an extra method call too
top = @tex_coords_top[offset]
bottom = @tex_coords_bottom[offset]
left = @tex_coords_left[offset]
right = @tex_coords_right[offset]

GL::TexCoord2f( left, top );
GL::Vertex( x, 0 )
GL::TexCoord2f( left, bottom );
GL::Vertex( x, size )
GL::TexCoord2f( right, bottom );
GL::Vertex( x + size, size )
GL::TexCoord2f( right, top );
GL::Vertex( x + size, 0 )
x += @sizes[char-32][0]
end
GL::End()
GL:

isable( GL::TEXTURE_2D )
end

You can't expect much from such micro-optimizations, but they do help a bit:

RUBY_VERSION # => "1.8.4"
require 'benchmark'

puts "ivar vs. lvar"
Benchmark.bm(10) do |bm|
o = Class.new do
def initialize(ivar); @iv = ivar end
def using_ivar; 1000000.times { @iv; @iv; @iv; @iv; @iv} end # !> useless use of a variable in void context
def using_lvar; iv = @iv; 1000000.times { iv; iv; iv; iv; iv} end # !> useless use of a variable in void context
def using_ivar2
1000000.times {
@iv; @iv; @iv; @iv; @iv # !> useless use of a variable in void context
@iv; @iv; @iv; @iv; @iv # !> useless use of a variable in void context
@iv; @iv; @iv; @iv; @iv # !> useless use of a variable in void context
@iv; @iv; @iv; @iv; @iv # !> useless use of a variable in void context
}
end
def using_lvar2
iv = @iv; 1000000.times {
iv; iv; iv; iv; iv # !> useless use of a variable in void context
iv; iv; iv; iv; iv # !> useless use of a variable in void context
iv; iv; iv; iv; iv # !> useless use of a variable in void context
iv; iv; iv; iv; iv # !> useless use of a variable in void context
}
end
end.new(1)
bm.report("ivar"){ o.using_ivar }
bm.report("lvar"){ o.using_lvar }
bm.report("ivar (x4)"){ o.using_ivar2 }
bm.report("lvar (x4)"){ o.using_lvar2 }
end

puts "dvar vs. lvar"

Benchmark.bm(10) do |bm|
bm.report("dvar"){ 1000000.times{|i| i = 1} }
j = 0
bm.report("lvar"){ 1000000.times{|j| j = 1} }
end
# >> ivar vs. lvar
# >> user system total real
# >> ivar 0.850000 0.000000 0.850000 ( 0.886111)
# >> lvar 0.530000 0.000000 0.530000 ( 0.529777)
# >> ivar (x4) 2.900000 0.010000 2.910000 ( 2.998297)
# >> lvar (x4) 1.580000 0.000000 1.580000 ( 1.645420)
# >> dvar vs. lvar
# >> user system total real
# >> dvar 0.380000 0.000000 0.380000 ( 0.390633)
# >> lvar 0.360000 0.000000 0.360000 ( 0.374979)

Access and assignment to lvars will often be faster than to dvars,
especially if:
* you're accessing a dynamic variable from an enclosing lexical scope
* there are lots of variables
foo { a = 1; bar{ b1 = 2; ...; b100 = 100; baz{ c = 3; foobar{ a } } } }
=====

Whereas lvars are subscripted directly from an array, dvars are looked up
linearly.

Robert Klemme · Mar 11, 2006

Mike Austin said:
I decided it was time to do a little profiling, and am so glad that
it's built right into Ruby. My biggest problem seems to be array
access:
% cumulative self self total
time seconds seconds calls ms/call ms/call name
36.02 3.35 3.35 350 9.57 15.10 String#each_byte
10.45 4.32 0.97 27230 0.04 0.04 Array#[]
6.73 4.95 0.63 16100 0.04 0.04 GL.Vertex

This looks rather like the code in each_byte was the problem. Notice that
self seconds for Array#[] is just 0.97 - while String#each_byte consumes
3.35. Or did I miss something?

Kind regards

robert

Weird Behavior with Rays in C and OpenGL	4	Feb 13, 2024
OpenGL 3D gears demo for Ruby	1	May 1, 2005
Profiling question	3	Jan 22, 2005
compling error in Visual c++	4	Dec 6, 2004
UTF - SEEK_SET workaround for BOM encoding(utf-16/32) layer Bug	2	Aug 5, 2009
Faster Prime class then Ruby 1.9	0	Feb 3, 2006
In the Matter of Herb Schildt: a Detailed Analysis of "C: TheComplete Nonsense"	109	Apr 3, 2010
performance and style advice requested	12	Sep 14, 2003

Optimizing ruby constant array data

George Ogata

Mike Austin

Mike Austin

Mauricio Fernandez

Robert Klemme

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads