bug in require?

A

Ara.T.Howard

[ahoward@localhost ~]$ strace ruby -e' 1.times{ require "digest/md5.so" }' 2>&1 |grep open|grep md5
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) = 3

[ahoward@localhost ~]$ strace ruby -e' 2.times{ require "digest/md5.so" }' 2>&1 |grep open|grep md5
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) = 3

[ahoward@localhost ~]$ strace ruby -e' 3.times{ require "digest/md5.so" }' 2>&1 |grep open|grep md5
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) = 3

[ahoward@localhost ~]$ strace ruby -e' 4.times{ require "digest/md5.so" }' 2>&1 |grep open|grep md5
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) = 3

[ahoward@localhost ~]$ cat a.rb
4.times{ require 'digest/md5' }

[ahoward@localhost ~]$ strace ruby a.rb 2>&1 |grep open|grep md5
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3

why the difference?

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| Your life dwells amoung the causes of death
| Like a lamp standing in a strong breeze. --Nagarjuna
===============================================================================
 
H

Han Holl

------=_Part_8219_4949088.1126865657887
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

[ahoward@localhost ~]$ cat a.rb
4.times{ require 'digest/md5' }
=20
[ahoward@localhost ~]$ strace ruby a.rb 2>&1 |grep open|grep md5
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",=20
O_RDONLY|O_LARGEFILE) =3D 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",=20
O_RDONLY|O_LARGEFILE) =3D 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) =3D 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",=20
O_RDONLY|O_LARGEFILE) =3D 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",=20
O_RDONLY|O_LARGEFILE) =3D 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",=20
O_RDONLY|O_LARGEFILE) =3D 3
=20
why the difference?


If you look at the stracefile, you'll notice that the last three times it=
=20
just open and closes the file, without reading. I'm very interested in this=
,=20
because I've noticed the same behaviour with my Rails application, and ther=
e=20
the search path is _huge_.

Han Holl

-a
--
=20
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| Your life dwells amoung the causes of death
| Like a lamp standing in a strong breeze. --Nagarjuna
=20
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D
=20
=20

------=_Part_8219_4949088.1126865657887--
 
M

Mauricio Fernández

If you look at the stracefile, you'll notice that the last three times it
just open and closes the file, without reading. I'm very interested in this,
because I've noticed the same behaviour with my Rails application, and there
the search path is _huge_.

static int
file_load_ok(const char *file)
{
FILE *f;

if (!file) return 0;
f = fopen(file, "r");
if (f == NULL) return 0;
fclose(f);
return 1;
}

(file.c)
 
H

Han Holl

------=_Part_8319_2271832.1126871795722
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Yes, that's answering where it happens.
Remains the question why.

Han

=20
=20 =20 =20
=20
static int
file_load_ok(const char *file)
{
FILE *f;
=20
if (!file) return 0;
f =3D fopen(file, "r");
if (f =3D=3D NULL) return 0;
fclose(f);
return 1;
}
=20
(file.c)
=20

------=_Part_8319_2271832.1126871795722--
 
Y

Yukihiro Matsumoto

Hi,

In message "Re: bug in require?"

|Yes, that's answering where it happens.
|Remains the question why.

It was the easiest way to check if a file is loadable (i.e. a file
exists and has proper permissions). Maybe it's possible to replace it
with access(3).

matz.
 
M

Mauricio Fernández

[ahoward@localhost ~]$ strace ruby -e' 4.times{ require "digest/md5.so" ===
[...]
[ahoward@localhost ~]$ cat a.rb
4.times{ require 'digest/md5' } ===
why the difference?

batsman@tux-chan:/tmp$ strace ruby -e' 4.times{ require "digest/md5.so" }' 2>&1 |grep open|grep md5
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) = 3
batsman@tux-chan:/tmp$ strace ruby -e' 4.times{ require "digest/md5" }' 2>&1 |grep open|grep md5
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) = 3
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3

If no extension is given, ruby verifies if any new 'feature' can be loaded:

batsman@tux-chan:/tmp$ strace ruby -rfileutils -e'require "digest/md5"; FileUtils.mkdir("digest"); File.open("digest/md5.rb","w"){|f| f.puts %{puts "==== opened & loaded md5.rb"} }; $:.unshift "."; require "digest/md5" ' 2>&1 |grep open|grep md5
execve("/home/batsman/usr/bin/ruby", ["ruby", "-rfileutils", "-erequire \"digest/md5\"; FileUtils.mkdir(\"digest\"); File.open(\"digest/md5.rb\",\"w\"){|f| f.puts %{puts \"==== opened & loaded md5.rb\"} }; $:.unshift \".\"; require \"digest/md5\" "], [/* 31 vars */]) = 0
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) = 3
open("digest/md5.rb", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
write(3, "puts \"==== opened & loaded md5.r"..., 35) = 35
open("./digest/md5.rb", O_RDONLY|O_LARGEFILE) = 3
open("./digest/md5.rb", O_RDONLY|O_LARGEFILE) = 3
open("./digest/md5.rb", O_RDONLY|O_LARGEFILE) = 3
open("./digest/md5.rb", O_RDONLY|O_LARGEFILE) = 3
open("./digest/md5.rb", O_RDONLY|O_LARGEFILE) = 3
read(3, "puts \"==== opened & loaded md5.r"..., 4096) = 35
write(1, "==== opened & loaded md5.rb\n", 28==== opened & loaded md5.rb
 
A

Ara.T.Howard

[ahoward@localhost ~]$ cat a.rb
4.times{ require 'digest/md5' }

[ahoward@localhost ~]$ strace ruby a.rb 2>&1 |grep open|grep md5
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",
O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",
O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",
O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",
O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",
O_RDONLY|O_LARGEFILE) = 3

why the difference?


If you look at the stracefile, you'll notice that the last three times it
just open and closes the file, without reading. I'm very interested in this,
because I've noticed the same behaviour with my Rails application, and there
the search path is _huge_.

yeah exactly - i'm trying to speed up acgi running strace's and was seeing md5
get opened like 20 times. this sure could slow something like rails down
alright.

guess this is the point where it's be smart to look at the source for
rb_require...

cheers.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| Your life dwells amoung the causes of death
| Like a lamp standing in a strong breeze. --Nagarjuna
===============================================================================
 
H

Han Holl

------=_Part_8576_8914796.1126880653560
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

=20
Hi,
=20
In message "Re: bug in require?"
=20
|Yes, that's answering where it happens.
|Remains the question why.
=20
It was the easiest way to check if a file is loadable (i.e. a file
exists and has proper permissions). Maybe it's possible to replace it
with access(3).
=20
matz.
=20
Ara's question was what the difference was between execution from the=20
commandline and from a script.
Only now I see that the difference is that from the commandline he requires=
=20
"digest/md5.so", and from the file
'digest/md5', without the extension.
Which means that you get a considerable perfomance penalty by omitting the=
=20
extension. In a gem-installed rails environment the seachpath can easily be=
=20
25 directories. Multiple requires withou extension can be costly there.

Cheers,

Han

------=_Part_8576_8914796.1126880653560--
 
A

Ara.T.Howard

commandline and from a script. Only now I see that the difference is that
from the commandline he requires "digest/md5.so", and from the file
'digest/md5', without the extension. Which means that you get a
considerable perfomance penalty by omitting the extension. In a
gem-installed rails environment the seachpath can easily be 25 directories.
Multiple requires withou extension can be costly there.


right. try this fix for your rails app:

harp:~ > strace ruby -e'4.times{ require "digest/md5" }' 2>&1 |egrep 'open.*md5'
open("/usr/lib/ruby/1.6/i386-linux-gnu/digest/md5.so", O_RDONLY) = 3
open("/usr/lib/ruby/1.6/i386-linux-gnu/digest/md5.so", O_RDONLY) = 3
open("/usr/lib/ruby/1.6/i386-linux-gnu/digest/md5.so", O_RDONLY) = 3
open("/usr/lib/ruby/1.6/i386-linux-gnu/digest/md5.so", O_RDONLY) = 3
open("/usr/lib/ruby/1.6/i386-linux-gnu/digest/md5.so", O_RDONLY) = 3
open("/usr/lib/ruby/1.6/i386-linux-gnu/digest/md5.so", O_RDONLY) = 3
open("/usr/lib/ruby/1.6/i386-linux-gnu/digest/md5.so", O_RDONLY) = 3
open("/usr/lib/ruby/1.6/i386-linux-gnu/digest/md5.so", O_RDONLY) = 3
open("/usr/lib/ruby/1.6/i386-linux-gnu/digest/md5.so", O_RDONLY) = 3


harp:~ > strace ruby -r ./require.rb -e'4.times{ require "digest/md5" }' 2>&1 |egrep 'open.*md5'
open("/usr/lib/ruby/1.6/i386-linux-gnu/digest/md5.so", O_RDONLY) = 3
open("/usr/lib/ruby/1.6/i386-linux-gnu/digest/md5.so", O_RDONLY) = 3
open("/usr/lib/ruby/1.6/i386-linux-gnu/digest/md5.so", O_RDONLY) = 3
open("/usr/lib/ruby/1.6/i386-linux-gnu/digest/md5.so", O_RDONLY) = 3
open("/usr/lib/ruby/1.6/i386-linux-gnu/digest/md5.so", O_RDONLY) = 3


harp:~ > cat ./require.rb
require 'rbconfig.rb'
module Kernel
alias_method '__require__', 'require'
def require lib
if((ext = lib[ %r/\.[^\.]+$/ ]))
__require__ lib
else
dlext, dlext2 = ::Config::CONFIG['DLEXT'], ::Config::CONFIG['DLEXT2']
__require__ "#{ lib }.#{ dlext }" rescue
__require__ "#{ lib }.#{ dlext2 }" rescue
__require__ "#{ lib }.rb"
end
end
end


cheers.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| Your life dwells amoung the causes of death
| Like a lamp standing in a strong breeze. --Nagarjuna
===============================================================================
 
N

nobu.nokada

Hi,

At Fri, 16 Sep 2005 23:24:16 +0900,
Han Holl wrote in [ruby-talk:156435]:
Ara's question was what the difference was between execution from the
commandline and from a script.
Only now I see that the difference is that from the commandline he requires
"digest/md5.so", and from the file
'digest/md5', without the extension.
Which means that you get a considerable perfomance penalty by omitting the
extension. In a gem-installed rails environment the seachpath can easily be
25 directories. Multiple requires withou extension can be costly there.

If md5.rb exists in loadpath, it should be loaded even if
md5.so has been loaded.
 
H

Han Holl

------=_Part_9125_14149565.1126898731686
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

=20
=20
harp:~ > cat ./require.rb
require 'rbconfig.rb'
module Kernel
alias_method '__require__', 'require'
def require lib
if((ext =3D lib[ %r/\.[^\.]+$/ ]))
__require__ lib
else
dlext, dlext2 =3D ::Config::CONFIG['DLEXT'], ::Config::CONFIG['DLEXT2']
__require__ "#{ lib }.#{ dlext }" rescue
__require__ "#{ lib }.#{ dlext2 }" rescue
__require__ "#{ lib }.rb"
end
end
end
=20
=20
=20
Yes, that's one way of dealing with it.
Another approach that I find attractive is autoload. I think in the case of=
=20
the Digest namespace, one could write:
module Digest
autoload :MD5, 'digest/md5'
end
Autoloading also has the advantage of looser coupling, lazy loading and=20
auto-determining load order in case of (almost cyclic) dependencies.
It should perform better than redundant requires as well, but I doubt this=
=20
is noticeable.

Cheers,

Han Holl

------=_Part_9125_14149565.1126898731686--
 
H

Han Holl

------=_Part_9141_26100838.1126899407385
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

=20
=20
If md5.rb exists in loadpath, it should be loaded even if
md5.so has been loaded.


I don't doubt for a minute that you are right, but in practice this will of=
=20
course hardly ever occur, and with the gem-induced explosion of the search=
=20
path it is probably worthwhile to do something about it: rewriting require,=
=20
or introducing a specialized require with a different name, or autoloading.

Intuitively, if I say: require 'some/feature' , I would expect (almost)=20
nothing to happen if this feature had been required before. About 50 failin=
g=20
lookups in directories is not almost nothing.
Maybe we could have some global flag meaning: I don't need this overly=20
correct behaviour, please do nothing if either .rb of .so (or .dll) has bee=
n=20
loaded ?

Cheers,

Han Holl

------=_Part_9141_26100838.1126899407385--
 
J

Jay Levitt

I don't doubt for a minute that you are right, but in practice this will of
course hardly ever occur, and with the gem-induced explosion of the search
path it is probably worthwhile to do something about it: rewriting require,
or introducing a specialized require with a different name, or autoloading.

I was noticing this too the other day; Windows file opens are much
slower than Linux, and even small rails apps can spend a second or two
on Windows just waiting for the file system while it goes searching for
gems at startup. Makes unit testing a good bit less fun.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,812
Latest member
GracielaWa

Latest Threads

Top