Help me clean up this method

V

Vincent Foley

Hello guys,

I wrote this little method to return the size of a given directory, but
I think it's very ugly. Could anyone help me clean it up a bit?


def Dir.size(dirname)
Dir.chdir(dirname)
entries = Dir.entries(".").reject { |x| %w(. ..).include? x }
entries.collect! { |filename| File.expand_path(filename) }
size = 0
entries.each do |filename|
begin
if File.file?(filename)
size += File.size(filename) rescue 0
else
size += Dir.size(filename)
end
rescue
next
end
end
size
end

Thanks,
Vincent.
 
J

James Edward Gray II

Hello guys,

I wrote this little method to return the size of a given directory,
but
I think it's very ugly. Could anyone help me clean it up a bit?

File.size(dirname) seems to be working on my system. Am I missing
something obvious?

James Edward Gray II
 
W

wuhy80

Yes,File.size(dirname) is working.but It only calc the current dir's
size,not include the subdir's
 
E

Edward Faulkner

--BOKacYhQ+x31HxR3
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

=20
File.size(dirname) seems to be working on my system. Am I missing =20
something obvious?

File.size(dirname) only tells you how many bytes the directory's own
inode is using. It doesn't include the bytes for the directory's
files.

How about this:

def Dir.size(dname)
Dir.new(dname).inject(File.size(dname)) {|total,name|
begin
exname =3D File.expand_path(name,dname)
if File.file?(exname)=20
total + File.size(exname)
elsif File.directory?(exname) and name !=3D '.' and name !=3D '..'
total + Dir.size(exname)
else
total
end
rescue
total
end
}
end

-Ed

--BOKacYhQ+x31HxR3
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDG6iLnhUz11p9MSARAsjyAJ9kVqc0quVzLeGEH2PZBWTzR8+gAQCdGx/0
pSNHg4J5BZzBOWD/pi6P/kQ=
=fDxz
-----END PGP SIGNATURE-----

--BOKacYhQ+x31HxR3--
 
H

Hristo Deshev

------=_Part_7647_10983256.1125895497540
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

=20
Hello guys,
=20
I wrote this little method to return the size of a given directory, but
I think it's very ugly. Could anyone help me clean it up a bit?


Hi guys,

I managed to get rid of the file names discovery by using Dir's globbing=20
facilities. The size calculation is then a matter of a single inject call:

def Dir.size(name)
Dir.chdir(name)
files =3D Dir["**/*"]
files.inject(0) do |total, name|=20
if File.file?(name)
total + File.size(name)
else
total
end
end
end

puts Dir.size(".")
puts Dir.size("D:/tmp/ruby")
puts Dir.size("C:/Windows")

I don't like the "if" statement inside the block that gets injected. Is=20
there a better, idiomatic way to express the same thing?

Hristo Deshev

------=_Part_7647_10983256.1125895497540--
 
H

Hristo Deshev

------=_Part_7656_24751092.1125895837015
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

The mailing list critters ate my indentation! Maybe this one will get=20
through:

|def Dir.size(name)
| Dir.chdir(name)
| files =3D Dir["**/*"]
| files.inject(0) do |total, name|=20
| if File.file?(name)
| total + File.size(name)
| else
| total
| end
| end
|end

Hristo Deshev

=20
Hello guys,

I wrote this little method to return the size of a given directory, but
I think it's very ugly. Could anyone help me clean it up a bit?
=20
=20
Hi guys,
=20
I managed to get rid of the file names discovery by using Dir's globbing
facilities. The size calculation is then a matter of a single inject call= :
=20
def Dir.size(name)
Dir.chdir(name)
files =3D Dir["**/*"]
files.inject(0) do |total, name|
if File.file?(name)
total + File.size(name)
else
total
end
end
end
=20
puts Dir.size(".")
puts Dir.size("D:/tmp/ruby")
puts Dir.size("C:/Windows")
=20
I don't like the "if" statement inside the block that gets injected. Is
there a better, idiomatic way to express the same thing?
=20
Hristo Deshev
=20

------=_Part_7656_24751092.1125895837015--
 
E

Eric Hodel

Hello guys,

I wrote this little method to return the size of a given directory,
but
I think it's very ugly. Could anyone help me clean it up a bit?

require 'find'

def Dir.size(dirname)
size = 0
Find.find dirname do |name|
next unless File.file? name
size += File.size name rescue 0
end
return size
end
 
W

William James

Vincent said:
Hello guys,

I wrote this little method to return the size of a given directory, but
I think it's very ugly. Could anyone help me clean it up a bit?

def Dir.size(name)
Dir[ name + "/**/*" ].select{|x| File.file?(x)}.inject(0){|sum,f|
sum + File.size(f) }
end
 
B

Brian Schröder

=20
Vincent said:
Hello guys,

I wrote this little method to return the size of a given directory, but
I think it's very ugly. Could anyone help me clean it up a bit?
=20
def Dir.size(name)
Dir[ name + "/**/*" ].select{|x| File.file?(x)}.inject(0){|sum,f|
sum + File.size(f) }
end
=20

One more safety addition
def Dir.size(name)
Dir[ File.join(name, "**/*") ].select{ | f | File.file?(f)
}.inject(0){ | sum, f |
sum + File.size(f)=20
}
end

regards,

Brian
--=20
http://ruby.brian-schroeder.de/

Stringed instrument chords: http://chordlist.brian-schroeder.de/
 
A

Ara.T.Howard

Hello guys,

I wrote this little method to return the size of a given directory, but
I think it's very ugly. Could anyone help me clean it up a bit?


def Dir.size(dirname)
Dir.chdir(dirname)
entries = Dir.entries(".").reject { |x| %w(. ..).include? x }
entries.collect! { |filename| File.expand_path(filename) }
size = 0
entries.each do |filename|
begin
if File.file?(filename)
size += File.size(filename) rescue 0
else
size += Dir.size(filename)
end
rescue
next
end
end
size
end

Thanks,
Vincent.

just thought i'd point out that every solution posted thus far fails in a
variety of ways when links are considered - in the best cases linked files are
counted twice or linked dirs are not checked, in the worst case infinite loops
occur. the methods using 'Dir[glob]' v.s. 'Find::find' suffer from the link
issue but also will perfom badly on large file systems. unfortunately ruby's
built-in 'Find::find' cannot deal with links - for that you have to rely on
motoyuki kasahara's Find2 module, which you can get off of the raa. i have it
inlined in my personal library (alib - also on the raa) with few small bug
fixes and interface additions, to use you would do something like:

require 'alib'

def dir_size dir
size = 0
totalled = {}
ALib::Util::find2(dir, 'follow' => true) do |path, stat|
begin
next if totalled[stat.ino]
next unless stat.file?
size += stat.size
ensure
totalled[stat.ino] = true
end
end
size
end

p dir_size('.')

this handles huge directories, duplicate files (links) in a directory, linked
directories, and potential infinite loops. i think this is about as simply as
one can write this without introducing subtle, or not so subtle, bugs.

cheers.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| Your life dwells amoung the causes of death
| Like a lamp standing in a strong breeze. --Nagarjuna
===============================================================================
 
D

David Brady

Ara.T.Howard said:
just thought i'd point out that every solution posted thus far fails in a
variety of ways when links are considered

If you have links, you also have du, which IMO is the Right Tool For
This Job:

total = `du --max-depth=0|cut -f 1`.to_i

-dB
 
S

Simon Kröger

David said:
If you have links, you also have du, which IMO is the Right Tool For
This Job:

total = `du --max-depth=0|cut -f 1`.to_i

sorry for stating the obvious, but: this isn't very portable.

cheers

Simon
 
D

David Brady

Simon said:
sorry for stating the obvious, but: this isn't very portable.


Yup! :)

<soapbox>
Portability isn't a good idea here.

(wait for scandalized gasps to quiet down)

The need here was to clean up the code. The solutions so far have=20
bulkily danced around the fact that Find::find doesn't seem to work=20
satisfactorily.

If the Standard Library is defective, the proper, *portable* solution=20
should be to patch the .c files that are defective. Until then, if=20
we're going to work around the Standard Library, we should be as quick=20
and deadly as possible.

I recognize that it may be seen as cheating or possibly even=20
inappropriate to suggest "don't use Ruby for this" on a Ruby mailing=20
list, but I feel strongly that Ruby's very ability to take advantage of=20
external tools is a great strength of the language. A superior wheel=20
already exists; reinventing it poorly does not seem to me to leverage=20
Ruby's power very well.

There seems to me to be a "right" way to do this, and it is to have the=20
Standard Library work as desired. Until then, how much effort should be=20
spent standardizing on a portable kludge?
</soapbox>

And of course, if the OP is on a different platform, the if statement=20
that started my earlier post will become important. The else clause=20
probably reads "then the issues with File::find link don't affect you. =20
Use the Standard Library."

Just my $0.02. Note sig.

-dB

--=20
David Brady
(e-mail address removed)
C++ Guru. Ruby nuby. Apply salt as needed.
 
W

William James

Ara.T.Howard said:
Hello guys,

I wrote this little method to return the size of a given directory, but
I think it's very ugly. Could anyone help me clean it up a bit?
[snip]

just thought i'd point out that every solution posted thus far fails in a
variety of ways when links are considered - in the best cases linked files are
counted twice or linked dirs are not checked, in the worst case infinite loops
occur. the methods using 'Dir[glob]' v.s. 'Find::find' suffer from the link
issue but also will perfom badly on large file systems. unfortunately ruby's
built-in 'Find::find' cannot deal with links - for that you have to rely on
motoyuki kasahara's Find2 module, which you can get off of the raa. i have it
inlined in my personal library (alib - also on the raa) with few small bug
fixes and interface additions, to use you would do something like:

require 'alib'

def dir_size dir
size = 0
totalled = {}
ALib::Util::find2(dir, 'follow' => true) do |path, stat|
begin
next if totalled[stat.ino]
next unless stat.file?
size += stat.size
ensure
totalled[stat.ino] = true
end
end
size
end

p dir_size('.')

this handles huge directories, duplicate files (links) in a directory, linked
directories, and potential infinite loops. i think this is about as simply as
one can write this without introducing subtle, or not so subtle, bugs.

This solution fails under windoze by always returning 0, apparently
because
File.stat(path).ino always returns 0.

If you're stuck with windoze, use the previously posted

def Dir.size(name)
Dir[File.join(name, "**/*")].select{|f|
File.file?(f)}.inject(0){|sum,f|
sum + File.size(f)
}
end
 
A

Ara.T.Howard

If you have links, you also have du, which IMO is the Right Tool For This
Job:

total = `du --max-depth=0|cut -f 1`.to_i

windows has links - and systems that have du may not have one that supports
max-depth. and du reports on file system blocks - not directory sums. this is
typically close, but can diverge greatly depending on file system setup and the
number of directoriess - since du will report usage for directories too..

harp:~ > mkdir foobar

harp:~ > du foobar
4 foobar

fyi.


-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| Your life dwells amoung the causes of death
| Like a lamp standing in a strong breeze. --Nagarjuna
===============================================================================
 
A

Ara.T.Howard

This solution fails under windoze by always returning 0, apparently because
File.stat(path).ino always returns 0.

If you're stuck with windoze, use the previously posted

def Dir.size(name)
Dir[File.join(name, "**/*")].select{|f|
File.file?(f)}.inject(0){|sum,f|
sum + File.size(f)
}
end

works for me :

Ara@JEN ~
$ cat a.rb
require 'alib'

def dir_size dir
size = 0
totalled = {}
ALib::Util::find2(dir, 'follow' => true) do |path, stat|
begin
next if totalled[stat.ino]
next unless stat.file?
size += stat.size
ensure
totalled[stat.ino] = true
end
end
size
end

p dir_size('.')

Ara@JEN ~
$ ruby a.rb
29432845

Ara@JEN ~
$ du -sb .
29432903 .

Ara@JEN ~
$ ruby -r rbconfig -r yaml -e'y Config::CONFIG' |egrep -i win
target: i686-pc-cygwin
ac_ct_WINDRES: windres
WINDRES: windres
archdir: /usr/lib/ruby/1.8/i386-cygwin
sitearch: i386-cygwin
arch: i386-cygwin
host_os: cygwin
build: i686-pc-cygwin
host: i686-pc-cygwin
build_os: cygwin
target_os: cygwin
sitearchdir: /usr/lib/ruby/site_ruby/1.8/i386-cygwin

so it seems like your ruby may be broken - how'd you install it?

cheers.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| Your life dwells amoung the causes of death
| Like a lamp standing in a strong breeze. --Nagarjuna
===============================================================================
 
W

William James

Ara.T.Howard said:
This solution fails under windoze by always returning 0, apparently because
File.stat(path).ino always returns 0.

If you're stuck with windoze, use the previously posted

def Dir.size(name)
Dir[File.join(name, "**/*")].select{|f|
File.file?(f)}.inject(0){|sum,f|
sum + File.size(f)
}
end

works for me :

Ara@JEN ~
$ cat a.rb
require 'alib'

def dir_size dir
size = 0
totalled = {}
ALib::Util::find2(dir, 'follow' => true) do |path, stat|
begin
next if totalled[stat.ino]
next unless stat.file?
size += stat.size
ensure
totalled[stat.ino] = true
end
end
size
end

p dir_size('.')

Ara@JEN ~
$ ruby a.rb
29432845

Ara@JEN ~
$ du -sb .
29432903 .

Ara@JEN ~
$ ruby -r rbconfig -r yaml -e'y Config::CONFIG' |egrep -i win
target: i686-pc-cygwin
ac_ct_WINDRES: windres
WINDRES: windres
archdir: /usr/lib/ruby/1.8/i386-cygwin
sitearch: i386-cygwin
arch: i386-cygwin
host_os: cygwin
build: i686-pc-cygwin
host: i686-pc-cygwin
build_os: cygwin
target_os: cygwin
sitearchdir: /usr/lib/ruby/site_ruby/1.8/i386-cygwin

so it seems like your ruby may be broken - how'd you install it?

Mine:

build: i686-pc-mswin32
build_os: mswin32
host: i686-pc-mswin32
host_os: mswin32
target: i386-pc-mswin32
target_os: mswin32

Yours:

build: i686-pc-cygwin
build_os: cygwin
host: i686-pc-cygwin
host_os: cygwin
target: i686-pc-cygwin
target_os: cygwin

Try it without cygwin. On my system, stat.ino is always 0.

If anyone else is running plain windoze without cygwin, see if
File.stat(path).ino is always 0.
 
A

Ara.T.Howard

Mine:

build: i686-pc-mswin32
build_os: mswin32
host: i686-pc-mswin32
host_os: mswin32
target: i386-pc-mswin32
target_os: mswin32

Yours:

build: i686-pc-cygwin
build_os: cygwin
host: i686-pc-cygwin
host_os: cygwin
target: i686-pc-cygwin
target_os: cygwin

Try it without cygwin. On my system, stat.ino is always 0.

If anyone else is running plain windoze without cygwin, see if
File.stat(path).ino is always 0.

hmmm. you are right. this seems to be a bug in ruby - amazing that no-one
has seen it before though? it looks like this might be in rb_uint2big but i'm
kinda guessing since i can't compile on windows myself and therefore can't
look at config.h - except in cygwin, which works. anyhow - the numbers spat
out in cygwin are huge - so i'm gussing the bug is here. perhaps someone out
there with a windows compiler tool-chain could examine?

so, just to re-state the bug: File::stat(anypath).ino is always zero under the
one click installer, but not under cygwin using the default or compiling by
hand.

cheers.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| Your life dwells amoung the causes of death
| Like a lamp standing in a strong breeze. --Nagarjuna
===============================================================================
 
D

Daniel Berger

Ara.T.Howard wrote:

so, just to re-state the bug: File::stat(anypath).ino is always zero under the
one click installer, but not under cygwin using the default or compiling by
hand.

Generally speaking, File.stat on Windows is not reliable. Too many of
that Stat members are either meaningless or wrong. Revamping it is on
my TODO list for the win32-file package.

Regards,

Dan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,183
Messages
2,570,967
Members
47,517
Latest member
Andres38A1

Latest Threads

Top