Problem using FileUtils to sort JPEG files

F

forgottenwizard

This is an odd problem, I admit. I'm currently working on a short script
to sort a set of JPEGs by size, but right now I'm just trying to move
them from point A to point B (I'm still learning Ruby, so I do this to
try and figure things out before getting into the big stuff), and so far
I have gotten so far as:

####################CODE####################

#!/usr/bin/ruby -wd

require "RMagick"
require "fileutils"

Dir.chdir(ARGV[0])

for pix in Dir.glob("*.{jpeg,jpg,png}")
filetype = Magick::Image::read(pix).first.format
workdir = Dir.pwd
If File.directory?(filetype) then
FileUtils.move("#{pix}", "#{workdir}/#{filetype}")
else
Dir.mkdir(filetype)
FileUtils.move("#{pix}", "#{workdir}/#{filetype}")
end
end

####################//CODE####################

The error I keep getting these error messages:

Exception 'Errno::ENOENT' at /usr/lib/ruby/1.8/fileutils.rb:1420 - no
such file or directory - <insert pathname to where jpg is being moved
to>

Exception 'Errno::ENOENT' at /usr/lib/ruby/1.8/fileutils.rb:1200 - no
such file or directory - <insert pathname to where jpg is being moved
to>

for every file. These errors come, while the file is STILL moved. I'm
not sure what the error is, but its a noisy one (testdir I made for
trying this out has ~1900 files in it).

Any clue as to the problem?
 
7

7stud --

forgottenwizard said:
The error I keep getting these error messages:

Exception 'Errno::ENOENT' at /usr/lib/ruby/1.8/fileutils.rb:1420 - no
such file or directory - <insert pathname to where jpg is being moved
to>

Any clue as to the problem?


1) pix is already a string, so doing this:

#{pix}"

is unnecessary.


2) The following causes a syntax error:

If File.directory?(filetype) then

To see a similar syntax error, try this:

num = 10
If num == 10
print 'yes'
end


3) Your default loop probably should be each(), unless you have a
specific reason to use another loop:

Dir.glob("*.{jpeg,jpg,png}").each do |fname|

A for-loop has a different scope than each(), i.e. it does not create a
new scope. An each() loop creates a new scope, although it is leaky(see
example below).


4) I couldn't find any documentation for the RMagick format() method,
which is pretty typical of the horrible ruby documentation, however this
works error free for me:

require 'fileutils'

Dir.chdir("../some_dir")
workdir = Dir.pwd #outside of each block

Dir.glob("*.{jpeg,jpg,png}").each do |fname|
filetype = fname.split(".")[-1]

if File.directory?(filetype)
FileUtils.move(fname, "#{workdir}/#{filetype}") #workdir is
accessible inside each()
else
Dir.mkdir(filetype)
FileUtils.move(fname, "#{workdir}/#{filetype}")
end
end
 
W

William James

This is an odd problem, I admit. I'm currently working on a short script
to sort a set of JPEGs by size, but right now I'm just trying to move
them from point A to point B (I'm still learning Ruby, so I do this to
try and figure things out before getting into the big stuff), and so far
I have gotten so far as:

####################CODE####################

#!/usr/bin/ruby -wd

require "RMagick"
require "fileutils"

Dir.chdir(ARGV[0])

for pix in Dir.glob("*.{jpeg,jpg,png}")
filetype = Magick::Image::read(pix).first.format
workdir = Dir.pwd
If File.directory?(filetype) then

Are you telling us that this is the exact code that you
ran? It can't be. Ruby is case-sensitive. It's "if",
not "If". How did "If" get into this code? Didn't you
copy and paste? Or did you retype it?
FileUtils.move("#{pix}", "#{workdir}/#{filetype}")
else
Dir.mkdir(filetype)
FileUtils.move("#{pix}", "#{workdir}/#{filetype}")
end
end

require "fileutils"

Dir.chdir( ARGV[0] )
workdir = Dir.pwd

Dir[ "*.{jpeg,jpg,png}" ].each{|pix|
filetype = File.extname( pix )[1..-1]
Dir.mkdir( filetype ) if not File.exists?( filetype )
FileUtils.move( pix, "#{workdir}/#{filetype}" )
}
 
F

forgottenwizard

1) pix is already a string, so doing this:

#{pix}"

is unnecessary.

I do that in places to make sure there isn't a problem somewhere with a
command taking a wrong arg. Unneeded it may be, but it makes me feel
better when I'm debugging.

2) The following causes a syntax error:

If File.directory?(filetype) then

To see a similar syntax error, try this:

num = 10
If num == 10
print 'yes'
end

Do you have a better suggestion? I was thinking of moving that outside
of the for loop and only doing it once, but looping through everything
twice (once to check to see if the directories exist and/or make them,
and a second to sort), but leaving it in makes a little more sense to
me.

Can you help explain the exact error, or is it just saying that the
directory is already there?

3) Your default loop probably should be each(), unless you have a
specific reason to use another loop:

Dir.glob("*.{jpeg,jpg,png}").each do |fname|

A for-loop has a different scope than each(), i.e. it does not create a
new scope. An each() loop creates a new scope, although it is leaky(see
example below).

That comes from using bash/C before. I'll look into trying it with each
soon.
4) I couldn't find any documentation for the RMagick format() method,
which is pretty typical of the horrible ruby documentation, however this
works error free for me:

require 'fileutils'

Dir.chdir("../some_dir")
workdir = Dir.pwd #outside of each block

Dir.glob("*.{jpeg,jpg,png}").each do |fname|
filetype = fname.split(".")[-1]

if File.directory?(filetype)
FileUtils.move(fname, "#{workdir}/#{filetype}") #workdir is
accessible inside each()
else
Dir.mkdir(filetype)
FileUtils.move(fname, "#{workdir}/#{filetype}")
end
end

Except if a file (we'll call it image) is in the dir, it won't get
sorted, which is why I'm usin RMagick in the first place. I never fully
trusted just relying on extensions.

And yes, I do notice the flaw in my logic here, and I'm going to correct
it within Dir.glob now :)
 
F

forgottenwizard

This is an odd problem, I admit. I'm currently working on a short script
to sort a set of JPEGs by size, but right now I'm just trying to move
them from point A to point B (I'm still learning Ruby, so I do this to
try and figure things out before getting into the big stuff), and so far
I have gotten so far as:

####################CODE####################

#!/usr/bin/ruby -wd

require "RMagick"
require "fileutils"

Dir.chdir(ARGV[0])

for pix in Dir.glob("*.{jpeg,jpg,png}")
filetype = Magick::Image::read(pix).first.format
workdir = Dir.pwd
If File.directory?(filetype) then

Are you telling us that this is the exact code that you
ran? It can't be. Ruby is case-sensitive. It's "if",
not "If". How did "If" get into this code? Didn't you
copy and paste? Or did you retype it?

Retyped it. uxterm doesn't like copy+paste.
 
F

forgottenwizard

The RMagick doc is not typical. It is the result of many hundreds of hours
of my work.

http://www.simplesystems.org/RMagick/doc/imageattrs.html#format

Since you wrote the docs for RMagick (or alot of it, I'm guessing), can
you tell me how to have RMagick handle diffrent filetypes properly? I've
tried to have it check to see if a file was of an image type, and if not
ignore it (which would be closer to ideal), but I don't know how to do
that, and I havn't seen any docs on doing such.
 
T

Tim Hunter

forgottenwizard said:
Since you wrote the docs for RMagick (or alot of it, I'm guessing), can
you tell me how to have RMagick handle diffrent filetypes properly? I've
tried to have it check to see if a file was of an image type, and if not
ignore it (which would be closer to ideal), but I don't know how to do
that, and I havn't seen any docs on doing such.

I wrote it all.

To use RMagick to determine if an arbitrary file is an image file that
it (that is, ImageMagick) can handle, try to read it. If it's not an
image file, it'll raise an exception.
 
7

7stud --

Tim said:
I wrote it all.

Then here's a suggestion. On the documentation page, display a link
that says:

all RMagick methods

When the user clicks on the link, display a page with an alphabetical
listing of all the methods in RMagick--where each listing is a link to
the documentation for that method. Easy peasy, Japaneasy.



Except if a file (we'll call it image) is in the dir, it won't get
sorted, which is why I'm usin RMagick in the first place.

Ah, I see. I would create the directories outside the loop:

require 'fileutils'

Dir.chdir("../somedir")

filetypes = ["jpeg", "jpg", "png"]

filetypes.each do |type|
if not File.directory?(type)
Dir.mkdir(type)
end
end

Then you don't have to check each file's format to see if a
corresponding directory exists. Then based on what Tim Hunter said, you
can do something like this:


lookup = filetypes.inject({}) do |hash, type|
hash[type] = true
hash
end
#lookup = {'jpeg'=>true,'jpg'=>true,'png'=>true}

glob_str = filetypes.join(",")

Dir.glob("*.{#{glob_str}}").each do |fname|
begin
filetype = Magick::Image::read(pix).first.format
#If that causes an error, then execution jumps
#down to the rescue block below.

if lookup[filetype]
FileUtils.move(fname, "./#{filetype}")
end

rescue IOError
#do nothing
end
end
 
F

forgottenwizard

forgottenwizard said:
Except if a file (we'll call it image) is in the dir, it won't get
sorted, which is why I'm usin RMagick in the first place.

Ah, I see. I would create the directories outside the loop:

require 'fileutils'

Dir.chdir("../somedir")

filetypes = ["jpeg", "jpg", "png"]

filetypes.each do |type|
if not File.directory?(type)
Dir.mkdir(type)
end
end

Then you don't have to check each file's format to see if a
corresponding directory exists. Then based on what Tim Hunter said, you
can do something like this:

I think for now I'm going to keep everything within the loop. It makes a
bit more sense to me to check to make sure the dir exist while looking
through everything then to go through and hope the first loop got it
all.

I'll keep that suggestion, though, for a possible rewrite later on.
lookup = filetypes.inject({}) do |hash, type|
hash[type] = true
hash
end
#lookup = {'jpeg'=>true,'jpg'=>true,'png'=>true}

glob_str = filetypes.join(",")

Dir.glob("*.{#{glob_str}}").each do |fname|
begin
filetype = Magick::Image::read(pix).first.format
#If that causes an error, then execution jumps
#down to the rescue block below.

if lookup[filetype]
FileUtils.move(fname, "./#{filetype}")
end

rescue IOError
#do nothing
end
end

I'm showing how much of a newbie I am here, but I added in the rescue
block the same as you have, and it still exists the loop when run over
the script itself. What is the best way to figure out what kind of
problem there is to throw into the rescue block?
 
7

7stud --

It makes a
bit more sense to me to check to make sure the dir exist
while looking through everything

Consider this: how does ruby determine if a filename is a directory?
Suppose your directory has 2,000 files in it. Next, suppose you ask
ruby if "my_file.jpg" is a directory. What if ruby first has to look at
each and every one of those 2,000 files to look for a match with
filename? Then if it found a match, ruby had to look at the stats for
the file to determine if it's a directory. That would mean that every
time you ask if a filename is a directory, ruby has to step through
every one of those 2,000 files to look for a match. Do you think that
is very efficient?

I added in the rescue
block the same as you have, and it still exists the loop when run over
the script itself.

I have no idea what that means. If you are asking how to determine what
error to catch with the rescue statement, do this: remove the rescue
statement from your code and try to read a file that RMagick doesn't
recognize, and then examine the error message. The error message will
give the name of the error. For instance, if you run this program:

puts x

the error message is:

"undefined local variable or method `x' for main:Object (NameError)"

To handle that error, you can write:

begin
puts x
rescue NameError
puts "caught the error"
end
 
F

forgottenwizard

Consider this: how does ruby determine if a filename is a directory?
Suppose your directory has 2,000 files in it. Next, suppose you ask
ruby if "my_file.jpg" is a directory. What if ruby first has to look at
each and every one of those 2,000 files to look for a match with
filename? Then if it found a match, ruby had to look at the stats for
the file to determine if it's a directory. That would mean that every
time you ask if a filename is a directory, ruby has to step through
every one of those 2,000 files to look for a match. Do you think that
is very efficient?

The way I'm currently going through all this, and the cases I'm trying
to keep from running across (such as a jpg named myfile), what it seems
to me you are suggesting is going through and check every single file to
make sure there isn't a new format, and if there is, make a new
directory, THEN go back though the same directory, find the format
AGAIN, then sort through everything.

It makes more sense to me to go through and do both at once, since
(correct me if I am wrong here) it takes longer to find the file format
as opposed to, say, seeing if there is a directory with the same name as
that format.
I have no idea what that means. If you are asking how to determine what
error to catch with the rescue statement, do this: remove the rescue
statement from your code and try to read a file that RMagick doesn't
recognize, and then examine the error message. The error message will
give the name of the error. For instance, if you run this program:

puts x

the error message is:

"undefined local variable or method `x' for main:Object (NameError)"

To handle that error, you can write:

begin
puts x
rescue NameError
puts "caught the error"
end

I'll look more into that. It seems a bad filetype causes several errors,
so I'm having to go through and take care of each of them.
 
7

7stud --

Suppose your directory has 2,000 files in it.

The way I see it, in order for your code to determine whether a file's
type is already a directory, your code has to go searching through a
directory with 2,000 files in it. And your code does that for each
filename. Also, if your directory has 10,000 files instead of 2,000
files, then the search will take five times as long.

The code I posted only has to search through the directory three times
to set things up, i.e when it creates the directories. After that, the
code loops over the file names and does a lookup in a hash to determine
if there's a directory for the file's file type, and hash lookups are
extremely fast. Which would you rather do: search a directory
containing 2,000 files 2,000 times; or search a directory containing
2,000 files 3 times, and then do extremely fast hash lookups 2,000
times? For the purposes of the comparison, you can assume hash lookups
take 0 time.
 
F

forgottenwizard

Suppose your directory has 2,000 files in it.

The way I see it, in order for your code to determine whether a file's
type is already a directory, your code has to go searching through a
directory with 2,000 files in it. And your code does that for each
filename. Also, if your directory has 10,000 files instead of 2,000
files, then the search will take five times as long.

The code I posted only has to search through the directory three times
to set things up, i.e when it creates the directories. After that, the
code loops over the file names and does a lookup in a hash to determine
if there's a directory for the file's file type, and hash lookups are
extremely fast. Which would you rather do: search a directory
containing 2,000 files 2,000 times; or search a directory containing
2,000 files 3 times, and then do extremely fast hash lookups 2,000
times? For the purposes of the comparison, you can assume hash lookups
take 0 time.

Alright, I see what you are saying. I'll look over your code again,
since I obviously missed something.

Thanks for the explinasion :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,189
Members
46,735
Latest member
HikmatRamazanov

Latest Threads

Top