Iteration through File.file? misses entries for whichFile.file?(entry) == true

K

Kyle Barbour

Hello everyone,

I'm new to Ruby, so I'm most likely making an elementary mistake.
However, searching Google didn't help me find my answer, and after
mucking around with irb I still haven't figured it out.

I wrote a method that was intended to take all of the files in a given
directory and put them into an array. Here's the code I wrote:

def getFiles(dir)
pwdFiles = Array.new

Dir.foreach(dir) do |entry|
pwdFiles.push(entry) if File.file?(entry) == true
end
end

This works fine when called on the working directory --- that is,
getFiles(Dir.pwd). However, when I call it on a subdirectory --- for
example, getFiles("doc") --- it only returns one file. I tested to
make sure that the files being ignored in the subdirectory actually
gave "true" when tested with File.file?(entry), which they do.

Anyone know what I'm doing wrong?

Kyle
 
B

Brian Candler

Kyle said:
Hello everyone,

I'm new to Ruby, so I'm most likely making an elementary mistake.
However, searching Google didn't help me find my answer, and after
mucking around with irb I still haven't figured it out.

I wrote a method that was intended to take all of the files in a given
directory and put them into an array. Here's the code I wrote:

def getFiles(dir)
pwdFiles = Array.new

Dir.foreach(dir) do |entry|
pwdFiles.push(entry) if File.file?(entry) == true
end
end

This works fine when called on the working directory --- that is,
getFiles(Dir.pwd). However, when I call it on a subdirectory --- for
example, getFiles("doc") --- it only returns one file. I tested to
make sure that the files being ignored in the subdirectory actually
gave "true" when tested with File.file?(entry), which they do.

Anyone know what I'm doing wrong?

I'm guessing you need to add this line to the end of your method:

return pwdFiles

or just

pwdFiles

since the return value of a method is the value of the last expression
evaluated.

As an alternative, Dir["etc/*"] builds an Array directly. So you could
have:

def get_files(dir)
Dir["#{dir}/*"].select { |x| File.file?(x) }
end

(Aside: getFiles works fine, but get_files is a more conventional Ruby
method name. Use CamelCase for constants/classes)
 
B

Brian Candler

Oh, and another hint: try things out in irb to find out what's
happening.

/$ irb --simple-prompt
 
J

James Harrison

=20
The symbolic constant for truth is True, not true.

In either case, because File.file? returns either True or False, you can =
drop the comparison:



If this is truly a method in an object, though, bear in mind scoping =
issues. pwdFiles is only available inside this method definition. When =
you declare it in your intialize statement, prepend with a =
scope-changing symbol. The most common in this case is @

def initialize
#will contain all file entries in the directory
@pwdFiles =3D []
end

def get_files(dir)
@pwdFiles =3D Array.new
=20
Dir.foreach(dir) do |entry|
@pwdFiles.push(entry) if File.file?(entry) =3D=3D true
end
end



And pay attention to the advice about camelCase versus snake_case for =
method names: because Ruby's duck typed, it's helped me out sometimes to =
be able to glance at my code and tell what is a variable and what isn't =
very quickly.

Best

James=
 
B

Bob Smith

pwdFiles.push(entry) if File.file?(entry) == true

While not an answer to your question, I thought I'd comment on the line
above -- you don't need to add the "== true".

pwdFiles.push(entry) if File.file?(entry)


A little more:

Now, this is just me, but I also try to avoid writing negative logic...
if I wanted only the items where File.file?(entry) was false, I would
write:

pwdFiles.push(entry) unless File.file?(entry)

instead of

pwdFiles.push(entry) if not File.file?(entry)
or
pwdFiles.push(entry) if File.file?(entry) == false



Please note: this is all just personal preference -- your code will work
just fine. I only mention it as you indicated you were new, and I
thought I'd share.
 
B

Brian Candler

James said:
The symbolic constant for truth is True, not true.

That is incorrect, as 5 seconds with irb will show you.

$ irb --simple-promptNameError: uninitialized constant True
from (irb):2

(You may have been thinking of TrueClass and FalseClass, perhaps?)
In either case, because File.file? returns either True or False, you can
drop the comparison:

It returns true or false, but yes the comparison can be removed.
If this is truly a method in an object, though, bear in mind scoping
issues. pwdFiles is only available inside this method definition. When
you declare it in your intialize statement, prepend with a
scope-changing symbol. The most common in this case is @

def initialize
#will contain all file entries in the directory
@pwdFiles = []
end

@ is not a "scope changing symbol". @pwdFiles is an instance variable of
the object, whereas pwdFiles is a local variable. They are completely
different concepts.

Since the OP wanted to return the value from the getFiles method, and
not set any instance variables in the object, then a local variable is
appropriate here.
 
J

James Harrison

If you're going to write an email and be wrong, be wrong on every point, =
that's what I was always taught and I'm sticking by it goddamit!

=20
That is incorrect, as 5 seconds with irb will show you.
=20
$ irb --simple-prompt
NameError: uninitialized constant True
from (irb):2
=20
(You may have been thinking of TrueClass and FalseClass, perhaps?)

You're 100% correct. Total brainfart.

=20
In either case, because File.file? returns either True or False, you = can=20
drop the comparison:
=20
It returns true or false, but yes the comparison can be removed.
=20
If this is truly a method in an object, though, bear in mind scoping=20=
issues. pwdFiles is only available inside this method definition. = When=20
you declare it in your intialize statement, prepend with a=20
scope-changing symbol. The most common in this case is @
=20
def initialize
#will contain all file entries in the directory
@pwdFiles =3D []
end
=20
@ is not a "scope changing symbol". @pwdFiles is an instance variable = of=20
the object, whereas pwdFiles is a local variable. They are completely=20=
different concepts.
=20
Since the OP wanted to return the value from the getFiles method, and=20=
not set any instance variables in the object, then a local variable is=20=
appropriate here.



In future, I'll have to stop myself whenever the thought "that mightn't =
be the right word" crops through my head. I figured it'd be close enough =
to get the idea across, but given that a symbol is actually a language =
construct I should've taken a few more seconds out.


In either case, you've hit something that interests me. Conceptually, =
I'd understood that the difference between an @-prepended variable and a =
non-@-prepended variable is that the former is global to the object, so =
it's not impacted by the change in scope introduced by defining a method =
inside the object. So I figured that describing the @ character as =
changing the internal scope of a variable from local to global inside =
the object wouldn't be entirely missing the mark. But it's possible I'm =
missing something.


Thanks for the clarifications and corrections, dude.=20

Best

James
 
M

Marvin Gülker

Kyle said:
def getFiles(dir)
pwdFiles = Array.new

Dir.foreach(dir) do |entry|
pwdFiles.push(entry) if File.file?(entry) == true
end
end

That can't work since Dir.foreach yields only the filenames of the files
in the directory to the block, not the directory the files are actually
in. So suppose you have:

dir/my_file
dir/my_second_file

By calling Dir.foreach("dir") you'll get "my_file" and "my_second_file".
Your File.file? statement then checks for the presence of the filenames
in the current working directory -- where they don't reside. Solution:
Append "dir" to the filename.

def get_files(dir)
pwd_files = Array.new

Dir.foreach(dir) do |entry|
path = File.join(dir, entry)
pwd_files.push(path) if File.file?(path)
end
pwd_files
end

And please use snake_case for your method and variable names.

Marvin
 
B

Brian Candler

James said:
In either case, you've hit something that interests me. Conceptually,
I'd understood that the difference between an @-prepended variable and a
non-@-prepended variable is that the former is global to the object, so
it's not impacted by the change in scope introduced by defining a method
inside the object. So I figured that describing the @ character as
changing the internal scope of a variable from local to global inside
the object wouldn't be entirely missing the mark. But it's possible I'm
missing something.

They are different in both scope and lifetime.

Instance variables (@foo) are a property of the object instance itself,
and once set, they remain for as long as the object exists, or until
reassigned. Calls to methods of the same object will see the same value
of @foo - even multiple concurrent calls in different threads.

Local variables are created in an activation record which is created
when the method is invoked, and so the values are not tied to the object
itself. If there are multiple calls to the method from multiple threads,
they will have their own activation records, and so independent local
variables. Normally the activation record is garbage-collected when the
method returns.

Now, I'm not sure I should be complicating things here, but activation
records may have extended lifetimes too, when closures are involved.

class Foo
def initialize(n)
@n = n
end
def make_adder(y)
lambda { |x| y + x }
end
end

f = Foo.new(1) # #<Foo @n=1>

a1 = f.make_adder(10)
a2 = f.make_adder(20)
a1.call(3) # 13
a1.call(4) # 14
a2.call(5) # 25

y is a local variable in make_adder, but it persists in the lambdas,
even after make_adder has returned. The two lambdas, a1 and a2, have
different values bound to y. But there is still only a single object of
class Foo.

My apologies if this has confused matters further :) But hopefully it
highlights how different these things are.

Regards,

Brian.
 
K

Kyle Barbour

Thanks for all the responses! I feel so thoroughly helped :)

By calling Dir.foreach("dir") you'll get "my_file" and "my_second_file".
Your File.file? statement then checks for the presence of the filenames
in the current working directory -- where they don't reside. Solution:
Append "dir" to the filename.

Martin - thanks for the diagnosis, that makes perfect sense. Awesome.
That works very well and solves my problem!

As an alternative, Dir["etc/*"] builds an Array directly. So you could
have:

def get_files(dir)
  Dir["#{dir}/*"].select { |x| File.file?(x) }
end

Interesting, thanks! Is there any reason to prefer either this
solution or the one Martin gave in the above post?

Also: everyone, thanks for the newbie friendliness and for the general
advice (snake_case, negative logic, etc.). Thoroughly appreciated.

Kyle
 
M

Marvin Gülker

Kyle said:
Interesting, thanks! Is there any reason to prefer either this
solution or the one Martin gave in the above post?

I'd prefer the Dir[] variant, since it's more compact. And if you want
to search for files via wildcards, it's the best way to go I think.

Oh, and one last note: My name is "Marvin", not "Martin". Note the v.
;-)

Vale,
Marvin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,226
Members
46,815
Latest member
treekmostly22

Latest Threads

Top