Style/Golf Question -- path simplification function.

M

michael.dehaan

Recently at work I ran into a function that I felt like making more
Ruby-like (for fun), and having not delved into Ruby land much in the
last six months (Python was paying the bills), kind of at a loss to
whether there is a cleaner way to do this. Any pointers on style
here?

The task is this: Given a list of paths, remove any subdirectories or
files in the list if any parent directories of the file/subdirectory is
already in the list. Using "break" to get a value out of the inner
block felt a bit sloppy, but using "return" returns from the outermost
function, so it wasn't working well. I am mainly interested in
functional solutions.

Here's my shot:

def wash(files)
return files.reject do |name|
tokens = name.split("/")
rc = 2.upto(tokens.length()-1) do |idx|
break if files.include?(tokens[0,idx].join("/"))
end
rc.nil?
end
end

files = [
"/tmp/bar",
"/blah/zorg/2",
"/tmp/bar/baz",
"/foo",
"/blah/zorg/2/3",
"/blah/zorg",
"/zorg/blah",
"/blah/zorg/1",
]
expected = [
"/tmp/bar",
"/foo",
"/blah/zorg",
"/zorg/blah",
]
puts wash(files) == expected
 
M

Martin DeMello

The task is this: Given a list of paths, remove any subdirectories or
files in the list if any parent directories of the file/subdirectory is
already in the list. Using "break" to get a value out of the inner
block felt a bit sloppy, but using "return" returns from the outermost
function, so it wasn't working well. I am mainly interested in
functional solutions.

# taking advantage of the fact that a.subdir?(b) => a.prefix?(b)

files = [
"/tmp/bar",
"/blah/zorg/2",
"/tmp/bar/baz",
"/foo",
"/blah/zorg/2/3",
"/blah/zorg",
"/zorg/blah",
"/blah/zorg/1",
]

class String
def prefix?(other)
self == other[0...self.length]
end
end

def wash(files)
first, *rest = files.sort
rest.inject([first]) {|a, e| a.last.prefix?(e) ? a : (a << e)}
end

p wash(files)

# this is O(n log n) because of the sort - if you have tens of thousands
# of files you can get O(n) time using a trie

martin
 
R

Ross Bamford

Recently at work I ran into a function that I felt like making more
Ruby-like (for fun), and having not delved into Ruby land much in the
last six months (Python was paying the bills), kind of at a loss to
whether there is a cleaner way to do this. Any pointers on style
here?

The task is this: Given a list of paths, remove any subdirectories or
files in the list if any parent directories of the file/subdirectory is
already in the list. Using "break" to get a value out of the inner
block felt a bit sloppy, but using "return" returns from the outermost
function, so it wasn't working well. I am mainly interested in
functional solutions.

Here's my shot:

def wash(files)
return files.reject do |name|
tokens = name.split("/")
rc = 2.upto(tokens.length()-1) do |idx|
break if files.include?(tokens[0,idx].join("/"))
end
rc.nil?
end
end

Maybe not what you're after, but I would probably go with something like
this:

def wash(files)
(files = files.dup).each do |fn|
fn += '/*' and files.reject! { |cmpfn| File.fnmatch(fn, cmpfn) }
end
end

Or maybe this (fnmatch times a tiny bit quicker, though):

def wash(files)
(files = files.dup).each do |fn|
rx = /#{Regexp.escape(fn)}\/./
files.reject! { |cmpfn| rx =~ cmpfn }
end
end
 
W

William James

Recently at work I ran into a function that I felt like making more
Ruby-like (for fun), and having not delved into Ruby land much in the
last six months (Python was paying the bills), kind of at a loss to
whether there is a cleaner way to do this. Any pointers on style
here?

The task is this: Given a list of paths, remove any subdirectories or
files in the list if any parent directories of the file/subdirectory is
already in the list. Using "break" to get a value out of the inner
block felt a bit sloppy, but using "return" returns from the outermost
function, so it wasn't working well. I am mainly interested in
functional solutions.

Here's my shot:

def wash(files)
return files.reject do |name|
tokens = name.split("/")
rc = 2.upto(tokens.length()-1) do |idx|
break if files.include?(tokens[0,idx].join("/"))
end
rc.nil?
end
end

files = [
"/tmp/bar",
"/blah/zorg/2",
"/tmp/bar/baz",
"/foo",
"/blah/zorg/2/3",
"/blah/zorg",
"/zorg/blah",
"/blah/zorg/1",
]
expected = [
"/tmp/bar",
"/foo",
"/blah/zorg",
"/zorg/blah",
]
puts wash(files) == expected


class Array
def wash
inject([]){ |ary,path1|
self.each{ |path2|
break if path1 =~ /^#{Regexp.escape(path2)}\//
} ? ary << path1 : ary
}
end
end

files =
"/tmp/bar",
"/blah/zorg/2",
"/tmp/bar/baz",
"/foo",
"/blah/zorg/2/3",
"/blah/zorg",
"/zorg/blah",
"/blah/zorg/1"

expected =
"/tmp/bar",
"/foo",
"/blah/zorg",
"/zorg/blah"

p files.wash == expected
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,296
Messages
2,571,535
Members
48,281
Latest member
DaneLxa72

Latest Threads

Top