R
Robin Siebler
I have two directory trees that I want to compare and I'm trying to
figure out what the best way of doing this would be. I am using walk
to get a list of all of the files in each directory.
I am using this code to compare the file lists:
def compare_files(first_list, second_list, first_dir, second_dir):
missing = in_first_only(first_list, second_list)
for item in missing:
index = first_list.index(item)
print first_list[index] + ' does not exist in ' +
second_dir[index]
first_list.pop(index); first_dir.pop(index)
return first_list, second_list, first_dir, second_dir
However, before I actually compare the files, I want to compare the
directories and if a directory is mising in either set, I want to
report it:
dir_list_a = ['d:\\results\\foldera\\','d:\\results\\folderb\\','d:\\results\\folderc\\']
dir_list_b = ['c:\\results\\foldera\\','c:\\results\\folderb\\']
output:
'folderc' exists in d:\results but not in c:\results
I am using splitall (from the Python Cookbook) to split the paths into
there parts and appending this to a list, but I can't figure out the
best way to compare the contents of the resulting 2 lists and I think
I am starting to make things *too* complicated:
def splitall(path):
"""
Source: Python Cookbook
Credit: Trent Mick
Split a path into all of its parts.
"""
allparts = []
while 1:
parts = os.path.split(path)
if parts[0] == path:
allparts.insert(0, parts[0])
break
elif parts[1] == path:
allparts.insert(0, parts[1])
break
else:
path = parts[0]
allparts.insert(0, parts[1])
return allparts
After using this, I end up with this:
dir_list_a = [['d:\\', 'results', 'foldera', 'd:\\', 'results',
'folderb', 'd:\\', 'results', 'folderc']]
dir_list_b =
[['d:\\', 'results', 'foldera', 'd:\\', 'results', 'folderb']]
figure out what the best way of doing this would be. I am using walk
to get a list of all of the files in each directory.
I am using this code to compare the file lists:
def compare_files(first_list, second_list, first_dir, second_dir):
missing = in_first_only(first_list, second_list)
for item in missing:
index = first_list.index(item)
print first_list[index] + ' does not exist in ' +
second_dir[index]
first_list.pop(index); first_dir.pop(index)
return first_list, second_list, first_dir, second_dir
However, before I actually compare the files, I want to compare the
directories and if a directory is mising in either set, I want to
report it:
dir_list_a = ['d:\\results\\foldera\\','d:\\results\\folderb\\','d:\\results\\folderc\\']
dir_list_b = ['c:\\results\\foldera\\','c:\\results\\folderb\\']
output:
'folderc' exists in d:\results but not in c:\results
I am using splitall (from the Python Cookbook) to split the paths into
there parts and appending this to a list, but I can't figure out the
best way to compare the contents of the resulting 2 lists and I think
I am starting to make things *too* complicated:
def splitall(path):
"""
Source: Python Cookbook
Credit: Trent Mick
Split a path into all of its parts.
"""
allparts = []
while 1:
parts = os.path.split(path)
if parts[0] == path:
allparts.insert(0, parts[0])
break
elif parts[1] == path:
allparts.insert(0, parts[1])
break
else:
path = parts[0]
allparts.insert(0, parts[1])
return allparts
After using this, I end up with this:
dir_list_a = [['d:\\', 'results', 'foldera', 'd:\\', 'results',
'folderb', 'd:\\', 'results', 'folderc']]
dir_list_b =
[['d:\\', 'results', 'foldera', 'd:\\', 'results', 'folderb']]