Expanding Search to Subfolders

PipedreamerGrey · Jun 5, 2006

This is the beginning of a script that I wrote to open all the text
files in a single directory, then process the data in the text files
line by line into a single index file.

os.chdir("C:\\Python23\\programs\\filetree")
mydir = glob.glob("*.txt")

index = open("index.rtf", 'w')

for File in mydir:
count = 1
file = open(File)
fileContent = file.readlines()
for line in fileContent:
if not line.startswith("\n"):
if count == 1:

I'm now trying to the program to process all the text files in
subdirectories, so that I don't have to run the script more than once.
I know that the following script will SHOW me the contents of the
subdirectories, but I can't integrate the two:

def print_tree(tree_root_dir):
def printall(junk, dirpath, namelist):
for name in namelist:
print os.path.join(dirpath, name)
os.path.walk(tree_root_dir, printall, None)

print_tree("C:\\Python23\\programs\\filetree")

I've taught myself out of online tutorials, so I think that this is a
matter of a command that I haven't learned rather a matter of logic.
Could someone tell me where to learn more about directory processes or
show me an improved version of my first script snippet?

Thanks

Lou Losee · Jun 5, 2006

This is the beginning of a script that I wrote to open all the text
files in a single directory, then process the data in the text files
line by line into a single index file.

os.chdir("C:\\Python23\\programs\\filetree")
mydir = glob.glob("*.txt")

index = open("index.rtf", 'w')

for File in mydir:
count = 1
file = open(File)
fileContent = file.readlines()
for line in fileContent:
if not line.startswith("\n"):
if count == 1:

I'm now trying to the program to process all the text files in
subdirectories, so that I don't have to run the script more than once.
I know that the following script will SHOW me the contents of the
subdirectories, but I can't integrate the two:

def print_tree(tree_root_dir):
def printall(junk, dirpath, namelist):
for name in namelist:
print os.path.join(dirpath, name)
os.path.walk(tree_root_dir, printall, None)

print_tree("C:\\Python23\\programs\\filetree")

I've taught myself out of online tutorials, so I think that this is a
matter of a command that I haven't learned rather a matter of logic.
Could someone tell me where to learn more about directory processes or
show me an improved version of my first script snippet?

Thanks

How about something like:
import os, stat

class DirectoryWalker:
# a forward iterator that traverses a directory tree, and
# returns the filename

def __init__(self, directory):
self.stack = [directory]
self.files = []
self.index = 0

def __getitem__(self, index):
while 1:
try:
file = self.files[self.index]
self.index = self.index + 1
except IndexError:
# pop next directory from stack
self.directory = self.stack.pop()
self.files = os.listdir(self.directory)
self.index = 0
else:
# got a filename
fullname = os.path.join(self.directory, file)
if os.path.isdir(fullname) and not os.path.islink(fullname):
self.stack.append(fullname)
else:
return fullname

for file, st in DirectoryWalker("."):
your function here

not tested

Lou

Grant Edwards · Jun 5, 2006

Just in case you really are trying to accomplish something
other than learn Python, there are far easier ways to do these
tasks:

This is the beginning of a script that I wrote to open all the
text files in a single directory, then process the data in the
text files line by line into a single index file.

#!/bin/bash
cat *.txt >outputfile

I'm now trying to the program to process all the text files in
subdirectories, so that I don't have to run the script more
than once.

#!/bin/bash
cat `find . -name '*.txt'` >outputfile

BartlebyScrivener · Jun 6, 2006

Well, yes, but if he's kicking things off with:

I'm guessing he's not on Linux. Maybe you're trying to convert him?

rd

BartlebyScrivener · Jun 6, 2006

Could someone tell me where to learn more about directory
Use os.walk

http://docs.python.org/lib/os-file-dir.html

It takes a little reading to get it if you are a beginner, but there
are zillions of examples if you just search this Google Group on
"os.walk"

http://tinyurl.com/kr3m6

Good luck

rd

"I don't have any solution, but I certainly admire the
problem."--Ashleigh Brilliant

PipedreamerGrey · Jun 6, 2006

Thanks, that was a big help. It worked fine once I removed

os.chdir("C:\\Python23\\programs\\Magazine\\SamplesE")

and changed "for file, st in DirectoryWalker("."):"
to
"for file in DirectoryWalker("."):" (removing the "st")

PipedreamerGrey · Jun 6, 2006

Thanks everyone!

PipedreamerGrey · Jun 6, 2006

Here's the final working script. It opens all of the text files in a
directory and its subdirectories and combines them into one Rich text
file (index.rtf):

#! /usr/bin/python
import glob
import fileinput
import os
import string
import sys

index = open("index.rtf", 'w')

class DirectoryWalker:
# a forward iterator that traverses a directory tree, and
# returns the filename

def __init__(self, directory):
self.stack = [directory]
self.files = []
self.index = 0

def __getitem__(self, index):
while 1:
try:
file = self.files[self.index]
self.index = self.index + 1
except IndexError:
# pop next directory from stack
self.directory = self.stack.pop()
self.files = os.listdir(self.directory)
self.index = 0
else:
# get a filename, eliminate directories from list
fullname = os.path.join(self.directory, file)
if os.path.isdir(fullname) and not
os.path.islink(fullname):
self.stack.append(fullname)
else:
return fullname

for file in DirectoryWalker("."):
# divide files names into path and extention
path, ext = os.path.splitext(file)
# choose the extention you would like to see in the list
if ext == ".txt":
print file

# print the contents of each file into the index
file = open(file)
fileContent = file.readlines()
for line in fileContent:
if not line.startswith("\n"):
index.write(line)
index.write("\n")

index.close()

Fredrik Lundh · Jun 6, 2006

Lou said:
How about something like:

import os, stat

class DirectoryWalker:
# a forward iterator that traverses a directory tree, and
# returns the filename
> ...

not tested

speak for yourself ;-)

(the code is taken from http://effbot.org/librarybook/os-path.htm )

</F>

Lou Losee · Jun 6, 2006

speak for yourself ;-)

(the code is taken from http://effbot.org/librarybook/os-path.htm )

</F>

Well that is good to know

I know I did not get it from there but I
had it in my snippits directory.

Lou

How Do I get my Python script to attach multiple files and send as asingle email	3	Aug 8, 2013
Help with importing from multiple files and printing lines in designated spot to spit out one file.	1	Jan 16, 2023
Changing a value for each folder while traversing a file system	6	Jul 26, 2006
Search script to index dynamic pages	19	Mar 28, 2011
cgi simple script in c to search text file	15	Mar 4, 2013
remove header line when reading/writing files	5	Oct 11, 2007
Q: Hi-HO! How to implement this search engine... ?	1	Sep 20, 2010
Rubyisms wanted to shorten code in search program	11	Dec 4, 2007

Expanding Search to Subfolders

PipedreamerGrey

Lou Losee

Grant Edwards

BartlebyScrivener

BartlebyScrivener

PipedreamerGrey

PipedreamerGrey

PipedreamerGrey

Fredrik Lundh

Lou Losee

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads