simple file flow question with csv.reader

Matt · Nov 2, 2011

Hi All,

I am trying to do a really simple file operation, yet, it befuddles me...

I have a few hundred .csv files, and to each file, I want to manipulate the data, then save back to the original file. The code below will open up the files, and do the proper manipulations-- but I can't seem to save the files after the manipulation..

How can I save the files-- or do I need to try something else maybe with split, join, etc..

import os
import csv
for filename in os.listdir("/home/matthew/Desktop/pero.ngs/blast"):
with open(filename, 'rw') as f:
reader = csv.reader(f)
for row in reader:
print ">",row[0],row[4],"\n",row[1], "\n", ">", row[2], "\n", row[3]

Thanks in advance, Matt

Tim Chase · Nov 2, 2011

Hi All,

I am trying to do a really simple file operation, yet, it befuddles me...

I have a few hundred .csv files, and to each file, I want to manipulate the data, then save back to the original file. The code below will open up the files, and do the proper manipulations-- but I can't seem to save the files after the manipulation..

How can I save the files-- or do I need to try something else maybe with split, join, etc..

import os
import csv
for filename in os.listdir("/home/matthew/Desktop/pero.ngs/blast"):
with open(filename, 'rw') as f:
reader = csv.reader(f)
for row in reader:
print ">",row[0],row[4],"\n",row[1], "\n",">", row[2], "\n", row[3]

Your last line just prints the data to standard-out. You can
either pipe the output to a file:

python myprog.py > output.txt

or you can write them to a single output file:

out = file('output.txt', 'w')
for filename in os.listdir(...):
with open(filename, 'rw') as f:
reader = csv.reader(f)
for row in reader:
out.write(">%s%s\n%s\n>%s\n>%s\n%s" % (
row[0], row[4], row[1], row[2], row[3]))

or you can write them to output files on a per-input basis:

for filename in os.listdir(SOURCE_LOC):
with open(filename, 'r') as f:
outname = os.path.join(
DEST_LOC,
os.path.basename(filename),
)
with file(outname, 'wb') as out:
for row in reader:
out.write(">%s%s\n%s\n>%s\n>%s\n%s" % (
row[0], row[4], row[1], row[2], row[3]))

-tkc

Dennis Lee Bieber · Nov 2, 2011

I have a few hundred .csv files, and to each file, I want to manipulate the data, then save back to the original file. The code below will open up the files, and do the proper manipulations-- but I can't seem to save the files after the manipulation..

How can I save the files-- or do I need to try something else maybe with split, join, etc..

<snip>

Option 1: Read the file completely into memory (your example is
reading line by line); close the reader and its file; reopen the file
for "wb" (delete, create new); open CSV writer on that file; write the
memory contents.

Option 2: Open a temporary file "wb"; open a CSV writer on the file;
for each line from the reader, update the data, send to the writer; at
end of reader, close reader and file; delete original file; rename
temporary file to the original name.

Terry Reedy · Nov 2, 2011

That is dangerous. Better to replace the file with a new one of the same
name.

Option 1: Read the file completely into memory (your example is
reading line by line); close the reader and its file; reopen the
file for "wb" (delete, create new); open CSV writer on that file;
write the memory contents.

and lose data if your system crashes or freezes during the write.

Option 2: Open a temporary file "wb"; open a CSV writer on the file;
for each line from the reader, update the data, send to the writer;
at end of reader, close reader and file; delete original file;
rename temporary file to the original name.

This works best if new file is given a name related to the original
name, in case rename fails. Alternative is to rename original x to
x.bak, write or rename new file, then delete .bak file.

Jon Clements · Nov 3, 2011

That is dangerous. Better to replace the file with a new one of the same
name.

and lose data if your system crashes or freezes during the write.

This works best if new file is given a name related to the original
name, in case rename fails. Alternative is to rename original x to
x.bak, write or rename new file, then delete .bak file.

To the OP, I agree with Terry, but will add my 2p.

What is this meant to achieve?
print ">",row[0],row[4],"\n",row[1], "\n", ">", row[2], "\n", row[3]

0 4 1
2

3

Is something meant to read this afterwards?

I'd personally create a subdir called db, create a sqlite3 db, then
load all the required fields into it (with a column for filename)...
it will either work or fail, then if it succeeds, start overwriting
the originals - just a "select * from some_table" will do, using
itertools.groupby on the filename column, changing the open() request
etc...

just my 2p mind you,

Jon.

csv.reader has trouble with comma inside quotes inside brackets	3	Jun 9, 2009
Errors When Pulling Information from CSV File to Python	0	Dec 10, 2020
Help with importing from multiple files and printing lines in designated spot to spit out one file.	1	Jan 16, 2023
Scan CSV file and saving it into an array	2	Apr 25, 2013
.csv to .txt after adding columns	7	Sep 18, 2013
Reading csv file	1	Dec 17, 2013
Printer list value problem	4	Jan 14, 2014
Print value from array	2	Nov 22, 2012

simple file flow question with csv.reader

Matt

Tim Chase

Dennis Lee Bieber

Terry Reedy

Jon Clements

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads