CSV reader and unique ids

M

Mike P

Hi All,

I'm trying to use the CSV module to read in some data and then use a
hashable method (as there are millions of records) to find unique ids
and push these out to another file,

can anyone advise? Below is the code so far


fin = open(CSV_INPUT, "rb")
fout = open(CSV_OUTPUT, "wb")
reader = csv.reader(fin, delimiter=chr(254))
writer = csv.writer(fout)

headerList = reader.next()
UID = {}

#For help
#print headerList
# ['Time', 'User-ID', 'IP']

try:
for row in reader[1]:
UID[row] = 1
else:
List= UID.keys()
writer.writerows(List)
fin.close()
fout.close()

Mike
 
T

Tim Golden

Mike said:
I'm trying to use the CSV module to read in some data and then use a
hashable method (as there are millions of records) to find unique ids
and push these out to another file,

You could either zip with a counter or use the uuid module,
depending on just how unique you want your ids to be.

<code>
import os, sys
import csv
import itertools
import uuid

stuff = "the quick brown fox jumps over the lazy dog".split ()

f = open ("output.csv", "wb")
writer = csv.writer (f)

#
# Style 1 - numeric counter
#
writer.writerows (zip (itertools.count (), stuff))

#
# Style 2 - uuid
#
writer.writerows ((uuid.uuid1 (), s) for s in stuff)

f.close ()
os.startfile ("output.csv")

</code>

TJG
 
P

python

Anyone have any benchmarks on the difference in performance between 32
and 64 bit versions of Python for specific categories of operation, eg.
math, file, string, etc. operations?

My question is OS neutral so feel free to share your experience with
either Windows or Linux OS's.

Thank you,
Malcolm
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top