Howto: extract a 'column' from a list of lists into a new list?

G

Greg Brunet

I'm writing some routines for handling dBASE files. I've got a table
(DBF file) object & field object already defined, and after opening the
file, I can get the field info like this:
[('STOCKNO', 'C', 8, 0), ('DACC', 'C', 5, 0), ('DEALERACCE', 'C', 30,
0), ('D-ACCRTL', 'C', 9, 0), ('D-ACCCST', 'C', 9, 0)]

What I would like to do is be able to extract the field names into a
single, separate list. It should look like:

['STOCKNO', 'DACC', 'DEALERACCE', 'D-ACCRTL', 'D-ACCCST']

but I'm not sure about how to do that. I can do this:
for g in tbl.Fields(): print g[0]
....
STOCKNO
DACC
DEALERACCE
D-ACCRTL
D-ACCCST

but I expect that one of those fancy map/lamda/list comprehension
functions can turn this into a list for me, but, to be honest, they
still make my head spin trying to figure them out. Any ideas on how to
do this simply?


Even better yet... the reason I'm trying to do this is to make it easy
to refer to a field by the field name as well as the field number. I
expect to be reading all of the data into a list (1 per row/record) of
row objects. If someone wants to be able to refer to FldA in record 53,
then I'd like them to be able to use: "row[52].FldA" instead of having
to use "row[52][4]" (if it's the 5th field in the row). I was planning
on using the __getattr__ method in my row object like the following:

#----------------------------------------
def __getattr__(self,key):
""" Return by item name """
ukey = key.upper()
return self._data[tbl.FldNames.index(ukey)]
ukey = key.upper()

....where "tbl.FldNames" is the list of fieldnames that I'm trying to
build up above (and tbl is a property in the row pointing back to the
file object, since I don't want to make a copy of the fieldnames in
every row record). Is there a better (more efficient) way to go about
this? Thanks,
 
E

Egor Bolonev

Hello, Greg!
You wrote on Mon, 30 Jun 2003 19:44:20 -0500:

??>>>> tbl.Fields()
GB> [('STOCKNO', 'C', 8, 0), ('DACC', 'C', 5, 0), ('DEALERACCE', 'C', 30,

[Sorry, skipped]

GB> but I expect that one of those fancy map/lamda/list comprehension
GB> functions can turn this into a list for me, but, to be honest, they
GB> still make my head spin trying to figure them out. Any ideas on how to
GB> do this simply?

=============================================
a=[('STOCKNO', 'C', 8, 0), ('DACC', 'C', 5, 0), ('DEALERACCE', 'C', 30,
0), ('D-ACCRTL', 'C', 9, 0), ('D-ACCCST', 'C', 9, 0)]

b=[x[0] for x in a] # :) Python is cool!

print b
=============================================

As I know the map/lamda/list works very slow and you should use it 'only'
with SCRIPTS.

[Sorry, skipped]

With best regards, Egor Bolonev. E-mail: (e-mail address removed)
 
G

Greg Brunet

=============================================
a=[('STOCKNO', 'C', 8, 0), ('DACC', 'C', 5, 0), ('DEALERACCE', 'C', 30,
0), ('D-ACCRTL', 'C', 9, 0), ('D-ACCCST', 'C', 9, 0)]

b=[x[0] for x in a] # :) Python is cool!

print b
=============================================

As I know the map/lamda/list works very slow and you should use it 'only'
with SCRIPTS.

Thanks Egor!

It looks like I was closer than I thought - but still would have been
unlikely to figure it out!
 
M

Max M

Greg said:
but I'm not sure about how to do that. I can do this:
for g in tbl.Fields(): print g[0]

...
STOCKNO
DACC
DEALERACCE
D-ACCRTL
D-ACCCST

but I expect that one of those fancy map/lamda/list comprehension
functions can turn this into a list for me, but, to be honest, they
still make my head spin trying to figure them out. Any ideas on how to
do this simply?

fields = [
('STOCKNO', 'C', 8, 0),
('DACC', 'C', 5, 0),
('DEALERACCE', 'C', 30, 0),
('D-ACCRTL', 'C', 9, 0),
('D-ACCCST', 'C', 9, 0)
]



# The "old" way to do it would be:
NAME_COLUMN = 0
results = []
for field in fields:
results.append(field[NAME_COLUMN])
print results




# But list comprehensions are made for exactly this purpose
NAME_COLUMN = 0
results = [field[NAME_COLUMN] for field in fields]
print results


regards Max M
 
M

Max M

Greg Brunet wrote:

Even better yet... the reason I'm trying to do this is to make it easy
to refer to a field by the field name as well as the field number. I
expect to be reading all of the data into a list (1 per row/record) of
row objects. If someone wants to be able to refer to FldA in record 53,
then I'd like them to be able to use: "row[52].FldA" instead of having
to use "row[52][4]" (if it's the 5th field in the row). I was planning
on using the __getattr__ method in my row object like the following:


If that is all you want to do, this might be the simlest approach:

fields = [
('STOCKNO', 'C', 8, 0),
('DACC', 'C', 5, 0),
('DEALERACCE', 'C', 30, 0),
('D-ACCRTL', 'C', 9, 0),
('D-ACCCST', 'C', 9, 0)
]


def byName(field, name):
fieldNames = {
'name':0,
'letter':1,
'val1':2,
'val2':3,
}
return field[fieldNames[name]]


for field in fields:
print byName(field, 'name')


regards Max M
 
B

Bengt Richter

Greg said:
but I'm not sure about how to do that. I can do this:
for g in tbl.Fields(): print g[0]

...
STOCKNO
DACC
DEALERACCE
D-ACCRTL
D-ACCCST

but I expect that one of those fancy map/lamda/list comprehension
functions can turn this into a list for me, but, to be honest, they
still make my head spin trying to figure them out. Any ideas on how to
do this simply?

fields = [
('STOCKNO', 'C', 8, 0),
('DACC', 'C', 5, 0),
('DEALERACCE', 'C', 30, 0),
('D-ACCRTL', 'C', 9, 0),
('D-ACCCST', 'C', 9, 0)
]



# The "old" way to do it would be:
NAME_COLUMN = 0
results = []
for field in fields:
results.append(field[NAME_COLUMN])
print results




# But list comprehensions are made for exactly this purpose
NAME_COLUMN = 0
results = [field[NAME_COLUMN] for field in fields]
print results
Or you can take advantage of zip:
... ('STOCKNO', 'C', 8, 0),
... ('DACC', 'C', 5, 0),
... ('DEALERACCE', 'C', 30, 0),
... ('D-ACCRTL', 'C', 9, 0),
... ('D-ACCCST', 'C', 9, 0)
... ]
('STOCKNO', 'DACC', 'DEALERACCE', 'D-ACCRTL', 'D-ACCCST')

Or a list of all the columns of which only the first was selected above: [('STOCKNO', 'DACC', 'DEALERACCE', 'D-ACCRTL', 'D-ACCCST'), ('C', 'C', 'C', 'C', 'C'), (8, 5, 30
, 9, 9), (0, 0, 0, 0, 0)]

Since zip gives you a list of tuples, you'll have to convert if you really need a list version
of one of them:
['STOCKNO', 'DACC', 'DEALERACCE', 'D-ACCRTL', 'D-ACCCST']

Regards,
Bengt Richter
 
J

John Hunter

That is amazingly helpful. I've often needed to transpose a 2d list,
(mainly to get a "column" from an SQL query of a list of rows) and
this is just the trick.

I recently wrote a function to "deal" out a list of MySQLdb results,
where each field was a numeric type, and wanted to fill numeric arrays
with each column of results

With your trick, eg, for a list of results from three numeric fields,
I just have to do:

a1, a2, a3 = map(array, zip(*results))

John Hunter
 
G

Greg Brunet

Bengt Richter said:
Or you can take advantage of zip:
... ('STOCKNO', 'C', 8, 0),
... ('DACC', 'C', 5, 0),
... ('DEALERACCE', 'C', 30, 0),
... ('D-ACCRTL', 'C', 9, 0),
... ('D-ACCCST', 'C', 9, 0)
... ]
('STOCKNO', 'DACC', 'DEALERACCE', 'D-ACCRTL', 'D-ACCCST')

Or a list of all the columns of which only the first was selected above:[('STOCKNO', 'DACC', 'DEALERACCE', 'D-ACCRTL', 'D-ACCCST'), ('C', 'C', 'C', 'C', 'C'), (8, 5, 30
, 9, 9), (0, 0, 0, 0, 0)]

Since zip gives you a list of tuples, you'll have to convert if you really need a list version
of one of them:
list(zip(*fields)[0])
['STOCKNO', 'DACC', 'DEALERACCE', 'D-ACCRTL', 'D-ACCCST']

Bengt:

This looks great - but something isn't quite working for me. If I type
in the stuff as you show, the zip function works, but if I use the
values that I get from my code, it doesn't. Here's what I get in a
sample session:

#------------------------------------[('STOCKNO', 'C', 8, 0), ('DACC', 'C', 5, 0), ('DEALERACCE', 'C', 30,
0), ('D_ACCRTL', 'C', 9, 0), ('D_ACCCST', 'C', 9, 0), ('DEC', 'N', 10,
2)]Traceback (most recent call last):
File "<interactive input>", line 1, in ?
TypeError: zip argument #1 must support iteration
#------------------------------------

Where the "dbf" invoices the _init_ for the dbf class which opens the
file & reads the header. As part of that, the fields are placed in a
class variable, and accessed using the Fields() method. At first I
wasn't sure of what the '*' did, but finally figured that out (section
5.3.4-Calls of the Language Reference for anyone else who's confused).

After puzzling it through a bit, I believe that Fields() is causing the
problem because it's not really a list of tuples as it appears. Rather
it's a list of dbfField objects which have (among others) the following
2 methods:

class dbfField:
#----------------------------------------
def __init__(self):
pass

#----------------------------------------
def create (self, fldName, fldType='C', fldLength=10, fldDec=0):
# (lot's of error-checking omitted)
self._fld = (fldName, fldType, fldLength, fldDec)
#----------------------------------------
def __repr__(self):
return repr(self._fld)
#----------------------------------------
def __getitem__(self,key):
""" Return by position or item name """
if type(key) is IntType:
return self._fld[key]
elif type(key) is StringType:
ukey = key.upper()
if ukey=="NAME": return self._fld[0]
elif ukey=="TYPE": return self._fld[1]
elif ukey=="LENGTH": return self._fld[2]
elif ukey=="DEC": return self._fld[3]


What I was trying to do, was to use the _fld tuple as the main object,
but wrap it with various methods & properties to 'safeguard' it. Given
that can I still use zip to do what I want? (Egor & Max's list
comprehension solution works fine for me, but the zip function seems
especially elegant) I read in the library reference about iterator
types (sec 2.2.5 from release 2.2.2), and it looks like I could get it
to work by implementing the iterator protocol, but I couldn't find any
sample code to help in this. Any idea if there's some available, or if
this is even worth it.

Better yet, is there a way for me to accomplish the same thing in a
simpler way? It's likely that I'm 'brute-forcing' a solution that has
gotten to be a lot more complex than it needs to be. Certainly if the
field definitions were in a simple tuple (which it is internally), zip
would work, but then it seems that I would lose the encapsulation
benefits. Is there a way to achieve both?

Thanks again,
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,996
Messages
2,570,238
Members
46,826
Latest member
robinsontor

Latest Threads

Top