E
Ethan Furman
Let's say I have two tables:
CatLovers DogLovers
------------------- -------------------
| name | age | | name | age |
|-----------------| |-----------------|
| Allen | 42 | | Alexis | 7 |
| Jerod | 29 | | Michael | 21 |
| Samuel | 17 | | Samuel | 17 |
| Nickalaus | 55 | | Lawrence | 63 |
| Frederick | 34 | | Frederick | 34 |
------------------- -------------------
NumberOfPets
---------------------------
| name | cats | dogs |
---------------------------
| Allen | 2 | 0 |
| Alexis | 0 | 3 |
| Michael | 0 | 1 |
| Samuel | 1 | 2 |
| Jerod | 3 | 0 |
| Nickalaus | 5 | 0 |
| Lawrence | 0 | 1 |
| Frederick | 3 | 2 |
---------------------------
(I know, I know -- coming up with examples has never been my strong
point.
catlovers = dbf.Table('CatLovers')
doglovers = dbf.Table('DogLovers')
petcount = dbf.Table('NumberOfPets')
For the sake of this highly contrived example, let's say I'm printing a
report that I would like in alphabetical order of those who love both
cats and dogs...
def names(record):
return record.name
c_idx = catlovers.create_index(key=names)
d_idx = doglovers.create_index(key=names)
p_idx = petcount.create_index(key=names)
# method 1
for record in c_idx:
if record in d_idx:
print record.name, record.age, \
p_idx[record].cats, p_idx[record].dogs
*or*
# method 2
for record in c_idx:
if d_idx.key(record) in d_idx: # or if names(record) in d_idx:
print record.name, record.age \
p_idx[record].cats, p_idx[record].dogs
Which is better (referring to the _in_ statement)? Part of the issue
revolves around the question of is _any_ record in the CatLovers table
really in the DogLovers index? Obviously no -- so if you are asking the
question in code you are really asking if a record from CatLovers has a
matching key value in DogLovers, which means either the __contains__
code can apply the key function to the record (implicit, as in method 1
above) or the calling code can do it (explicit, as in method 2 above).
I'm leaning towards method 1, even though the key function is then
called behind the scenes, because I think it makes the calling code cleaner.
Opinions?
~Ethan~
CatLovers DogLovers
------------------- -------------------
| name | age | | name | age |
|-----------------| |-----------------|
| Allen | 42 | | Alexis | 7 |
| Jerod | 29 | | Michael | 21 |
| Samuel | 17 | | Samuel | 17 |
| Nickalaus | 55 | | Lawrence | 63 |
| Frederick | 34 | | Frederick | 34 |
------------------- -------------------
NumberOfPets
---------------------------
| name | cats | dogs |
---------------------------
| Allen | 2 | 0 |
| Alexis | 0 | 3 |
| Michael | 0 | 1 |
| Samuel | 1 | 2 |
| Jerod | 3 | 0 |
| Nickalaus | 5 | 0 |
| Lawrence | 0 | 1 |
| Frederick | 3 | 2 |
---------------------------
(I know, I know -- coming up with examples has never been my strong
point.
catlovers = dbf.Table('CatLovers')
doglovers = dbf.Table('DogLovers')
petcount = dbf.Table('NumberOfPets')
For the sake of this highly contrived example, let's say I'm printing a
report that I would like in alphabetical order of those who love both
cats and dogs...
def names(record):
return record.name
c_idx = catlovers.create_index(key=names)
d_idx = doglovers.create_index(key=names)
p_idx = petcount.create_index(key=names)
# method 1
for record in c_idx:
if record in d_idx:
print record.name, record.age, \
p_idx[record].cats, p_idx[record].dogs
*or*
# method 2
for record in c_idx:
if d_idx.key(record) in d_idx: # or if names(record) in d_idx:
print record.name, record.age \
p_idx[record].cats, p_idx[record].dogs
Which is better (referring to the _in_ statement)? Part of the issue
revolves around the question of is _any_ record in the CatLovers table
really in the DogLovers index? Obviously no -- so if you are asking the
question in code you are really asking if a record from CatLovers has a
matching key value in DogLovers, which means either the __contains__
code can apply the key function to the record (implicit, as in method 1
above) or the calling code can do it (explicit, as in method 2 above).
I'm leaning towards method 1, even though the key function is then
called behind the scenes, because I think it makes the calling code cleaner.
Opinions?
~Ethan~