how to compare two fields in python

  • Thread starter upendra kumar Devisetty
  • Start date
U

upendra kumar Devisetty

I have a very basic question in python. I want to go through each line of the a csv file and compare to see if the first field of line 1 is same as first field of next line and so on. If it finds a match then i would like to put that field in an object1 else put that field in a different object2. Finally i would like to count how many of the fields in object1 vs object2. Can this be done in python? Here is a small example.

BRM_1 679 1929
BRM_1 203 567
BRM_2 367 1308
BRM_3 435 509
As you can see field1 of line1 is same as field2 of line2 and so that fieldBRM_1 should be place in object1 and BRM_2 and BRM_3 should be placed in object2. So the final numbers of object1 is 1 and object2 is 2.

Thanks in advance..

Upendra
 
J

Joel Goldstick

On Tue, Apr 30, 2013 at 1:41 PM, upendra kumar Devisetty <
I have a very basic question in python. I want to go through each line of
the a csv file and compare to see if the first field of line 1 is same as
first field of next line and so on. If it finds a match then i would like
to put that field in an object1 else put that field in a different object2.
Finally i would like to count how many of the fields in object1 vs object2.
Can this be done in python? Here is a small example.

BRM_1 679 1929
BRM_1 203 567
BRM_2 367 1308
BRM_3 435 509
As you can see field1 of line1 is same as field2 of line2 and so that
field BRM_1 should be place in object1 and BRM_2 and BRM_3 should be placed
in object2. So the final numbers of object1 is 1 and object2 is 2.
You should study the csv module.
 
F

Fábio Santos

.... And collections.Counter. This is useful for (you guessed it) counting.

Maybe itertools.groupby will be helpful as well (it could be used to give
you your data grouped by the first column of data), but it could be a tad
advanced for you if you are not too familiar with iterators.
 
T

Tim Chase

I have a very basic question in python. I want to go through each
line of the a csv file and compare to see if the first field of
line 1 is same as first field of next line and so on. If it finds a
match then i would like to put that field in an object1 else put
that field in a different object2. Finally i would like to count
how many of the fields in object1 vs object2. Can this be done in
python? Here is a small example.

BRM_1 679 1929
BRM_1 203 567
BRM_2 367 1308
BRM_3 435 509
As you can see field1 of line1 is same as field2 of line2 and so
that field BRM_1 should be place in object1 and BRM_2 and BRM_3
should be placed in object2. So the final numbers of object1 is 1
and object2 is 2.

You underdefine the problem. What happens in the case of:

BRM_1 ...
BRM_1 ...
BRM_2 ...
BRM_1 ... <-- duplicates a (not-immediately) previous line
BRM_3 ...

Also, do the values that follow have any significance for this, or
are they just noise to be ignored?

-tkc
 
U

upendra kumar Devisetty

The data was sorted and so duplicates will not appear anywhere in the dataframe. The values does not have significance and can be ignored safely.

Thanks
Upendra
 
D

Dennis Lee Bieber

I have a very basic question in python. I want to go through each line of the a csv file and compare to see if the first field of line 1 is same as first field of next line and so on. If it finds a match then i would like to put that field in an object1 else put that field in a different object2. Finally i would like to count how many of the fields in object1 vs object2. Can this be done in python? Here is a small example.

You are essentially describing a "control-break" (or "report break")
http://en.wikipedia.org/wiki/Control_break

The basic algorithm requires one to keep track of "previous" record
and do a comparison. While the "control" field is the same, you do one
action. When the control changes you close out the previous group and
start a new one.
BRM_1 679 1929
BRM_1 203 567
BRM_2 367 1308
BRM_3 435 509
As you can see field1 of line1 is same as field2 of line2 and so that field BRM_1 should be place in object1 and BRM_2 and BRM_3 should be placed in object2. So the final numbers of object1 is 1 and object2 is 2.

Pseudo-code:

control = None
group = []
for record in file:
if control is not None and record[0] != control:
output(group) #close out previous group data
group = [] #initialize new group data
control = record[0] #reset control break data
group.append(record) #add current record to group
if group:
output(group) #handle non-empty last group
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,154
Members
46,702
Latest member
LukasConde

Latest Threads

Top