Parsing Hints

M

mwt

Hi -
I'm working on parsing a file that has data that looks like the sample
below. Obviously, I can't just split the string by colons. I'm pretty
new to regex, but I was thinking of something that would essentially
"split" by colons only if the are preceded by alpha characters -- thus
eliminating problems of splitting up times, etc. Still, I'm nagged by
the spectre of you gurus knowing a powerful way to approach this
problem. Am I on the right track here with this regex idea? Any hints
as to the sanest angle on parsing this would be appreciated. Thanks.

Here's a sample of the data:

Index 4: folding now
server: 171.65.199.158:8080; project: 1809
Folding: run 17, clone 19, generation 35; benchmark 669; misc: 500,
400
issue: Wed Mar 15 18:32:19 2006; begin: Wed Mar 15 18:32:25 2006
due: Fri Apr 28 19:32:25 2006 (44 days)
core URL: http://www.stanford.edu/~pande/Linux/x86/Core_82.fah
CPU: 1,0 x86; OS: 4,0 Linux
assignment info (le): Wed Mar 15 18:32:19 2006; A0F3AAD2
CS: 171.65.103.100; P limit: 5241856
user: MWT; team: 0; ID: 1A2BFB777775B7B; mach ID: 2
work/wudata_04.dat file size: 82814; WU type: Folding@Home
Average download rate 97.552 KB/s (u=4); upload rate 38.718 KB/s (u=3)
Performance fraction 0.950453 (u=3)
 
M

mwt

OK. I think the solution was much easier than I thought. The key is the
semicolon. I'm doing it in 3 steps:
1) Break string into 13 lines
2) Split each line by the semi-colon
3) Ummm... done already.

Time to wake up. ;)
 
K

Kent Johnson

mwt said:
Hi -
I'm working on parsing a file that has data that looks like the sample
below.

Here's a sample of the data:

Index 4: folding now
server: 171.65.199.158:8080; project: 1809
Folding: run 17, clone 19, generation 35; benchmark 669; misc: 500,
400
issue: Wed Mar 15 18:32:19 2006; begin: Wed Mar 15 18:32:25 2006
due: Fri Apr 28 19:32:25 2006 (44 days)
core URL: http://www.stanford.edu/~pande/Linux/x86/Core_82.fah
CPU: 1,0 x86; OS: 4,0 Linux
assignment info (le): Wed Mar 15 18:32:19 2006; A0F3AAD2
CS: 171.65.103.100; P limit: 5241856
user: MWT; team: 0; ID: 1A2BFB777775B7B; mach ID: 2
work/wudata_04.dat file size: 82814; WU type: Folding@Home
Average download rate 97.552 KB/s (u=4); upload rate 38.718 KB/s (u=3)
Performance fraction 0.950453 (u=3)

You don't say what data you are trying to extract. If it is key:value
pairs where the key is everything before the first colon, just use
line.split(':', 1) to split on just the first colon.

Kent
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,291
Messages
2,571,455
Members
48,132
Latest member
KatlynC08

Latest Threads

Top