Regexp problem, which pattern to use in split

  • Thread starter Hans =?iso-8859-1?q?Alm=E5sbakk?=
  • Start date
H

Hans =?iso-8859-1?q?Alm=E5sbakk?=

Hi,

I have a problem which I believe is seen before:
Finding the correct pattern to use, in order to split a line correctly,
using the split function in the re module.

I'm new to regexp, and it isn't always easy to comprehend for a newbie :)

The lines I want to split are like this:
(The following is one line, even if news client splits it up:)

"abc ",,"-",,,,,"Doe, John D.",2004,"A long text, which may contain many
characters. Dots, commas, and if I'm real unlucky: maybe even
"-characters","-",32454,,

These lines are in a csv file exported from excel.
Comma is obviously the separator, but as you can see a comma might
occur between " ", and if that is the case, it should not be (a
separator).
Then I pondered upon a way of using " chars in the splitting aswell,
something like "?,"? . (optional " before and after comma), which of course
also goes wrong. " may and may not occur around the splitting comma, but
that would also match single commas inside quoted text, see example.

Any pointer will be greatly appreciated. Maybe I'm attacking this problem
the wrong way already from the start? (Not that I can see another way
myself :)

Regards
 
M

Matthias Huening

Hans Almåsbakk (14.12.2004 16:02):
Any pointer will be greatly appreciated. Maybe I'm attacking this problem
the wrong way already from the start? (Not that I can see another way
myself :)

Hans, did you try the csv module in the Python library?

Matthias
 
H

Hans =?iso-8859-1?q?Alm=E5sbakk?=


This seems be just the thing I need.

Now ofcourse, another problem arouse:
The csv module is new in Python 2.3.

hans:~# python -V
Python 2.1.3

Is there a relatively hassle-free way to get the csv module working with
2.1? The server is running Debian stable/woody, and it also seemed 2.2 can
coexist with 2.1, when I checked the distro packages, if that is any help.

Regards
 
F

Fredrik Lundh

Hans said:
Is there a relatively hassle-free way to get the csv module working with
2.1? The server is running Debian stable/woody, and it also seemed 2.2 can
coexist with 2.1, when I checked the distro packages, if that is any help.

2.3 and 2.4 can also coexist with 2.1 (use "make altinstall" to leave "python"
alone, so if you're using a pure-Python application, upgrading might be a good
idea.

alternatively, the following module (with a slightly different API) should work
under 2.1:

http://www.object-craft.com.au/projects/csv/

</F>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,992
Messages
2,570,220
Members
46,807
Latest member
ryef

Latest Threads

Top