Text & Unicode processing references on the web.

A

anthony hornby

Hi,
I am starting my honours degree project and part of it is going to be
manipulating ASCII encoded XML files from a legacy database and
converting them to Unicode and doing text processing stuff on the data.

I am new to python ( total n00b ) but am keen to use it as the rest of
the software my application has to extend is already written in python,
plus I've always wanted to learn more about it - so here's my chance :)

I've written stuff in Java and Perl so I expect I'll pick up the basics
without too much trouble.

Can anyone point out some good "Python & Unicode" and "Python & Text
processing" resources on the net to get me started? Any good book
recommendations?

Thanks a lot for your help.
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

anthony said:
Can anyone point out some good "Python & Unicode" and "Python & Text
processing" resources on the net to get me started? Any good book
recommendations?

As a quick Unicode tutorial, I'd recommend

http://www.jorendorff.com/articles/unicode/python.html
http://www.egenix.com/files/python/Unicode-EPC2002-Talk.pdf

For text processing, you should read the "Strings" section of the
library reference:

http://www.python.org/doc/current/lib/strings.html

Notice there is also a separate section on SGML/XML

http://www.python.org/doc/current/lib/markup.html

PyXML has its own documentation page, on

http://pyxml.sourceforge.net/topics/docs.html

As for book recommendations: What language? I would recommend

Fischbeck, v. Löwis: Python 2

:) It covers all of these topics.

Regards,
Martin
 
A

anthony hornby

Hi Martin,
Thanks for the useful links :)

Anthony.

As a quick Unicode tutorial, I'd recommend

http://www.jorendorff.com/articles/unicode/python.html
http://www.egenix.com/files/python/Unicode-EPC2002-Talk.pdf

For text processing, you should read the "Strings" section of the
library reference:

http://www.python.org/doc/current/lib/strings.html

Notice there is also a separate section on SGML/XML

http://www.python.org/doc/current/lib/markup.html

PyXML has its own documentation page, on

http://pyxml.sourceforge.net/topics/docs.html

As for book recommendations: What language? I would recommend

Fischbeck, v. Löwis: Python 2

:) It covers all of these topics.

Regards,
Martin
--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,234
Messages
2,571,179
Members
47,811
Latest member
GregoryHal

Latest Threads

Top