translating LaTeX to XML

  • Thread starter Michael Friendly
  • Start date
M

Michael Friendly

I have a LaTeX document describing a long list of items that I want to
translate to XML to treat these as a database. I've written a perl
script to do the basic translation, and a basic DTD file,
but I am stumped at translating
LaTeX character encodings to something XML won't choke on.

I found GNU recode to solve most of this, using

cat milestone.tex | recode -d tex..xml | itemdb -s xml -o milestone.xml

where itemdb is my perl script, and I've gotten rid of the diacritical
characters, but I'm getting errors with &s in URLs:

XML Parsing Error: not well-formed
Location: file:///home/friendly/SCS/Gallery/milestone/Private/milestone.xml
Line Number 2397, Column 101: <commentary
url="http://historical.library.cornell.edu/cgi-bin/cul.math/docviewer?did=00620001&seq=3"
text="Text of d'Ocagne'sbook on parallel coordinates" />
----------------------------------------------------------------------------------------------------^

(This is from the mozilla browser, trying to load the milestone.xml file.)
I'm pretty much a newbie with XML, so I don't know whether it is a
problem with my DTD or what tools are available (debian linux)

-Michael


--
Michael Friendly Email: (e-mail address removed)
Professor, Psychology Dept.
York University Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html
Toronto, ONT M3J 1P3 CANADA
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top