How to search XML? Are there special libs?

  • Thread starter Sullivan WxPyQtKinter
  • Start date
S

Sullivan WxPyQtKinter

I am now using XML to record my lab records in quite a complex way, in
which about 80 tags are used to identify different types of data. I
think it is a good idea to search for a popular and mature XML search
engine before I started to program it myself:

I need the following functions:
Search and get all the records that share the same tree structures.
Search records containing a specific node or sub tree structres.

Does anyone know about something suitable? Or other powerful XML search
engine in Python?
Thank you so much for help.

PS: I do not care about the learning curve, as long as it is not a
cliff to climb.
 
D

Diez B. Roggisch

Sullivan said:
I am now using XML to record my lab records in quite a complex way, in
which about 80 tags are used to identify different types of data. I
think it is a good idea to search for a popular and mature XML search
engine before I started to program it myself:

I need the following functions:
Search and get all the records that share the same tree structures.
Search records containing a specific node or sub tree structres.

Does anyone know about something suitable? Or other powerful XML search
engine in Python?
Thank you so much for help.

XPath is your friend. Available at least in the 4suite-xml-libraries. I'm
not aware of anything that resembles a more database-like approach, but at
least you can issue queries against the xml.

However you might reconsider your choice of a persistence model. Queries is
what SQL is for, and you very well might be better off using e.g. sqlite
and a XML-im/export if that is what you need.

Diez
 
R

Ravi Teja

Yes! XPath is a good bet.
You can also try some Pythonic XML libraries like Amara. You need not
learn any special language even.

There are good database approaches to XML too, especially if you are
going to query a document collection as a whole rather than file by
file. You can try XQuery. I think 4Suite can do this (But I am too
sleepy to confirm :) ). You also use eXist (Java but you can use
XMLRPC or SOAP to interface with it from Python). Not optimal like
parent said, but if it is XML that have to live with ...
 
U

uche.ogbuji

Ravi said:
Yes! XPath is a good bet.
You can also try some Pythonic XML libraries like Amara. You need not
learn any special language even.

There are good database approaches to XML too, especially if you are
going to query a document collection as a whole rather than file by
file. You can try XQuery. I think 4Suite can do this (But I am too
sleepy to confirm :) ). You also use eXist (Java but you can use
XMLRPC or SOAP to interface with it from Python). Not optimal like
parent said, but if it is XML that have to live with ...

4Suite does not support XQuery. It does support full XPath plus EXSLT
and enough other extensions to come close to the power of XQuery.
Amara [1] makes it really easy to get XQuery-like power from right
within Python, as I've blogged before (e.g. [2][3]).

I don't know whether full-text indexing of XML is something the OP
needs as well. If so, see [3].

[1] http://uche.ogbuji.net/tech/4Suite/amara/
[2] http://copia.ogbuji.net/blog/2005-06-12/Amara_equi
[3] http://copia.ogbuji.net/blog/2005/Sep/20
[4] http://www.xml.com/pub/a/2004/12/08/py-xml.html
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,997
Messages
2,570,241
Members
46,831
Latest member
RusselWill

Latest Threads

Top