How to use XPath to select element with text match

S

sehiser

Hello,

I've been reading up on xpath and I am able to access elements with it.
I haven't been able to figure one thing out though.

How would I use XPath to select an element where the text equals what
I'm looking for?

If I select a book topic (in the traditional of XML examples) of
Programming and it returns 200 books, and I want to specifically select
books with an element named "topic" text of "C++" what would the xpath
be?

<books>
<category>
<topic>Linux</topic>
<title>Linux ABC's</title>
<desc>The ABC's of Linux</desc>
<topic>C++</topic>
<title>Beginning C++</title>
<desc>C++ programming for beginners</desc>
<topic>C++</topic>
<title>Patterns in C++</title>
<desc>Advanced C++ programming</desc>
</category>
</books>

If I wanted to only select the elements in category matching the topic
text of C++, how would I use XPath to achieve this?

I thought it would be something like //books/category[topic='C++'] but
I get a node test expected error. I'm adding XML support to a VB
application and using MSXML6, but I don't think that's relevant to
XPath syntax.

Thanks.
 
S

sehiser

Ah, I just hacked out that XML, just imagine it is properly formed for
it's purpose.
 
J

Joseph Kesselman

How would I use XPath to select an element where the text equals what
I'm looking for?

You use a predicate.
books with an element named "topic" text of "C++" what would the xpath
be?

I presume you meant to have a "book" element that groups the values that
are properties of the same book:

<books>
<category>
<book>
<topic>Linux</topic>
<title>Linux ABC's</title>
<desc>The ABC's of Linux</desc>
</book>
<book>
<topic>C++</topic>
<title>Beginning C++</title>
<desc>C++ programming for beginners</desc>
</book>
<book>
<topic>C++</topic>
<title>Patterns in C++</title>
<desc>Advanced C++ programming</desc>
</book>
</category>
</books>

You could then say, for example, /books/category/book[topic="C++"]

I thought it would be something like //books/category[topic='C++']

That should work on your sample document (see above grumble about the
design). But what it will find is the category element, not an
individual book.
 
S

sehiser

Joseph said:
How would I use XPath to select an element where the text equals what
I'm looking for?

You use a predicate.
books with an element named "topic" text of "C++" what would the xpath
be?

I presume you meant to have a "book" element that groups the values that
are properties of the same book:

<books>
<category>
<book>
<topic>Linux</topic>
<title>Linux ABC's</title>
<desc>The ABC's of Linux</desc>
</book>
<book>
<topic>C++</topic>
<title>Beginning C++</title>
<desc>C++ programming for beginners</desc>
</book>
<book>
<topic>C++</topic>
<title>Patterns in C++</title>
<desc>Advanced C++ programming</desc>
</book>
</category>
</books>

You could then say, for example, /books/category/book[topic="C++"]

I thought it would be something like //books/category[topic='C++']

That should work on your sample document (see above grumble about the
design). But what it will find is the category element, not an
individual book.

Thanks, I just added another element and it worked fine.

Second, how can I use select-distinct so I only return one category
topic? I see how to do it with attributes selected but I'm not having
a lot of immediate success trying to get it to work just on elements.
 
P

Peter Flynn

Hello,

I've been reading up on xpath and I am able to access elements with it.
I haven't been able to figure one thing out though.

How would I use XPath to select an element where the text equals what
I'm looking for?

Let's assume you have some well-formed XML:

<books>
<category class="computing">
<topic subject="Linux">
<book>
<title>Linux ABC's</title>
<desc>The ABC's of Linux</desc>
</book>
</topic>
<topic subject="C++">
<book>
<title>Beginning C++</title>
<desc>C++ programming for beginners</desc>
</book>
<book>
<title>Patterns in C++</title>
<desc>Advanced C++ programming</desc>
</book>
</topic>
</category>
</books>

(Far from ideal, but usable)

Then books about C++ can be got by several means:

//topic[@subject='C++']/book
//book[contains(title,'C++')]
//book[contains(desc,'C++')]

or some combination of them.
I thought it would be something like //books/category[topic='C++'] but
I get a node test expected error.

Hardly surprising if your XML is not well-formed. Get the data model
right first, then everything else pretty much falls into place. Get it
wrong, and your application is hosed before you start.

///Peter
 
P

Peter Flynn

And be careful if your data is coming from someone who doesn't
understand XML. I see a lot of data these days being generated by
scripts being careless about newlines, eg

<desc>
C++
</desc>

and of course /foo[desc='C++'] fails because of the white-space.
contains() is your friend.

///Peter
 
S

sehiser

Great info here! You're saving me a lot of trial and error.

The select-distinct would be like a select distrinct SQL statement.

Let's say I have 200 records to search and I only want to know all the
topic categories out there. Rather than display a result that might
show the category C++ 34 times, I want to return C++ once. Basically
my end result is every category available but only displaying it once.

C++
C++
LINUX
C++
JAVA
PHP
PHP
PERL
C++
..NET

Selecting distrinct would return. I've seen some XPATH function like
this, but they only work with attributes, I would like to use the
element name.

C++
LINUX
JAVA
PHP
PERL
..NET

~Scott
 
P

p.lepin

Great info here! You're saving me a lot of trial and
error.

Trial and error is an excellent way to learn, as long as
you're not coding a targetting software package for ICBMs.
Let's say I have 200 records to search and I only want to
know all the topic categories out there.
Rather than display a result that might show the category
C++ 34 times, I want to return C++ once. Basically my end
result is every category available but only displaying it
once.

You seem to understand pretty well what exactly do you
need, you should've tried implementing it.
I've seen some XPATH function like this, but they only
work with attributes, I would like to use the element
name.

What does it matter? So it's in child:: axis instead of
attribute:: axis. Big deal.

Assume you have your data stored in an XML file like this:

<?xml version="1.0" encoding="UTF-8"?>
<data>
<book><topic>Perl</topic></book>
<book><topic>C++</topic></book>
<book><topic>Java</topic></book>
<book><topic>C++</topic></book>
<book><topic>Perl</topic></book>
<book><topic>Java</topic></book>
<book><topic>Java</topic></book>
<book><topic>C++</topic></book>
<book><topic>Java</topic></book>
<book><topic>Perl</topic></book>
</data>

(Nodes that we aren't using in this example omitted.)

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0" <xsl:eek:utput
method="xml"
version="1.0"
encoding="UTF-8"
/>
<xsl:template match="/">
<result>
<xsl:apply-templates
select="//book"
mode="distinct-topic"
<xsl:sort select="topic"/>
</xsl:apply-templates>
</result>
</xsl:template>
<xsl:template match="book" mode="distinct-topic">
<!-- only if the current node is the first node in the -->
<!-- document with the same content of <topic> child -->
<xsl:if
test="
generate-id(.)=
generate-id(//book[topic=current()/topic][1])
"
<xsl:copy-of select="topic"/>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top