XPath speed.

Daniel Pitts · Jul 9, 2008

Hello everyone.
I've noticed that a lot of time in a few of my code-bases is spent in
XPath.evaluate or XPathExpression.evaluate. I've re-written part of it
to use direct DOM navigation where feasable with a huge speed increase.
I was thinking there must be faster implementation of XPath than the
default com.sun implementation. Does anyone have any experience with this?

Thanks,
Daniel.

Stanimir Stamenkov · Jul 9, 2008

Tue, 08 Jul 2008 17:55:18 -0700, /Daniel Pitts/:

I've noticed that a lot of time in a few of my code-bases is spent in
XPath.evaluate or XPathExpression.evaluate. I've re-written part of it
to use direct DOM navigation where feasable with a huge speed increase.
I was thinking there must be faster implementation of XPath than the
default com.sun implementation. Does anyone have any experience with this?

Haven't used the XPath classes (directly) but I believe the com.sun
classes must fork of the Xalan [1] implementation. Did you try it
plugging the latest Xalan (2.7.1) release? You may get more help
with it on the Xalan user mailing list [2].

[1] http://xml.apache.org/xalan-j/
[2] http://mail-archives.apache.org/mod_mbox/xml-xalan-j-users/

Tom Anderson · Jul 9, 2008

I've noticed that a lot of time in a few of my code-bases is spent in
XPath.evaluate or XPathExpression.evaluate. I've re-written part of it
to use direct DOM navigation where feasable with a huge speed increase.
I was thinking there must be faster implementation of XPath than the
default com.sun implementation. Does anyone have any experience with
this?

Nope. But i'm working on an XPath-heavy app, so let us know if you
discover anything!

tom

Daniel Pitts · Jul 9, 2008

Tom said:
Nope. But i'm working on an XPath-heavy app, so let us know if you
discover anything!

tom

Apparently this is a known issue with the way the sun implementation
hides XPathContext, which (as I understand it) caches some important
information every time its instantiated. XPath.evaluate and
XPathExpression.evaluate both instantiate a new XPathContext every time.

A co-worker of mine sent these references:

<http://blog.astradele.com/computers...nce-problem-confirmed.2006-04-01-17-14.1024px>
<http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6344064>

This blog shows some options: <http://www.asciiarmor.com/post/33736618/java-xpath-1-0-engine-comparison-compliance>

For my use case:
My XPath expressions (as Strings) were all being injected by Spring into
one of my beans. XPath was just one option, so I had an interface
(FieldNormalizer) and a few concrete options, XPathFieldNormalizer for
example.

The "average" XPath expression in my app was in one of these forms:
../Tag or /Tag or Tag (get child element called Tag)
..//Tag or //Tag (get descendant element called Tag)
'StringLiteral'
concat('StringLiteral', ./@attribute) (don't ask.)

All of which are "easy" to do "quickly" using normal DOM operations.

The fix (read "hack") I took was to write handlers for those special
cases.
I wrote a ChildElementNormalizer, DescendantElementNormalizer,
StringLiteralNormalizer, and StringLiteralPlusAttributeNormalizer.
I then replaced the XPathFieldNormalizer constructor with a factory
method that will find if the express matches one of the above forms, and
use the appropriate specialization.

The gained my particular application about 10 fold request performance.

The moral is: It is important to analyze your common cases carefully
when optimizing. A profiler tool is a must for this

.

Hope this helps some of you out there facing similar problems.

Tom Anderson · Jul 11, 2008

Apparently this is a known issue with the way the sun implementation
hides XPathContext, which (as I understand it) caches some important
information every time its instantiated. XPath.evaluate and
XPathExpression.evaluate both instantiate a new XPathContext every time.

Aha. I don't think my library is reusing XPathContexts - i should look
into this.

At the moment, we don't have a performance problem, so this would be
premature. However, there are plans afoot to use the code to do something
a lot more performance-critical, and so if XPath queries prove to be a
significant part of the workload, easy speedups would be good to know
about.

For my use case:
My XPath expressions (as Strings) were all being injected by Spring into one
of my beans. XPath was just one option, so I had an interface
(FieldNormalizer) and a few concrete options, XPathFieldNormalizer for
example.

The "average" XPath expression in my app was in one of these forms:
./Tag or /Tag or Tag (get child element called Tag)
.//Tag or //Tag (get descendant element called Tag)
'StringLiteral'
concat('StringLiteral', ./@attribute) (don't ask.)

All of which are "easy" to do "quickly" using normal DOM operations.

The fix (read "hack") I took was to write handlers for those special cases.
I wrote a ChildElementNormalizer, DescendantElementNormalizer,
StringLiteralNormalizer, and StringLiteralPlusAttributeNormalizer.
I then replaced the XPathFieldNormalizer constructor with a factory method
that will find if the express matches one of the above forms, and use the
appropriate specialization.

The gained my particular application about 10 fold request performance.

Wow. Our case is similar - almost all of the expressions look like:

//markertag/chain/of/child/tags

Which could be implemented very quickly indeed with specialised code like
yours.

I imagine you could implement XPath expressions by code generation, as is
done for reflection. That could potentially be as fast as custom code,
while retaining all the flexibility of XPath. It would have monstrous
overhead, but there are many situations where that wouldn't matter.

tom

Execution speed question	25	Jul 25, 2008
ANN: XMLMax Beta Adds Indexing For Faster Xpath Query	0	Apr 1, 2009
Speed abilities	1	Apr 18, 2006
Creating a subversion "tag change summary".	7	Apr 15, 2012
10 Easy Steps to Speed Up Your Computer - Without Upgrading	0	Jun 8, 2009
Does anyone do DOM navigation anymore?	2	Jul 6, 2004
Contract Java Software Engineer- Ingenuity Systems- Redwood City, CA	0	Mar 28, 2008
when to force an app to consume less resources	9	Jan 14, 2009

XPath speed.

Daniel Pitts

Stanimir Stamenkov

Tom Anderson

Daniel Pitts

Tom Anderson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads