pythojn/xpath question...

bruce · Feb 16, 2009

hi...

using libxml2dom as the xpath lib

i've got a situation where i can have:
foo=a.xpath( /html/body/table[2]/tr[45]/td)
and i can get
11 as the number of returned td elements for the 45th row...

this is as it should be.

however, if i do:
foo=a.xpath( /html/body/table[2]/tr)

and then try to iterate through to the 45th "tr", and try to get the number
of "td" elements..
i can't seem to get the additional xpath that has to be used,

i've tried a number of the following with no luck...
l1 = libxml2dom.toString(tmp_[0])
print "l1 = "+l1+"\n"

ldx = 0
for l in tmp_:
print "ld ="+str(ldx)
if ldx==45:
#needs to be a better way...
#l1 = libxml2dom.toString(tmp_[0])
l1 = libxml2dom.toString(l)
#print "1111 = ",l1

q1 = libxml2dom
b1 = q1.parseString(l1, html=1)
#dd1 = b1.xpath("//td[not(@width)]")
#data = b1.xpath("//td/font")
#data = b1.xpath("//td[@valign='top'][not(@width)]")
#data =
b1.xpath("//child::td[position()>0][@valign='top'][not(@width)]")
#data = b1.xpath("//td/parent::*/td[@valign='top'][not(@width)]")
#data = b1.xpath("//td[position()]")
#data = b1.xpath("//parent::tr[position()=1]/td")
data = b1.xpath("//td[@valign='top'][not(@width)]")

it appears that i somehow need to get the direct child/node of the parent
"tr" that's the "td"...
it looks like using ("//td..." gets all the underlying child "td"... as
opposed to the direct
1st level child/siblings... any thoughts/pointers would be appreciated...

thanks...

Diez B. Roggisch · Feb 16, 2009

bruce said:
hi...

using libxml2dom as the xpath lib

i've got a situation where i can have:
foo=a.xpath( /html/body/table[2]/tr[45]/td)
and i can get
11 as the number of returned td elements for the 45th row...

this is as it should be.

however, if i do:
foo=a.xpath( /html/body/table[2]/tr)

and then try to iterate through to the 45th "tr", and try to get the number
of "td" elements..
i can't seem to get the additional xpath that has to be used,

i've tried a number of the following with no luck...
l1 = libxml2dom.toString(tmp_[0])
print "l1 = "+l1+"\n"

ldx = 0
for l in tmp_:
print "ld ="+str(ldx)
if ldx==45:
#needs to be a better way...
#l1 = libxml2dom.toString(tmp_[0])
l1 = libxml2dom.toString(l)
#print "1111 = ",l1

q1 = libxml2dom
b1 = q1.parseString(l1, html=1)
#dd1 = b1.xpath("//td[not(@width)]")
#data = b1.xpath("//td/font")
#data = b1.xpath("//td[@valign='top'][not(@width)]")
#data =
b1.xpath("//child::td[position()>0][@valign='top'][not(@width)]")
#data = b1.xpath("//td/parent::*/td[@valign='top'][not(@width)]")
#data = b1.xpath("//td[position()]")
#data = b1.xpath("//parent::tr[position()=1]/td")
data = b1.xpath("//td[@valign='top'][not(@width)]")

it appears that i somehow need to get the direct child/node of the parent
"tr" that's the "td"...
it looks like using ("//td..." gets all the underlying child "td"... as
opposed to the direct
1st level child/siblings... any thoughts/pointers would be appreciated...

- you don't give enough information, as you don't provide the html
- the above code is obviously not the one running, as I can't see
anything that's increasing your running variable ldx
- using l as variable names is extremely confusing, because it's hard
to distinguish from 1 (the number). Using l1 is even worse.
- xpath usually counts from 1, whereas python is 0-based. As is your
code. So you most probably have a off-by-one-error.
- you should read a xpath-tutorial, as "//td"'s purpose is to fetch
*all* elements td from the document root, as it is clearly stated here:
http://www.w3.org/TR/xpath#path-abbrev. So it's no wonder you get more
than you expect. Direct child nodes are found by simply omitting the
axis specifier.

Diez

Can someone tell me if this a real tracker? Or is it one designed to show you a different message at certain times, ie. acting like one?	0	Jan 10, 2021
I am using 2 loops, 1 for input and 1 for td. Can we achieve the same functionality with 1 loop in Jquery?	4	Sep 29, 2023
How can I calculate the last payment of the year to be the sum of all previous payments for that year and subtracting it from Research Costs value?	7	Aug 22, 2023
Nested Loop Insert Page Break	1	Nov 5, 2021
Sort by number of characters	1	Nov 2, 2023
How to have two html audio players on one page?	0	May 3, 2022
Image shifts to the right when export the page to pdf	4	May 5, 2023
When I send email as HTML, why do erroneous whitespaces getintroduced to the HTML source and a few <	2	Nov 8, 2013

pythojn/xpath question...

bruce

Diez B. Roggisch

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads