Nokogiri parsing question

K

Kyle X.

Hello, I have beenhaving trouble trying to transform some REXML to
Nokogiri, and from my reading online I cannot find the proper way to
write it using Nokogiri. Here are the two examples I am having trouble
with:

Case 1)
<IfcWallStandardCase id="i17855">
<Representation>
<IfcProductDefinitionShape id="i17925">
<Representations id="i17928" exp:cType="list">
<IfcShapeRepresentation exp:pos="0" xsi:nil="true"
ref="i17886"/>
<IfcShapeRepresentation exp:pos="1" xsi:nil="true"
ref="i17919"/>
</Representations>
</IfcProductDefinitionShape>
</Representation>
</IfcWallStandardCase>

I am trying to get the reference for exp:pos="1", and I had this working
with using REXML with the following -

XPath.match( $doc, "//IfcWallStandardCase//*[@pos='1']" )

With nokogiri I can get it to read both pos 0 and 1, using .css and
xpath-

$doc_noko.css("uosNS|IfcWallStandardCase uosNS|IfcShapeRepresentation",
{"uosNS" => $http})
and
$doc_noko.xpath("//uosNS:IfcWallStandardCase//uosNS:IfcShapeRepresentation",
{"uosNS" => $http})

But cannot figure out how to get it to read only pos=1 using either
method and continuously get error or nil.


Case 2)
<IfcDirection id="i1574">
<DirectionRatios id="i1577" exp:cType="list">
<exp:double-wrapper pos="0">1.</exp:double-wrapper>
<exp:double-wrapper pos="1">0.</exp:double-wrapper>
<exp:double-wrapper pos="2">0.</exp:double-wrapper>
</DirectionRatios>
</IfcDirection>

The issue I am having here is that I am reading this with Nokogiri using
xpath and the colon in exp:double is giving me trouble since the xpath
is written -

ref = "i1574"
$doc_noko.xpath("//uosNS:*[@id='#{ref}']//uosNS:exp:double-wrapper",
{"uosNS" => $http}).map {|element| element.text}

I am guessing that it would be easier to use .css here rather than
xpath. So I have tried using it but cannot seem to get it correct.
Trying -

ref = "i1574"
$doc_noko.css("uosNS|#{ref} uosNS|exp:double-wrapper", {"uosNS" =>
$http}).map {|element| element.text}

I read in a previous post that you call the ref using #{} for css,
but this returns nil for me.

Any ideas?
 
R

Robert Klemme

Hello, I have beenhaving trouble trying to transform some REXML to
Nokogiri, and from my reading online I cannot find the proper way to
write it using Nokogiri. =A0Here are the two examples I am having trouble
with:

Case 1)
<IfcWallStandardCase id=3D"i17855">
=A0<Representation>
=A0 =A0<IfcProductDefinitionShape id=3D"i17925">
=A0 =A0 =A0<Representations id=3D"i17928" exp:cType=3D"list">
=A0 =A0 =A0 =A0<IfcShapeRepresentation exp:pos=3D"0" xsi:nil=3D"true"
ref=3D"i17886"/>
=A0 =A0 =A0 =A0<IfcShapeRepresentation exp:pos=3D"1" xsi:nil=3D"true"
ref=3D"i17919"/>
=A0 =A0 =A0</Representations>
=A0 =A0</IfcProductDefinitionShape>
=A0</Representation>
</IfcWallStandardCase>

I am trying to get the reference for exp:pos=3D"1", and I had this workin= g
with using REXML with the following -

XPath.match( $doc, "//IfcWallStandardCase//*[@pos=3D'1']" )

With nokogiri I can get it to read both pos 0 and 1, using .css and
.xpath-

$doc_noko.css("uosNS|IfcWallStandardCase uosNS|IfcShapeRepresentation",
{"uosNS" =3D> $http})
and
$doc_noko.xpath("//uosNS:IfcWallStandardCase//uosNS:IfcShapeRepresentatio= n",
{"uosNS" =3D> $http})

But cannot figure out how to get it to read only pos=3D1 using either
method and continuously get error or nil.

The XPath seems to work in Nokogiri as well:

Case 2)
<IfcDirection id=3D"i1574">
=A0<DirectionRatios id=3D"i1577" exp:cType=3D"list">
=A0 =A0<exp:double-wrapper pos=3D"0">1.</exp:double-wrapper>
=A0 =A0<exp:double-wrapper pos=3D"1">0.</exp:double-wrapper>
=A0 =A0<exp:double-wrapper pos=3D"2">0.</exp:double-wrapper>
=A0</DirectionRatios>
</IfcDirection>

The issue I am having here is that I am reading this with Nokogiri using
.xpath and the colon in exp:double is giving me trouble since the xpath
is written -

ref =3D "i1574"
$doc_noko.xpath("//uosNS:*[@id=3D'#{ref}']//uosNS:exp:double-wrapper",
{"uosNS" =3D> $http}).map {|element| element.text}

I am guessing that it would be easier to use .css here rather than
.xpath. =A0So I have tried using it but cannot seem to get it correct.
Trying -

ref =3D "i1574"
$doc_noko.css("uosNS|#{ref} uosNS|exp:double-wrapper", {"uosNS" =3D>
$http}).map {|element| element.text}

I read in a previous post that you call the ref using #{} for css,
but this returns nil for me.

When I define a namespace mapping and remove the namespace prefix from
the wildcard it works as expected:

irb(main):042:0> doc=3DNokogiri.XML(<<XML)
irb(main):043:1" <IfcDirection xmlns:exp=3D"foo" id=3D"i1574">
irb(main):044:1" <DirectionRatios id=3D"i1577" exp:cType=3D"list">
irb(main):045:1" <exp:double-wrapper pos=3D"0">1.</exp:double-wrapper>
irb(main):046:1" <exp:double-wrapper pos=3D"1">0.</exp:double-wrapper>
irb(main):047:1" <exp:double-wrapper pos=3D"2">0.</exp:double-wrapper>
irb(main):048:1" </DirectionRatios>
irb(main):049:1" </IfcDirection>
irb(main):050:1" XML

Note: "exp" is mapped to "foo".

irb(main):052:0> ref
=3D> "i1574"

irb(main):054:0> puts
doc.xpath("//*[@id=3D'#{ref}']//ns1:double-wrapper", {"ns1" =3D> "foo"})
<exp:double-wrapper pos=3D"0">1.</exp:double-wrapper>
<exp:double-wrapper pos=3D"1">0.</exp:double-wrapper>
<exp:double-wrapper pos=3D"2">0.</exp:double-wrapper>
=3D> nil

With ns prefix:

irb(main):055:0> puts
doc.xpath("//ns1:*[@id=3D'#{ref}']//ns1:double-wrapper", {"ns1" =3D>
"foo"})
=3D> nil

Kind regards

robert

--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
 
K

Kyle X.

Thank you for the reply Robert; however, I cannot get it to work still.
I left out a lit bit of the xml when I wrote the first post. Both cases
have already have a namespace defined for them.


Case 1) These two lines would be added to both cases. The actual file
reads:
<doc:iso_10303_28 xmlns:exp="urn:eek:id:1.0.10303.28.2.1.1"
xmlns:doc="urn:eek:id:1.0.10303.28.2.1.3"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:eek:id:1.0.10303.28.2.1.1 ex.xsd" version="2.0">
<uos id="uos_1" description="" configuration="i-ifc2x3" edo=""
xmlns="http://www.iai-tech.org/ifcXML/IFC2x3/FINAL"
xsi:schemaLocation="http://www.iai-tech.org/ifcXML/IFC2x3/FINAL
ifc2x3.xsd">
<IfcWallStandardCase id="i17855">
<Representation>
<IfcProductDefinitionShape id="i17925">
<Representations id="i17928" exp:cType="list">
<IfcShapeRepresentation exp:pos="0" xsi:nil="true"
ref="i17886"/>
<IfcShapeRepresentation exp:pos="1" xsi:nil="true"
ref="i17919"/>
</Representations>
</IfcProductDefinitionShape>
</Representation>
</IfcWallStandardCase>
</uos>
</doc:iso_10303_28>

The exact code I have been trying to run is:

$http = "http://www.iai-tech.org/ifcXML/IFC2x3/FINAL"
ref = "i17855"
$doc_noko = Nokogiri::XML(File.read(filename))

x = $doc_noko.xpath("//uosNS:*[@id='#{ref}']//uosNS:*[@pos='1']",
{"uosNS" => $http})
#=> nil

It reads for "//uosNS:*[@id='#{ref}']" but the "//uosNS:*[@pos='1']"
is where it is not working as intended returning a nil value. With
these two new lines in mind any ideas for the two cases?
 
J

Jesús Gabriel y Galán

The problem is that the attribute pos has a different namespace (exp), I
have tried a couple of ways but could not make it work
 
R

Robert Klemme

Thank you for the reply Robert; however, I cannot get it to work still.
I left out a lit bit of the xml when I wrote the first post. =A0Both case= s
have already have a namespace defined for them.


Case 1) =A0These two lines would be added to both cases. =A0The actual fi= le
reads:
<doc:iso_10303_28 xmlns:exp=3D"urn:eek:id:1.0.10303.28.2.1.1"
xmlns:doc=3D"urn:eek:id:1.0.10303.28.2.1.3"
xmlns:xsi=3D"http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation=3D"urn:eek:id:1.0.10303.28.2.1.1 ex.xsd" version=3D"2.0">
=A0<uos id=3D"uos_1" description=3D"" configuration=3D"i-ifc2x3" edo=3D""
xmlns=3D"http://www.iai-tech.org/ifcXML/IFC2x3/FINAL"
xsi:schemaLocation=3D"http://www.iai-tech.org/ifcXML/IFC2x3/FINAL
ifc2x3.xsd">
=A0 =A0<IfcWallStandardCase id=3D"i17855">
=A0 =A0 =A0<Representation>
=A0 =A0 =A0 =A0<IfcProductDefinitionShape id=3D"i17925">
=A0 =A0 =A0 =A0 =A0<Representations id=3D"i17928" exp:cType=3D"list">
=A0 =A0 =A0 =A0 =A0 =A0<IfcShapeRepresentation exp:pos=3D"0" xsi:nil=3D"t= rue"
ref=3D"i17886"/>
=A0 =A0 =A0 =A0 =A0 =A0<IfcShapeRepresentation exp:pos=3D"1" xsi:nil=3D"t= rue"
ref=3D"i17919"/>
=A0 =A0 =A0 =A0 =A0</Representations>
=A0 =A0 =A0 =A0</IfcProductDefinitionShape>
=A0 =A0 =A0</Representation>
=A0 =A0</IfcWallStandardCase>
=A0</uos>
</doc:iso_10303_28>

The exact code I have been trying to run is:

$http =3D "http://www.iai-tech.org/ifcXML/IFC2x3/FINAL"
ref =3D "i17855"
$doc_noko =3D Nokogiri::XML(File.read(filename))

x =3D $doc_noko.xpath("//uosNS:*[@id=3D'#{ref}']//uosNS:*[@pos=3D'1']",
{"uosNS" =3D> $http})
#=3D> nil

It reads for "//uosNS:*[@id=3D'#{ref}']" =A0but the "//uosNS:*[@pos=3D'1'= ]"
is where it is not working as intended returning a nil value. =A0With
these two new lines in mind any ideas for the two cases?

As I said: get rid of namespaces for wildcards. Instead, of course
you have to add the NS for the attribute. How you name your
namespaces in XPath expression doesn't really matter as long as the
_mapping_ is identical. XML namespaces are just a convenient
replacement for the mapped URI. So you need to make sure in the XPath
expression you use the same URI.

#!/bin/env ruby19

require 'nokogiri'

doc =3D Nokogiri.XML(DATA)

ref =3D "i17855"
puts doc.xpath("//*[@id=3D'#{ref}']//*[@uosNS:pos=3D'1']",
{"uosNS" =3D> "urn:eek:id:1.0.10303.28.2.1.1"})


__END__
<doc:iso_10303_28 xmlns:exp=3D"urn:eek:id:1.0.10303.28.2.1.1"
xmlns:doc=3D"urn:eek:id:1.0.10303.28.2.1.3"
xmlns:xsi=3D"http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation=3D"urn:eek:id:1.0.10303.28.2.1.1 ex.xsd"
version=3D"2.0">
<uos id=3D"uos_1" description=3D"" configuration=3D"i-ifc2x3" edo=3D""
xmlns=3D"http://www.iai-tech.org/ifcXML/IFC2x3/FINAL"
xsi:schemaLocation=3D"http://www.iai-tech.org/ifcXML/IFC2x3/FINALifc2x3.=
xsd">
<IfcWallStandardCase id=3D"i17855">
<Representation>
<IfcProductDefinitionShape id=3D"i17925">
<Representations id=3D"i17928" exp:cType=3D"list">
<IfcShapeRepresentation exp:pos=3D"0" xsi:nil=3D"true" ref=3D"i1=
7886"/>
<IfcShapeRepresentation exp:pos=3D"1" xsi:nil=3D"true" ref=3D"i1=
7919"/>
</Representations>
</IfcProductDefinitionShape>
</Representation>
</IfcWallStandardCase>
</uos>
</doc:iso_10303_28>

Cheers

robert


--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top