P
patrik.nyman
I am working with marking up the text of old books,
and need to be able to present the result page-wise.
Problem is, sometimes the page breaks occurs in the
middle of a paragraph (or in some other element).
See the following example.
<p>I shall not describe it to you, for in-
<lb/>deed I cannot. To delineate the truly aw-
<lb/>ful locality of Trollhättan, would
<lb/>baffle the powers of poetic fancy, and mock
<pb n="15" urn="urn:nbn:se:kb:digark-7886"/>
<lb/>the painter's daring pencil. I ran only af-
<lb/>ford you a faint idea of its characteristic
<lb/>features, and even that will he found
<lb/>arduous. Come, and see it, and you will
<lb/>applaud my modesty.
</p>
<p>[...]
<lb/>of gold." Subscribing to the old Swedish
<lb/>proverb: When it rains down milk, the poor
<lb/>has no spoon," I silently dropped the theme,
<lb/>and would not have rementioned it now,
<pb n="16" urn="urn:nbn:se:kb:digark-7887"/>
<lb/>if I were not anxious to dis-play to you, what
<lb/>an able minister of state I might possibly
<lb/>be, if His Majesty should be pleased to
<lb/>invest me with that honor, which, you
<lb/>know, is as distant from me as the mitre
<lb/>and the slipper of the Pope of Rome.
</p>
Just separating out the material in between the <pb/>'s
gives non-wellformed XML.
So, is it possible to write an XQuery expression that
can fix this, i.e. 'detect' that the <pb/> occurs in
the middle of another element and take the appropriate
action? The result would have to look something like
<pb n="15" urn="urn:nbn:se:kb:digark-7886"/>
<p rend="noindent">the painter's daring pencil. I ran only af-
<lb/>ford you a faint idea of its characteristic
<lb/>features, and even that will he found
<lb/>arduous. Come, and see it, and you will
<lb/>applaud my modesty.
</p>
<p>[...]
<lb/>of gold." Subscribing to the old Swedish
<lb/>proverb: When it rains down milk, the poor
<lb/>has no spoon," I silently dropped the theme,
<lb/>and would not have rementioned it now,
</p>
Thanks.
and need to be able to present the result page-wise.
Problem is, sometimes the page breaks occurs in the
middle of a paragraph (or in some other element).
See the following example.
<p>I shall not describe it to you, for in-
<lb/>deed I cannot. To delineate the truly aw-
<lb/>ful locality of Trollhättan, would
<lb/>baffle the powers of poetic fancy, and mock
<pb n="15" urn="urn:nbn:se:kb:digark-7886"/>
<lb/>the painter's daring pencil. I ran only af-
<lb/>ford you a faint idea of its characteristic
<lb/>features, and even that will he found
<lb/>arduous. Come, and see it, and you will
<lb/>applaud my modesty.
</p>
<p>[...]
<lb/>of gold." Subscribing to the old Swedish
<lb/>proverb: When it rains down milk, the poor
<lb/>has no spoon," I silently dropped the theme,
<lb/>and would not have rementioned it now,
<pb n="16" urn="urn:nbn:se:kb:digark-7887"/>
<lb/>if I were not anxious to dis-play to you, what
<lb/>an able minister of state I might possibly
<lb/>be, if His Majesty should be pleased to
<lb/>invest me with that honor, which, you
<lb/>know, is as distant from me as the mitre
<lb/>and the slipper of the Pope of Rome.
</p>
Just separating out the material in between the <pb/>'s
gives non-wellformed XML.
So, is it possible to write an XQuery expression that
can fix this, i.e. 'detect' that the <pb/> occurs in
the middle of another element and take the appropriate
action? The result would have to look something like
<pb n="15" urn="urn:nbn:se:kb:digark-7886"/>
<p rend="noindent">the painter's daring pencil. I ran only af-
<lb/>ford you a faint idea of its characteristic
<lb/>features, and even that will he found
<lb/>arduous. Come, and see it, and you will
<lb/>applaud my modesty.
</p>
<p>[...]
<lb/>of gold." Subscribing to the old Swedish
<lb/>proverb: When it rains down milk, the poor
<lb/>has no spoon," I silently dropped the theme,
<lb/>and would not have rementioned it now,
</p>
Thanks.