DTD elements definition question

E

es@d

Hello there,

I'm trying to build what is in basis a screen scraper sofware that
takes an url as input and produces an xml file as output.I wanted to
introduce something like "document definitiion" for the source URL,
i.e.

<document id="some_news_site_without_rss"
url="http://www.example.com/news.html">
<news repeat="true">
<article>
<title begin="somehtml" end="somehtml">
</article>

</news>
</document>

would something like

<document>
<news>
<article>
<title>Some title 1</title>
</article>
<article>
<title>Some title 2</title>
</article>
<article>
<title>Some title 3</title>
</article>
</news>
</document>

I hope you get the idea.

My problem is that I've tried to describe this "definition language"
using DTD, but as far as I can see DTD doesn't support/specifies
something like "I want to have one fix parent element - document, all
the other elements are user-specified (unspecified), but they have to
be closed and have following attributes...".

I'm not so deep into XML/SGML thing so maybe I'm just missing some
basic thing.

Thanks,

Esad Hajdarevic
 
M

Martin Honnen

es@d wrote:

I'm trying to build what is in basis a screen scraper sofware that
takes an url as input and produces an xml file as output.I wanted to
introduce something like "document definitiion" for the source URL,
i.e.

<document id="some_news_site_without_rss"
url="http://www.example.com/news.html">
<news repeat="true">
<article>
<title begin="somehtml" end="somehtml">
</article>

</news>
</document>

would something like

<document>
<news>
<article>
<title>Some title 1</title>
</article>
<article>
<title>Some title 2</title>
</article>
<article>
<title>Some title 3</title>
</article>
</news>
</document>

I hope you get the idea.

My problem is that I've tried to describe this "definition language"
using DTD, but as far as I can see DTD doesn't support/specifies
something like "I want to have one fix parent element - document, all
the other elements are user-specified (unspecified), but they have to
be closed and have following attributes...".

I'm not so deep into XML/SGML thing so maybe I'm just missing some
basic thing.

It is correct that if you use a DTD you need to define all elements and
attributes otherwise validation will fail.
With an XML schema you can define some elements while others can be
skipped during validation, see
http://www.w3.org/TR/xmlschema-0/
http://www.w3.org/TR/xmlschema-0/#any
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,825
Latest member
VernonQuy6

Latest Threads

Top