System Literal

M

Mike Reed

Hi,

I'm writing my own validating XML parser in C++ (it seemed like a good way to
understand the specification!) and I'm a bit stuck on the SystemLiteral. The
"rules" say it can be any valid character but the text also gives a "Definition"
saying it is a URI conforming to RFC2396 and RFC2732. So what should a
validating parser do if a SystemLiteral does not conform to these RFCs?

Specifically, what should a validating parser do if

1. A SystemLiteral is an invalid URI
2. It cannot access the resource given by the URI (which it needs to do to
validate the document).

And more generally, what should a validating parser do if a document does not
break any WFC or VC but does not agree with a "Definition"?

Mike.
 
R

Richard Tobin

The
"rules" say it can be any valid character but the text also gives a
"Definition"
saying it is a URI conforming to RFC2396 and RFC2732.

First, make sure that you are taking account of all the errata. There
have been some amendments to the description of system identifiers.

Bascially system identifiers are "IRIs", though there is not standard
for that yet. That is, they are strings that are legal URIs after
escaping certain characters.

The natural approach is to defer most checking to the URI retrieval
level, and treat errors as resource errors (and probably abort) rather
than validity or well-formedness errors.

-- Richard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,815
Latest member
treekmostly22

Latest Threads

Top