Subject: Re: Scripting XML?
From: Andy Dingley <
[email protected]>
Newsgroups: comp.text.xml
Without an agreement as to what "powerful" means, then there's no way
to answer this question.
XSLT 2 and EXSLT are also worth looking at. XSLT is based on XML,
where the contents of nodes (text nodes or attributes) are largely
opaque. This is a major limitation in practice, and so these are
efforts to improve matters.
On the whole, I'd assert that XSL was "more useful" than Java "in its
abilities to transform XML", but I wouldn't claim this was "more
powerful".
Yes, you are right. My mistake in not clarifying that.
Well, simply put,
Can I take a XML parser, a XML document, a programming language such as
Java and combine them in the following manner for example:
XML Document => parsed by parser --> in-memory representation of
document => manipulated by application --> output in different form
So, if there were commands in some "embedded language" (or variable
value substituitions, variable declarations, function definitions,
funtion declarations etc.) in the document (in attribute values, element
data etc), the application can walk through the in-memory representation
of the document and act as a recognizer for the embedded language. Upon
completion the final document is free of all the "embedded language"
constructs.
Note that in both cases, per se, the document is valid as against being
merely well-formed because the embedded language constructs have no
meaning to a pure XML parser.
I can do this with the above combination certainly!
Question is, can I do the same thing with only XSLT which boils down to
asking, "can I represent the same actions in XSLT" or can I contrive an
equivalence between say Java and XSLT in their ability to do things with
an XML document?
I have no idea how you draw that conclusion from the statement listed,
so I don't really know what you're getting at.
XML has a number of issues (re encoding and whitespace) which are seen
as freely interchangeable. As such, a transport protocol can change
this without consequence. Whether this is regarded as "static" depends
on the context.
There are also many cases (e.g. serving XML content to HTML-only
browser) where XML may be transformed on being served. This isn't
really a mere "transport" though.
So XML documents transported over networks are static. But I'm sure
this wasn't what you meant, because your whole project seems to be
based on breaking this.
Well, what I meant is, if we consider the network as a black box and the
server and client on each side, then what is input into the box at the
server side is the same when it comes out at the client side of the box.
In between yes, the transport protocol may change things.
Specifically, an example of the client is "the parser on the browser"
not the rendering engine or the end-user.
Transformations such as applying a stylesheet are applied after the
client received it, so what I am really getting at is really "before
transformation on the client side are applied" and "after processing at
server side is complete".
Naturally, when you consider stand alone documents used by an
application, the server really does nothing but the client (the parser)
does a lot of stuff before the document is passed on to the final
application (the end-user?).
This last question is relevant because we faced this problem when
using XML to describe a sequence of actions.
I had a brief read of your document. To be frank, I found it a very
hard read - it needs an introduction to it, which it painfully doesn't
have at present. There's no distinction drawn between your monitoring
project, and your XML-MS concept.
I see. It could be that I was explaining it from our perspective and
problems in the project so things got intermixed. I will try to separate
things out and clarify them.
If your project boils down to "Add in-transit processing of scriptlets
to XML", then that's a worthy idea (although it's already out there).
Well, a general purpose scripting language can be used for anything,
including in transit processing (I presume you mean things like in-
network processing?) primarily because the language itself doesn't
impose any restrictions on what can be done with the document and its
content.
Yes, the primary point here is as you said above, the "non-opaqueness"
about the content (such as attribute values or element data). Simply
put, it is the same as the C-preprocessor walking through C modules and
substituting macros, except that the "pre-processor" in our problem is a
more powerful than merely doing macro expansion.
What I want to know about it is how the processing model works (who
processes it, and how does it decide what to process), what's
available as a coding platform to use, and what's the presented
interface for the document that's being processed. I don't care about
in-line code fragments, because quite honestly if I have to learn
anything new to fulfill that particular role, this isn't a good
solution for me.
Well here is how it could work:
- XML-Document A (in Schema S) served by server (which may be a
webserver, or just a file reader library)
- XML-Document A is parsed by XML Parser P into an in-memory
representation M
- Depending on the type of application (stand alone, web browser etc)
+ P invokes I (an interpreter for the embedded scripting language L)
on the value of each attribute and the data of each element.
or
+ P invokes I on only certain elements/attributes on-demand by the
stand alone application
# A stand alone application would be something that I would code for
some specific project.
# A general application would be something like a web-browser which
does not expose the presence of L constructs in a particular
document to an end-user, but does allow the end-user to tell it at
a very general level what to do when such constructs are found.
This is similar to you or me having control over whether IE or
Firefox executes Java applets or Javascript functions.
+ I recognizes certain constructs in the data that is passed to it.
So, in effect, I plays two roles:
* A pattern recognizer
* A language interpreter
# Note that the language is as powerful as any conventional
programming language.
# The patterns are relatively simple and serve only to mark out
regions of the data that should be considered as constructs in L
* On recognition of the pattern, I performs the tasks denoted by
the pattern:
@ This task may be as simple as requesting macro subsutition
@ or it may be as complex as a nested function call
+ When I finishes, the result is a XML document that is free of L in
all respects and conforms to a schema S'.
This model does actually allow you one to do in-transit processing
because all the necessary information to do such processing is available
in the document itself and the "patterns" can be extended to signal
which patterns are to be invoked and where. (That is a very good point,
thanks for pointing it out; I will look this up).
There's also a strong sense that you're ignorant of Schematron, JSP,
Cocoon, even XSLT, and many of the other "pre-invented wheels" that
are already out there.
Definitely true. I had no idea about Schematron and Coocoon (and no
great experience with XSLT) before you pointed them out. JSP I have
heard off. So, yes, I will go and investigate them as well!
Thanks again for your comments, but I hope I can impose upon you to
comment some more?
regards,
-vijai.