Proposal: Genji -- XML Authorship System

J

Jeff Rubard

I'm just getting interested in XML, although I have some background; and
I'd like to run an idea by people to get a sense of how feasible it is.
I am quite taken with the DocBook format, and find as I am putting
pre-written material into XMLmind as part of a book project that good
ideas for emendation of the text rather naturally arise as a result of
the structure "imposed" by the format. But as DocBook is rather
obviously not everything, it occurred to me that this phenomenon
("felicitous compositionality") might be amenable to extensions in other
directions. So I have a proposal for something like a next-generation
Emacs, or an anti-Publicon; an electronic authorship, rather than
publishing system.

Although we should not be too quick to link the unfortunate cast of
Publicon's name to the misbegotten character of Mr. Wolfram's ideas
about the all-encompassing importance of finite automata (let them be
called "cellular" and you've taken the bait), it is my feeling that the
W3 vision for XML and RDF as permitting a "Semantic Web" is also pitched
at too low a level of abstraction and is liable to lead merely to even
higher levels of "datamation" than already exist in our
identity-thievin', consent-manufacturin', spirit-scornin' world.
Consequently, the proposed program would not be an XML editor proper,
but involving parts designed to allow content to be what it needs to be:
informative, rather than merely manipulable, and this either by being
*timely* (what you need to know when you need to know it) or *singular*
(ain't that somethin').

Would SGML play a role in this? Yes, but as an idea. One of the chief
differences between XML and SGML is that XML imposes a "deterministic
content model" (also an unfortunate turn of phrase) on documents,
whereas such documents are market in SGML as "unambiguous" in typing.
But the question here is not quite that of static/dynamic typing in
programming languages, but one of whether the kind of content human
beings produce (text, images, sounds - even the CDs in our home
collection, according to the chipper W3 RDF page) truly permit of such
unambiguous characterization. The XML answer seems to be "well, they
have to", and I suppose this is true but it's still not for all that a
very good reason; furthermore, as a writer I find that I often do not
know exactly how I want to "type" statements; and this is less of a cute
joke than it appears, as typing shows up in natural language through
categorial grammar.

So it seems to me that there is room for something like an "XML
preprocessor" which takes the "putative SGML" of creative output and
renders it, not as XML, but as an XML Schema -- which itself would
permit several different renderings as "camera-ready" content. Such a
program would resemble XMLmind/XMLspy/XMLalibi/XMLshooflypie in having
both simple "stylistic" content editors (although slightly less
primitive text editing - the equivalent of gvim with spellcheck would be
good enough - and greatly enhanced interfaces with image and sound
editing would be necessary) and tools for "placing" content in the XML
firmament.

But the level on which I feel the program would need to be "dynamic"
would be the RDF or "semantic" level, and although a number of exotic
formal tools are available for dynamically rendering language without
losing the thread the one I suspect might be seriously tractable would
be Mostowski's "generalized quantifiers". I have no idea whether the
idea, which was circulating in formal semantics of natural language in
the late 70s, made it into SGML or not (thank you, ISO) but the concept,
which involves tentatively extending "rigidity" upwards and typing
"downwards" into content, seems much sounder with respect to preserving
the particular character of a piece of content than the object-property
model imposed by RDF -- and programmable to boot.

Is this a necessary program? Well, whether the rigidity of XHTML
relative to HTML merits an entry in RISKS is a question I'll leave for
another time, but I'll try to sell capable parties (not me as of this
moment) on the idea by talking about the "Tale of Genji". The Tale of
Genji, written by a Japanese noblewoman in the 11th century, is not the
first novel of world history; some Greek romances have been credibly
characterized as novels. But it was the first novel in Japan, and made
an according impression; even before printing, and this is something of
the point. If you write someting, put it in HTML, and upload it to a
website not listed with Google have you engaged in "web publishing"?
Not really; obviously the levers which determine mass availability lie
elsewhere. But you have taken a pre-publication step, and really the
process of writing something is nothing other than "pre-publication
steps", one after the other.

And the question the "end of the book" (heh) might raise for us is
whether this series is not infinite, and genuine web "publicity"
dependent on the preservation of the singularity of the content. If I
can't tell two websites apart, I'm unlikely to visit either again; I'm
not maximizing my time by visiting a site without new content.
Furthermore, as the spin-off XML formats indicate sometimes quotidian
information requires a singular format to be properly handled (that is,
disseminated). So I ask, why not go beyond building XML databases of
various kinds and build a product designed to foster "authorship",
rather than foisting more of the same on a possibly-deserving public?

Comments appreciated.
Jeff Rubard
(e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,226
Members
46,815
Latest member
treekmostly22

Latest Threads

Top