F
Florian von Savigny
Hi there,
I am having a hard time in grasping how the TBX (TermBase eXchange
format), authored by LISA (http://www.lisa.org/tbx/) exactly
works. What I seem to have understood is the following:
- although TBX *is* defined in terms of a DTD, it does not define it
in full; it is more like it defines a general structure that can be
further modified as far as details are concerned. Thus, TBX is not
"one" format but rather, well, a class of closely related formats
(or maybe they could be called "dialects" - "TBX variants" in proper
lingo, I think). In the implementation of this model, elements are
rather few and their generic identifiers sound more abstract than is
familiar with e. g. traditional SGML formats. For instance, what
could conceivably be called <definition> in an old-fashioned way
("this is a definition") is <descrip type="definition"> in the TBX
way ("this is a descriptive element, namely a definition"). In TBX
lingo, "descrip" is a meta data category, and "definition", a data
category.
- what is hence also needed to have a working format is thus the
information prescribing the details. This is the job of so-called
XCS (extensible constraint specification) files. Formally, they are
in XML format, hence may carry an .xml file extension. Any
application capable of processing TBX must be capable of checking
adherence to at least one such XCS (this may perhaps even be
hardcoded into the application, meaning the TBX variant understood
by this app would be fixed). For an application to be even better,
however, it should be able to read in a given XCS file, and then
process the TBX data based on that (hence it would be able to
process any TBX variant).
I never before heard of XCS, and very few of schemata (;-) - yeah, I
know, schemas), but I am wondering whether XCS might be no more and no
less than a schema (though it would be lightly confusing if both a DTD
and a schema were present - I thought these to be alternatives,
rather). Here's an illustrative example of such an XCS, provided by
LISA:
<?xml version="1.0"?>
<TBXXCS name='DXFd-supplier' version="1.0" lang='en' xmlns="x-schema:TBX-XCS-XDRschema-v-0-
1.xml">
<header><title>subset DCS file for the Supplier example</title></header>
<datCatSet>
<termNoteSpec name="termType" datcatId="ISO12620A-0201">
<contents datatype="picklist" targetType="none">fullForm abbreviatedForm</contents>
</termNoteSpec>
<descripSpec name="subjectField" datcatId="ISO12620A-04">
<contents datatype="picklist" targetType="none">manufacturing finance</contents>
<levels>termEntry</levels>
</descripSpec>
<descripSpec name="definition" datcatId="ISO12620A-0501">
<contents datatype="noteText" targetType="none"/>
<levels>termEntry
</descripSpec>
</datCatSet>
</TBXXCS>
Now, is XCS simply a schema, or is it an invention peculiar to the TBX
format? I gather the same tools as used for conversion of ordinary
SGML/XML should be usable, but are there tools available for checking
for consistency with an XCS file, analogous to the way nsgmls checks
for consistency with a DTD? I hope that the syntax of the XBS file
becomes more transparent to me as I grasp this.
I would very much appreciate any enlightenment on this.
Regards, Florian
--
Florian v. Savigny
If you are going to reply in private, please be patient, as I only
check for mail something like once a week. - Si vous allez répondre
personellement, patientez s.v.p., car je ne lis les courriels
qu'environ une fois par semaine.
I am having a hard time in grasping how the TBX (TermBase eXchange
format), authored by LISA (http://www.lisa.org/tbx/) exactly
works. What I seem to have understood is the following:
- although TBX *is* defined in terms of a DTD, it does not define it
in full; it is more like it defines a general structure that can be
further modified as far as details are concerned. Thus, TBX is not
"one" format but rather, well, a class of closely related formats
(or maybe they could be called "dialects" - "TBX variants" in proper
lingo, I think). In the implementation of this model, elements are
rather few and their generic identifiers sound more abstract than is
familiar with e. g. traditional SGML formats. For instance, what
could conceivably be called <definition> in an old-fashioned way
("this is a definition") is <descrip type="definition"> in the TBX
way ("this is a descriptive element, namely a definition"). In TBX
lingo, "descrip" is a meta data category, and "definition", a data
category.
- what is hence also needed to have a working format is thus the
information prescribing the details. This is the job of so-called
XCS (extensible constraint specification) files. Formally, they are
in XML format, hence may carry an .xml file extension. Any
application capable of processing TBX must be capable of checking
adherence to at least one such XCS (this may perhaps even be
hardcoded into the application, meaning the TBX variant understood
by this app would be fixed). For an application to be even better,
however, it should be able to read in a given XCS file, and then
process the TBX data based on that (hence it would be able to
process any TBX variant).
I never before heard of XCS, and very few of schemata (;-) - yeah, I
know, schemas), but I am wondering whether XCS might be no more and no
less than a schema (though it would be lightly confusing if both a DTD
and a schema were present - I thought these to be alternatives,
rather). Here's an illustrative example of such an XCS, provided by
LISA:
<?xml version="1.0"?>
<TBXXCS name='DXFd-supplier' version="1.0" lang='en' xmlns="x-schema:TBX-XCS-XDRschema-v-0-
1.xml">
<header><title>subset DCS file for the Supplier example</title></header>
<datCatSet>
<termNoteSpec name="termType" datcatId="ISO12620A-0201">
<contents datatype="picklist" targetType="none">fullForm abbreviatedForm</contents>
</termNoteSpec>
<descripSpec name="subjectField" datcatId="ISO12620A-04">
<contents datatype="picklist" targetType="none">manufacturing finance</contents>
<levels>termEntry</levels>
</descripSpec>
<descripSpec name="definition" datcatId="ISO12620A-0501">
<contents datatype="noteText" targetType="none"/>
<levels>termEntry
</descripSpec>
</datCatSet>
</TBXXCS>
Now, is XCS simply a schema, or is it an invention peculiar to the TBX
format? I gather the same tools as used for conversion of ordinary
SGML/XML should be usable, but are there tools available for checking
for consistency with an XCS file, analogous to the way nsgmls checks
for consistency with a DTD? I hope that the syntax of the XBS file
becomes more transparent to me as I grasp this.
I would very much appreciate any enlightenment on this.
Regards, Florian
--
Florian v. Savigny
If you are going to reply in private, please be patient, as I only
check for mail something like once a week. - Si vous allez répondre
personellement, patientez s.v.p., car je ne lis les courriels
qu'environ une fois par semaine.