Function to return a valid element name

adurth · Feb 26, 2007

Hi!
Is there any function that converts a string containing characters
that are invalid for use in an element name to a valid one?

Thanks,
Andreas

Martin Honnen · Feb 26, 2007

Is there any function that converts a string containing characters
that are invalid for use in an element name to a valid one?

Which programming language/framework are you using? The Microsoft .NET
framework has
XmlConvert.EncodeName
<http://msdn2.microsoft.com/en-us/library/system.xml.xmlconvert.encodename.aspx>

adurth · Feb 26, 2007

Which programming language/framework are you using? The Microsoft .NET
framework has
XmlConvert.EncodeName
<http://msdn2.microsoft.com/en-us/library/system.xml.xmlconvert.encode...>

Aah yes, sorry I have not been precise. I am looking for a xml
function like translate() or replace().

Joe Kesselman · Feb 26, 2007

Aah yes, sorry I have not been precise. I am looking for a xml
function like translate() or replace().

In that case, I believe the answer is... translate(), or implement your
own recursive string processing if single-character substitutions aren't
sufficient for you. There's nothing standardized for this purpose, since
it isn't something commonly done.

adurth · Feb 26, 2007

In that case, I believe the answer is... translate(), or implement your
own recursive string processing if single-character substitutions aren't
sufficient for you. There's nothing standardized for this purpose, since
it isn't something commonly done.

Okay, thank you anyway.

Joe Kesselman · Feb 27, 2007

One more observation: There are a heck of a lot of characters that are
valid in element names (just about any alphanumeric in just about any
language, plus some punctuation), since XML's defined in terms of
Unicode. Simply checking whether all the characters in an element name
are legal is something of a pain; figuring out what to replace the
(many!) other Unicode characters with is going to be (ahem) interesting.
The simplest solution would probably be to invent some sort of escaping
syntax (and then, as usual with such things, also escape the
escape-introduction sequence so the conversion is reliably unique and
reversible).

Unless you control ALL names in the document, that does introduce the
risk that a name created by someone else will contain something that
looks like an escape sequence.

BUT... frankly, you really don't *WANT* element names being made up on
the fly, since they're what describes the structure of your document.
Consider putting your non-XML descriptor in _content_, eg an attribute
value, rather than an element name. Among other things, XML already has
the ability to escape characters in text content.

(You still won't be able to use every possible character, even after
escaping it, if you're working in XML 1.0. I believe XML 1.1 -- which is
rarely used -- expanded the legal character set, but you may not want to
make support for 1.1 a prereqisite. The alternative is to fall back to
inventing your own escaping mechanism, eg by doing a base-64 encoding
upon the UTF8 data.)

In other words: What problem are you really trying to solve, and is the
rather ugly kluge you proposed really necessary and/or sufficient?

adurth · Feb 27, 2007

One more observation: There are a heck of a lot of characters that are
valid in element names (just about any alphanumeric in just about any
language, plus some punctuation), since XML's defined in terms of
Unicode. Simply checking whether all the characters in an element name
are legal is something of a pain; figuring out what to replace the
(many!) other Unicode characters with is going to be (ahem) interesting.
The simplest solution would probably be to invent some sort of escaping
syntax (and then, as usual with such things, also escape the
escape-introduction sequence so the conversion is reliably unique and
reversible).

Unless you control ALL names in the document, that does introduce the
risk that a name created by someone else will contain something that
looks like an escape sequence.

BUT... frankly, you really don't *WANT* element names being made up on
the fly, since they're what describes the structure of your document.
Consider putting your non-XML descriptor in _content_, eg an attribute
value, rather than an element name. Among other things, XML already has
the ability to escape characters in text content.

(You still won't be able to use every possible character, even after
escaping it, if you're working in XML 1.0. I believe XML 1.1 -- which is
rarely used -- expanded the legal character set, but you may not want to
make support for 1.1 a prereqisite. The alternative is to fall back to
inventing your own escaping mechanism, eg by doing a base-64 encoding
upon the UTF8 data.)

In other words: What problem are you really trying to solve, and is the
rather ugly kluge you proposed really necessary and/or sufficient?

Hi!
Thank you for your extended thoughts on this. As you might have
guessed, I´m pretty new to XML. In my case a tool from a toolchain can
export results as a xml-file. Until now this feature has not been used
but now we want to use it and therefore import it to another tool. As
you can imagine the output is not compatible to what the second tool
can import so I'm currently writing a xsl transformation. In order to
do this, some element values will become element names in the output
xml. Meanwhile I have found the problem I was facing when I posted
this not to be illegal characters in regard to xml (except some
spaces), but the fact that the second tool doesn´t accept a whole
bunch of characters used in the source xml. Consequently it seems to
me that translate() is my choice. If you can advice otherwise, please
tell me!

Regards,
Andreas

Tasks	1	Nov 29, 2022
Update element value ViewWeb2 .net Control	1	Oct 7, 2022
What's the detailed explanation for why the 1st function is correct and the 2nd is wrong?	3	Dec 16, 2022
Accessing array index addresses with custom datatype in a function	0	Jun 2, 2022
How to discover a CSS Selector name?	8	Sep 12, 2023
Centering picture element for larger screen sizes	2	Sep 21, 2023
Why doesn't the function get called?	1	Nov 20, 2023
Trouble calling a function with enum parameter	3	Jan 13, 2023

Function to return a valid element name

adurth

Martin Honnen

adurth

Joe Kesselman

adurth

Joe Kesselman

adurth

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads