Unicode to HTML entities

Clodoaldo · May 29, 2007

I was looking for a function to transform a unicode string into
htmlentities. Not only the usual html escaping thing but all
characters.

As I didn't find I wrote my own:

# -*- coding: utf-8 -*-
from htmlentitydefs import codepoint2name

def unicode2htmlentities(u):

htmlentities = list()

for c in u:
if ord(c) < 128:
htmlentities.append(c)
else:
htmlentities.append('&%s;' % codepoint2name[ord(c)])

return ''.join(htmlentities)

print unicode2htmlentities(u'São Paulo')

Is there a function like that in one of python builtin modules? If not
is there a better way to do it?

Regards, Clodoaldo Pinto Neto

Richard Brodie · May 29, 2007

I was looking for a function to transform a unicode string into
htmlentities.

'São Paulo'

Clodoaldo · May 29, 2007

'São Paulo'

That was a fast answer. I would never find that myself.

Thanks, Clodoaldo

Duncan Booth · May 30, 2007

Clodoaldo said:
That was a fast answer. I would never find that myself.

You might actually want:
'São Paulo & Espírito Santo'

as you have to be sure to escape any ampersands in your unicode
string before doing the encode.

Tommy Nordgren · May 30, 2007

I was looking for a function to transform a unicode string into
htmlentities. Not only the usual html escaping thing but all
characters.

As I didn't find I wrote my own:

# -*- coding: utf-8 -*-
from htmlentitydefs import codepoint2name

def unicode2htmlentities(u):

htmlentities = list()

for c in u:
if ord(c) < 128:
htmlentities.append(c)
else:
htmlentities.append('&%s;' % codepoint2name[ord(c)])

return ''.join(htmlentities)

print unicode2htmlentities(u'São Paulo')

Is there a function like that in one of python builtin modules? If not
is there a better way to do it?

Regards, Clodoaldo Pinto Neto

In many cases, the need to use html/xhtml entities can be avoided by
generating
utf8- coded pages.

Clodoaldo · May 30, 2007

I was looking for a function to transform a unicode string into
htmlentities. Not only the usual html escaping thing but all
characters.

Click to expand...

As I didn't find I wrote my own:

Click to expand...

# -*- coding: utf-8 -*-
from htmlentitydefs import codepoint2name

Click to expand...

def unicode2htmlentities(u):

Click to expand...

htmlentities = list()

Click to expand...

for c in u:
if ord(c) < 128:
htmlentities.append(c)
else:
htmlentities.append('&%s;' % codepoint2name[ord(c)])

Click to expand...

return ''.join(htmlentities)

Click to expand...

print unicode2htmlentities(u'São Paulo')

Click to expand...

Is there a function like that in one of python builtin modules? If not
is there a better way to do it?

Click to expand...

Regards, Clodoaldo Pinto Neto

Click to expand...

In many cases, the need to use html/xhtml entities can be avoided by
generating
utf8- coded pages.

Sure. All my pages are utf-8 encoded. The case I'm dealing with is an
email link which subject has non ascii characters like in:

<a href=mailto:[email protected]?subject=Dúvidas>Mail to</a>

Somehow when the user clicks on the link the subject goes to his email
client with the non ascii chars as garbage.

And before someone points that I should not expose email addresses,
the email is only linked with the consent of the owner and the source
is obfuscated to make it harder for a robot to harvest it.

Regards, Clodoaldo

Clodoaldo · May 30, 2007

You might actually want:

'São Paulo & Espírito Santo'

as you have to be sure to escape any ampersands in your unicode
string before doing the encode.

I will do it. Thanks.

Regards, Clodoaldo.

codec for html/xml entities!?	3	Apr 18, 2008
Benchmarking stripping of Unicode characters which are invalid XML	0	Mar 18, 2012
Convert from unicode chars to HTML entities	8	Jan 29, 2007
Ascii to Unicode.	4	Jul 28, 2010
How to convert Unicode string to raw string escaped with HTML Entities	3	May 10, 2007
unicode	7	Jul 1, 2007
Python and decimal character entities over 128.	2	Jul 10, 2008
Unicode/UTF-8 confusion	1	Mar 15, 2008

Unicode to HTML entities

Clodoaldo

Richard Brodie

Clodoaldo

Duncan Booth

Tommy Nordgren

Clodoaldo

Clodoaldo

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads