F
Fafounet
Hello,
I am parsing a web page with special chars such as é (which
stands for é).
I know I can have the unicode character é from unicode
("\xe9","iso-8859-1")
but with those extra characters I don' t know.
I tried to implement handle_charref within HTMLParser without success.
Furthermore, if I have the data abécd, handle_data will get "ab",
handle_charref will get xe9 and then handle_data doesn't have the end
of the string ("cd").
Thank you for your help,
Fabien
I am parsing a web page with special chars such as é (which
stands for é).
I know I can have the unicode character é from unicode
("\xe9","iso-8859-1")
but with those extra characters I don' t know.
I tried to implement handle_charref within HTMLParser without success.
Furthermore, if I have the data abécd, handle_data will get "ab",
handle_charref will get xe9 and then handle_data doesn't have the end
of the string ("cd").
Thank you for your help,
Fabien