A
Amishera Amishera
I have an html file which is encoded in UTF-8. The file contains the
following text:
It's a wonderful life
now the character code 39 is for aphostrohpe in UTF8. so suppose I got
the 39 out of the text using:
s="It's a wonderful life"
s.gsub(/&#(\d+);/, '\1')
The output is
It39s a wonderful life
So firstly I am having trouble making it
It\39s a wonderful life
Secondly I manually did this in test_utf8.rb:
puts "It\39s a wonderful life"
and ran it
ruby test_utf8.rb > utf8.txt
but by opening it in the open office by setting the encoding to utf-8
the output is
It#9s a wonderful life
So how to correctly parse the collect and convert html character
reference to encoded charcters in utf-8 and then save file?
Thanks.
following text:
It's a wonderful life
now the character code 39 is for aphostrohpe in UTF8. so suppose I got
the 39 out of the text using:
s="It's a wonderful life"
s.gsub(/&#(\d+);/, '\1')
The output is
It39s a wonderful life
So firstly I am having trouble making it
It\39s a wonderful life
Secondly I manually did this in test_utf8.rb:
puts "It\39s a wonderful life"
and ran it
ruby test_utf8.rb > utf8.txt
but by opening it in the open office by setting the encoding to utf-8
the output is
It#9s a wonderful life
So how to correctly parse the collect and convert html character
reference to encoded charcters in utf-8 and then save file?
Thanks.