B
Bedo Sandor
Hi,
How can I convert utf-8 encoded strings to latin-2?
I have tried it using libuconv with little success:
require 'uconv'
class String
def un_utf8
Uconv.u8tou16(self).gsub(/\000/, '')
end
def to_utf8
tmp = ""
self.each_byte { |b|
tmp += b.chr + "\000"
}
Uconv.u16tou8(tmp)
end
end
This program is ugly, and does not exactly what I want.
u8tou16 generates a string with 16 bit long characters,
for example "test".un_utf8 == "t\000e\000s\000t\000".
gsub clears the unnecessery "\000" characters from
the string. But there are characters in Hungarian,
that has non-zero second byte in the output of the
u8tou16, so they fail to convert. Anyway this is an
ugly hack.
How is it done nicely?
How can I convert utf-8 encoded strings to latin-2?
I have tried it using libuconv with little success:
require 'uconv'
class String
def un_utf8
Uconv.u8tou16(self).gsub(/\000/, '')
end
def to_utf8
tmp = ""
self.each_byte { |b|
tmp += b.chr + "\000"
}
Uconv.u16tou8(tmp)
end
end
This program is ugly, and does not exactly what I want.
u8tou16 generates a string with 16 bit long characters,
for example "test".un_utf8 == "t\000e\000s\000t\000".
gsub clears the unnecessery "\000" characters from
the string. But there are characters in Hungarian,
that has non-zero second byte in the output of the
u8tou16, so they fail to convert. Anyway this is an
ugly hack.
How is it done nicely?