From UTF-8 to windows-1252

  • Thread starter Noé Alejandro Castro Sánchez
  • Start date
N

Noé Alejandro Castro Sánchez

Hello.

I have some data in a file with windows-1252 charset ("special"
characters, for example accented words). I use the method encode to post
them in a SQLite3 DB:

mydata.encode("utf-8")

Using SQLiteSpy I can see the data with the right characters.

But when I get the data from the DB with my program I want to process
them in Windows-1252 again. So, if I use encode with windows-1252 I get
an error

mydata.encode("windows-1252")

compare_synonyms.rb:21:in `encode': "\xC3" from ASCII-8BIT to UTF-8
in conversion from ASCII-8BIT to Windows-1252
(Encoding::UndefinedConversionError) from compare_synonyms.rb:21:in
`block (2 levels) in identify_synonyms'

Now, if I use codepoints the data are not displayed with the the
right characters:

mydata.codepoints.to_a.pack("C*")
What happen? What can I do?

Thanks in advanced.

-- =

Posted via http://www.ruby-forum.com/.=
 
Y

Y. NOBUOKA

Hello,
But when I get the data from the DB with my program I want to process
them in Windows-1252 again. So, if I use encode with windows-1252 I get
an error

=A0 mydata.encode("windows-1252")

Although an encoding of the data from the DB is UTF-8, ruby doesn't
know the encoding, so you must do tell ruby the encoding before
encoding to Windows-1252.

# tell ruby the encoding
mydata.force_encoding( "UTF-8" )
# encode to windows-1252
mydata.encode( "windows-1252" )

Regards,
--=20
nobuoka
 
N

Noé alejandro Castro sánchez

Although an encoding of the data from the DB is UTF-8, ruby doesn't
know the encoding, so you must do tell ruby the encoding before
encoding to Windows-1252.

# tell ruby the encoding
mydata.force_encoding( "UTF-8" )
# encode to windows-1252
mydata.encode( "windows-1252" )

Regards,

Hey, thanks a lot!! Now I can see the right characters =D

Regards.
 
Y

Y. NOBUOKA

Hi, matz
|Although an encoding of the data from the DB is UTF-8, ruby doesn't
|know the encoding, so you must do tell ruby the encoding before
|encoding to Windows-1252.
|
| =A0# tell ruby the encoding
| =A0mydata.force_encoding( "UTF-8" )
| =A0# encode to windows-1252
| =A0mydata.encode( "windows-1252" )

For the record, you don't have to use force_encoding:

=A0mydata.encode("windows-1252", "UTF-8")

I missed the +src_encoding+ arg and the +option+ arg.
Now I see a String#encode method is a very useful.
http://www.ruby-doc.org/core/classes/String.html#M001113

thanks!

--=20
nobuoka
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,965
Messages
2,570,148
Members
46,710
Latest member
FredricRen

Latest Threads

Top