-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Am 12.10.2010 01:16, schrieb Andreas S.:
Using iconv to clean the string works:
Iconv.conv('utf-8//IGNORE','utf-8',"#{0xFF.chr} abcde")
=> " abcde"
However, it would be nicer if there was a way to do this with the
built-in encoding functions of Ruby 1.9.
String#encode can do this much nicer:
============================================
$ irb
irb(main):001:0> RUBY_DESCRIPTION
=> "ruby 1.9.2p0 (2010-08-18 revision 29036) [x86_64-linux]"
irb(main):002:0> str = "#{0xFF.chr}"
=> "\xFF"
irb(main):003:0> str.encoding
=> #<Encoding:ASCII-8BIT>
irb(main):004:0> str.encode("UTF-8")
Encoding::UndefinedConversionError: "\xFF" from ASCII-8BIT to UTF-8
from (irb):4:in `encode'
from (irb):4
from /opt/rubies/ruby-1.9.2-p0/bin/irb:12:in `<main>'
irb(main):005:0> str.encode("UTF-8", :invalid => :replace, :undef =>
:replace, :replace => "?")
=> "?"
irb(main):006:0>
============================================
In order to remove invalid chars completely, use an empty string instead
of "?".
Vale,
Marvin
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla -
http://enigmail.mozdev.org/
iQEcBAEBAgAGBQJMtA4iAAoJEGrS0YjAWTKV3T0H/0871zefFCUGMrNt69O2JjOJ
waH6Kwi3VqQzXS/AW/UdGFS7BGJwD70Rn62D43MMhqQ1gzPEdIlecMuDl1QZwp06
Fu1cuLE0lvWh0ecS0ahBRgmc0fdGPAM7/EKKIHsXuhfFJgoS0ttVVQ363UbMYXst
jMUrDAlJJ5fpasptxz9avq5MwAFyBvFXOqsRVuWrsZyuMy/akdWysUF9CoxtnIyp
mKh/dmkZ+tWZNuDHTRwFmXcxOFmwrJB8oXIGurKKDiseo2/K8KkldwCjNKRhNBfn
6RInFulYLDiywIYDPF/M4k5fDfnwhuFMF9qWtnoQuoXK/rPV4Al/oNXyEXLPICU=
=M4ng
-----END PGP SIGNATURE-----