J
James Gray
I've just finished an extensive reworking of the standard CSV library =20=
in Ruby 1.9 (formerly FasterCSV). CSV's parser and generator are now =20=
m17n aware. This means they should work naturally with your data in =20
any non-"dummy" Encoding Ruby 1.9 supports.
Everything is documented so it should be pretty easy to figure out how =20=
to use the new system, but generally you just set the Encoding for =20
your IO or String objects correctly and CSV should do the rest:
# reading example
CSV.foreach(=85, :encoding =3D> "=85") do |row|
# row will be parsed but not transcoded here
end
# writing example
CSV.open(=85, "wb:=85") do |csv|
csv << data
# data will be quoted and separated with characters
# in the proper encoding
end
Encodings default to Encoding.default_external if not provided.
I had to change quite a bit of code to support this. I tried to test =20=
well, but it's possible I introduced some new bugs. Please let me =20
know if you find any issues.
I suspect this is probably one of the first full m17n compatible =20
implementations, so I hope it can serve as a guide to others wanting =20
to provide similar support in their libraries. I know I learned a ton =20=
just figuring out how to do this. Feel free to ask me questions about =20=
mulit-encoding support. I'll sure try to answer them if I can.
Finally, here's some fun news to look forward to: even with the m17n =20=
support, CSV on Ruby 1.9 is over three times faster than FasterCSV on =20=
Ruby 1.8 thanks to the speed of the new VM and the switch to =20
Oniguruma. Three cheers to the core team for giving us a much faster =20=
Ruby!
James Edward Gray II
in Ruby 1.9 (formerly FasterCSV). CSV's parser and generator are now =20=
m17n aware. This means they should work naturally with your data in =20
any non-"dummy" Encoding Ruby 1.9 supports.
Everything is documented so it should be pretty easy to figure out how =20=
to use the new system, but generally you just set the Encoding for =20
your IO or String objects correctly and CSV should do the rest:
# reading example
CSV.foreach(=85, :encoding =3D> "=85") do |row|
# row will be parsed but not transcoded here
end
# writing example
CSV.open(=85, "wb:=85") do |csv|
csv << data
# data will be quoted and separated with characters
# in the proper encoding
end
Encodings default to Encoding.default_external if not provided.
I had to change quite a bit of code to support this. I tried to test =20=
well, but it's possible I introduced some new bugs. Please let me =20
know if you find any issues.
I suspect this is probably one of the first full m17n compatible =20
implementations, so I hope it can serve as a guide to others wanting =20
to provide similar support in their libraries. I know I learned a ton =20=
just figuring out how to do this. Feel free to ask me questions about =20=
mulit-encoding support. I'll sure try to answer them if I can.
Finally, here's some fun news to look forward to: even with the m17n =20=
support, CSV on Ruby 1.9 is over three times faster than FasterCSV on =20=
Ruby 1.8 thanks to the speed of the new VM and the switch to =20
Oniguruma. Three cheers to the core team for giving us a much faster =20=
Ruby!
James Edward Gray II