Bug in 1.9.0 csv parser

N

Nicko Kaltner

Hi,
I've found a bug in the 1.9.0 csvparser. I've got a script and data
that effectively breaks it, but it is 567kb. Is that too large for this
list?

The ruby instance takes 100% of cpu while processing this file, and I
had to stop it after 5 minutes..

the code is

#!/usr/local/bin/ruby1.9.0

require 'csv'

filename = 'broken.csv'

CSV.foreach(filename) do |row|
STDERR.puts row.length
row.each do |entry|
puts entry
end
puts "\n####################################\n"
end


I would try and debug it further, but the debugger seems broken in 1.9.0.


Regards,
Nicko
 
J

James Gray

I've found a bug in the 1.9.0 csvparser. I've got a script and data =20=
that effectively breaks it, but it is 567kb. Is that too large for =20=
this list?

Probably, but you are welcome to email it to me privately. I maintain =20=

the CSV library in Ruby 1.9.
The ruby instance takes 100% of cpu while processing this file, and =20=
I had to stop it after 5 minutes..

I'm 95% percent sure this is an issue of your CSV file being invalid. =20=

Because of the way the format works, a simple file like:

"=8510 Gigs of data without another quote=85

can only be detected invalid by reading the entire thing. I've =20
considered putting limits in the library for controlling how far it =20
would read ahead, but these would break some valid data. Then the =20
problem becomes, do I make the limits default to on? That's the only =20=

way they would have helped here, but that would break some otherwise =20
valid code. It's a tough problem to solve correctly.

James Edward Gray II=
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,999
Messages
2,570,246
Members
46,839
Latest member
MartinaBur

Latest Threads

Top