pipe binary data

C

cs5b

Hi there,
I have a little program that mines the web. It consists of two ruby
programs. The first is responsible to get the data from the web; the
second is responsible of mining that data. I connect the two via
command line redirection/ pipe (the first executes stdin.puts data; the
second calls data = stdin.gets)

This works just fine when I access html pages. However, when I access
binary data, at times, the pipe misbehaves. I encountered one jpg file
that causes a crash of the consuming data mining program on linux; and
a infinite loop (stdin.gets just returns nil after a while without
having read the entire data) on windows.

I suppose what I have to do is encode the data before I place it on the
pipe and decode it after I consume it from the pipe. Any suggestions?

Cheers

Christian
 
M

MonkeeSage

Hi Christian,

I'm no expert by any means, but this sounds odd to me -- more like
something is broken with your scripts rather than the piping. The
reason I say so is because I can pipe /dev/random and /dev/dsp to a
file without any problem; you'd think that I'd hit the same problem at
some point given the (pseudo-)random nature of those pipes; but I never
have.

Regards,
Jordan
 
C

cs5b

Jordan, some more digging revealed this was related to a parsing bug in
REXML. It somehow didnt recognize a closing tag...so it wasnt related
to the pipe afterall.
thanks-
christian
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,214
Messages
2,571,112
Members
47,704
Latest member
DavidSuita

Latest Threads

Top