Sending an incomplete file?

B

burak

I apologize in advance if this is off-topic (it may not be a Perl
question, it could be an HTTP thing? I'm not sure though, that's why
I'm asking... so please don't flame me!).

Regarding sending a binary file through perl. I assume this is one of
the many ways to do it:

print "Content-type: application/octet-stream\n\n";
open (FILE, "<foo.bar");
binmode(FILE);
print <FILE>;
close (FILE);

BUT, if the file is incomplete when this action occurs, the above code
sends only what's there when you open the file, and the download ends
abruptly. Err that's probably not too descriptive, so here's an
example of what I want to do:

Let's say I have some perl code to generate a very large binary file,
hundreds of megabytes or so. And let's say this code is pretty slow,
so it takes some time to generate this file. I'd like to start sending
the file to the user (as a download) before the file is generated
completely. If the download is faster than the file creation, I'd like
the download to sort of.. wait for the file to be completed. So lets
say the file is supposed to be 200 mb large, and so far the script has
only generated 100mb. Your download runs quickly, and you hit the 100
mb mark pretty quickly. Normally, the download would end there and you
have only half the file. How can i make this download wait until the
file is officially "done" (ie, reached 200 mb)? Is this a perl thing
or a browser thing? (I've looked a lot into the http headers for the
download; while there is a filesize parameter, this seems to be only
for convenience sake, and doesn't force the download to continue until
that filesize has been reached.)

I hope I've made this question significantly clear and I really hope
this is the correct group to post at. If not, please direct me to the
correct place. I try not to piss of the regulars :)

PS I realize that the normal solution to this problem would just be
"send the file when it's complete, duh" but that's not what I'm going
for. I want the user to start downloading the file as soon as it
starts being generated...

Thanks in advance
 
B

burak

I apologize in advance if this is off-topic (it may not be a Perl
question, it could be an HTTP thing? I'm not sure though, that's why
I'm asking... so please don't flame me!).

Regarding sending a binary file through perl. I assume this is one of
the many ways to do it:

print "Content-type: application/octet-stream\n\n";
open (FILE, "<foo.bar");
binmode(FILE);
print <FILE>;
close (FILE);

BUT, if the file is incomplete when this action occurs, the above code
sends only what's there when you open the file, and the download ends
abruptly. Err that's probably not too descriptive, so here's an
example of what I want to do:

Let's say I have some perl code to generate a very large binary file,
hundreds of megabytes or so. And let's say this code is pretty slow,
so it takes some time to generate this file. I'd like to start sending
the file to the user (as a download) before the file is generated
completely. If the download is faster than the file creation, I'd like
the download to sort of.. wait for the file to be completed. So lets
say the file is supposed to be 200 mb large, and so far the script has
only generated 100mb. Your download runs quickly, and you hit the 100
mb mark pretty quickly. Normally, the download would end there and you
have only half the file. How can i make this download wait until the
file is officially "done" (ie, reached 200 mb)? Is this a perl thing
or a browser thing? (I've looked a lot into the http headers for the
download; while there is a filesize parameter, this seems to be only
for convenience sake, and doesn't force the download to continue until
that filesize has been reached.)

I hope I've made this question significantly clear and I really hope
this is the correct group to post at. If not, please direct me to the
correct place. I try not to piss of the regulars :)

PS I realize that the normal solution to this problem would just be
"send the file when it's complete, duh" but that's not what I'm going
for. I want the user to start downloading the file as soon as it
starts being generated...

Thanks in advance





Nevermind guys, I figured out a way to make it work. It always happens
that whenever I post a question on Usenet, I figure it out within the
hour. How bizarre.
 
T

Tad McClellan

On Mar 10, 1:05 am, (e-mail address removed) wrote:

Nevermind guys, I figured out a way to make it work.


The newsgroup is all about sharing.

Care to share they way to make it work, or the root cause of the problem?

It always happens
that whenever I post a question on Usenet, I figure it out within the
hour. How bizarre.


If it always happens, then it cannot also be bizarre. :)

You are not alone, that happens to everybody.
 
B

Bart Lateur

print "Content-type: application/octet-stream\n\n";
open (FILE, "<foo.bar");
binmode(FILE);
print <FILE>;
close (FILE); ....
Let's say I have some perl code to generate a very large binary file,
hundreds of megabytes or so.

Uh oh... Do you know that your code tries to fully load the file in
memory (some people have even claimed "several times") before sending it
out to the browser?

Don't do that. Try sending it in chunks instead: read a block of bytes
of a reasonable size, and send it out, and then read the next block...
in a loop.

You can reorganize your code so it uses read or sysread, or you can use
a trick: set $/ to a reference to an integer value, and <> will read in
blocks of that size instead of trying to load an "entire line", whatever
that may mean (the latter could, again, be the entire file)

local $/ = \10240 ; # 10 k blocks
while(<FILE>) {
print;
}

Of course, you still have to take care of the nitty gritty, opening the
file and using binmode (don't forget to binmode STDOUT too). But for the
rest, that's all there is to it.
 
B

burak

The newsgroup is all about sharing.

Care to share they way to make it work, or the root cause of the problem?


If it always happens, then it cannot also be bizarre. :)

You are not alone, that happens to everybody.

Ok, here's what I did. Hopefully my comments will explain it
sufficiently. The problem is that when a large file is being generated
on the fly and a download is requested from the server, the server
will only send the portion of the file that currently exists-- ie, if
the download rate is faster than the rate of generation, the download
will "catch up" and terminate because there is no file left, even
though you haven't reached the proper end of the file. The first
technique I tried was to start an infinite loop and check the file
size as packets are being sent.. if the file size is constant after a
large number of iterations (ie 20000 or so), then you can safely
assume that the file is no longer being created and you can finish the
download safely. This has overhead, however, and is not that
efficient. So the next method I tried, which works quite well, is to
append, at the very end, a string of characters designating the end of
the file (something like "###ENDOFFILE###" or similar). So we loop
through, reading $CHUNK_SIZE bytes at a time, and look for the
$END_OF_FILE text. If we find it at the end of our chunk, we omit the
end of file marker (as to not corrupt the file) and terminate the
script. Here's the main loop:

## open file, set up variables, etc. I omitted this part. You don't
really need to see it...

while (1) {
# where we are currently in the file. seek to there, and read into
$buffer
$pos = tell(FILE);
seek(FILE, $pos, 0);
read(FILE, $buffer, $CHUNK_SIZE);
# now, look one buffer size ahead. why? because if a buffer contains
at the end "###ENDO", we need to know if the next buffer will be
"FFILE###". it is very likely that a buffer will grab only part of our
end of file text
$pos2 = tell(FILE);
seek(FILE, $pos2, 0);
read(FILE, $buffer2, $CHUNK_SIZE);
# seek back to the original position so that in the next iteration,
we're in the right place
# this is NOT the best way to do this, and i will revise it. re-
reading a chunk is unnecessary here. i just got lazy
seek(FILE, $pos, 0);
read(FILE, $buffer, $CHUNK_SIZE);
#this conditional checks one buffer ahead. if the buffer ahead is less
than the length of the end of file text, we should check to see if it
contains our text. if it does, check the current buffer to see if it
has the other part of our text. if it does, print out the current
buffer, omitting our text, and quit the loop
if (length($buffer2) < $EOF_LEN) {
if ($buffer2 eq substr($END_OF_FILE, $EOF_LEN-length($buffer2)) ) {
if (substr($buffer, length($buffer)-($EOF_LEN-length($buffer2))) eq
substr($END_OF_FILE, 0, ($EOF_LEN-length($buffer2))) ) {
print substr($buffer, 0, length($buffer)-($EOF_LEN-
length($buffer2)));
last;
}
}
}

#assuming that the buffer ahead doesnt contain part of our end of
file, check to see if the current buffer has the whole end of file and
quit the loop. if so, print the buffer, omitting the end of file. if
not, just print the buffer.
if (substr($buffer, length($buffer)-$EOF_LEN) eq $END_OF_FILE) {
print substr($buffer, 0, length($buffer)-$EOF_LEN);
last;
} else {
print $buffer;
}

$count++;
}

This technique works quite well, and I've tested it many times without
issues. I hope it helps someone out there.

I also realize this may not be the best way to do it. If anyone has
other suggestions, throw em at me!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,740
Latest member
JudsonFrie

Latest Threads

Top