ftplib - Did the whole file get sent?

S

Sean DiZazzo

Hi,

I have some scripts that send files via ftplib to a client's ftp
site. The scripts have generally worked great for a few years.
Recently, the client complained that only part of an important file
made it to their server. My boss got this complaint and brought it to
my attention.

The first thing I did was track down the specific file transfer in my
logs. My log showed a success, I told my boss that, but he wasn't
satisfied with my response. He began asking if there is a record of
the file transfer ack and number of bytes sent for this transfer. I'm
not keeping a record of that...only success or failure (and some
output)

How can I assure him (and the client) that the transfer completed
successfully like my log shows? I'm using code similar to the
following:

try:
ftp = ftplib.FTP(host)
ftp.login(user, pass)
ftp.storbinary("STOR " + destfile, open(f.path, 'rb'))
# log this as success
except:
# log this as an error

Is ftplib reliable enough to say that if an exception is not thrown,
that the file was transferred in full?

Python 2.4.3 (#1, Sep 17 2008, 16:07:08)
Red Hat Enterprise Linux Server release 5.3 (Tikanga)

Thanks for your thoughts.

~Sean
 
S

Steven D'Aprano

How can I assure him (and the client) that the transfer completed
successfully like my log shows?

"It has worked well for many years, there are no reported bugs in the ftp
code, and the logs show the file was transferred correctly. Unless you
can reproduce the failure, there's no evidence that my code is to blame
for the failure. How do you know that the original file wasn't corrupted
before the transfer? Or that the copy wasn't corrupted after the
transfer? Does it copy correctly if you try again? Can you give me a copy
of the file for testing?"



[...]
Is ftplib reliable enough to say that if an exception is not thrown,
that the file was transferred in full?

Python 2.4 is pretty old. Have you checked the bug tracker to see if
there are any reported bugs in ftplib that might be relevant? Or the
What's New for 2.5, 2.6 and 2.7? The source code for ftplib seems fairly
straightforward to me -- if there was an error, I can't see that it could
have been suppressed.

But really, unless you can reproduce the error, I'd say the error lies
elsewhere.
 
S

Sean DiZazzo

How can I assure him (and the client) that the transfer completed
successfully like my log shows?

"It has worked well for many years, there are no reported bugs in the ftp
code
[...]

Thanks for your advice Steven. I agree with you,and did take that
approach to start with. Then the request for the actual FTP "ack"
caught me off guard. I had to break out Wireshark and run a few tests
just to make sure I knew exactly what I was talking about.

I think I will try to explain that asking for the "ack" is not really
a valid request. Something like this:

"Technically, these messages are used only on the lowest level of the
FTP protocol itself. Any client or library implementing FTP would be
sending these messages under the covers...in this case I think its
done in the socket library. It is possible that there is a bug in the
Python FTP library, just like it's possible there is a bug in any
other FTP client. Considering how long this library has been around
(~15-20 years), and how often it is used, it is very unlikely that a
bug causing a partial transfer but showing a success has managed to
stick around for so long."

Does that make sense?
Python 2.4 is pretty old. Have you checked the bug tracker to see if
there are any reported bugs in ftplib that might be relevant? Or the
What's New for 2.5, 2.6 and 2.7? The source code for ftplib seems fairly
straightforward to me -- if there was an error, I can't see that it could
have been suppressed.

But really, unless you can reproduce the error, I'd say the error lies
elsewhere.

I'll check bugs and whats new before sending any response. The more I
think about this, I am beginning to think that he is just trying to
find someone to blame for a problem, and has chosen me.

Thanks again.

~Sean
 
B

Brendan

"It has worked well for many years, there are no reported bugs in the ftp
code
[...]

Thanks for your advice Steven.  I agree with you,and did take that
approach to start with.  Then the request for the actual FTP "ack"
caught me off guard.  I had to break out Wireshark and run a few tests
just to make sure I knew exactly what I was talking about.

I think I will try to explain that asking for the "ack" is not really
a valid request.  Something like this:

"Technically, these messages are used only on the lowest level of the
FTP protocol itself.  Any client or library implementing FTP would be
sending these messages under the covers...in this case I think its
done in the socket library.  It is possible that there is a bug in the
Python FTP library, just like it's possible there is a bug in any
other FTP client.  Considering how long this library has been around
(~15-20 years), and how often it is used, it is very unlikely that a
bug causing a partial transfer but showing a success has managed to
stick around for so long."

Does that make sense?
Python 2.4 is pretty old. Have you checked the bug tracker to see if
there are any reported bugs in ftplib that might be relevant? Or the
What's New for 2.5, 2.6 and 2.7? The source code for ftplib seems fairly
straightforward to me -- if there was an error, I can't see that it could
have been suppressed.
But really, unless you can reproduce the error, I'd say the error lies
elsewhere.

I'll check bugs and whats new before sending any response.  The more I
think about this, I am beginning to think that he is just trying to
find someone to blame for a problem, and has chosen me.

Thanks again.

~Sean

Your boss is both moron and wanker.
 
S

sjm

I follow every ftp put (STOR) with a dir command. Then if the
recipient claims that they never got it (or did not get all of it), I
have evidence that they did and that their byte count is the same as
mine.

This does not entirely guarantee that the ftp was perfect but it goes
a long way. It also provides useful information if there is a
problem.

HTH,
SJM
 
J

John Nagle

Hi,

I have some scripts that send files via ftplib to a client's ftp
site. The scripts have generally worked great for a few years.
Recently, the client complained that only part of an important file
made it to their server. My boss got this complaint and brought it to
my attention.

The first thing I did was track down the specific file transfer in my
logs. My log showed a success, I told my boss that, but he wasn't
satisfied with my response. He began asking if there is a record of
the file transfer ack and number of bytes sent for this transfer. I'm
not keeping a record of that...only success or failure (and some
output)

How can I assure him (and the client) that the transfer completed
successfully like my log shows? I'm using code similar to the
following:

try:
ftp = ftplib.FTP(host)
ftp.login(user, pass)
ftp.storbinary("STOR " + destfile, open(f.path, 'rb'))
# log this as success
except:
# log this as an error

Is ftplib reliable enough to say that if an exception is not thrown,
that the file was transferred in full?

No.

This was for years an outstanding problem with FTP under Windows.
See "http://www.fourmilab.ch/documents/corrupted_downloads/"
And
"http://us.generation-nt.com/answer/incomplete-ftp-upload-under-windows-xp-help-139017881.html"
And "http://winscp.net/forum/viewtopic.php?t=6458". Many FTP
implementations have botched this. TCP has all the machinery to
guarantee that both ends know the transfer completed
properly, but it's often misused.

Looking at the Python source, it doesn't look good. The "ftplib"
module does sending by calling sock_sendall in "socketmodule.c".
That does an OS-level "send", and once everything has been sent,
returns.

But an OS-level socket send returns when the data is queued for
sending, not when it is delivered. Only when the socket is closed,
and the close status checked, do you know if the data was delivered.
There's a final TCP close handshake that occurs when close has
been called at both ends, and only when it completes successfully
do you know that the data has been delivered.

At the socket level, this is performed by "shutdown" (which
closes the connection and returns the proper network status
information), or by "close" (which forces a shutdown but doesn't
return status).

Look at sock_close in "socketmodule.c". Note that it ignores the
return status on close, always returns None, and never raises an
exception. As the Linux manual page for "close" says:
"Not checking the return value of close() is a common but nevertheless
serious programming error. It is quite possible that errors on a
previous write(2) operation are first reported at the final close(). Not
checking the return value when closing the file may lead to silent loss
of data."

"ftplib", in "storlines" and "storbinary", calls "close"
without calling "shutdown" first. So if the other end disconnects
after all data has been queued but not received, the sender will
never know. FAIL.

So there's your bug.

John Nagle
 
L

Lawrence D'Oliveiro

Look at sock_close in "socketmodule.c". Note that it ignores the
return status on close, always returns None, and never raises an
exception. As the Linux manual page for "close" says:
"Not checking the return value of close() is a common but nevertheless
serious programming error. It is quite possible that errors on a
previous write(2) operation are first reported at the final close(). Not
checking the return value when closing the file may lead to silent loss
of data."

The close call is the wrong place to report such errors. For output, there
should be some kind of flush-output call that you can use to get the error.
Close should just unconditionally tear down the connection, which you can do
whether the prior transfers were successful or not.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top