FTP Upload replacement

R

Roedy Green

FTP uploads have five problems:

1. It doesn't properly preserve timestamps.

2. if a file is being downloaded when you try to upload a replacement,
it fails.

3. it is pretty slow since it has a per file overhead which often
overshadows the time to transmit the file body.

4. During an upload you have an inconsistent website, e.g. with links
to files that don't yet exist. Ideally you want the entire upload
treated atomically -- applied as a lump, only after all the files are
present on the server.

5. no compression.

What are people using?
--
Roedy Green Canadian Mind Products
http://mindprod.com

"For reason that have a lot to do with US Government bureaucracy, we settled on the one issue everyone could agree on, which was weapons of mass destruction."
~ Paul Wolfowitz 2003-06, explaining how the Bush administration sold the Iraq war to a gullible public.
 
A

Andrew Thompson

FTP uploads have five problems:
...
What are people using?

Upload as a single .war file? (I am actually
only guessing that would address most of the
issues mentioned.)

I use FileZilla for the pscode.org site. It
is not such a big/complex site, and I need an
FTP client that can do FTPES - FTP over explicit
TLS/SSL to connect. Though I have the option to
deploy as a WAR file, I do not use that option.
 
A

Arne Vajhøj

Roedy said:
FTP uploads have five problems:

1. It doesn't properly preserve timestamps.

2. if a file is being downloaded when you try to upload a replacement,
it fails.

3. it is pretty slow since it has a per file overhead which often
overshadows the time to transmit the file body.

4. During an upload you have an inconsistent website, e.g. with links
to files that don't yet exist. Ideally you want the entire upload
treated atomically -- applied as a lump, only after all the files are
present on the server.

5. no compression.

What are people using?

1,2 and 5 can be solved by using HTTP and a special upload script.

I am not sure that I understand 3. I can not imagine any
protocol that sends less bytes than FTP. The data connection
only sends the file content. The control connection sends 7 bytes
plus the file name. That is really low overhead.

I think it is standard to take the site down while it is being
upgrade. Otherwise install the new version a different location
and switch over when done (even though that can still create small
inconsistencies for pages that are fetched via multiple HTTP
requests).

Arne
 
A

Arne Vajhøj

Andrew said:
Upload as a single .war file? (I am actually
only guessing that would address most of the
issues mentioned.)

It is not a bad option if it is a Java web app.

(which is should be given the group !!)

It bundles all the files in one big files and shutdown
the app, update and restart the app automatically when done.

Arne
 
M

Martin Gregorie

FTP uploads have five problems:

1. It doesn't properly preserve timestamps.

2. if a file is being downloaded when you try to upload a replacement,
it fails.

3. it is pretty slow since it has a per file overhead which often
overshadows the time to transmit the file body.

4. During an upload you have an inconsistent website, e.g. with links to
files that don't yet exist. Ideally you want the entire upload treated
atomically -- applied as a lump, only after all the files are present on
the server.

5. no compression.

What are people using?

FTP, but using the gftp client. The Windows equivalent is WS-FTP.

gftp is a graphical client with decent good batching capabilities - it
will transfer a directory structure recursively and do it quickly. I'm
quite happy with the FTP timestamps reflecting local update time because
I very rarely look at them. My pages and images are small enough that the
lack of compression isn't a problem

I can't run scripts on my web host. The standard sftp client only works
one file/directory at a time and I'm far too lazy to create, test and run
one-off scripts for each update. As a result, using gftp to handle FTP
transfers is the best solution for my situation.

My website is hierarchic with each topic and sub topic in a separate
directory. This means that the directories form a tree structure that
tends to reflect the links between pages. I can avoid momentarily broken
links during updates by working along each branch toward the root as I do
the FTP transfers. If I'm adding a new topic or subtopic there are no
conflicts because this way of doing the update means that the referencing
page is copied after the new branch has been completely transferred.
 
R

Roedy Green

Upload as a single .war file? (I am actually
only guessing that would address most of the
issues mentioned.)

Does that let you serve vanilla HTTP?
--
Roedy Green Canadian Mind Products
http://mindprod.com

"For reason that have a lot to do with US Government bureaucracy, we settled on the one issue everyone could agree on, which was weapons of mass destruction."
~ Paul Wolfowitz 2003-06, explaining how the Bush administration sold the Iraq war to a gullible public.
 
L

Lew

Roedy said:
FTP uploads have five problems:

1. It doesn't properly preserve timestamps.

2. if a file is being downloaded when you try to upload a replacement,
it fails.

3. it is pretty slow since it has a per file overhead which often
overshadows the time to transmit the file body.

4. During an upload you have an inconsistent website, e.g. with links
to files that don't yet exist. Ideally you want the entire upload
treated atomically -- applied as a lump, only after all the files are
present on the server.

5. no compression.

What are people using?

I frequently use scp, which has a "-p" option that "[p]reserves modification
times, access times, and modes from the original file." (from "man scp"). It
leverages SSH options such as "Compression" (also via the scp "-C" flag) and
"CompressionLevel". It seems plenty fast, and has similar syntax to good ol'
cp. I don't think it does anything about points 2 and 4, but I have to wonder
if any upload protocol would. Packaging the upload as a WAR/EAR is great for
Java EE hosts for atomicity, but I don't know of similar solutions for
straight Web servers. Any solution that would work with a local server would
work with FTP or scp; it's independent of how you transfer files.
 
R

Roedy Green

I am not sure that I understand 3. I can not imagine any
protocol that sends less bytes than FTP. The data connection
only sends the file content. The control connection sends 7 bytes
plus the file name. That is really low overhead.

By experiment I have found that sending 100 files of 1 byte takes
orders of magnitude longer than one file of 100 bytes. The problem may
be delays in the handshaking, so that the line is idle most of the
time. Of course part of the problem is the you have the file names to
send.

I upload the same files both as raw HTML files and Replicator zipped
archives. The Replicator uploads are extremely quick in comparison. I
think it has more to do with fewer files than the compression.

Zip format has its own timestamp problems:
2 second resolution, vague about the timezone, vague about whether to
consider DST for present and past files.

--
Roedy Green Canadian Mind Products
http://mindprod.com

"For reason that have a lot to do with US Government bureaucracy, we settled on the one issue everyone could agree on, which was weapons of mass destruction."
~ Paul Wolfowitz 2003-06, explaining how the Bush administration sold the Iraq war to a gullible public.
 
L

Lothar Kimmeringer

Roedy said:
FTP uploads have five problems:

1. It doesn't properly preserve timestamps.

You can use MFMT, where the timestamp is specified as UTC-time.
A ftp-client like FileZilla is doing that automatically for both
directions (i.e. the downloaded files also have the timestamp
they had on the server).
2. if a file is being downloaded when you try to upload a replacement,
it fails.

Only on Windows. On Unixsystems, the file currently being downloaded
don't prevent you from uploading.
3. it is pretty slow since it has a per file overhead which often
overshadows the time to transmit the file body.

Really? To start an upload you send a
PASV or EPSV
and receive a
about 50 Bytes back. After that you start a new connection and
transfer only the raw bytes of the file. Compare these 50 bytes
with a HTTP-Request-Header (and multiply it by two because there
is the HTTP-Response-Header as well).
4. During an upload you have an inconsistent website, e.g. with links
to files that don't yet exist. Ideally you want the entire upload
treated atomically -- applied as a lump, only after all the files are
present on the server.

Use a content management system where you can "publish" your uploaded
changes.
5. no compression.

I don't know but maybe the TLS-part of FTPS is compressing, so you
might use that gaining security as a bonus.
What are people using?

SCP/SFTP, solves a lot of issues concerning firewalls.
When talking of file transfer happening in B2B-szenarios, you
have AS2 (mainly USA), OFTP (mainly Europe), X.400 (replaced
by AS2 and OFTP more and more).


Regards, Lothar
--
Lothar Kimmeringer E-Mail: (e-mail address removed)
PGP-encrypted mails preferred (Key-ID: 0x8BC3CD81)

Always remember: The answer is forty-two, there can only be wrong
questions!
 
R

RedGrittyBrick

Roedy said:
FTP uploads have five problems:

1. It doesn't properly preserve timestamps.

This isn't a problem with some FTP clients and servers. For example,
ncftp has a -y option [1] to try to use a "SITE UTIME" command to
preserve timestamps. I don't use this option but inconsistent timestamps
haven't caused me any noticeable problems so far.

2. if a file is being downloaded when you try to upload a replacement,
it fails.

I must be lucky (or the few websites I maintain using FTP are unpopular :)

3. it is pretty slow since it has a per file overhead which often
overshadows the time to transmit the file body.

This isn't a problem for me, perhaps because I rarely need to upload
large enough numbers of small files that it would be an issue.

OTOH there are reasons why I use rsync for backups in preference to FTP.
When forced to use FTP to bulk copy large numbers of files I almost
always create a single archive file first and FTP that.

4. During an upload you have an inconsistent website, e.g. with links
to files that don't yet exist. Ideally you want the entire upload
treated atomically -- applied as a lump, only after all the files are
present on the server.

This isn't a problem with some FTP clients which upload to a temporary
name and rename afterwards. For example, ccftp has -T and -S options for
this [1].

5. no compression.

See RFC 468 [2]. Some FTP clients and servers do support on-the-fly
compression [3]. It isn't a feature I use. Maybe this is less of an
issue with those high speed optical fibre WAN links you discussed before?

What are people using?

FileZilla, ncftp where I can and and, elsewhere, whatever FTP client or
server is actually available.



[1] http://www.ncftp.com/ncftp/doc/ncftpput.html
[2] http://www.faqs.org/rfcs/rfc468.html
[3] http://tinyurl.com/kqysq4
 
R

RedGrittyBrick

RedGrittyBrick said:
Roedy said:
4. During an upload you have an inconsistent website, e.g. with links
to files that don't yet exist. Ideally you want the entire upload
treated atomically -- applied as a lump, only after all the files are
present on the server.

This isn't a problem with some FTP clients which upload to a temporary
name and rename afterwards. For example, ccftp has -T and -S options for
this [1].

I misunderstood this. I deal with this issue by uploading new link
targets first. When I perform a wholesale restructuring, I usually
upload a "temporarily unavailable" homepage during the upload and don't
worry too much about deep links getting 404'd. If that became an issue
I'd look at uploading into a new document root directory and finishing
with a directory rename at root level Or reconfigure the web server to
serve the site from the new directory (e.g. v1/, v1.1/, v2/ ...).

It may be possible to temporarily configure the server to do some
intelligent redirection of 404'd pages?
 
A

Arne Vajhøj

Lothar said:
Only on Windows. On Unixsystems, the file currently being downloaded
don't prevent you from uploading.

All Unix'es, all file systems and all FTP servers ??
SCP/SFTP, solves a lot of issues concerning firewalls.
When talking of file transfer happening in B2B-szenarios, you
have AS2 (mainly USA), OFTP (mainly Europe), X.400 (replaced
by AS2 and OFTP more and more).

X.400 is email ??


Arne
 
A

Arne Vajhøj

Roedy said:
By experiment I have found that sending 100 files of 1 byte takes
orders of magnitude longer than one file of 100 bytes. The problem may
be delays in the handshaking, so that the line is idle most of the
time. Of course part of the problem is the you have the file names to
send.

But that would be the same problem with almost any protocol.
I upload the same files both as raw HTML files and Replicator zipped
archives. The Replicator uploads are extremely quick in comparison. I
think it has more to do with fewer files than the compression.

Probably. Even though compression also helps.

Arne
 
R

Roedy Green

But that would be the same problem with almost any protocol.

If files were bundled into compressed archives and uploaded, you at
least get rid of the per-file handshaking.

The Replicator protocol that does this, uploads and downloads way
faster than FTP.

My ISP keeps promising me the right to run Java on my server. That has
been going on for years. If it eventually comes through I can use the
Replicator, extended slightly to upload ordinary files to my website.

--
Roedy Green Canadian Mind Products
http://mindprod.com

"For reason that have a lot to do with US Government bureaucracy, we settled on the one issue everyone could agree on, which was weapons of mass destruction."
~ Paul Wolfowitz 2003-06, explaining how the Bush administration sold the Iraq war to a gullible public.
 
R

Roedy Green

Really? To start an upload you send a
PASV or EPSV
and receive a
about 50 Bytes back.

I think the problem is there are delays where nothing is being
transmitted. I know this from observation, not from analysis of the
protocol.
--
Roedy Green Canadian Mind Products
http://mindprod.com

"For reason that have a lot to do with US Government bureaucracy, we settled on the one issue everyone could agree on, which was weapons of mass destruction."
~ Paul Wolfowitz 2003-06, explaining how the Bush administration sold the Iraq war to a gullible public.
 
R

Roedy Green

Only on Windows. On Unixsystems, the file currently being downloaded
don't prevent you from uploading.


odd. I am experiencing this with Unix BSD 4. Do you have a choice of
FTP servers?
--
Roedy Green Canadian Mind Products
http://mindprod.com

"For reason that have a lot to do with US Government bureaucracy, we settled on the one issue everyone could agree on, which was weapons of mass destruction."
~ Paul Wolfowitz 2003-06, explaining how the Bush administration sold the Iraq war to a gullible public.
 
R

Roedy Green

I must be lucky (or the few websites I maintain using FTP are unpopular :)

Exactly. Normally all is fine. But lately I have been getting 5 times
normal volumes.
--
Roedy Green Canadian Mind Products
http://mindprod.com

"For reason that have a lot to do with US Government bureaucracy, we settled on the one issue everyone could agree on, which was weapons of mass destruction."
~ Paul Wolfowitz 2003-06, explaining how the Bush administration sold the Iraq war to a gullible public.
 
R

Roedy Green

It may be possible to temporarily configure the server to do some
intelligent redirection of 404'd pages?

That work for my most common case, which works like this. I create a
new entry in the Java glossary. Then I put a link to it is perhaps 5
related Java glossary entries. Then I upload.

Ideally 6 files should be uploaded, and once they are all safely
uploaded, in one fast fell swoop, the pointers for the files should be
pointed to the new replacements.

It is not the end of the world that the new links don't work for a
while, except the "what's new" link. The time delay can become
considerable if I have rebuilt all the utilities.

I figure uploading to a website is something needed by tens of
thousands of people, and it should just work without a lot of futzing
about.

My biggest problem is the upload simply derails if somebody is using a
file that changed. I have automated uploads. It requires me manually
fiddling to get things flowing again. If this happens while I am
asleep the automated changes don't upload. JSP would fix this since
the variability would not depend on uploading.



--
Roedy Green Canadian Mind Products
http://mindprod.com

"For reason that have a lot to do with US Government bureaucracy, we settled on the one issue everyone could agree on, which was weapons of mass destruction."
~ Paul Wolfowitz 2003-06, explaining how the Bush administration sold the Iraq war to a gullible public.
 
L

Lothar Kimmeringer

Arne said:
All Unix'es, all file systems and all FTP servers ??

I never experienced problems like this except on Windows-systems.
But I can't say it with absolute certainty of course.
X.400 is email ??

It's a standard for a Message Transfer System that is quite
much older than SMTP and covers much more than the original
SMTP-standard. But, yes, it's some kind of email-system that
is still quite common for the transfer of EDI-messages.


Regards, Lothar
--
Lothar Kimmeringer E-Mail: (e-mail address removed)
PGP-encrypted mails preferred (Key-ID: 0x8BC3CD81)

Always remember: The answer is forty-two, there can only be wrong
questions!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

war file sought 13
Fribre Optic 16
Stmreaing video 3
FTP site upload 9
FTP Upload 12
FTP upload through proxy 0
Variable is undefined: 'UploadFormRequest' error in ftp upload modulescript 1
ftp upload through proxy ? 0

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,818
Latest member
Brigette36

Latest Threads

Top