ISO CD Image file is being sent as HTML

J

jstorta

I have the following link in a document

<A href="mycdimage.iso">Test Image</A>

When I click on it in a browser it opens it as though it were and HTML
file and starts putting random characters in the window.

If I right click and select Save Target As, it allows me to save it but
says the file type is HTML and tacks on an htm extension to it.

I have gone to other sites and can download iso images without any
problems so I've determined that the problem is on my web server not
the browser.

The web server uses Apache Tomcat and the document is a java server
page. This link is just straight HTML code though.


Is there something I am supposed to include in the document or anchor
tag to indicate that this is a binary file that should be saved when
clicked or does this sound like a tomcat/JSP problem?

Thanks,
 
J

Jim Higson

jstorta said:
I have the following link in a document

<A href="mycdimage.iso">Test Image</A>

When I click on it in a browser it opens it as though it were and HTML
file and starts putting random characters in the window.

If I right click and select Save Target As, it allows me to save it but
says the file type is HTML and tacks on an htm extension to it.

I have gone to other sites and can download iso images without any
problems so I've determined that the problem is on my web server not
the browser.

The web server uses Apache Tomcat and the document is a java server
page. This link is just straight HTML code though.


Is there something I am supposed to include in the document or anchor
tag to indicate that this is a binary file that should be saved when
clicked or does this sound like a tomcat/JSP problem?

You need to configure the server to send out the right Content-Type header
with the ISO. This is how the browser knows what type the file is.
 
J

Jonathan N. Little

Switchy said:
Do you use Mozzila as your web browser?

Don't top post please.

A: Not according to his header

X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;
..NET CLR 1.1.4322),gzip(gfe),gzip(gfe)
 
N

Neredbojias

With neither quill nor qualm, jstorta quothed:
I have the following link in a document

<A href="mycdimage.iso">Test Image</A>

When I click on it in a browser it opens it as though it were and HTML
file and starts putting random characters in the window.

If I right click and select Save Target As, it allows me to save it but
says the file type is HTML and tacks on an htm extension to it.

I have gone to other sites and can download iso images without any
problems so I've determined that the problem is on my web server not
the browser.

The web server uses Apache Tomcat and the document is a java server
page. This link is just straight HTML code though.


Is there something I am supposed to include in the document or anchor
tag to indicate that this is a binary file that should be saved when
clicked or does this sound like a tomcat/JSP problem?

It sounds like the server is not exercising the right isometrics.
 
S

Switchy

The reason that I asked you if you use Mozzila is:

IE can recognize a lot of extensions,
otherwise, with mozzila you have to contact your provider
who can define the extension on server.

That happens to me with MSI (Microsoft Installer) extension.

DO not use Mozzila is just a game for kids.
 
A

Alan J. Flavell

The reason that I asked you if you use Mozzila is:

IE can recognize a lot of extensions,

In clear text: MSIE violates a mandatory requirement of the HTTP
protocol.
otherwise, with mozzila you have to contact your provider
who can define the extension on server.

What you're thinking of is true for all WWW-compatible browsers, by
definition. Not only Mozilla.
That happens to me with MSI (Microsoft Installer) extension.

Browsers usually provide a way for the recipient to download content
from a link, even if the sender gets the type wrong. But the *user*
has to ask for it deliberately (e.g shift/click). The HTTP protocol
prohibits browsers from unilaterally overriding the server-provided
content type.
DO not use Mozzila is just a game for kids.

You'd have to draw your own conclusions about whether to take advice
from someone who can't even spell Mozilla, let alone caring to follow
netiquette, and can't seem to recognise the difference between a
WWW-compatible browser and an operating system component which, all
too frequently, is too clever for the security of its users.

"Switchy" doesn't even know what principles MSIE uses in violating
this mandatory requirement. Looking at the filename extension is
pretty much its *last* resort when guessing at content type - as MS's
own documentation would have told him.

cheers
 
N

Neredbojias

With neither quill nor qualm, Switchy quothed:
DO not use Mozzila is just a game for kids.

Oh, the Humanities!

Are you serious? Firefox is the best browser out there. Come out from
under that rock and smell the new millennium air!
 
S

Switchy

Yes,

where is the basic alt="text" option?


Neredbojias said:
With neither quill nor qualm, Switchy quothed:


Oh, the Humanities!

Are you serious? Firefox is the best browser out there. Come out from
under that rock and smell the new millennium air!
 
J

Jonathan N. Little

Switchy wrote:

Stop top posting please.
Yes,

where is the basic alt="text" option?

What, do your mean showing the 'tool tip' with an image's alt text? If
so the reason is because it is not supposed to. The alt text if for when
the image is unavailable. If you want the tool tip use the title
attribute. Does have to be on an image either.

<h1 title="Heading tooltip!">Test Tooltip</h1>
 
J

Jim Higson

Switchy said:
The reason that I asked you if you use Mozzila is:

IE can recognize a lot of extensions,

Just a note, but file extensions in the URL aren't really a very good way to
decide what type the content is. When Tim Berners-Lee designed the WW, he
decided to use an HTTP header ("Content-Type") instead.

Off the top of my head, there are a few reasons why putting too much faith
in extentions isn't a very good idea on the WWW:

1) A lot of servers like scripts to have a certain extension, but this
rarely indicates the type of content being served. For example, a URL that
ends in ".php" could serve an (X)HTML page, a CSS sheet, a PNG image, plain
text, an SVG image... anything! This isn't just an academic thing, for
example check out the page on a site I run:

http://surfcore.co.uk/node/293

All the URLs for user images except the first one end in ".php" because they
are scaled as requested.

2) HTTP has something called content negotiation. (the Apache implementation
of this is called multiviews). Using content negotiation,
http://example.com/images/me might return a SVG to very modern browsers, a
PNG to recent-ish onces and a GIF to very old ones. The browser should tell
the server what it supports when it requests the image.

3) It is in many cases a bad idea to have the file extentions in URLs since
it is an implementation detail and not of interest to most users. Serving
ISO images is a bit of an exception to this because the user *is*
interested in the type of the file.

4) Which extention is for which file type are only really a convention,
whereas MIME types are (mostly) formally registered.

5) Only really MS Windows uses file extentions to decide what type a file
is. Unix typically looks at the contents of the file itself to decide.
otherwise, with mozzila you have to contact your provider
who can define the extension on server.

Not necessarilty. This can be done in htaccess. The server is at fault
though, it is telling the browser that the content is "text/html" and the
browser is believing it. It might be frustrating in the short term, but in
the long run I find web development less frustrating if the browser
believes what the server tells it instead of trying to second guess it all
the time.

Hope this is of interest,
Jim
 
A

Alan J. Flavell

Just a note, but file extensions in the URL aren't really a very
good way to decide what type the content is. When Tim Berners-Lee
designed the WW, he decided to use an HTTP header ("Content-Type")
instead.

Right; but in the interests of accuracy, RFC2616 is an IETF
standards-track RFC - which means it behoves all Internet users to
observe its requirements, no matter what their opinion of TimBL and
the W3C might happen to be.

[good points snipped...]
5) Only really MS Windows uses file extentions to decide what type a
file is.

MS Windows does, it's true, but MSIE generally doesn't, as their
documentation shows:
http://msdn.microsoft.com/workshop/networking/moniker/overview/appendix_a.asp

In trying to guess what the content really might be, the filename
extension is pretty much its last resort.
Unix typically looks at the contents of the file itself to decide.

Well, yes; but a very common way to use Apache is with the MIME
content-type determined by the filename extension *at the server*
(which might or might not appear in the associated URL, as you say).

There isn't any HTTP content type which clearly means "the receiving
OS should guess", so the issue of how a unix-type operating system
might guess is mostly off-topic. If *and only if* (in the words of
RFC2616) the sender has omitted to provide a Content-type header, the
client agent is permitted to guess - but this is already a dubious
situation, since RFC2616 told the sender that they SHOULD provide an
appropriate Content-type header.

I hadn't seen an HTTP 200 response without a Content-type header for
many years. I *did* see one quite recently - and, guess what, the
server that sent it said that it was IIS. Yet another case of
software from Galactic HQ spitting in the face of the Internet
specifications.

The only remaining situation where it's doubtful what RFC2616 says
should happen, is application/octet-stream. Some say that this can
only be saved to file, since it can't be unambiguously associated with
any viewer or application at the client side. Others say that the
wording of RFC2616 doesn't actually disallow the client agent trying
to guess what it is. (I'm fairly agnostic on this point, but this
isn't about me.)
Not necessarilty. This can be done in htaccess. The server is at
fault though, it is telling the browser that the content is
"text/html" and the browser is believing it.

RFC2616 leaves the browser only two choices: either treat it as what
it claims to be, or reject it. In practical terms, "reject it" could
mean appealing to the user and getting their informed consent to
proceed on the basis of what the stuff appears to be, rather than what
the sender claims it to be. The majority of WWW users would,
admittedly, be in no position to take a proper decision on that
"informed" consent; but if all client agents (including the operating
system component that thinks it's a web browser) were behaving in
accordance with RFC2616, then this situation simply wouldn't arise,
since everyone providing content would see the problem as soon as they
tried to access it themselves, and would correct it forthwith.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,992
Messages
2,570,220
Members
46,807
Latest member
ryef

Latest Threads

Top