Character encoding

E

Eustace

Sorry to bother you again, but something has happened to my webpage

http://www.geocities.com/emfrilingos/fril

and although I specify

<meta content="text/html; charset=ISO-8859-7" http-equiv="content-type">

when it is opened the character encoding is utf-8 and you have to change
it to Greek. The WC3 Validator says:

=============
Character Encoding mismatch!

The character encoding specified in the HTTP header (utf-8) is different
from the value in the <meta> element (iso-8859-7).
=============

Where in the http header is the utf-8 specified and how can I change it?

This didn't happen a week ago, before I started working again on the code...

And it's not only the fact that the visitor would have to change the
encoding, the spacing between elements is also affected.

Thanks,

Eustace
 
N

Neredbojias

Well bust mah britches and call me cheeky, on Fri, 15 Feb 2008 09:24:41
GMT Eustace scribed:
Sorry to bother you again, but something has happened to my webpage

http://www.geocities.com/emfrilingos/fril

and although I specify

<meta content="text/html; charset=ISO-8859-7"
http-equiv="content-type">

when it is opened the character encoding is utf-8 and you have to
change it to Greek. The WC3 Validator says:

=============
Character Encoding mismatch!

The character encoding specified in the HTTP header (utf-8) is
different from the value in the <meta> element (iso-8859-7).
=============

Where in the http header is the utf-8 specified and how can I change
it?

This didn't happen a week ago, before I started working again on the
code...

And it's not only the fact that the visitor would have to change the
encoding, the spacing between elements is also affected.

Looks like a Geocities "feature". Do a view-source; see all that crap
"code" above/before your own markup? Does it belong in a valid page?

If you had a decent host, you may get decent results.
 
J

JWS

Eustace said:
Where in the http header is the utf-8 specified and how can I
change it?

It is specified by the web server. The server's charset settings
override the ones specified by your own page itself. Does
geocities allow changing the default character set for pages
served up by it?

If not, it might be a good idea to convert your page to utf-8. The
ISO encodings like 8859-7 are becoming obsolete, anyway. With
utf-8, you can mix all kinds of languages (Greek, English, German,
Japanese..) in one document, and all browsers nowadays understand
utf-8.

A good conversion tool for Linux is iconv. Very likely something
similar exists on Windows as well.

Regards, Jan
 
J

Jukka K. Korpela

Scripsit JWS:
Does geocities allow changing the default character set for pages
served up by it?

If not, it might be a good idea to convert your page to utf-8.

In any case, it might be a good idea to move the site to a decent
server.

Conversion to utf-8 is not trivial to an average web author who has
little idea of encodings. It is certainly possible.
The ISO encodings like 8859-7 are becoming obsolete, anyway.

Says who? You? Not ISO for sure.

Besides, you used ISO-8859-7 for your Usenet message. Probably by
accident, but still. :)
 
J

JWS

Jukka said:
Scripsit JWS:

Says who? You? Not ISO for sure.

Perhaps I should have included some weasel words like "I think
that". Anyway I see more and more html pages, internet forums, and
important sites (like Wikipedia) using utf-8. Microsoft's "Dr.
International" has been promoting utf-8 in web pages for years.
For a Greek site, I suppose it would be nice to be able to include
some ancient Greek text (with accents), which would again point to
utf-8.

But of course the most important thing is that every web page
should announce correctly which encoding it uses, no matter what
that encoding is. Many sites still do not do this..
Besides, you used ISO-8859-7 for your Usenet message. Probably by
accident, but still. :)

Aarrgh.. should have checked. Hope I get it right this time..

Regards, Jan.
 
B

Ben C

Perhaps I should have included some weasel words like "I think
that". Anyway I see more and more html pages, internet forums, and
important sites (like Wikipedia) using utf-8. Microsoft's "Dr.
International" has been promoting utf-8 in web pages for years.
For a Greek site, I suppose it would be nice to be able to include
some ancient Greek text (with accents), which would again point to
utf-8.

ISO-8859-7 etc. aren't becoming obsolete in any official way, but you
are basically right: many people consider it good practice with little
or no downside to use UTF-8 for everything.
 
Q

Quadibloc

Well bust mah britches and call me cheeky, on Fri, 15 Feb 2008 09:24:41
GMT Eustace scribed:











Looks like a Geocities "feature". Do a view-source; see all that crap
"code" above/before your own markup? Does it belong in a valid page?

If you had a decent host, you may get decent results.

I dunno. My .xhtml pages are still getting served as text/plain, even
with meta tags, from a shared hosting service. Of course, that's only
one step above an ISP 'personal web page', but that's a second big
step above Geocities.

John Savard
 
E

Eustace

It is specified by the web server. The server's charset settings
override the ones specified by your own page itself. Does geocities
allow changing the default character set for pages served up by it?

If not, it might be a good idea to convert your page to utf-8. The ISO
encodings like 8859-7 are becoming obsolete, anyway. With utf-8, you can
mix all kinds of languages (Greek, English, German,
Japanese..) in one document, and all browsers nowadays understand
utf-8.

A good conversion tool for Linux is iconv. Very likely something similar
exists on Windows as well.

Regards, Jan

Thanks. I used NVU to change the character encoding to UTF-8. When I
first got NVU and made a page in Greek, I saw the code and was terrified
to see code instead of Greek letters, so I asked at their forum and was
told how to change to ISO. After reading your message, I was curious to
go to a couple of Greek papers' sites, and realized that they both use
UTF-8, and that the View Code page showed Greek letters not code, so I
went back to NVU and in the New Page settings I chose UTF-8 and after
searching a little I found out that in the Page Properties I could/had
to also change the character set to UTF-8 and set is as default, and
voilá, the code also showed Greek characters. So I copied the code from
Yahoo to NVU, saved it, then edited it in Notepad to what it was at
Yahoo because NVU changes a few things, and then posted it back to
Yahoo. I am generally using NVU in this way, to help me make a webpage
initially and then I work directly on the code.

I see that you guys here don't have Yahoo in high esteem, but I really
find it still satisfactory for my limited needs, and inertia also plays
a role. The balance may change, however, if Microsoft buys it.

Best regards,

Eustace
 
N

Neredbojias

Well bust mah britches and call me cheeky, on Sat, 16 Feb 2008 04:38:29
GMT Quadibloc scribed:
I dunno. My .xhtml pages are still getting served as text/plain, even
with meta tags, from a shared hosting service. Of course, that's only
one step above an ISP 'personal web page', but that's a second big
step above Geocities.

John Savard

Text/plain, huh? Sucko.

I've noticed that meta tags are often ignored under any circumstances.
I handle my own xhtml pages via php, to wit:

<?
header("Vary: Accept");
if (stristr($_SERVER["HTTP_ACCEPT"],"application/xhtml+xml")) {
header("Content-Type: application/xhtml+xml; charset=utf-8");
echo '<?xml version="1.0" encoding="utf-8"?>';
echo "\r\n";
echo '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">';
echo "\r\n";
} else {
header("Content-Type: text/html; charset=utf-8");
echo '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">';
echo "\r\n";
}
?>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
....

This even works for IE and is valid in the empirical sense. Of course
the drawback is you must have a server-side language available to you.
 
N

Neredbojias

Well bust mah britches and call me cheeky, on Sat, 16 Feb 2008 17:52:48 GMT
Neredbojias scribed:
This even works for IE and is valid in the empirical sense. Of course
the drawback is you must have a server-side language available to you.

I forgot to add that the xhtml parsing mechanism results in Gecko are _not_
the same as the html parsing mechanism results, and there are (unique)
errors. It is much preferable to use html 4.01 strict for just about any
page on the Web today.
 
M

Michael Fesser

..oO(Neredbojias)
I've noticed that meta tags are often ignored under any circumstances.
I handle my own xhtml pages via php, to wit:

<?
header("Vary: Accept");
if (stristr($_SERVER["HTTP_ACCEPT"],"application/xhtml+xml")) {
header("Content-Type: application/xhtml+xml; charset=utf-8");
echo '<?xml version="1.0" encoding="utf-8"?>';
echo "\r\n";
echo '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">';
echo "\r\n";
} else {
header("Content-Type: text/html; charset=utf-8");
echo '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">';
echo "\r\n";
}
?>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
...

This even works for IE and is valid in the empirical sense. Of course
the drawback is you must have a server-side language available to you.

Another drawback is that it ignores the quality parameters in the
HTTP_ACCEPT header. A browser might accept both XHTML and HTML, but
might still prefer HTML for whatever reason. Your script ignores that.

Micha
 
N

Neredbojias

Well bust mah britches and call me cheeky, on Sun, 17 Feb 2008 07:34:54
GMT Michael Fesser scribed:
.oO(Neredbojias)
I've noticed that meta tags are often ignored under any circumstances.
I handle my own xhtml pages via php, to wit:

<?
header("Vary: Accept");
if (stristr($_SERVER["HTTP_ACCEPT"],"application/xhtml+xml")) {
header("Content-Type: application/xhtml+xml; charset=utf-8");
echo '<?xml version="1.0" encoding="utf-8"?>';
echo "\r\n";
echo '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">';
echo "\r\n";
} else {
header("Content-Type: text/html; charset=utf-8");
echo '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">';
echo "\r\n";
}
?>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
...

This even works for IE and is valid in the empirical sense. Of course
the drawback is you must have a server-side language available to you.

Another drawback is that it ignores the quality parameters in the
HTTP_ACCEPT header. A browser might accept both XHTML and HTML, but
might still prefer HTML for whatever reason. Your script ignores that.

Yes, and here is a link to an excellent explanation of that situation:

http://www.workingwith.me.uk/articles/scripting/mimetypes

In the case(s) of my own personal site, however, I wish not to provide
the preferential option and instead analyze how various browsers handle
xhtml as served by my dictates.
 
Q

Quadibloc

I've noticed that meta tags are often ignored under any circumstances.
I handle my own xhtml pages via php, to wit:
This even works for IE and is valid in the empirical sense. Of course
the drawback is you must have a server-side language available to you.

I could wade into some degree of server-side scripting, but that is
hazardous.

But I have found out that, being served by Apache, I have .htmlaccess
available to me, and that seems to have solved the problem nicely. I
will learn more when I can test it with a second machine.

John Savard
 
J

Jonathan N. Little

Quadibloc said:
But I have found out that, being served by Apache, I have .htmlaccess
available to me, and that seems to have solved the problem nicely. I
will learn more when I can test it with a second machine.

Just a small point that is .htaccess not .htmlaccess
 
Q

Quadibloc

Just a small point that is .htaccess not .htmlaccess

You're quite right, sorry about the mistake that could have confused
people reading the thread.

John Savard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,740
Latest member
AdolphBig6

Latest Threads

Top