<Form> and Arabic input problems

S

shror

Hi every body,

I need some help regarding my <Form>,
I have a website in English & my customer asked me to make another
copy of the website in Arabic language and I have an email Form
posting to a php script for sending the form entries to his email but
when I was testing the Arabic side the <form> wasn't sent correctly
when I entered Arabic text.
What I got was nothing except Unicode letters like this:
بامب

but all the text in my pages is viewable very good so please tell me
whats the problem and how to solve it

Thanks for your help in advance

shror
 
J

Jukka K. Korpela

Scripsit shror:
when I was testing the Arabic side the <form> wasn't sent correctly
when I entered Arabic text.
What I got was nothing except Unicode letters like this:
بامب

As usual, revealing a URL would have helped to analyze the problem. But it
looks pretty obvious that the problem is in the character encoding of the
form data.

For example, if a page is ASCII (or ISO-8859-1) encoded, then the form data
encoding is the same by default, and if you enter an Arabic character in a
form there, the effect is _undefined_ by HTML specifications. What browsers
might do is to represent the characters that have no representation in the
encoding by character references like ب (or by entity references, when
applicable). This is really odd, since the form data is just character data,
not HTML, but on the other hand, what else could a poor browser do?

You could tweak your form handler into dealing with such references, but the
real solution is to make the page UTF-8 encoded and to make the form handler
deal with UTF-8 data.
 
S

shror

Scripsit shror:


As usual, revealing a URL would have helped to analyze the problem. But it
looks pretty obvious that the problem is in the character encoding of the
form data.

For example, if a page is ASCII (or ISO-8859-1) encoded, then the form data
encoding is the same by default, and if you enter an Arabic character in a
form there, the effect is _undefined_ by HTML specifications. What browsers
might do is to represent the characters that have no representation in the
encoding by character references like ب (or by entity references, when
applicable). This is really odd, since the form data is just character data,
not HTML, but on the other hand, what else could a poor browser do?

You could tweak your form handler into dealing with such references, but the
real solution is to make the page UTF-8 encoded and to make the form handler
deal with UTF-8 data.

sorry for not sending my URL I know its stupidity but here it is
http://www.mobidp.com/request2.htm


Thanks for your help Jukka

shror
 
J

Jukka K. Korpela

Scripsit shror:
...
sorry for not sending my URL I know its stupidity but here it is
http://www.mobidp.com/request2.htm

The situation is basically what I wrote in the quoted text, just with
windows-1252 (Windows Latin 1) as the encoding. The encoding in unable to
represent any Arabic letters.

The encoding is specified in a <meta> tag, and HTTP headers are silent about
encoding, so it would be almost trivial to change the encoding to utf-8, by
modifying the <meta> tag and by replacing all non-ASCII characters (such as
the copyright sign) by entity or character references (such as &copy;).
ASCII data constitutes utf-8 data too.

But there's probably much more to be done on the server side, in the form
handler (confirmation.php). It would need to be modified so that it can read
utf-8 data and process it meaningfully.

The bad news is that PHP does not support utf-8 yet, except in fairly
limited ways.

Alternative tricks:

1) Let the page be windows-1252 encoded, and just get prepared to getting
stuff like ب. If you pass them into an HTML document, _without_
encoding the "&" in any way, they will appear as the characters they denote
by HTML rules. (This is actually the way people have built, probably by
accident, a poor man's Unicode support to one of the most popular web-based
discussion forums in Finland, suomi24.fi.) There is no guarantee that this
will work, but it happens to work in most situations.

2) Make the Arabic page windows-1256 (Windows Arabic) or iso-8859-6 (ISO
Latin/Arabic) encoded. Your form handler will then get Arabic letters in the
specified 8-bit encoding. This in principle restricts input to characters
representable in the chosen encoding, but in practice you usually get a
stuff for other characters.

P.S. Your form has a single-line input field for "Address", which is
probably for a postal address, since you also have "E-mail". Normally you
should reserve a textarea of six lines for input of a postal address, but in
this case, _if_ you include the postal address input (why?), then I think
you should have two textareas, one for the address in Latin letters and one
for the eventual address in the local writing system. According to the
International Postal Union, a letter sent e.g. to an Arabic-speaking country
from abroad should have the recipient address in two ways, in Latin letters
and in Arabic letters.
 
S

shror

Scripsit shror:






The situation is basically what I wrote in the quoted text, just with
windows-1252 (Windows Latin 1) as the encoding. The encoding in unable to
represent any Arabic letters.

The encoding is specified in a <meta> tag, and HTTP headers are silent about
encoding, so it would be almost trivial to change the encoding to utf-8, by
modifying the <meta> tag and by replacing all non-ASCII characters (such as
the copyright sign) by entity or character references (such as &copy;).
ASCII data constitutes utf-8 data too.

But there's probably much more to be done on the server side, in the form
handler (confirmation.php). It would need to be modified so that it can read
utf-8 data and process it meaningfully.

The bad news is that PHP does not support utf-8 yet, except in fairly
limited ways.

Alternative tricks:

1) Let the page be windows-1252 encoded, and just get prepared to getting
stuff like ب. If you pass them into an HTML document, _without_
encoding the "&" in any way, they will appear as the characters they denote
by HTML rules. (This is actually the way people have built, probably by
accident, a poor man's Unicode support to one of the most popular web-based
discussion forums in Finland, suomi24.fi.) There is no guarantee that this
will work, but it happens to work in most situations.

2) Make the Arabic page windows-1256 (Windows Arabic) or iso-8859-6 (ISO
Latin/Arabic) encoded. Your form handler will then get Arabic letters in the
specified 8-bit encoding. This in principle restricts input to characters
representable in the chosen encoding, but in practice you usually get a
stuff for other characters.

P.S. Your form has a single-line input field for "Address", which is
probably for a postal address, since you also have "E-mail". Normally you
should reserve a textarea of six lines for input of a postal address, but in
this case, _if_ you include the postal address input (why?), then I think
you should have two textareas, one for the address in Latin letters and one
for the eventual address in the local writing system. According to the
International Postal Union, a letter sent e.g. to an Arabic-speaking country
from abroad should have the recipient address in two ways, in Latin letters
and in Arabic letters.

Really I dont know how to thank your Jukka,

I just changed the encoding to UTF-8 and tryed my <Form> and I
recieved arabic in my email
its now working fine because of your help, I didnt do any thing to the
php script and it worked

Thanks so much
 
J

Jukka K. Korpela

Scripsit shror:
I just changed the encoding to UTF-8 and tryed my <Form> and I
recieved arabic in my email
its now working fine because of your help, I didnt do any thing to the
php script and it worked

Sounds too good to be true... and strange. But I guess the PHP script just
passes the incoming data "as is", as a sequence of octets, and your email
program manages to interpret it as utf-8.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,240
Members
46,828
Latest member
LauraCastr

Latest Threads

Top