English input on Arabic Page

T

tobes

Hi there,

I'm developing a web application that has contains an online
questionnaire.

My client wants users to input answers in English, despite the fact
the questionnaire is in Arabic.

If I view the page on a UK computer and type into some user input
boxes, then the input language is English regardless of what the
language of the HTML page is.

However, if using a computer in Kuwait for example, anything typed
into the boxes will be Arabic.

Is there a way in HTML of forcing the input language on a particular
element?

I hoped this would work :)

<input type="text" xml:lang="en" lang="en" name="forename" value="" />

Any help appreciated

Tobin
 
B

Ben C

Hi there,

I'm developing a web application that has contains an online
questionnaire.

My client wants users to input answers in English, despite the fact
the questionnaire is in Arabic.

If I view the page on a UK computer and type into some user input
boxes, then the input language is English regardless of what the
language of the HTML page is.

However, if using a computer in Kuwait for example, anything typed
into the boxes will be Arabic.

Is there a way in HTML of forcing the input language on a particular
element?

I don't think so, it's up the user what language they want to input in
(provided they have the right keyboards, input method programs etc.).
I hoped this would work :)

<input type="text" xml:lang="en" lang="en" name="forename" value="" />

I think the best you can do is:

<div>Please answer in English</div>
 
J

Jukka K. Korpela

Scripsit tobes:
I'm developing a web application that has contains an online
questionnaire.

Is there an URL for the latest draft page?
My client wants users to input answers in English, despite the fact
the questionnaire is in Arabic.

That's strange. Why would he want to do that?

Anyway, you should probably use UTF-8 for the page, and naturally design the
form handler so that it can process UTF-8 data. The reason is that people
_can_ input any Unicode character (if you just learn how to do that), and
strange things may and will happen if they enter a character that is not
representable in the encoding of the page, which is what gets used (at least
by default) for the form data when the browser sends it to the server.

(The 8-bit encodings commonly used for Arabic contain Latin letters as well,
but only the basic letters A to Z, and English texts may contain other
letters, too.)
If I view the page on a UK computer and type into some user input
boxes, then the input language is English regardless of what the
language of the HTML page is.

There's some confusion here. No wonder, since people regularly confuse
languages, character repertoires, encodings, fonts, and input methods with
each other, and many programs reinforce the confusion.

The language of the HTML page, as the actual language or as the language
declared in lang or xml:lang attributes, has nothing to do with the issue.

The keyboards and keyboard settings vary, and they do not depend on the
language _or_ the encoding of a web page. A user in the UK may install a
keyboard driver that turns the keyboard into Arabic and may well have done
that, unless he got a real Arabic keyboard from somewhere.
However, if using a computer in Kuwait for example, anything typed
into the boxes will be Arabic.

Are you sure? Similar considerations apply here. People can use different
keyboards and keyboard settings.
Is there a way in HTML of forcing

Of course not.
the input language on a particular element?

Not that one either.
<input type="text" xml:lang="en" lang="en" name="forename" value="" />

You have just declared that "" is in English. Even if we took the attributes
as applying to the data entered into the field (even though it is _not_
element content in the HTML sense), then the attributes would in principle
just _claim_ that the text is in English and would in practice by ignored by
any browser.

If you want to restrict input in one language, or even in some limited
character repertoire, you need to add a lot of server-side code that checks
such issues.

Technically, a <form> element could have e.g. accept-charset="us-ascii", and
it in fact may have some effect. But what it does - if it does anything - is
that the form data will be us-ascii encoded. If the actual user input
contains non-us-ascii characters, well... you wouldn't believe me if I
explained, so you'd better check yourself if you considered using this
attribute. :)
 
T

tobes

Hi Jukka

Thank you *very* much for the reply - that all makes good sense.

I'm not entirely sure why the client wants people to enter data in
English and read in Arabic. They've got several questionnaire
applications and they've had those developed in a similar way.

The accept-charset looks interesting, although I do believe that it
will screw things up. I bet it doesn't even convert between encodings,
and just sticks question marks in where a character code doesn't fit?

But yes, based on what you've said it does seem that the server side
checking would be the only "real" way of seeing if input is typed in
the correct character set at least. I think I can avoid this though,
and convince the cleint to handle this by just *asking* users to enter
details in English, we'll see!

Where do you get your HTML knowledge from, I could do with reading up
a bit more :)

Thanks

Tobin
 
J

Jukka K. Korpela

Scripsit tobes:
The accept-charset looks interesting, although I do believe that it
will screw things up. I bet it doesn't even convert between encodings,
and just sticks question marks in where a character code doesn't fit?

It used to be ignored by browsers, and still might, but when supported, it
makes the browser convert the form data to the specified encoding - if
possible. It might be impossible because the browser does not know that
encoding, or because there are characters in the data that are not
representable in it. Browsers may then even throw in designations!

I don't think accept-charset is a useful approach here.
But yes, based on what you've said it does seem that the server side
checking would be the only "real" way of seeing if input is typed in
the correct character set at least. I think I can avoid this though,
and convince the cleint to handle this by just *asking* users to enter
details in English, we'll see!

Let's hope that will work.
Where do you get your HTML knowledge from, I could do with reading up
a bit more :)

Unfortunately the book on HTML that I have co-authored (and is the best book
available on HTML) is in Finnish only. :-( But the encoding issues,
including those relating to web pages, are discussed in my book "Unicode
Explained", published by O'Reilly last year.
 
S

Sherm Pendley

tobes said:
I'm not entirely sure why the client wants people to enter data in
English and read in Arabic. They've got several questionnaire
applications and they've had those developed in a similar way.

I can imagine such a thing would be useful as a quiz in a foreign language
class. The questions might be in the students' native language to ensure
that they understand them, but the answers expected in the foreign language
because that's what is being tested.
But yes, based on what you've said it does seem that the server side
checking would be the only "real" way of seeing if input is typed in
the correct character set at least. I think I can avoid this though,
and convince the cleint to handle this by just *asking* users to enter
details in English, we'll see!

Have you considered doing an English-language spell check on the submit-
ted text? It won't be 100% obviously, but it doesn't need to be perfect;
even a mere 20% correct spelling would show that the text is at least an
attempt at English.

sherm--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,997
Messages
2,570,240
Members
46,828
Latest member
LauraCastr

Latest Threads

Top