Problems displaying unicode-characters

J

Jay Brahms

Hi everyone.

I have a problem:
I am using tomcat 4.x and 5.x, trying to offer our websites in
different languages, also in (for example) polish and different
languages other then ISO-8859-1 (Latin-1).

the jsps and property-files are programmed in ISO-8859-1, the polish
translation of the used property-file (used for <bean:message />)
contains polish symbols (when using ISO-8859-1 in jsp, I can see the
right translation for the symbol in Latin-1, meaning not the polish
symbol).

I tried almost everything possible:
Changing the page-encoding using the <page-directive>:
<%@ page contentType="text/html; charset=iso8859-2"
pageEncoding="iso8859-1" %>

Using in jsp (especially because we have to change the encoding
dynamically):
<% response.setContentType("text/html; charset=ISO-8859-2"); %>

also including in html-header:
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-2">

But as soon as I change the 'charset' within the <page-directive> or
'content-type' of the response all polish characters are displayed as
'?' ....

I even changed my whole computer-settings to polish, system, browser,
keyboard, still all polish characters are displayed as '?'.

I tried saving the property-file in ISO-8859-2, but tomcat refused to
read it.

Can someone please help.
I need to be able to change dynamically the encoding of the page,
always depending on the translation and I cannot translate the
property-file (meaning all symbols within the file) into UTF-8. That is
not an option.

Thanx.
Jay.
 
T

Thomas Fritsch

Jay said:
Hi everyone.

I have a problem:
I am using tomcat 4.x and 5.x, trying to offer our websites in
different languages, also in (for example) polish and different
languages other then ISO-8859-1 (Latin-1).

the jsps and property-files are programmed in ISO-8859-1, the polish
translation of the used property-file (used for <bean:message />)
contains polish symbols (when using ISO-8859-1 in jsp, I can see the
right translation for the symbol in Latin-1, meaning not the polish
symbol).

I tried almost everything possible:
Changing the page-encoding using the <page-directive>:
<%@ page contentType="text/html; charset=iso8859-2"
pageEncoding="iso8859-1" %>

Using in jsp (especially because we have to change the encoding
dynamically):
<% response.setContentType("text/html; charset=ISO-8859-2"); %>

also including in html-header:
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-2">

But as soon as I change the 'charset' within the <page-directive> or
'content-type' of the response all polish characters are displayed as
'?' ....

I even changed my whole computer-settings to polish, system, browser,
keyboard, still all polish characters are displayed as '?'.

I tried saving the property-file in ISO-8859-2, but tomcat refused to
read it.

Can someone please help.
I need to be able to change dynamically the encoding of the page,
always depending on the translation and I cannot translate the
property-file (meaning all symbols within the file) into UTF-8. That is
not an option.

Thanx.
Jay.
Partial answer concerning property files: You can use the \uXXXX escape
notation in property files. For example:
\u00E4 instead of ä (german a-umlaut)
\u0142 instead of ? (polish l-slash)
\u0414 instead of ? (cyrillic De)
This may make the file somewhat hard to read, but you are on the safe
side, because then you can code *any* chars (not only the ASCII ones).

I don't know enough about JSP. So I can't say, whether this technique is
also applicable to JSP files.
 
E

enrique

Maybe you do not have a font set on the browser capable of displaying
those polish characters. Here's a simpler test: Hard-code static HTML
containing a sample of Polish. Does it display in the browser? Maybe
the issue is not even Java-related.

epp
 
R

Roedy Green

I have a problem:
I am using tomcat 4.x and 5.x, trying to offer our websites in
different languages, also in (for example) polish and different
languages other then ISO-8859-1 (Latin-1).

how many browsers support UTF-8? Unicode was invented to solve the
sort of headache you are having.
 
J

Jay Brahms

Thanx for the tip, I have already checked, the browser is completely
capable of showing any character I want to ... it seems to be a
JAVA-related problem.
 
Joined
Jun 28, 2006
Messages
1
Reaction score
0
I am having the same issue

I agree with Jay, it seems to be a java issue. I have other languages also where I can view them in other Windows programs such as UltraEdit, Notepad, Wordpad. But they don't seem to come out correctly in Java. I've tried several different code pages.

A question for you Jay, have you tried using the native2ascii utility to encode the property file. I know you said that's not an option. My question is, have you tried it, and it just doesn't work?

I am encoding mine on the fly using the overridden method from the ResourceBundle getString().

ResourceBundle res = <my resource bundle>
String newString = new String(res.getString().getBytes(), "IS08859-2");

newString is now encoded with the specified Character Set.

So I don't know if my method of encoding is good or bad, but that's my way of dynamically changing the char set.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,816
Latest member
SapanaCarpetStudio

Latest Threads

Top