how to display/input/write Chinese Text in java

B

Buddha

Dear All,

I am trying to make a very simple java program, where I am trying to
display Chinese characters. I am trying to save them into a file (for
now, later into db2). However, I seem to make no progress at all.
I am completely lost with this one. I have googled and gone through a
lot of sites like :

http://www.mandarintools.com/javaconverter.html
http://www.chinesecomputing.com/programming/java.html
http://forum.java.sun.com/thread.jspa?threadID=442220&messageID=1995079
http://www.linuxforum.net/chinese/develop/java.html
http://java.sun.com/docs/books/tutorial/i18n/locale/services.html

I have downloaded the cyberbit.ttf into /jre/lib/fonts and updated the
fonts.properties also. (I dont reallyknow if I need to do it though).
I have tried to compile with two different encoding options big5 and
gb2132 as well.
I have also tried to pick up a few unicodes from a site and tried ..
for ex.

String s = "\U+0061";


but this results in Illegale Escape Char. exception.

All I am trying to do is use a String reference, feed it with Chinese
text, which again I copied from a site, and then display it on the
console. Or even write it to a file.
I am lost in the maze of Unicode, UTF -8 etc.


Later on :
I tried to actually start working on the web app where this change is
intended.
I copied a chinese String from a website (pasted it in Word, and it
pasted fine).
Pasted it in the text box and saved this field. The value appears
gibberish in db2 ( I am sure the tables arent specified for UTF-8).
My app doesnt use any contentType or characterEncoding either.
Then when I retrieve this value, it comes as exact same gibberish as
in DB.
But, behold, when I change the encoding (view -> encoding -> chinese
(GB2312)) It displays the exact String that was inserted :)
Then I add these lines in my Jsp.

<%@ page contentType="text/html; charset=GB2312" %>
request.setCharacterEncoding("GB2312");

..
so that I can make that chinese GB2312 as the default option. It does
that. (The default was Western European (ISO).
However now, it displays question marks (????) in that field.

What options can I have ?
I would be really glad if someone could lead me out of this.

Thanks in advance.

Rgds,

This is a crosspost, obviously, because I did not recieve help
elsewhere.
 
M

Mark Space

Buddha said:
String s = "\U+0061";

I think "\U0061" is the correct syntax. However this is a latin "="
yes? Is that what you want? (or is it an "a"?)

<%@ page contentType="text/html; charset=GB2312" %>
request.setCharacterEncoding("GB2312");

I am by no means an expert, but is your page really GB2312? Aren't you
using a regular text editor? Does your own system use GB2312? Because I
think that's what you are saying here.

request.setCharacterEncoding("GB2312") will set the request but... is
the request in GB2312 format? Can you show us the settings before
setting this? Did you set your browser to request that character set?

You might want to check out the <fmt> stuff in the JSTL. It provides
localization in JSPs. You still have to provide resource bundles.

Don't forget that the locale comes from the user and the browser, you
don't set it. You follow what the user asks for.
 
B

Buddha

Hi,
First, thanks for the quick reply.
yes?  Is that what you want? (or is it an "a"?)
I dont know, honestly. I just got this off a chinese site.

I am by no means an expert, but is your page really GB2312?  Aren't you
using a regular text editor?  Does your own system use GB2312? Because I
think that's what you are saying here.
Again, I am not sure. On a site is where I read about this.
and this is what is expected to be done in case we are dealing with
words in other language ( chinese ).
You might want to check out the <fmt> stuff in the JSTL. It provides
localization in JSPs. You still have to provide resource bundles.

Don't forget that the locale comes from the user and the browser, you
don't set it. You follow what the user asks for.

My application is only in english. It doesnt really use
internationalization.
Now, in short. All I need to do is allow my user irrespective of which
country this site is opened in,
to be able to enter chinese text.
I am guessing : They will have the chinese keyboards and a way to
enter chinese text.
So, all I would need to concentrate on is, entering some chinese words/
alphabets(?) (which I copied from a site, because I dont know Chinese)
and inserted them in the text box, and hit the save button.
Rest is as mentioned above :
**********************
The value appears
gibberish in db2 ( I am sure the tables arent specified for UTF-8).
My app doesnt use any contentType or characterEncoding either.
Then when I retrieve this value, it comes as exact same gibberish as
in DB.
But, behold, when I change the encoding (view -> encoding -> chinese
(GB2312)) It displays the exact String that was inserted :)
**********************
This is a very old application and there are no
<%@ page contentType="text/html; charset=GB2312" %> tags anywhere. I
just inserted them to check this functionality.

So basically, what I am looking to do is : Enter Chinese text and upon
retrival of that record; display it.
and, am clueless on how I would go about it.

Any help appreciated.

TIA
 
M

Mark Space

Buddha said:
Any help appreciated.

Well like I say I'm not an expert, but try going here:

http://www.javapassion.com/j2ee/#JSTL

Read the PDF files first, paying attention to the i18n stuff mostly.
Note there are two ways described that fit into J2EE architecture. One
is for the browser to send a request, the second is for the user to log
in and configure their preferences.

I think just putting tags everywhere won't work, because obviously you
have people browsing with other systems, like English.

The lab document takes you through some steps to see examples of i18n
and how they work in J2EE. It's pretty valuable I think.

Then back up on the lessons there, and check out the NetBeans IDE. It
has an HTTP monitor tool that allows you to see what is actually being
requested. (The lab document talks about this.) I think you are going
to need this to figure out what is really going on.

That's about all I can say, because I'd really need to see how the
requests are being sent. You'll likely need to learn how to set your
browser to request GB2312 so you can test it. And even better would be
an automated test suite that makes both Latin and GB2312 requests
automatically so you don't have to wear your fingers out testing. Just
something to think about.
 
R

Roedy Green

I am trying to make a very simple java program, where I am trying to
display Chinese characters. I am trying to save them into a file (for
now, later into db2). However, I seem to make no progress at all.
I am completely lost with this one. I have googled and gone through a
lot of sites like :

Have at a look at the source code for fontshower and fontshowerawt
They each display a few Chinese characters. Perhaps you could tell me
which ones I chose. I picked something that had visual appeal and
that looked "Chinese". I hope they don't have some peculiar meaning.

see http://mindprod.com/applet/fontshower.html
http://mindprod.com/jgloss/fontshowerawt.html

That will show you how to write Chinese in program in a number of
ways. Your big problem was you did not know how to write unicode
literals.

See http://mindprod.com/jgloss/literal.htm

you want "\u3302\u4e02" no plus signs.

If you have a file in Chinese, you next need to figure out what
encoding it is. See http://mindprod.com/applet/officialencoding.html
to help you guess.

Once you know, you can write a program to read the file, ,decoding it.
See http://mindprod.com/jgloss/encoding.html
http://mindprod.com/applet/fileio.html
for details.
 
B

Buddha

Thank you so much Roedy and Mark.
Mark, I dont really have the luxury of using JSTL.Moreover what I
ONLY need to do is allow the user to ENTER Chinese in the text boxes,
and display it when retrieved. The application that I am maintaining
went live in 97 !!

Roedy, thanks for all the help. Those links majorly deal with awts and
applets. I am working only in Jsps, and as said before, only be able
to enter chinese and display chinese ( in the text boxes).

I shall get back to you with what ever I plan to do. In the mean time
if you have more info, kindly keep it coming :)

thanks
Buddha
 
L

Lew

Buddha said:
Thank you so much Roedy and Mark.
Mark, I dont really have the luxury of using JSTL.Moreover what I
ONLY need to do is allow the user to ENTER Chinese in the text boxes,
and display it when retrieved. The application that I am maintaining
went live in 97 !!

Side note: How is it that JSTL is a "luxury" for you? Would you please
elucidate what makes it unavailable for you?

JSPs weren't available in 1997. There must be some path to upgrade the
application platform. What Java version does the application use?

Using JSTL should be just a matter of dropping the correct JARs in the
application lib path.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,153
Members
46,699
Latest member
AnneRosen

Latest Threads

Top