Why JDBC need explicit character conversion?

howachen · Jul 1, 2006

Hi,

I use java to connect to MySQL (5.x), using the latest connector.

I have declared the character setting in the connection:
//-----------------------------------------------------------------------------
Properties props = new Properties();
props.put("characterEncoding", "UTF-8");
props.put("useUnicode", "true");
props.put("user", "root");
props.put("password", "password");
Class.forName("com.mysql.jdbc.Driver");

String dbUrl = "jdbc:mysql://" + "127.0.0.1" + "/"
+ "test_db";

this.setConnection(DriverManager.getConnection(dbUrl, props));
//-----------------------------------------------------------------------------

in the query part
//-----------------------------------------------------------------------------

String sqlstr = "SELECT * FROM test_table WHERE id = 8";
try {
PreparedStatement s = this.getConnection().prepareStatement(sqlstr);
ResultSet rs = s.executeQuery();

while (rs.next()) {

String text1Val = "";
String text2Val = "";

text1Val = rs.getString("text1"); // THIS DOES NOT WORK

try {
text2Val = new String(rs.getString("text1").getBytes(),
"UTF-8"); // THIS WORK!
} catch (UnsupportedEncodingException e2) {
e2.printStackTrace();
}
}
.....

//----------------------------------------

the value storing in "text1" field is a UTF-8 character.

Why need this kind of overhead in character conversion when using java?

thanks...

Chris Smith · Jul 1, 2006

I use java to connect to MySQL (5.x), using the latest connector.

As an unrelated side note, have you considered PostgreSQL instead? It
is free, but also cares about your data integrity, has a better design
and better standards compliance, and most importantly no one from the
PostgreSQL project has ever made any statements to me of the form: "We
think you (and everyone else who has written database-independent JDBC
code that may be usable with MySQL) may owe us money despite possibly
having never used our product, ever; but we can't give you legal advice
on interpreting our own license, so please contact an attorney if you
need advice on this matter." That last one is important to me.

On to your question...

I have declared the character setting in the connection:
//-----------------------------------------------------------------------------
Properties props = new Properties();
props.put("characterEncoding", "UTF-8");
props.put("useUnicode", "true");

The reference manual at:

http://dev.mysql.com/doc/refman/5.0/en/cj-character-sets.html

suggests that you should be specifying "utf8" instead of "UTF-8". Does
this make any difference?

....

text1Val = rs.getString("text1"); // THIS DOES NOT WORK

try {
text2Val = new String(rs.getString("text1").getBytes(),
"UTF-8"); // THIS WORK!

Hmm. That's really bad! The database appears to be sending the data
back in the system default encoding. If that data is representable in
the system default encoding, then the code above will work; but
otherwise, it will fail. Hence, your application will fail non-
deterministically based on things like the operating system, version,
locale settings, environment variables (in UNIX), etc.

If the above suggestion doesn't work, I'll poke around further.

Why need this kind of overhead in character conversion when using java?

This doesn't have anything to do with Java; it's a JDBC driver problem.

howachen · Jul 1, 2006

Chris Smith å¯«é“ï¼š

As an unrelated side note, have you considered PostgreSQL instead? It
is free, but also cares about your data integrity, has a better design
and better standards compliance, and most importantly no one from the
PostgreSQL project has ever made any statements to me of the form: "We
think you (and everyone else who has written database-independent JDBC
code that may be usable with MySQL) may owe us money despite possibly
having never used our product, ever; but we can't give you legal advice
on interpreting our own license, so please contact an attorney if you
need advice on this matter." That last one is important to me.

On to your question...

The reference manual at:

http://dev.mysql.com/doc/refman/5.0/en/cj-character-sets.html

suggests that you should be specifying "utf8" instead of "UTF-8". Does
this make any difference?

this does not work.

in fact, the documentation you quoted above said we should use "UTF-8"
:

i.e.
When specifying character encodings on the client side, Java-style
names should be used. The following table lists Java-style names for
MySQL character sets...

....or by configuring the JDBC driver to use "UTF-8" through the
characterEncoding property.

Chris Smith · Jul 2, 2006

this does not work.

in fact, the documentation you quoted above said we should use "UTF-8"
:

i.e.
When specifying character encodings on the client side, Java-style
names should be used. The following table lists Java-style names for
MySQL character sets...

...or by configuring the JDBC driver to use "UTF-8" through the
characterEncoding property.

Oops. I misread the page. Sorry, I don't know why the driver is using
the wrong encoding.

=?ISO-8859-2?Q?Dra=BEen_Gemi=E6?= · Jul 3, 2006

I have declared the character setting in the connection:
AS a matter of fact, I never had that kind of problem with Postgres or
MS SQL, or Hypersonic SQL.....

Check your jdbc driver documentation.....probably there is one.

DG

Need help with JDBC code walk	3	Mar 26, 2008
Problem with JDBC and euro ...	0	Feb 24, 2004
JSP to Applet communication	2	May 26, 2004
make any database have the ability of logging within 10 seconds	0	May 14, 2006
Problem with Hibernate.	0	Feb 2, 2009
IllegalArgumentException when invoking axis2-webservice with client	0	Feb 16, 2008
Jndi, Tomcat, Mysql settup and Servlet test	0	Nov 18, 2007
Jtable repaint - it just doesn't work! tried	1	Nov 7, 2003

Why JDBC need explicit character conversion?

howachen

Chris Smith

howachen

Chris Smith

=?ISO-8859-2?Q?Dra=BEen_Gemi=E6?=

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads