Unicode -> Python -> DBAPI -> PyPgSQL -> PostgreSQL

R

Rene Pijlman

I can't seem to find any way to specify the character encoding with the DB
API implementation of PyPgSQL. There is no mention of encoding and Unicode
in the DB API v2.0 spec and the PyPgSQL README.

When I have Unicode strings in Python and store it in a PostgreSQL Unicode
database, will the data automatically be correctly encoded? Or do I need
to specify the UTF-8 client encoding on the database connection somehow?

I'm using the current packages of Debian stable (woody):
Python 2.2
PyPgSQL 2.0
PostgreSQL 7.2 (database created with UNICODE / UTF-8 encoding)
 
?

=?ISO-8859-1?Q?Gerhard_H=E4ring?=

Rene said:
I can't seem to find any way to specify the character encoding with the DB
API implementation of PyPgSQL. There is no mention of encoding and Unicode
in the DB API v2.0 spec and the PyPgSQL README. [...]

See section 2.2.5 in the pyPgSQL README:

pyPgSQL has a few extensions that make it possible to insert Unicode strings
into PostgreSQL and fetch unicode strings instead of byte strings from the
database.

The module-level connect() function has two Unicode-related parameters:

- client_encoding
- unicode_results

*client_encoding* accepts the same parameters as the encode method
of Unicode strings. If you also want to set a policy for encoding
errors, set client_encoding to a tuple, like ("koi8-r", "replace")

Note that you still must make sure that the PostgreSQL client is
using the same encoding as set with the client_encoding parameter.
This is typically done by issuing a "SET CLIENT_ENCODING TO ..."
SQL statement immediately after creating the connection.

If you also want to fetch Unicode strings from the database, set
*unicode_results* to 1.

For example, assuming a database created with *createdb mydb -E UNICODE*
and a
table *TEST(V VARCHAR(50))*:
>>> from pyPgSQL import PgSQL
>>> cx = PgSQL.connect(database="mydb", client_encoding="utf-8", unicode_results=1)
>>> cu = cx.cursor()
>>> cu.execute("set client_encoding to unicode")
>>> cu.execute("insert into test(v) values (%s)", (u'\x99sterreich',))
>>> cu.execute("select v from test")
>>> cu.fetchone() [u'\x99sterreich']
>>>


-- Gerhard
 
R

Rene Pijlman

Gerhard Häring:
Rene Pijlman:
I can't seem to find any way to specify the character encoding with the DB
API implementation of PyPgSQL. There is no mention of encoding and Unicode
in the DB API v2.0 spec and the PyPgSQL README. [...]

See section 2.2.5 in the pyPgSQL README:

Well, its not in this README I found in the Debian package python2.2-pgsql
with pyPgSQL 2.0:
#ident "@(#) $Id: README,v 1.20 2001/11/05 01:18:12 ghaering Exp $"
pyPgSQL - v2.0: Python DB-API 2.0 Compliant Interface Module for
PostgreSQL.

But this tells me I probably need to upgrade to pyPgSQL 2.3 or 2.4:
"Q: I’ve heard of Unicode support for pyPgSQL. What’s the current status?
A: It’s integrated in pyPgSQL 2.3."
http://pypgsql.sourceforge.net/pypgsql-faq.pdf

Thanks a lot Gerhard, you've put me on the right track.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,969
Messages
2,570,161
Members
46,710
Latest member
bernietqt

Latest Threads

Top