Peculiar issue with French characters

sumitra · Jan 30, 2006

Hello All,

I need to print out French characters
(ççÇÇààÀÀèèÈÈééÉÉ) in a PDF file by runningmy code on
Unix. I'm using iText to create the PDF. The configurations in iText
for the fonts include BaseFont.IDENTITY_H for encoding and
BaseFont.EMBEDDED.

The PDF encoding I have given is:
/BaseFont /Courier /Encoding /WinAnsiEncoding

which generates the PDFs with the French text fine on Windows. Should I
be changing this??

The problem is that with these parameters, on Unix, all I get is
garbled text in my pdf doc.

Compiling with -encoding ISO-8859-1 does not help because these French
values are picked up at run time from a Hashtable. I have checked the
Hashtable contents and they look good.

My code uses a lot of StringWriter() and I would like to know if I need
to explicitly set the encoding here to "8859_1" and if so, how?? I've
tried the ByteArrayOutputStream approach to replace the StringWriter
and wrapped that in OutputStreamWriter with the ecoding 8859_1. That
did not help.

I also tried the getBytes() method of StringWriter and tried to convert
it to another encoding, but that did not help too!!

I really am at a loss now as to how to resolve my problem.
If anyone out there has an idea do let me know please!
Thanks in advance.

--Sum

opalpa · Jan 30, 2006

What happens when you generate the pdf on unix and view it on windows?

Opalinski
(e-mail address removed)
http://www.geocities.com/opalpaweb/

Thomas Hawtin · Jan 30, 2006

My code uses a lot of StringWriter() and I would like to know if I need
to explicitly set the encoding here to "8859_1" and if so, how?? I've
tried the ByteArrayOutputStream approach to replace the StringWriter
and wrapped that in OutputStreamWriter with the ecoding 8859_1. That
did not help.

I also tried the getBytes() method of StringWriter and tried to convert
it to another encoding, but that did not help too!!

Character encoding matters at the point you encode characters as bytes
(or the opposite decode).

Lots of APIs confuse the matter by picking the encoding up from the
system defaults. So code may work on one setup, but not on another. To
get around a fatal bug in Adobe Acrobat Reader I had to change
encodings, meaning I could get different results depending upon which
window/tab I launched an application from.

FileWriter doesn't support character encodings, so don't use that class.
OutputStreamWriter has constructors to take character encodings, and one
which doesn't (so don't use that one). StringWriter.getBytes does not
exist. Swing has various methods which may depend upon configured
encoding, a specified encoding or just chopping the top byte off each
character (including surrogates).

Tom Hawtin

Sum · Jan 31, 2006

When I generate the pdf on Unix and view it on Windows, I see only
garbled text.

Sum · Jan 31, 2006

My bad, I meant the String.getBytes() method and not
StringWriter.getBytes(), which as you rightly pointed out, does not
exist.

What I noticed while running my app on Unix was that the French string
being returned to my program was:

Ã§Ã§ÃÃÃ Ã ÃÃÃ¨Ã¨ÃÃÃ©Ã©ÃÃ

whereas I expected to see:

ççÇÇààÀÀèèÈÈééÉÉ

This does not happen on Windows. Also, I actually compile my code on
Windows, and put the tarball onto Unix.
What do you suppose is happening now??

Roedy Green · Jan 31, 2006

This does not happen on Windows. Also, I actually compile my code on
Windows, and put the tarball onto Unix.
What do you suppose is happening now??

There is an implied default encoding used to map any conversion byte
<=> String. See http://mindprod.com/jgloss/encoding.html

sumitra · Feb 6, 2006

Figured it out. The one thing that I did not do was to start the
application (in Unix) from the same session where I had set LANG to
fr_FR. I assumed that setting LANG=fr_FR would have an environment
level effect, however that turned out to be only for that telnet
session!

Thanks for the help everyone. :-D

Pattern matching French accented characters	4	Mar 1, 2011
some hotmail and gmail can't render French characters	5	Mar 22, 2012
SQL request returns incorrect french characters	1	Nov 2, 2006
I need some help on a format issue that should be simple for someone here (but not me!)	0	Jul 6, 2023
How can I view / open / render / display a pdf file with c code?	0	Sep 23, 2023
Can't get tilde character with IDLE 3.2.2 on French Mac Lion	4	Dec 19, 2011
French Encoding	4	Aug 16, 2005
UrlEncode French characters - wrong encoding	5	Feb 23, 2006

Peculiar issue with French characters

sumitra

opalpa

Thomas Hawtin

Sum

Sum

Roedy Green

sumitra

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads