Thanks for your reply, austin,
Is there any possibility to output UTF-8 encoded text right know?
I need no fancy fonts or formating, just some plain text output at
specific x-y corrdinates.
best regards and thanks for the great library,
Yes -- but you have to wade through the font encoding mapping
information for PDF documents right now, and you have to be using a
Unicode-capable font. From the PDF 1.6 Reference:
Font management is primarily concerned with producing the
correct appearance of text—that is, the shape and placement of
glyphs. However, it is sometimes necessary for a PDF application
to extract the meaning of the text, represented in some standard
information encoding such as Unicode. In some cases, this
information can be deduced from the encoding used to represent
the text in the PDF file. Otherwise, the PDF producer
application should specify the mapping explicitly by including a
special object, the ToUnicode CMap.
I have not added support for the /ToUnicode CMap in PDF::Writer, but
it may be possible. However:
Certain strings contain information that is intended to be
human-readable, such as text annotations, bookmark names,
article names, document information, and so forth. Such strings
are referred to as text strings. Text strings are encoded in
either PDFDocEncoding or Unicode character encoding.
PDFDocEncoding is a superset of the ISO Latin 1 encoding and is
documented in Appendix D. Unicode is described in the Unicode
Standard by the Unicode Consortium (see the Bibliography).
For text strings encoded in Unicode, the first two bytes must be
254 followed by 255. These two bytes represent the Unicode byte
order marker, U+FEFF, indicating that the string is encoded in
the UTF-16BE (big-endian) encoding scheme specified in the
Unicode standard. (This mechanism precludes beginning a string
using PDFDocEncoding with the two characters thorn ydieresis,
which is unlikely to be a meaningful beginning of a word or
phrase). Note: Applications that process PDF files containing
Unicode text strings should be prepared to handle supplementary
characters; that is, characters requiring more than two bytes to
represent.
An escape sequence may appear anywhere in a Unicode text string
to indicate the language in which subsequent text is written,
which is useful when the language cannot be determined from the
character codes used in the text. The escape sequence consists
of the following elements, in order:
1. The Unicode value U+001B (that is, the byte sequence 0
followed by 27)
2. A 2-character ISO 639 language code—for example, en for
English or ja for Japanese
3. (Optional) A 2-character ISO 3166 country code—for example,
US for the United States or JP for Japan
4. The Unicode value U+001B
The complete list of codes defined by ISO 639 and ISO 3166 can
be obtained from the International Organization for
Standardization (see the Bibliography).
So you can't specify UTF-8, but you can specify UTF-16BE if you
provide the 0xFEFF BOM.
-austin