UTF-8 and C string

M

Mike

Hi there,

Here is my question:

If I pass a value to a string, like "xyz\xc2\xbfwww", then the runtime
value (VC++)of this string is "xyz¿www". Is this runtime value in
UTF-8 encoding? How can I check this?

Thanks a lot.

Mike
 
R

Roger Leigh

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

If I pass a value to a string, like "xyz\xc2\xbfwww", then the
runtime value (VC++)of this string is "xyz¿www". Is this runtime
value in UTF-8 encoding? How can I check this?

Walk the string and print it out as hex, byte by byte.

On my Linux system, GCC encodes all narrow strings as UTF-8 and all
wide strings as UCS-4. How they are displayed to the user (the output
encoding) depends on the locale, which causes them to be recoded on
the fly if required.

The following test should be portable, but does require that your
compiler accept UTF-8 source (recode it if required)

Regards,
Roger


#include <locale.h>
#include <stdio.h>
#include <string.h>
#include <wchar.h>

int main(void)
{
setlocale(LC_ALL, "");

const char *narrow = "Test Unicode (narrow): ïàý ÐÐ¾Ñ ã‘ãŸã„ã¨é¡˜ã†!\n";
fprintf(stdout, "%s\n", narrow);

fprintf(stdout, "Narrow bytes:\n");
for (int i = 0; i< strlen(narrow); ++i)
fprintf(stdout, "%3d: %02X\n", i, (unsigned int) *((unsigned char *)narrow+i));

if (fwide (stderr, 1) <= 0)
fprintf(stdout, "Failed to set stderr to wide orientation\n");

const wchar_t *wide = L"Test Unicode (wide): ïàý ÐÐ¾Ñ ã‘ãŸã„ã¨é¡˜ã†!\n";
fwprintf(stderr, L"\n%ls\n", wide);

fwprintf(stderr, L"\nNarrow-to-wide: %s\n", narrow);

fprintf(stdout, "\nWide-to-narrow: %ls\n", wide);

fprintf(stdout, "Wide bytes:\n");
for (int i = 0; i< (wcslen(wide) * sizeof(wchar_t)); ++i)
fprintf(stdout, "%3d: %02X\n", i, (unsigned int) *((unsigned char *)wide+i));

return 0;
}

- --
Roger Leigh
Printing on GNU/Linux? http://gimp-print.sourceforge.net/
Debian GNU/Linux http://www.debian.org/
GPG Public Key: 0x25BFB848. Please sign and encrypt your mail.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/>

iD8DBQFCNMIuVcFcaSW/uEgRAneFAJwLvrXidezttj2ZdhTer450Q796wQCgjrDL
SfeNBsrg/ggtOoA7s0iU8ew=
=0zUE
-----END PGP SIGNATURE-----
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,160
Messages
2,570,889
Members
47,422
Latest member
LatashiaZc

Latest Threads

Top