R
Raymundo
Hello,
At first, I'm sorry that I'm not good at English.
To represent a Unicode character in a string or in a regexp, I can use
"\x{hex}" notation.
my $char = "\x{AC00}";
# $char = "ê°€" -- a Korean character, pronounced "GA"
(I'm not sure you can see this Korean character in your browser.
Please tell me if you can't)
However, it seems that this representation works only when it is hard-
coded. That means, I can't use a variable for the hex value:
my $index = "AC00";
my $char = "\x{$index}"; # This doesn't work.
print length($char),"\n";
print "[$char]\n";
[] -- that character is not "ê°€"(GA). It isn't even a printable
character.
(In fact, $char seems to be null char "\0". I found it by redirecting
the output into a file and viewing the file with hex editor)
Anyway, I tried several codes including double quote, single quote,
s/// op, etc.
Finally I found the code that works:
(code)
#!/usr/bin/perl
my $index = "AC00";
my $char = eval( "\"\\x{$index}\"" );
print length($char),"\n";
print "[$char]\n";
(output)
Wide character in print at ./test.pl line 6.
[ê°€]
I had to make a string that consists of
double quote " (it must be quoted with backslash)
backslash \ (quoted)
x
brace {
Unicode index
brace }
double quote " (quoted)
Then I have to eval that string... This is, I think, so complicated.
I think there may be a better way to do this. I found that
Unicode::Char module provides u() subroutine:
my $u = Unicode::Char->new();
my $char = $u->u('AC00'); # u() returns a character of Unicode
index AC00
( http://search.cpan.org/~dankogai/Unicode-Char-0.02/lib/Unicode/Char.pm
)
But I still wonder if there is a Perl internel function or standard
module that do same thing. I want to know what is the most popular
way.
Thanks.
G.Y.Park from South Korea.
At first, I'm sorry that I'm not good at English.
To represent a Unicode character in a string or in a regexp, I can use
"\x{hex}" notation.
my $char = "\x{AC00}";
# $char = "ê°€" -- a Korean character, pronounced "GA"
(I'm not sure you can see this Korean character in your browser.
Please tell me if you can't)
However, it seems that this representation works only when it is hard-
coded. That means, I can't use a variable for the hex value:
my $index = "AC00";
my $char = "\x{$index}"; # This doesn't work.
print length($char),"\n";
print "[$char]\n";
1 -- $char has one character but..../test.pl
[] -- that character is not "ê°€"(GA). It isn't even a printable
character.
(In fact, $char seems to be null char "\0". I found it by redirecting
the output into a file and viewing the file with hex editor)
Anyway, I tried several codes including double quote, single quote,
s/// op, etc.
Finally I found the code that works:
(code)
#!/usr/bin/perl
my $index = "AC00";
my $char = eval( "\"\\x{$index}\"" );
print length($char),"\n";
print "[$char]\n";
(output)
1./test.pl
Wide character in print at ./test.pl line 6.
[ê°€]
I had to make a string that consists of
double quote " (it must be quoted with backslash)
backslash \ (quoted)
x
brace {
Unicode index
brace }
double quote " (quoted)
Then I have to eval that string... This is, I think, so complicated.
I think there may be a better way to do this. I found that
Unicode::Char module provides u() subroutine:
my $u = Unicode::Char->new();
my $char = $u->u('AC00'); # u() returns a character of Unicode
index AC00
( http://search.cpan.org/~dankogai/Unicode-Char-0.02/lib/Unicode/Char.pm
)
But I still wonder if there is a Perl internel function or standard
module that do same thing. I want to know what is the most popular
way.
Thanks.
G.Y.Park from South Korea.