unicode

  • Thread starter nicolas_laurent545
  • Start date
N

nicolas_laurent545

This command perl -pi -e 's/(\p{IsAplha}+)(é)(\s)/($1$2)/g' text.txt
returns
Can't find Unicode property definition "Aplha" at -e line 1, <> line 1.

What should I do to correct the problem ?

Thanks
 
T

Tobias Witek

This command perl -pi -e 's/(\p{IsAplha}+)(é)(\s)/($1$2)/g' text.txt
returns
Can't find Unicode property definition "Aplha" at -e line 1, <> line 1.

What should I do to correct the problem ?

I guess you could probably try to spell "Alpha" correctly.

I am not an expert in Perl, but that might fix your problem :-D
 
D

Dr.Ruud

nicolas_laurent545 schreef:
perl -pi -e 's/(\p{IsAplha}+)(é)(\s)/($1$2)/g' text.txt
returns
Can't find Unicode property definition "Aplha"
at -e line 1, <> line 1.

What should I do to correct the problem ?

See "perldoc perlre": IsAlpha
 
N

nicolas_laurent545

Tobias Witek a écrit :
I guess you could probably try to spell "Alpha" correctly.

I am not an expert in Perl, but that might fix your problem :-D

I am sorry but still have the problem with unicode.

perl -pi -e 's/(\p{IsAlpha}+)(é)(\s)/($1$2)/g' text.txt
This tell perl to take this as input
input:détrôné ôperaé ôpéraé tônâné tûtéé
expected output:(détrôné) (ôperaé) (ôpéraé) (tônâné)
(tûtéé)
actual output: détrô(né)ô(peraé)ôpé(raé)tônâ(né)tûtéé
 
A

anno4000

Tobias Witek a écrit :


I am sorry but still have the problem with unicode.

perl -pi -e 's/(\p{IsAlpha}+)(é)(\s)/($1$2)/g' text.txt
This tell perl to take this as input
input:détrôné ôperaé ôpéraé tônâné tûtéé
expected output:(détrôné) (ôperaé) (ôpéraé) (tônâné) (tûtéé)

I wouldn't expect any blanks in the output because the substitution
pattern doesn't use $3. So I'd expect

(détrôné)(ôperaé)(ôpéraé)(tônâné)(tûtéé)

and that's what I get.
actual output: détrô(né)ô(peraé)ôpé(raé)tônâ(né)tûtéé

What version of Perl?

Anno
 
B

Ben Bacarisse

Tobias Witek a écrit :


I am sorry but still have the problem with unicode.

perl -pi -e 's/(\p{IsAlpha}+)(é)(\s)/($1$2)/g' text.txt
This tell perl to take this as input
input:détrôné ôperaé ôpéraé tônâné tûtéé
expected output:(détrôné) (ôperaé) (ôpéraé) (tônâné)
(tûtéé)
actual output: détrô(né)ô(peraé)ôpé(raé)tônâ(né)tûtéé

There are two separate issues:
(a) the IO layer must use (and expect) UTF-8;
(b) the program source includes a UTF-8 encoded character (é).

My version 5.8.0 is such that "use utf8;" is still required for (b)
and I must turn on UTF-8 encoding of IO using the -C option to achieve
(a), so

perl -C -pi -e 'use utf8; s/(\p{IsAlpha}+)(é)(\s)/($1$2)/g' text.txt

works for me. To avoid the "use utf8;" just write the é as \x{E9} and
you can avoid the -C flag by setting PERL_UNICODE.

I found "perldoc perluniintro" very informative.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,197
Messages
2,571,040
Members
47,634
Latest member
RonnyBoelk

Latest Threads

Top