text converter

A

Abanowicz Tomasz

Hello
I would like to convert multiple characters or one character into one
character. Assume the following example:

Thi\u105 is stAbCange te±t

The conversion map looks like:

\u105 -> s
AbC -> r
± -> x

The result after conversion should be:

This is strange text

How can I do that in PERL in order it to be simple and efficient.
I need some hints i.e. functions to use or at most a few lines of
example code.

Great THANKS for help.
 
J

Josef Moellers

Abanowicz said:
Hello
I would like to convert multiple characters or one character into one
character. Assume the following example:

Thi\u105 is stAbCange te±t

The conversion map looks like:

\u105 -> s
AbC -> r
± -> x

The result after conversion should be:

This is strange text

How can I do that in PERL in order it to be simple and efficient.
I need some hints i.e. functions to use or at most a few lines of
example code.

I usually do it like this:

my %map = (
'\u105' => 's',
'AbC' => 'r',
'±' => 'x'
);

my $string = 'Thi\u105 is stAbCange te±t';
while (my ($orig, $repl) = each %map) {
$string =~ s/\Q$orig\E/$repl/g;
}
print "$string\n";
 
M

Mirco Wahab

Abanowicz said:
Hello
I would like to convert multiple characters or one character into one
character. Assume the following example:

Thi\u105 is stAbCange te±t

The conversion map looks like:

\u105 -> s
AbC -> r
± -> x

The result after conversion should be:

This is strange text

How can I do that in PERL in order it to be simple and efficient.
I need some hints i.e. functions to use or at most a few lines of
example code.

Joseph did already post a very nice solution,
so I could go and copy/paste some part for mine ;-)

...
my $string = q{Thi\u105 is stAbCange te±t};

$string=~s/\Q$_->[0]\E/$_->[1]/g
for
[ '\u105', 's' ],
[ 'AbC' , 'r' ],
[ '±' , 'x' ]
;
...

Regards

M.
 
A

anno4000

Josef Moellers said:
I usually do it like this:

my %map = (
'\u105' => 's',
'AbC' => 'r',
'±' => 'x'
);

my $string = 'Thi\u105 is stAbCange te±t';
while (my ($orig, $repl) = each %map) {
$string =~ s/\Q$orig\E/$repl/g;
}
print "$string\n";

Alternatively, one can pack all the patterns into one regex
alternation and do the conversion in a single s///g. To do
this reliably it is best to test long patterns before short
ones. If the map contained 'AbCd' besides 'AbC', the test for
'AbCd' must come first.

Starting from %map as above, the regex can be built like this:

my $re = join '|',
map quotemeta,
sort { length( $b) <=> length( $a) }
keys %map;

Then the conversion becomes

$string = ~ s/($re)/$map{ $1}/g;


On my machine, the one-regex solution is notably, but not
dramatically faster than the looping one (a factor of 2).
Then again it's more code and less direct, so harder to read.
Anyone's call...

Anno
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,202
Messages
2,571,057
Members
47,662
Latest member
sxarexu

Latest Threads

Top