Replacement Help

natedubya · Nov 18, 2005

I was writing a quick little program up, and I ran into a bit of
difficulty with it.

What I'm trying to end up with is a simple program that takes stdin and
takes any hexed numbers (like "%20", as in web addresses) and changes
them to their ascii counterpart. So, something like "hello%20world"
gets changed to "hello world". Easy enough.

What I have so far is simple:
s/%([0-9a-fA-F]{2})/chr(hex($1))/gee

Unfortunately, that dies. I was looking into using an extended
expression:
s/%([0-9a-fA-F]{2})/(??{ chr(hex($1)) })/gx

But that also doesn't work. I was just wondering, is there a way I can
do it this way? I managed to get around it by sticking the expression
inside a while loop and doing the replacement there, but I wanted to
know if this type of thing is even possible.
Thanks.
~Nate

usenet · Nov 18, 2005

[email protected] said:
What I'm trying to end up with is a simple program that takes stdin and
takes any hexed numbers (like "%20", as in web addresses) and changes
them to their ascii counterpart. So, something like "hello%20world"
gets changed to "hello world". Easy enough.

Yes, very easy, but you are going about it in a very difficult manner.

Why not make use of the CGI module (which is included with Perl)?

#!/usr/bin/perl
use strict; use warnings;
use CGI qw/unescape/;

my $string = "hello%20world";
print unescape ($string);
__END__

Mark Clements · Nov 18, 2005

I was writing a quick little program up, and I ran into a bit of
difficulty with it.

What I'm trying to end up with is a simple program that takes stdin and
takes any hexed numbers (like "%20", as in web addresses) and changes
them to their ascii counterpart. So, something like "hello%20world"
gets changed to "hello world". Easy enough.

What I have so far is simple:
s/%([0-9a-fA-F]{2})/chr(hex($1))/gee

Unfortunately, that dies. I was looking into using an extended
expression:
s/%([0-9a-fA-F]{2})/(??{ chr(hex($1)) })/gx

But that also doesn't work. I was just wondering, is there a way I can
do it this way? I managed to get around it by sticking the expression
inside a while loop and doing the replacement there, but I wanted to
know if this type of thing is even possible.
Thanks.
~Nate

assuming this isn't just a learning exercise, check out

URI::Escape

C:\>perl -MURI::Escape -le "print uri_unescape(shift)" "asdf%20adsf"
asdf adsf

from the docs:

uri_unescape($string,...)
Returns a string with each %XX sequence replaced with the actual
byte (octet).
This does the same as:

$string =~ s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg;

but does not modify the string in-place as this RE would. Using the
uri_unescape() function instead of the RE might make the code look
cleaner and is a few characters less to type.

Mark

Paul Lalli · Nov 18, 2005

I was writing a quick little program up, and I ran into a bit of
difficulty with it.

What I'm trying to end up with is a simple program that takes stdin and
takes any hexed numbers (like "%20", as in web addresses) and changes
them to their ascii counterpart. So, something like "hello%20world"
gets changed to "hello world". Easy enough.

What I have so far is simple:
s/%([0-9a-fA-F]{2})/chr(hex($1))/gee

What exactly made you believe you wanted a double-eval in this regexp?

Unfortunately, that dies.

That is a poor error description. You are more likely to get the best
help possible if you describe *precisely* how your scripts fail.
Include full text error messages.

Your solution was almost correct. But the double-eval was attempting
to execute the return value of chr() as Perl code, when it's just a
string.

#!/usr/bin/perl
use strict;
use warnings;

while (<DATA>){
s/%([0-9a-fA-F]{2})/chr(hex($1))/ge;
print;
}

__DATA__
Hello%20World
Foo%2FBar

Ouput:
Hello World
Foo/Bar

However, please do take heed of the other responses in this thread, and
use a solution that has already been written rather than reinventing
the wheel.

Paul Lalli

usenet · Nov 18, 2005

Purl said:
[benchmarks]

I suppose if the OP has 100000 strings to munge and s/he really, really
needs to save five seconds then s/he ought to use some cryptic code
instead of the plain and simple function that already comes with Perl.

When given a choice to make things easier for the programmer or easier
for the machine, I usually side with the programmer (since I'm a
programmer).

Brian McCauley · Nov 18, 2005

I was writing a quick little program up, and I ran into a bit of
difficulty with it.

What I'm trying to end up with is a simple program that takes stdin and
takes any hexed numbers (like "%20", as in web addresses) and changes
them to their ascii counterpart. So, something like "hello%20world"
gets changed to "hello world". Easy enough.

And, indeed a FAQ: "How do I decode or create those %-encodings on the web?"

Please refrain from posting FAQs it wastes everyone's time.

What I have so far is simple:
s/%([0-9a-fA-F]{2})/chr(hex($1))/gee

Why do you think you want that second /e ?

Unfortunately, that dies. I was looking into using an extended
expression:
s/%([0-9a-fA-F]{2})/(??{ chr(hex($1)) })/gx

But that also doesn't work.

The RHS of s/// is just a double-quotish string. Do not attempt to use
RegEx constructs in it.

Mark Clements · Nov 18, 2005

Purl said:
Benchmark: timing 100000 iterations of Clements...
Clements: 5 wallclock secs ( 3.96 usr + 0.00 sys = 3.96 CPU) @ 25252.53/s

Benchmark: timing 100000 iterations of PurlGurl...
PurlGurl: 1 wallclock secs ( 2.25 usr + 0.00 sys = 2.25 CPU) @ 44444.44/s

Purl Gurl

Really? Who cares, quite frankly? Extracting every last ounce of
performance out of the system is secondary to writing clear,
maintainable, legible code in 99% of cases. Optimizing every single
snippet of your code is counter-productive and best left as a learning
exercise.

Mark

Nate · Nov 18, 2005

Okay, 3 responses.

1.)

Why not make use of the CGI module (which is included with Perl)?

It was a learning exercise, so I'm not interested in using a module.

2.)

What exactly made you believe you wanted a double-eval in this regexp?

Things that I had read led me to believe that using 'e' would the
replacement from a double-quoted string to an expression, and 'ee'
would make perl eval it before it swapped it.

3.)

And, indeed a FAQ: "How do I decode or create those %-encodings on the web?"
Please refrain from posting FAQs it wastes everyone's time.

Again, it was a learning exercise. I especially don't care about taking
the %-encodings off, it was just an example.

usenet · Nov 18, 2005

Nate said:
It was a learning exercise, so I'm not interested in using a module.

You should have said so. That way, people wouldn't have wasted their
time telling you about modules.

Nate · Nov 18, 2005

You should have said so. That way, people wouldn't have wasted their

time telling you about modules.

I'm *so* sorry that I wasted so much of your apparently valuable time
(of which you apparently have enough of that you can reply to scold me
for wasting your time).

That aside, much thanks to Perl Gurl and Paul Lalli to actually
answering my question.

Mark Clements · Nov 18, 2005

Purl said:
Read my response to Filmer and learn.

Thanks: I have already learnt much about your approach to programming.

Gisle Aas makes use of virtually the same code as mine, as does
Lincoln Stein.

This is irrelevant. You can make your own toaster using readily
available component parts, or you can buy a toaster. Which do you
suggest I do next time I need toast?

It is highly illogical to use a slow clunky module in place of simple
single line code, without a very good reason.

legibility? maintainability? programmer efficiency?

To repeat, raw performance matters little. There is a wealth of
literature on this, pretty much starting with Knuth. I'm sure you're
aware of it, though you probably choose to ignore it. The trick is to
optimize when optimization is necessary ie when a bottleneck has been
identified. Up until that point, the aim of the game is clarity. You
don't agree: your choice, but I'm glad I don't have to maintain your code.

Mark

usenet · Nov 18, 2005

Nate said:
I'm *so* sorry that I wasted so much of your apparently valuable time
(of which you apparently have enough of that you can reply to scold me
for wasting your time).

I was trying to help you to be polite. But I see now that I'm probably
not up to the challenge.

Sherm Pendley · Nov 18, 2005

Mark Clements said:
Really? Who cares, quite frankly? Extracting every last ounce of
performance out of the system is secondary to writing clear,
maintainable, legible code in 99% of cases.

And, in the remaining 1%, the best approach is to profile your code to
find out what parts of it would yield the best overall results if you
optimize them. The "90/10 rule" applies - 90% of the total execution
time is usually spent in 10% of the code. So, to make the best use of
limited programmer time, you optimize that 10% first, and worry about
the rest as time permits.

I have yet to see a real-world non-trivial CGI where the time spent
parsing the encoded input was significant compared to the time spent
doing real work.

Optimizing every single
snippet of your code is counter-productive

Indeed. Wasting valuable and limited time on code that has a negligible
impact on total performance is rarely a recipe for job security.

sherm--

A. Sinan Unur · Nov 18, 2005

Okay, 3 responses.

1.)
It was a learning exercise, so I'm not interested in using a module.

You can use the source code of the module to learn.
....

3.)

Again, it was a learning exercise. I especially don't care about
taking the %-encodings off, it was just an example.

Reading the FAQ list would also help you learn.

It seems you are not really interested in learning.

Sinan

Eric J. Roode · Nov 18, 2005

Benchmark: timing 100000 iterations of Clements...
Clements: 5 wallclock secs ( 3.96 usr + 0.00 sys = 3.96 CPU) @
25252.53/s

Benchmark: timing 100000 iterations of PurlGurl...
PurlGurl: 1 wallclock secs ( 2.25 usr + 0.00 sys = 2.25 CPU) @
44444.44/s

Why don't you write it in C, and use an XS wrapper? You'll get even better
performance.

--
Eric
`$=`;$_=\%!;($_)=/(.)/;$==++$|;($.,$/,$,,$\,$",$;,$^,$#,$~,$*,$:,@%)=(
$!=~/(.)(.).(.)(.)(.)(.)..(.)(.)(.)..(.)......(.)/,$"),$=++;$.++;$.++;
$_++;$_++;($_,$\,$,)=($~.$"."$;$/$%[$?]$_$\$,$:$%[$?]",$"&$~,$#,);$,++
;$,++;$^|=$";`$_$\$,$/$:$;$~$*$%[$?]$.$~$*${#}$%[$?]$;$\$"$^$~$*.>&$=`

Mark Clements · Nov 18, 2005

Purl said:
You have written, paraphrased,

"It is ok to use your code inside a module. It is not ok
to use your code outside a module."

Boy Howdy!

Purl Gurl

You are greatly skilled at taking statements out of context and/or
misinterpreting them. I take my hat off to you.

Mark

Tad McClellan · Nov 18, 2005

Nate said:
I'm *so* sorry that I wasted so much of your apparently valuable time

That won't be a problem in the future.

Thanks for identifying yourself.

Brian McCauley · Nov 19, 2005

Nate said:
Okay, 3 responses.

Things that I had read led me to believe that using 'e' would the
replacement from a double-quoted string to an expression, and 'ee'
would make perl eval it before it swapped it.

Yes that is absolutely correct.

If $1 is '20' then the Perl expression chr(hex($1)) will return ' '.

If you put on the second /e then you perform eval(' ') which will return
undef.

I tend to think of the s///e as the more primative form than the s///
without the /e.

In other words it find it helpfull to consider that...

s/foo/bar/;

....is shorthand for...

s/foo/"bar"/e;

Nate · Nov 21, 2005

Okay, cool. Thanks for clearing that up.

Can't solve problems! please Help	0	Sep 26, 2022
Unicode escaping fun & games	0	Apr 23, 2009
regular expressions use	3	Aug 22, 2005
Need help with first C# console program	0	Sep 4, 2015
newbie regular expression questions	11	Nov 22, 2003
Bug in Perl Profiler?	6	Jul 27, 2006
Perl Newbie	2	Aug 27, 2008
Regular Expressions and String Replacement	1	Dec 13, 2003

Replacement Help

natedubya

usenet

Mark Clements

Paul Lalli

usenet

Brian McCauley

Mark Clements

Nate

usenet

Nate

Mark Clements

usenet

Sherm Pendley

A. Sinan Unur

Eric J. Roode

Mark Clements

Tad McClellan

Brian McCauley

Nate

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads