Replacement Help

N

natedubya

I was writing a quick little program up, and I ran into a bit of
difficulty with it.

What I'm trying to end up with is a simple program that takes stdin and
takes any hexed numbers (like "%20", as in web addresses) and changes
them to their ascii counterpart. So, something like "hello%20world"
gets changed to "hello world". Easy enough.

What I have so far is simple:
s/%([0-9a-fA-F]{2})/chr(hex($1))/gee

Unfortunately, that dies. I was looking into using an extended
expression:
s/%([0-9a-fA-F]{2})/(??{ chr(hex($1)) })/gx

But that also doesn't work. I was just wondering, is there a way I can
do it this way? I managed to get around it by sticking the expression
inside a while loop and doing the replacement there, but I wanted to
know if this type of thing is even possible.
Thanks.
~Nate
 
U

usenet

What I'm trying to end up with is a simple program that takes stdin and
takes any hexed numbers (like "%20", as in web addresses) and changes
them to their ascii counterpart. So, something like "hello%20world"
gets changed to "hello world". Easy enough.

Yes, very easy, but you are going about it in a very difficult manner.

Why not make use of the CGI module (which is included with Perl)?

#!/usr/bin/perl
use strict; use warnings;
use CGI qw/unescape/;

my $string = "hello%20world";
print unescape ($string);
__END__
 
M

Mark Clements

I was writing a quick little program up, and I ran into a bit of
difficulty with it.

What I'm trying to end up with is a simple program that takes stdin and
takes any hexed numbers (like "%20", as in web addresses) and changes
them to their ascii counterpart. So, something like "hello%20world"
gets changed to "hello world". Easy enough.

What I have so far is simple:
s/%([0-9a-fA-F]{2})/chr(hex($1))/gee

Unfortunately, that dies. I was looking into using an extended
expression:
s/%([0-9a-fA-F]{2})/(??{ chr(hex($1)) })/gx

But that also doesn't work. I was just wondering, is there a way I can
do it this way? I managed to get around it by sticking the expression
inside a while loop and doing the replacement there, but I wanted to
know if this type of thing is even possible.
Thanks.
~Nate

assuming this isn't just a learning exercise, check out

URI::Escape

C:\>perl -MURI::Escape -le "print uri_unescape(shift)" "asdf%20adsf"
asdf adsf

from the docs:

uri_unescape($string,...)
Returns a string with each %XX sequence replaced with the actual
byte (octet).
This does the same as:

$string =~ s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg;

but does not modify the string in-place as this RE would. Using the
uri_unescape() function instead of the RE might make the code look
cleaner and is a few characters less to type.


Mark
 
P

Paul Lalli

I was writing a quick little program up, and I ran into a bit of
difficulty with it.

What I'm trying to end up with is a simple program that takes stdin and
takes any hexed numbers (like "%20", as in web addresses) and changes
them to their ascii counterpart. So, something like "hello%20world"
gets changed to "hello world". Easy enough.

What I have so far is simple:
s/%([0-9a-fA-F]{2})/chr(hex($1))/gee

What exactly made you believe you wanted a double-eval in this regexp?
Unfortunately, that dies.

That is a poor error description. You are more likely to get the best
help possible if you describe *precisely* how your scripts fail.
Include full text error messages.

Your solution was almost correct. But the double-eval was attempting
to execute the return value of chr() as Perl code, when it's just a
string.

#!/usr/bin/perl
use strict;
use warnings;

while (<DATA>){
s/%([0-9a-fA-F]{2})/chr(hex($1))/ge;
print;
}

__DATA__
Hello%20World
Foo%2FBar

Ouput:
Hello World
Foo/Bar


However, please do take heed of the other responses in this thread, and
use a solution that has already been written rather than reinventing
the wheel.

Paul Lalli
 
U

usenet

Purl said:
[benchmarks]

I suppose if the OP has 100000 strings to munge and s/he really, really
needs to save five seconds then s/he ought to use some cryptic code
instead of the plain and simple function that already comes with Perl.

When given a choice to make things easier for the programmer or easier
for the machine, I usually side with the programmer (since I'm a
programmer).
 
B

Brian McCauley

I was writing a quick little program up, and I ran into a bit of
difficulty with it.

What I'm trying to end up with is a simple program that takes stdin and
takes any hexed numbers (like "%20", as in web addresses) and changes
them to their ascii counterpart. So, something like "hello%20world"
gets changed to "hello world". Easy enough.

And, indeed a FAQ: "How do I decode or create those %-encodings on the web?"

Please refrain from posting FAQs it wastes everyone's time.
What I have so far is simple:
s/%([0-9a-fA-F]{2})/chr(hex($1))/gee

Why do you think you want that second /e ?
Unfortunately, that dies. I was looking into using an extended
expression:
s/%([0-9a-fA-F]{2})/(??{ chr(hex($1)) })/gx

But that also doesn't work.

The RHS of s/// is just a double-quotish string. Do not attempt to use
RegEx constructs in it.
 
M

Mark Clements

Purl said:
Benchmark: timing 100000 iterations of Clements...
Clements: 5 wallclock secs ( 3.96 usr + 0.00 sys = 3.96 CPU) @ 25252.53/s

Benchmark: timing 100000 iterations of PurlGurl...
PurlGurl: 1 wallclock secs ( 2.25 usr + 0.00 sys = 2.25 CPU) @ 44444.44/s


Purl Gurl

Really? Who cares, quite frankly? Extracting every last ounce of
performance out of the system is secondary to writing clear,
maintainable, legible code in 99% of cases. Optimizing every single
snippet of your code is counter-productive and best left as a learning
exercise.

Mark
 
N

Nate

Okay, 3 responses.

1.)
Why not make use of the CGI module (which is included with Perl)?
It was a learning exercise, so I'm not interested in using a module.

2.)
What exactly made you believe you wanted a double-eval in this regexp?
Things that I had read led me to believe that using 'e' would the
replacement from a double-quoted string to an expression, and 'ee'
would make perl eval it before it swapped it.

3.)
And, indeed a FAQ: "How do I decode or create those %-encodings on the web?"
Please refrain from posting FAQs it wastes everyone's time.

Again, it was a learning exercise. I especially don't care about taking
the %-encodings off, it was just an example.
 
U

usenet

Nate said:
It was a learning exercise, so I'm not interested in using a module.

You should have said so. That way, people wouldn't have wasted their
time telling you about modules.
 
N

Nate

You should have said so. That way, people wouldn't have wasted their
time telling you about modules.

I'm *so* sorry that I wasted so much of your apparently valuable time
(of which you apparently have enough of that you can reply to scold me
for wasting your time).

That aside, much thanks to Perl Gurl and Paul Lalli to actually
answering my question.
 
M

Mark Clements

Purl said:
Read my response to Filmer and learn.

Thanks: I have already learnt much about your approach to programming.
Gisle Aas makes use of virtually the same code as mine, as does
Lincoln Stein.

This is irrelevant. You can make your own toaster using readily
available component parts, or you can buy a toaster. Which do you
suggest I do next time I need toast?
It is highly illogical to use a slow clunky module in place of simple
single line code, without a very good reason.

legibility? maintainability? programmer efficiency?

To repeat, raw performance matters little. There is a wealth of
literature on this, pretty much starting with Knuth. I'm sure you're
aware of it, though you probably choose to ignore it. The trick is to
optimize when optimization is necessary ie when a bottleneck has been
identified. Up until that point, the aim of the game is clarity. You
don't agree: your choice, but I'm glad I don't have to maintain your code.


Mark
 
U

usenet

Nate said:
I'm *so* sorry that I wasted so much of your apparently valuable time
(of which you apparently have enough of that you can reply to scold me
for wasting your time).

I was trying to help you to be polite. But I see now that I'm probably
not up to the challenge.
 
S

Sherm Pendley

Mark Clements said:
Really? Who cares, quite frankly? Extracting every last ounce of
performance out of the system is secondary to writing clear,
maintainable, legible code in 99% of cases.

And, in the remaining 1%, the best approach is to profile your code to
find out what parts of it would yield the best overall results if you
optimize them. The "90/10 rule" applies - 90% of the total execution
time is usually spent in 10% of the code. So, to make the best use of
limited programmer time, you optimize that 10% first, and worry about
the rest as time permits.

I have yet to see a real-world non-trivial CGI where the time spent
parsing the encoded input was significant compared to the time spent
doing real work.
Optimizing every single
snippet of your code is counter-productive

Indeed. Wasting valuable and limited time on code that has a negligible
impact on total performance is rarely a recipe for job security.

sherm--
 
A

A. Sinan Unur

Okay, 3 responses.

1.)
It was a learning exercise, so I'm not interested in using a module.

You can use the source code of the module to learn.
....
3.)

Again, it was a learning exercise. I especially don't care about
taking the %-encodings off, it was just an example.

Reading the FAQ list would also help you learn.

It seems you are not really interested in learning.

Sinan
 
E

Eric J. Roode

Benchmark: timing 100000 iterations of Clements...
Clements: 5 wallclock secs ( 3.96 usr + 0.00 sys = 3.96 CPU) @
25252.53/s

Benchmark: timing 100000 iterations of PurlGurl...
PurlGurl: 1 wallclock secs ( 2.25 usr + 0.00 sys = 2.25 CPU) @
44444.44/s

Why don't you write it in C, and use an XS wrapper? You'll get even better
performance.

--
Eric
`$=`;$_=\%!;($_)=/(.)/;$==++$|;($.,$/,$,,$\,$",$;,$^,$#,$~,$*,$:,@%)=(
$!=~/(.)(.).(.)(.)(.)(.)..(.)(.)(.)..(.)......(.)/,$"),$=++;$.++;$.++;
$_++;$_++;($_,$\,$,)=($~.$"."$;$/$%[$?]$_$\$,$:$%[$?]",$"&$~,$#,);$,++
;$,++;$^|=$";`$_$\$,$/$:$;$~$*$%[$?]$.$~$*${#}$%[$?]$;$\$"$^$~$*.>&$=`
 
M

Mark Clements

Purl said:
You have written, paraphrased,

"It is ok to use your code inside a module. It is not ok
to use your code outside a module."

Boy Howdy!

Purl Gurl

You are greatly skilled at taking statements out of context and/or
misinterpreting them. I take my hat off to you.

Mark
 
T

Tad McClellan

Nate said:
I'm *so* sorry that I wasted so much of your apparently valuable time


That won't be a problem in the future.

Thanks for identifying yourself.
 
B

Brian McCauley

Nate said:
Okay, 3 responses.


Things that I had read led me to believe that using 'e' would the
replacement from a double-quoted string to an expression, and 'ee'
would make perl eval it before it swapped it.

Yes that is absolutely correct.

If $1 is '20' then the Perl expression chr(hex($1)) will return ' '.

If you put on the second /e then you perform eval(' ') which will return
undef.

I tend to think of the s///e as the more primative form than the s///
without the /e.

In other words it find it helpfull to consider that...

s/foo/bar/;

....is shorthand for...

s/foo/"bar"/e;
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,379
Messages
2,571,945
Members
48,805
Latest member
CeceliaWri

Latest Threads

Top