fastest count of instances in string?

K

ko

Purl said:
John W. Krahn wrote:

(snipped)




Ok, you got me on this one. I am reading Perl Cookbook,
Chapter 11 on taking references to scalars. I have not
yet found any notes on this oddity.

Would you mind helping me to understand how your final
printed sum is generated? Otherwords, where is your
number 135267140 originating?

I am running tests,

$stupid = \55;
print $stupid;
print "\n";
print $stupid + 0;

Naturally my numbers are different but I cannot understand
how the base number is created. My local results are:

SCALAR(0x1765654)
24532564

So, where the heck is your number and my 24532564 originating?

Thanks,

Purl Gurl

Automatic string to number conversion. The scalar's *memory* address is
converted from its hexadecimal representation to decimal:

perl> $a = \55;
perl> print "$a\n";
SCALAR(0x1ac65f0)
perl> print $a + 0, "\n";
28075504
perl> printf("%d\n", 0x1ac65f0);
28075504

HTH - keith
 
R

Randal L. Schwartz

ko> Automatic string to number conversion. The scalar's *memory* address
ko> is converted from its hexadecimal representation to decimal:

Well, it was never "hex" to begin with. It was a number. When you
use a reference in a numeric context, you get its address. And
numbers are printed as decimal by default.

When you use a reference in a string context, you get a debugging
value, which includes the address as a hex value.

Slight terminology nitpicking, but sloppy talk in that arena leads to
nightmares.
 
U

Uri Guttman

RLS> When you use a reference in a string context, you get a debugging
RLS> value, which includes the address as a hex value.

it is not only for debugging. it is a guaranteed unique string by virtue
of its address. it is useful for the key in a hash of objects, or as a
anonymous class name and other areas. damian uses that last concept in
his hook::lexwrap module. and i bet class::classless uses it too.

uri
 
K

ko

Randal said:
ko> Automatic string to number conversion. The scalar's *memory* address
ko> is converted from its hexadecimal representation to decimal:

Well, it was never "hex" to begin with. It was a number. When you
use a reference in a numeric context, you get its address. And
numbers are printed as decimal by default.

Ok, thanks for pointing out that the referece was a number. Are there
any practical uses for using references in this context?
 
U

Uri Guttman

PG> I have been playing with this scalar reference notion and
PG> working on how to ask if there is usage for this reference
PG> return as a number. You have provided some good examples.

PG> Are there other uses you know of, Uri?

sure. but i am not telling you.

PG> This specific topic is new to me. When you indicate usage
PG> for a hash key, do you mean the hex value or the numeric
PG> value which had me confused?

i won't help yo until you leave this group. and then i will only help by
posting here. figure that out.

PG> As an example, would you store the hex value as a key for
PG> later access by dereference?

PG> This is really interesting.

it is way over your head as it doesn't involve index or crappy html. leave
it to the professionals. you will never grok references thankfully as it
keeps you out of many of the threads here.

uri
 
R

Randal L. Schwartz

Purl> You posted this information simply to confuse me.

Not really. I'm trying to bring light to a confusing subject.
Such is my nature, except when I'm yelling at someone. :)

Purl> You have confused me in this way. When I create a reference to
Purl> a scalar, a typical hex format number is returned:


Purl> SCALAR(0x1765654)

But only when you do that in a place that needs a string ("string
context").


Purl> When I use John Krahn's method of:

Purl> print $scalar_reference + 0;

Purl> An expected natural number is returned.

Yes, now you're using a reference in a place that needs
a number, and the actual address is returned.

Purl> So, using Krahn's method, am I actually addressing a "numerical"
Purl> memory address, this is, a plain natural number (not hex) and it
Purl> is being returned?

Right.

Purl> Is the hex number SCALAR(some hex value), is this generated
Purl> by perl core solely?

Correct, I guess, although I'm not sure what the terms "core" and
"solely" mean in there. "as opposed to what?", my brain says, "little
green goblins?".

When you use a reference in a string context, a debugging value is
returned.

Purl> Otherwords, it is not the system memory
Purl> address but rather perl's "hook" into the memory address
Purl> but in hex format? No conversion from hex to dec takes place
Purl> when I use Krahn's method?

Correct. A reference in a numeric context returns its address.
A reference in a string context returns something like "FOO(0xDEADBEEF)".
 
R

Roy Johnson

My results were: about 9-10 sec for your solution, 12 for s///g, and
15 for m//g with an iterator. Code is below for anybody who wants to
try/tweak it. Not as much difference as I had expected, but
noticeable.

#!perl
use strict;
use warnings;

my $string='cacababaA';
my %tr_subs ;
my $count;
for my $i (1 .. 1000000) {
my $char='a';
$count = str($string, $char);
}
print "There are $count of them.\n";

sub mtr {
my ($str, $tr_chars) = @_;
my $count = 0;
++$count while $str=~/a/g;
$count;
}

sub str {
my ($str, $tr_chars) = @_;
$str=~s/$tr_chars//g;
}

sub tr_chars {
my ( $str, $tr_chars ) = @_;
$tr_subs{ $tr_chars } ||= eval 'sub { $str=~tr/$tr_chars// }' ;
$tr_subs{ $tr_chars }->() ;
}
 
U

Uri Guttman

RJ> sub str {
RJ> my ($str, $tr_chars) = @_;
RJ> $str=~s/$tr_chars//g;
RJ> }

that still doesn't do anything useful. how many times do you have to be
told that?

RJ> sub tr_chars {
RJ> my ( $str, $tr_chars ) = @_;
RJ> $tr_subs{ $tr_chars } ||= eval 'sub { $str=~tr/$tr_chars// }' ;

and that doesn't interpolate $tr_chars either as it is in single quotes.
it may still works as it has an 'a' in '$tr_chars' but it is slower than
just having 'a' in there.


RJ> $tr_subs{ $tr_chars }->() ;
RJ> }

this is going nowhere slowly. first, learn how to generate subs on the
fly with exactly what you want. try printing out the generated code
first and then do an eval on it. also learn to factor out fixed stuff
like that out of the loop. even an ||= takes time that is reflected in
the benchmarks.

uri
 
B

Bill

Purl Gurl said:
John W. Krahn wrote:

(snipped)



Ok, you got me on this one. I am reading Perl Cookbook,
Chapter 11 on taking references to scalars. I have not
yet found any notes on this oddity.

Well, you need to know C, then study the source code for perl 5, for
this one.
And for anything other than writing perl mods it's pretty useless.
Would you mind helping me to understand how your final
printed sum is generated? Otherwords, where is your
number 135267140 originating?

I am running tests,

$stupid = \55;
print $stupid;
print "\n";
print $stupid + 0;

Naturally my numbers are different but I cannot understand
how the base number is created. My local results are:

SCALAR(0x1765654)
24532564

So, where the heck is your number and my 24532564 originating?

<silly>Well, grasshopper, let us show you one of the deep things of
Perl.</silly>

24532564 is the decimal representation of the hex address 0x1765654.
The perl internal reference of \55 is the builtin match variable $55.

try this (watch for word wrap, the strings are long):

#!/usr/bin/perl

$idiotic = "abcdefghijklmnopqrstuvwxyz";

$moronic = "(a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)(l)(m)(n)(o)(p)(q)(r)(s)(t)(u)(v)(w)(x)(y)(z)";

$dumb = $idiotic . uc $idiotic . '123456789';

$dumber = $moronic . uc $moronic . '(1)(2)(3)(4)(5)(6)(7)(8)(9)';

print "matching $dumb AGAINST $dumber\n\n";

@a = $dumb =~ m/$dumber/;

$stupid = \55;

print "address is $stupid, value is $55, decimal val plus zero is ",
$stupid + 0, "\n\n";

foreach (1 .. 61) {
my($var, $addr, $value);
$var = '$' . $_;
eval ( '$addr = \$' . $_ . ';' );
eval ( '$value = $' . $_ . ';' );
print "For builtin var $var, address is $addr, value is $value\n";
}


--Bill
 
D

Dave Cross

This specific topic is new to me. When you indicate usage for a hash
key, do you mean the hex value or the numeric value which had me
confused?

The key to a hash is a string. When you use a reference as a hash key,
Perl therefore stringifies it and uses the string value as the value key.
As an example, would you store the hex value as a key for later access
by dereference?

But the stringified version of a reference is no longer a reference. You
can't dereference it.

#!/usr/bin/perl

use warnings;

my $ref = \42;

my %hash;

$hash{$ref} = 'The answer';

# This prints SCALAR(0xDEADBEEF)
print keys %hash, "\n";

while (my $key = each %hash) {
print "Ref is $key\n";
print "Dereferenced: $$key\n";
}

# But the real reference still works;
print "Real ref is $ref\n";
print "Dereferenced: $$ref\n";

If you "use strict" then you get told that you're doing bad things.

Can't use string ("SCALAR(0x805f84c)") as a SCALAR ref while "strict refs"
in use at ./ref.pl line 16.

Dave...
 
R

Roy Johnson

Uri Guttman said:
RJ> sub str {
RJ> my ($str, $tr_chars) = @_;
RJ> $str=~s/$tr_chars//g;
RJ> }

that still doesn't do anything useful. how many times do you have to be
told that?

How many times do you imagine you have told me that? Can you explain
why the output of the program is "There are 4 of them." when the str
function is called?
RJ> sub tr_chars {
RJ> my ( $str, $tr_chars ) = @_;
RJ> $tr_subs{ $tr_chars } ||= eval 'sub { $str=~tr/$tr_chars// }' ;

and that doesn't interpolate $tr_chars either as it is in single quotes.
it may still works as it has an 'a' in '$tr_chars' but it is slower than
just having 'a' in there.

Yep. I need to protect $str from interpolation, but not $tr_chars:
$tr_subs{ $tr_chars } ||= eval "sub { \$str=~tr/$tr_chars// }" ;
The timing is not noticeably affected.
this is going nowhere slowly.

Then ignore me. I supplied code, you supplied complaints. I won't feel
guilty for not being perfect. If you don't want to deal with lesser
mortals, don't.
 
U

Uri Guttman

RJ> sub str {
RJ> my ($str, $tr_chars) = @_;
RJ> $str=~s/$tr_chars//g;
RJ> }
RJ> How many times do you imagine you have told me that? Can you explain
RJ> why the output of the program is "There are 4 of them." when the str
RJ> function is called?

have you rtfm'ed about this? the relevant section was posted in this
thread. tr/// DOES NOT INTERPOLATE.

the actual string in the tr/// happens to HAVE an 'a' in it. so it will
count a's as well as '$' and 't' the rest of '$tr_chars'


RJ> sub tr_chars {
RJ> my ( $str, $tr_chars ) = @_;
RJ> $tr_subs{ $tr_chars } ||= eval 'sub { $str=~tr/$tr_chars// }' ;

same here. the single quotes won't interpolate either.

RJ> Yep. I need to protect $str from interpolation, but not $tr_chars:
RJ> $tr_subs{ $tr_chars } ||= eval "sub { \$str=~tr/$tr_chars// }" ;
RJ> The timing is not noticeably affected.

but this is CORRECT now. i wasn't concerned about speed.

try the original and the str sub with data that contains any other chars
in '$tr_chars' beyond 'a'. see what the count is then.

RJ> Then ignore me. I supplied code, you supplied complaints. I won't feel
RJ> guilty for not being perfect. If you don't want to deal with lesser
RJ> mortals, don't.

i deal with all sorts provided they listen and learn. when you have been
told multiple times that tr/// doesn't interpolate and you still attempt
it then you are not listening. you did listen and learn about the
eval. now try to listen and learn about proper testing. use better data
input and try to break the code. that is one aspect of proper
testing. note that moronzilla is infamous (among other things) for
posting test cases that don't drive the code well. the data she provides
is almost always best case and will produce good results. but anything
else will fail and she fails to see that. she has done that many times
and refuses to listen or learn. don't you become like that.

uri
 
U

Uri Guttman

RJ> Ok, mea culpa. For string values of $stupid, they are the same.

and you can't guarantee that. they are different and "$string" is almost
always wrong.

uri
 
J

John W. Krahn

Roy said:
Ok, mea culpa. For string values of $stupid, they are the same.

If the value in $stupid is already a string why would you need to quote
the variable?


John
 
J

JS Bangs

Roy Johnson sikyal:
How many times do you imagine you have told me that? Can you explain
why the output of the program is "There are 4 of them." when the str
function is called?

Because the string you are matching against matches against one of the
characters in qw/$ t r _ c h a r s/. '$tr_chars' is interpreted as a
literal string and the tr/// is matched against that. If the character you
attempt to match against give happens to be in '$tr_chars', you will get a
misleadingly correct value.

This has tripped me up as well.


--
Jesse S. Bangs (e-mail address removed)
http://students.washington.edu/jaspax/
http://students.washington.edu/jaspax/blog

Jesus asked them, "Who do you say that I am?"

And they answered, "You are the eschatological manifestation of the ground
of our being, the kerygma in which we find the ultimate meaning of our
interpersonal relationship."

And Jesus said, "What?"
 
E

Eric J. Roode

RJ> sub str {
RJ> my ($str, $tr_chars) = @_;
RJ> $str=~s/$tr_chars//g;
RJ> }

RJ> How many times do you imagine you have told me that? Can you
explain RJ> why the output of the program is "There are 4 of them."
when the str RJ> function is called?

have you rtfm'ed about this? the relevant section was posted in this
thread. tr/// DOES NOT INTERPOLATE.

Uri..... Roy Johnson's code above does not contain a tr///.
 
U

Uri Guttman

RJ> sub str {
RJ> my ($str, $tr_chars) = @_;
RJ> $str=~s/$tr_chars//g;
RJ> }

RJ> How many times do you imagine you have told me that? Can you
EJR> Uri..... Roy Johnson's code above does not contain a tr///.

this was pointed out to me by someone else. then there is another
bug. if $tr_chars has more than 1 char then s/// not delete the
individual chars. he needs a char class for that.

uri
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,141
Messages
2,570,814
Members
47,360
Latest member
kathdev

Latest Threads

Top