Iterating over a hash

L

limitz

Hey I have a question with iterating over a hash.

For example, if I have a hash table %hash1:

%hash1(
1 => 1,
2=> 2,
3=> 3,
);

And given this string 123123123123.

How would increment the values in the hash every time it recognizes
the value?
For example:

$numbers = 123123123123;

$beginning = <STDIN>;
print "Input Ending Base Number:\n";
$end = <STDIN>;
$length = $end-$beginning+1;

$one = '1';
$onecount = 0;
for my $i (0 .. $length-1) {
my $count = substr($numbers, $i, 2);
if ($count =~ $one) {
$onecount++;
}
}

Instead of copy and pasting this code three times and changing the
variable names. How would I have it directly read into the hash and
iterate over the hash?

Thanks
 
G

Gunnar Hjalmarsson

limitz said:
Hey I have a question with iterating over a hash.

For example, if I have a hash table %hash1:

%hash1(
1 => 1,
2=> 2,
3=> 3,
);

And given this string 123123123123.

How would increment the values in the hash every time it recognizes
the value?

To me, the problem you describe rather seems to be about iterating over
a string of digits. Are you after something like this?

#!/usr/bin/perl
use strict;
use warnings;

my %hash = ( 1 => 0, 2 => 0, 3 => 0 );
my $digits = '123123123123';

foreach my $dig ( split //, $digits ) {
$hash{$dig}++ if exists $hash{$dig};
}

print "$_: $hash{$_}\n" for keys %hash;

__END__
 
X

xhoster

limitz said:
Hey I have a question with iterating over a hash.

For example, if I have a hash table %hash1:

%hash1(
1 => 1,
2=> 2,
3=> 3,
);

And given this string 123123123123.

How would increment the values in the hash every time it recognizes
the value?
For example:
$numbers = 123123123123;

You may want to make that a string rather than a number, or at some
point you will lose precision.

$numbers = '123123123123';

$beginning = <STDIN>;
print "Input Ending Base Number:\n";
$end = <STDIN>;
$length = $end-$beginning+1;

Since we don't know what it is you type in for STDIN, this makes a poor
example. I would be better to hard-code the values you want for testing
purposes.
$one = '1';
$onecount = 0;
for my $i (0 .. $length-1) {
my $count = substr($numbers, $i, 2);
if ($count =~ $one) {
$onecount++;
}
}

Assuming this actually does what you want (in which case you want
something quite strange--counting how many of the overlapping "digraphs"
(not exactly the right word--every combination of two adjacent letters
taken together, with some possible weirdenss as the end of the string if
$length is the same or longer than length $numbers) of a string contain a
digit in at least one of the two places in the digraph), then you could do
it something like this:

my %hash1=(
1 => 0,
2=> 0,
3=> 0,
);

for my $i (0 .. $length-1) {
foreach (keys %hash1) {
$hash1{$_}++ if substr($numbers, $i, 2) =~ /$_/;
}
}


If performance was important, you could use index rather than a regex,
or try switching the nesting of the inner and outer loops, or a variety
of other things.

On the other hand, if the code you posted doesn't do what you want in the
first place, then it isn't clear what you want.

Xho
 
J

John W. Krahn

limitz said:
Hey I have a question with iterating over a hash.

For example, if I have a hash table %hash1:

%hash1(
1 => 1,
2=> 2,
3=> 3,
);

And given this string 123123123123.

How would increment the values in the hash every time it recognizes
the value?
For example:

$numbers = 123123123123;

$hash1{ $_ }++ for $numbers =~ /[123]/g;



John
 
J

Jürgen Exner

Tad said:
^^
What is the point of adding one followed by subtracting one?

In this particular case it actually makes sense. Otherwise the variable
would have to be called "LastIndex" or "LengthMinusOne" or something like
that.
While technically you are right from a documentation and readability point
of view using $length and subtracting 1 when needed seems to be preferable.

jue
 
L

limitz

You may want to make that a string rather than a number, or at some
point you will lose precision.

$numbers = '123123123123';




Since we don't know what it is you type in for STDIN, this makes a poor
example. I would be better to hard-code the values you want for testing
purposes.




Assuming this actually does what you want (in which case you want
something quite strange--counting how many of the overlapping "digraphs"
(not exactly the right word--every combination of two adjacent letters
taken together, with some possible weirdenss as the end of the string if
$length is the same or longer than length $numbers) of a string contain a
digit in at least one of the two places in the digraph), then you could do
it something like this:

my %hash1=(
1 => 0,
2=> 0,
3=> 0,
);

for my $i (0 .. $length-1) {
foreach (keys %hash1) {
$hash1{$_}++ if substr($numbers, $i, 2) =~ /$_/;
}

}

If performance was important, you could use index rather than a regex,
or try switching the nesting of the inner and outer loops, or a variety
of other things.

On the other hand, if the code you posted doesn't do what you want in the
first place, then it isn't clear what you want.

Xho

--
--------------------http://NewsReader.Com/--------------------
Usenet Newsgroup Service $9.95/Month 30GB- Hide quoted text -

- Show quoted text -

Actually, that is what I want it to do, something like a sliding
frame. As for the values in STDIN, they are numerical values. This
way, I can specify an arbitrary sequence for it to iterate over.
Although this example seems ridiculous, its actually a simplification
of a first order Markov transitional chain matrix.

I'm actually having an error running this. After I modify my code to
fit this, essentially what you suggested but with a print function at
the end. I get this error: "Use of unintialized value in string C:\Perl
\bin\markovgen2.pl line 107, <STDIN> line 2."

This is my exact code:

print "\nThere are a total of 18990 bases in the entire sequence\n";
print "\nWhat is the starting base number? Keep in mind that Perl
begins the tally";
print "\nwith 0. So if you wanted to start from base 30, input in
29\n";


#Here we can define where we want the sequence to begin and end
print "Input Starting Number:\n";
$beginning_of_sequence = <STDIN>;
print "Input Ending Base Number:\n";
$end_of_sequence = <STDIN>;
$length_of_sequence = $end_of_sequence-$beginning_of_sequence+1;

%dinucleotidepair = (
AT => 0,
AC => 0,
AG => 0,
AA => 0,

TA => 0,
TC => 0,
TG => 0,
TT => 0,

CA => 0,
CT => 0,
CG => 0,
CC => 0,

GA => 0,
GT => 0,
GC => 0,
GG => 0,
);




#$AC = 'AC';
#$ACcountss = 0;
#for my $i (0 .. $length_of_sequence-1) {
# my $dinuc = substr($fastasequence, $i, 2);
# if ($dinuc =~ $AC) {
# $ACcountss++;
# }
#}

for my $i (0 .. $length_of_sequence-1) {
foreach (keys %dinucleotidepair) {
$dinucleotidepair{$_}++ if substr($fastasequence, $i, 2) =~ /$_/;
}
}
print "$dinucleotidepair";

Can anyone explain to me the reason the error is popping up? Thanks.

~Frank
 
L

limitz

Actually, that is what I want it to do, something like a sliding
frame. As for the values in STDIN, they are numerical values. This
way, I can specify an arbitrary sequence for it to iterate over.
Although this example seems ridiculous, its actually a simplification
of a first order Markov transitional chain matrix.

I'm actually having an error running this. After I modify my code to
fit this, essentially what you suggested but with a print function at
the end. I get this error: "Use of unintialized value in string C:\Perl
\bin\markovgen2.pl line 107, <STDIN> line 2."

This is my exact code:

print "\nThere are a total of 18990 bases in the entire sequence\n";
print "\nWhat is the starting base number? Keep in mind that Perl
begins the tally";
print "\nwith 0. So if you wanted to start from base 30, input in
29\n";

#Here we can define where we want the sequence to begin and end
print "Input Starting Number:\n";
$beginning_of_sequence = <STDIN>;
print "Input Ending Base Number:\n";
$end_of_sequence = <STDIN>;
$length_of_sequence = $end_of_sequence-$beginning_of_sequence+1;

%dinucleotidepair = (
AT => 0,
AC => 0,
AG => 0,
AA => 0,

TA => 0,
TC => 0,
TG => 0,
TT => 0,

CA => 0,
CT => 0,
CG => 0,
CC => 0,

GA => 0,
GT => 0,
GC => 0,
GG => 0,
);

#$AC = 'AC';
#$ACcountss = 0;
#for my $i (0 .. $length_of_sequence-1) {
# my $dinuc = substr($fastasequence, $i, 2);
# if ($dinuc =~ $AC) {
# $ACcountss++;
# }
#}

for my $i (0 .. $length_of_sequence-1) {
foreach (keys %dinucleotidepair) {
$dinucleotidepair{$_}++ if substr($fastasequence, $i, 2) =~ /$_/;
}}

print "$dinucleotidepair";

Can anyone explain to me the reason the error is popping up? Thanks.

~Frank- Hide quoted text -

- Show quoted text -

Actually, I solved my own problem, but have a fresh problem. Here is
the working script:

print "\nThere are a total of 18990 bases in the entire sequence\n";
print "\nWhat is the starting base number? Keep in mind that Perl
begins the tally";
print "\nwith 0. So if you wanted to start from base 30, input in
29\n";


#Here we can define where we want the sequence to begin and end
print "Input Starting Number:\n";
$beginning_of_sequence = <STDIN>;
print "Input Ending Base Number:\n";
$end_of_sequence = <STDIN>;
$length_of_sequence = $end_of_sequence-$beginning_of_sequence+1;

%dinucleotidepair = (
AT => 0,
AC => 0,
AG => 0,
AA => 0,

TA => 0,
TC => 0,
TG => 0,
TT => 0,

CA => 0,
CT => 0,
CG => 0,
CC => 0,

GA => 0,
GT => 0,
GC => 0,
GG => 0,
);




#$AC = 'AC';
#$ACcountss = 0;
#for my $i (0 .. $length_of_sequence-1) {
# my $dinuc = substr($fastasequence, $i, 2);
# if ($dinuc =~ $AC) {
# $ACcountss++;
# }
#}

for my $i (0 .. $length_of_sequence-1) {
foreach (keys %dinucleotidepair) {
$dinucleotidepair{$_}++ if substr($fastasequence, $i, 2) =~ /$_/;
}
}

while ( my($keys,$values) = each(%dinucleotidepair) ) {
print "$keys $values\n";
}


#print "The Fasta sequence segment has $ACcountss AC's in
$beginning_of_sequence to $end_of_sequence",
#printf "for a relative frequency of %f\n", $ACcountss/
$length_of_sequence;


My new problem is this. I have to calculate the relative frequencies
for everything. So, what that means, is that if one of the keys in my
hash has an occurence, I have to divide that by $length_of_sequence to
find the relative frequency. Then, that relative frequency will be
used in another Perl script.

My question is this: How do I manipulate individual elements in a hash
given the value of the key is not zero.

Secondly, how do I save that as a variable that can be used by another
Perl script for calcuation purposes?

Thanks!

~Frank
 
L

limitz

Actually, that is what I want it to do, something like a sliding
frame. As for the values in STDIN, they are numerical values. This
way, I can specify an arbitrary sequence for it to iterate over.
Although this example seems ridiculous, its actually a simplification
of a first order Markov transitional chain matrix.

I'm actually having an error running this. After I modify my code to
fit this, essentially what you suggested but with a print function at
the end. I get this error: "Use of unintialized value in string C:\Perl
\bin\markovgen2.pl line 107, <STDIN> line 2."

This is my exact code:

print "\nThere are a total of 18990 bases in the entire sequence\n";
print "\nWhat is the starting base number? Keep in mind that Perl
begins the tally";
print "\nwith 0. So if you wanted to start from base 30, input in
29\n";

#Here we can define where we want the sequence to begin and end
print "Input Starting Number:\n";
$beginning_of_sequence = <STDIN>;
print "Input Ending Base Number:\n";
$end_of_sequence = <STDIN>;
$length_of_sequence = $end_of_sequence-$beginning_of_sequence+1;

%dinucleotidepair = (
AT => 0,
AC => 0,
AG => 0,
AA => 0,

TA => 0,
TC => 0,
TG => 0,
TT => 0,

CA => 0,
CT => 0,
CG => 0,
CC => 0,

GA => 0,
GT => 0,
GC => 0,
GG => 0,
);

#$AC = 'AC';
#$ACcountss = 0;
#for my $i (0 .. $length_of_sequence-1) {
# my $dinuc = substr($fastasequence, $i, 2);
# if ($dinuc =~ $AC) {
# $ACcountss++;
# }
#}

for my $i (0 .. $length_of_sequence-1) {
foreach (keys %dinucleotidepair) {
$dinucleotidepair{$_}++ if substr($fastasequence, $i, 2) =~ /$_/;
}}

print "$dinucleotidepair";

Can anyone explain to me the reason the error is popping up? Thanks.

~Frank- Hide quoted text -

- Show quoted text -

Actually, I solved my own problem, but have a fresh problem. Here is
the working script:

print "\nThere are a total of 18990 bases in the entire sequence\n";
print "\nWhat is the starting base number? Keep in mind that Perl
begins the tally";
print "\nwith 0. So if you wanted to start from base 30, input in
29\n";


#Here we can define where we want the sequence to begin and end
print "Input Starting Number:\n";
$beginning_of_sequence = <STDIN>;
print "Input Ending Base Number:\n";
$end_of_sequence = <STDIN>;
$length_of_sequence = $end_of_sequence-$beginning_of_sequence+1;


%dinucleotidepair = (
AT => 0,
AC => 0,
AG => 0,
AA => 0,


TA => 0,
TC => 0,
TG => 0,
TT => 0,


CA => 0,
CT => 0,
CG => 0,
CC => 0,


GA => 0,
GT => 0,
GC => 0,
GG => 0,
);


#$AC = 'AC';
#$ACcountss = 0;
#for my $i (0 .. $length_of_sequence-1) {
# my $dinuc = substr($fastasequence, $i, 2);
# if ($dinuc =~ $AC) {
# $ACcountss++;
# }
#}


for my $i (0 .. $length_of_sequence-1) {
foreach (keys %dinucleotidepair) {
$dinucleotidepair{$_}++ if substr($fastasequence, $i,
2) =~ /$_/;
}



}


while ( my($keys,$values) = each(%dinucleotidepair) ) {
print "$keys $values\n";


}


#print "The Fasta sequence segment has $ACcountss AC's in
$beginning_of_sequence to $end_of_sequence",
#printf "for a relative frequency of %f\n", $ACcountss/
$length_of_sequence;

My new problem is this. I have to calculate the relative frequencies
for everything. So, what that means, is that if one of the keys in my
hash has an occurence, I have to divide that by $length_of_sequence
to
find the relative frequency. Then, that relative frequency will be
used in another Perl script.


My question is this:
How do I manipulate individual elements in a hash
given the value of the key is not zero?

For example. In $fastasequence, the first 30 nucleotide bases contain
4 occurences of the variable "AC" yet no occurences of the base
combination "TG".

Thus, the variable frequency of AC is 0.133.


Secondly, how do I save 0.133 as a variable that can be carried over
and used by another
Perl script for calcuation purposes?


Thanks!


~Frank
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,153
Members
46,699
Latest member
AnneRosen

Latest Threads

Top