J
January Weiner
Hi,
here is the problem.
I have strings (protein sequences, to be precise) that contain letters +
'-':
my $s = 'Y---ERI-TTKDIV----EIKRHLDYLQAPRITNNDLE' ; # for example
my $s_nogaps = $s ;
$s_nogaps =~ s/-//g ;
Then I get two numbers:
my ($from, $length) = (0, 5) ;
These are the indices of a substring within $s_nogaps (and not $s). The
above indices correspond to a string 'YERIT'.
Now I would like to have fragments of $s corresponding to this substring.
For example, given the above indices, I should get 'Y--ERI-T'.
The best that I can come up with is looping over each character separately,
and incrementing a counter only if it is:
sub get_subseq {
my ($s, $f, $l) = @_ ;
my $i = 0 ;
my $ret ;
my $within = 0 ;
for(split //, $s) {
last if($i >= $f + $l) ;
if($_ ne '-') { $i++ ; }
next unless($i > $f) ;
$ret .= $_ ;
}
return $ret ;
}
any better ideas?
j.
--
here is the problem.
I have strings (protein sequences, to be precise) that contain letters +
'-':
my $s = 'Y---ERI-TTKDIV----EIKRHLDYLQAPRITNNDLE' ; # for example
my $s_nogaps = $s ;
$s_nogaps =~ s/-//g ;
Then I get two numbers:
my ($from, $length) = (0, 5) ;
These are the indices of a substring within $s_nogaps (and not $s). The
above indices correspond to a string 'YERIT'.
Now I would like to have fragments of $s corresponding to this substring.
For example, given the above indices, I should get 'Y--ERI-T'.
The best that I can come up with is looping over each character separately,
and incrementing a counter only if it is:
sub get_subseq {
my ($s, $f, $l) = @_ ;
my $i = 0 ;
my $ret ;
my $within = 0 ;
for(split //, $s) {
last if($i >= $f + $l) ;
if($_ ne '-') { $i++ ; }
next unless($i > $f) ;
$ret .= $_ ;
}
return $ret ;
}
any better ideas?
j.
--