J
Janus Bor
Hi all,
I'm sure this is a very common problem. But can't find a simple &
efficient way of solving it.
Basically, I want to find out if a string contains my search string as a
substring. However, a certain amount of mismatches has to be allowed.
Here's an example:
query string:
"acgt"
subject string
"acctaggg"
If no mismatch was allowed, there would be 0 hits.
If 1 mismatch was allowed, the query string would match "acct".
If 2 mismatches were allowed, the query string would match "acct" and
"aggg".
If possible, I would like to do this using regular expressions, as my
search string can also contain ambiguous characters. So a real search
pattern might look something like this: /[ac][c][cg][t][acgt][c][gt][t]/
Unfortunately, performance is also very important, as I will have to
perform thousands of searches in strings that contain several million
characters.
I'd be very grateful for any suggestions!
Cheers,
Janus
I'm sure this is a very common problem. But can't find a simple &
efficient way of solving it.
Basically, I want to find out if a string contains my search string as a
substring. However, a certain amount of mismatches has to be allowed.
Here's an example:
query string:
"acgt"
subject string
"acctaggg"
If no mismatch was allowed, there would be 0 hits.
If 1 mismatch was allowed, the query string would match "acct".
If 2 mismatches were allowed, the query string would match "acct" and
"aggg".
If possible, I would like to do this using regular expressions, as my
search string can also contain ambiguous characters. So a real search
pattern might look something like this: /[ac][c][cg][t][acgt][c][gt][t]/
Unfortunately, performance is also very important, as I will have to
perform thousands of searches in strings that contain several million
characters.
I'd be very grateful for any suggestions!
Cheers,
Janus