J
Jack
Hi folks,
I want to be able to test any string (yes any string with any
combination of -, 0-9, +, -, [, and other non alphanumeric characters)
against a set of target strings (and with these also, anything goes in
these strings as well)..Here is an example of the code which as you can
see pattern matches:
while (<SOURCE2>) {
@temparray2 = split(/$delimiter/, $_);
for ($i=0;$i<=$#target;$i++)
{
if ($target[$i] =~ m/$temparray2[0]/i) { print " match "; }
}
} #end while
This works great and runs through the file matching each $temparray2[0]
element against a target set of elements in an array, until I run into
some character that Regex complains about. What do I need to do with
my code to allow the flexibility I am looking for, the ability to case
insensitively match many, many patterns I have against other complex
strings without the complaints ; do I need to preprocess $temparray2[0]
to remove non alphanumeric characters and if so, whats is the code to
strip these out - here is a example data where regex complains about
$temparray2[0]:
METHYLENETETRAHYDROFOLATE DEHYDROGENASE/METHENYLTETRAHYDROFOLATE
CYCLOHYDROLASE/FORMYLTETRAHYDROFOLATESYNTHETASE,
NADP(+)-DEPENDENTMETHYLTETRAHYDROFOLATE CYCLOHYDROLASE DEFICIENCY,
INCLUDED
Here are some example errors:
Invalid [] range "y-4" in regex; marked by <-- HERE in
m/1-naphthacenecarboxylic
acid, 2-ethyl-1,2,3,4,6,11-hexahydro-2,5,
7-trihydroxy-6,11-dioxo-4-[[2,3,6-tri
deoxy-4 <-- HERE
-O-[2,6-dideoxy-4-O-((2R-trans)-tetrahydro-6-methyl-5-oxo-2H-py
ran-2-yl)
-.alpha.-L-lyxo-hexopyranosyl]-3-(dimethylamino)-.alpha.-L-lyxo-hexopy
ranosyl]oxy]-, methyl ester,(1R-(1.alpha.,2.beta.,4.beta.))-(9CI)/ at
....
Quantifier follows nothing in regex; marked by <-- HERE in
m/METHYLENETETRAHYDRO
FOLATE DEHYDROGENASE/METHENYLTETRAHYDROFOLATE
CYCLOHYDROLASE/FORMYLTETRAHYDROFOL
ATESYNTHETASE, NADP(+ <-- HERE )-DEPENDENTMETHYLTETRAHYDROFOLATE
CYCLOHYDROLASE
DEFICIENCY, INCLUDED/ at ...
I want to be able to test any string (yes any string with any
combination of -, 0-9, +, -, [, and other non alphanumeric characters)
against a set of target strings (and with these also, anything goes in
these strings as well)..Here is an example of the code which as you can
see pattern matches:
while (<SOURCE2>) {
@temparray2 = split(/$delimiter/, $_);
for ($i=0;$i<=$#target;$i++)
{
if ($target[$i] =~ m/$temparray2[0]/i) { print " match "; }
}
} #end while
This works great and runs through the file matching each $temparray2[0]
element against a target set of elements in an array, until I run into
some character that Regex complains about. What do I need to do with
my code to allow the flexibility I am looking for, the ability to case
insensitively match many, many patterns I have against other complex
strings without the complaints ; do I need to preprocess $temparray2[0]
to remove non alphanumeric characters and if so, whats is the code to
strip these out - here is a example data where regex complains about
$temparray2[0]:
METHYLENETETRAHYDROFOLATE DEHYDROGENASE/METHENYLTETRAHYDROFOLATE
CYCLOHYDROLASE/FORMYLTETRAHYDROFOLATESYNTHETASE,
NADP(+)-DEPENDENTMETHYLTETRAHYDROFOLATE CYCLOHYDROLASE DEFICIENCY,
INCLUDED
Here are some example errors:
Invalid [] range "y-4" in regex; marked by <-- HERE in
m/1-naphthacenecarboxylic
acid, 2-ethyl-1,2,3,4,6,11-hexahydro-2,5,
7-trihydroxy-6,11-dioxo-4-[[2,3,6-tri
deoxy-4 <-- HERE
-O-[2,6-dideoxy-4-O-((2R-trans)-tetrahydro-6-methyl-5-oxo-2H-py
ran-2-yl)
-.alpha.-L-lyxo-hexopyranosyl]-3-(dimethylamino)-.alpha.-L-lyxo-hexopy
ranosyl]oxy]-, methyl ester,(1R-(1.alpha.,2.beta.,4.beta.))-(9CI)/ at
....
Quantifier follows nothing in regex; marked by <-- HERE in
m/METHYLENETETRAHYDRO
FOLATE DEHYDROGENASE/METHENYLTETRAHYDROFOLATE
CYCLOHYDROLASE/FORMYLTETRAHYDROFOL
ATESYNTHETASE, NADP(+ <-- HERE )-DEPENDENTMETHYLTETRAHYDROFOLATE
CYCLOHYDROLASE
DEFICIENCY, INCLUDED/ at ...