H
Hai anh Le
I have a problem with regexp. I have some document like :
A method for detecting a post-translationally modified protein with a
glycosyl group comprising contacting the protein with a glycosyl
transferase enzyme and a labeling agent, wherein the labeling agent
comprises a chemical handle and a transferable glycosyl group.
I want to divide it to some string follow a rule that string start with
"a, an, the" like :
"A method for detecting "
"a post-translationally modified protein with "
"a glycosyl group comprising contacting "
"the protein with "
"a glycosyl transferase enzyme and "
"a labeling agent, wherein "
"the labeling agent comprises "
"a chemical handle and "
"a transferable glycosyl group."
I use the code
while element.size do
if element =~ /([Aa]|[Aa]n|[Tt]he)( [^ ]+)(?:[Aa]|[Aa]n|[Tt]he)?/
then
temp_string = $1 +$2
temp_array << temp_string
else
break
end
element.slice!(temp_string)
end
but it's not ok. the result is
A method
a post-translationally
a glycosyl
the protein
a glycosyl
a labeling
the labeling
a chemical
a transferable
Can anyone help me about the code ?
Hai Anh
A method for detecting a post-translationally modified protein with a
glycosyl group comprising contacting the protein with a glycosyl
transferase enzyme and a labeling agent, wherein the labeling agent
comprises a chemical handle and a transferable glycosyl group.
I want to divide it to some string follow a rule that string start with
"a, an, the" like :
"A method for detecting "
"a post-translationally modified protein with "
"a glycosyl group comprising contacting "
"the protein with "
"a glycosyl transferase enzyme and "
"a labeling agent, wherein "
"the labeling agent comprises "
"a chemical handle and "
"a transferable glycosyl group."
I use the code
while element.size do
if element =~ /([Aa]|[Aa]n|[Tt]he)( [^ ]+)(?:[Aa]|[Aa]n|[Tt]he)?/
then
temp_string = $1 +$2
temp_array << temp_string
else
break
end
element.slice!(temp_string)
end
but it's not ok. the result is
A method
a post-translationally
a glycosyl
the protein
a glycosyl
a labeling
the labeling
a chemical
a transferable
Can anyone help me about the code ?
Hai Anh