N
Nuralanur
-------------------------------1127918773
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
Hello,
I have a regexp search problem.
I have written a text-correction program in Ruby which
reads in a text file and marks every word that's not
in a dictionary array red in an RTF output file
(still a plain text file, from Ruby's viewpoint) and saves that
file (i.e., it is still a text, with some commands specific
to Rich Text Format).
For instance, in a text, I have a citation "(Fox , 1970)."
Now, "(Fox" is not a correct English word, so it should
be red and bold, the comma is all right, so it stays black, and
"1970)." is not a correct English word, either, so it
should be red and bold, also.
In RTF, you can achieve this by replacing
a="(Fox , 1970)."
by
b=" \cf1\b (Fox \cf0\b0 , \cf1\b 1970). \cf0\b0 ".
Now, if you say
p b
Ruby will give the following output
However, I would like to remove all the characters of the form '\' + number
from the RTF file in a next step.
Is there a character class for Regexps (like \w,\S etc.) that achieves
this?
I have learned so far that '\010' is one character, and not the same as
'backslash' +
three digits.
Best regards,
Axel
-------------------------------1127918773--
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
Hello,
I have a regexp search problem.
I have written a text-correction program in Ruby which
reads in a text file and marks every word that's not
in a dictionary array red in an RTF output file
(still a plain text file, from Ruby's viewpoint) and saves that
file (i.e., it is still a text, with some commands specific
to Rich Text Format).
For instance, in a text, I have a citation "(Fox , 1970)."
Now, "(Fox" is not a correct English word, so it should
be red and bold, the comma is all right, so it stays black, and
"1970)." is not a correct English word, either, so it
should be red and bold, also.
In RTF, you can achieve this by replacing
a="(Fox , 1970)."
by
b=" \cf1\b (Fox \cf0\b0 , \cf1\b 1970). \cf0\b0 ".
Now, if you say
p b
Ruby will give the following output
" \0061\010 (Fox \0060\010 1970). \0060\0100 "
However, I would like to remove all the characters of the form '\' + number
from the RTF file in a next step.
Is there a character class for Regexps (like \w,\S etc.) that achieves
this?
I have learned so far that '\010' is one character, and not the same as
'backslash' +
three digits.
Best regards,
Axel
-------------------------------1127918773--