J
Jared.S.Bauer
Hello,
I'm new to python and I'm having problems with a regular expression. I
use textmate as my editor and when I run the regex in textmate it
works fine, but when I run it as part of the script it freezes. Could
anyone help me figure out why this is happening and how to fix it.
Here is the script:
======================================================
# regular expression search and replace
import sys, os, re, string, csv
#Open the file and taking its data
myfile=open('Steve_query3.csv') #Steve_query_test.csv
#create an error flag to loop the script twice
#store all file's data in the string object 'text'
myfile.seek(0)
text = myfile.read()
for i in range(2):
#def textParse(text, reRun):
print 'how many times is this getting executed', i
#Now to create the newfile 'test' and write our 'text'
newfile = open('Steve_query3_out.csv', 'w')
#open the new file and set it with 'w' for "write"
#loop trough 'text' clean them up and write them into the 'newfile'
#sub( pattern, repl, string[, count])
#"sub("(?i)b+", "x", "bbbb BBBB")" returns 'x x'.
text = re.sub('(\<(/?[^\>]+)\>)', "", text)#remove the HTML
text = re.sub('/<!--(.|\s)*?-->/', "", text) #remove comments <!--[^
\-]+-->
text = re.sub('\/\*(.|\s)*?;}', "", text) #remove css formatting
#remove a bunch of word formatting yuck
text = re.sub(" ", " ", text)
text = re.sub("<", "<", text)
text = re.sub(">", ">", text)
text = re.sub(""|&rquot;|“", "\'", text)
#===================================
#The two following lines are the ones giving me the problems
text = re.sub("w.|\s)*?\n", "", text)
text = re.sub("UnhideWhenUsed=(.|\s)*?\n", "", text)
#===========================================
text = re.sub(re.compile('^\r?\n?$', re.MULTILINE), '', text) #remove
the extra whitespace
#now write out the new file and close it
newfile.write(text)
newfile.close()
#open the newfile and run the script again
#Open the file and taking its data
myfile=open('Steve_query3_out.csv') #Steve_query_test.csv
#store all file's data in the string object 'text'
myfile.seek(0)
text = myfile.read()
Thanks for the help,
-Jared
I'm new to python and I'm having problems with a regular expression. I
use textmate as my editor and when I run the regex in textmate it
works fine, but when I run it as part of the script it freezes. Could
anyone help me figure out why this is happening and how to fix it.
Here is the script:
======================================================
# regular expression search and replace
import sys, os, re, string, csv
#Open the file and taking its data
myfile=open('Steve_query3.csv') #Steve_query_test.csv
#create an error flag to loop the script twice
#store all file's data in the string object 'text'
myfile.seek(0)
text = myfile.read()
for i in range(2):
#def textParse(text, reRun):
print 'how many times is this getting executed', i
#Now to create the newfile 'test' and write our 'text'
newfile = open('Steve_query3_out.csv', 'w')
#open the new file and set it with 'w' for "write"
#loop trough 'text' clean them up and write them into the 'newfile'
#sub( pattern, repl, string[, count])
#"sub("(?i)b+", "x", "bbbb BBBB")" returns 'x x'.
text = re.sub('(\<(/?[^\>]+)\>)', "", text)#remove the HTML
text = re.sub('/<!--(.|\s)*?-->/', "", text) #remove comments <!--[^
\-]+-->
text = re.sub('\/\*(.|\s)*?;}', "", text) #remove css formatting
#remove a bunch of word formatting yuck
text = re.sub(" ", " ", text)
text = re.sub("<", "<", text)
text = re.sub(">", ">", text)
text = re.sub(""|&rquot;|“", "\'", text)
#===================================
#The two following lines are the ones giving me the problems
text = re.sub("w.|\s)*?\n", "", text)
text = re.sub("UnhideWhenUsed=(.|\s)*?\n", "", text)
#===========================================
text = re.sub(re.compile('^\r?\n?$', re.MULTILINE), '', text) #remove
the extra whitespace
#now write out the new file and close it
newfile.write(text)
newfile.close()
#open the newfile and run the script again
#Open the file and taking its data
myfile=open('Steve_query3_out.csv') #Steve_query_test.csv
#store all file's data in the string object 'text'
myfile.seek(0)
text = myfile.read()
Thanks for the help,
-Jared