re.search - just skip it

rasdj · Jan 26, 2005

Input is this:

SET1_S_W CHAR(1) NOT NULL,
SET2_S_W CHAR(1) NOT NULL,
SET3_S_W CHAR(1) NOT NULL,
SET4_S_W CHAR(1) NOT NULL,
;

..py says:

import re, string, sys
s_ora = re.compile('.*S_W.*')
lines = open("y.sql").readlines()
for i in range(len(lines)):
try:
if s_ora.search(lines): del lines
except IndexError:
open("z.sql","w").writelines(lines)

but output is:

SET2_S_W CHAR(1) NOT NULL,
SET4_S_W CHAR(1) NOT NULL,
;

It should delete every, not every other!

thx,

RasDJ

Duncan Booth · Jan 26, 2005

wrote:

Input is this:

SET1_S_W CHAR(1) NOT NULL,
SET2_S_W CHAR(1) NOT NULL,
SET3_S_W CHAR(1) NOT NULL,
SET4_S_W CHAR(1) NOT NULL,
;

.py says:

import re, string, sys
s_ora = re.compile('.*S_W.*')
lines = open("y.sql").readlines()
for i in range(len(lines)):
try:
if s_ora.search(lines): del lines
except IndexError:
open("z.sql","w").writelines(lines)

but output is:

SET2_S_W CHAR(1) NOT NULL,
SET4_S_W CHAR(1) NOT NULL,
;

It should delete every, not every other!

No, it should delete every other line since that is what happens if you use
an index to iterate over a list while deleting items from the same list.
Whenever you delete an item the following items shuffle down and then you
increment the loop counter which skips over the next item.

The fact that you got an IndexError should have been some sort of clue that
your code was going to go wrong.

Try one of these:
iterate backwards
iterate over a copy of the list but delete from the original
build a new list containing only those lines you want to keep

also, the regex isn't needed here, and you should always close files when
finished with them.

Something like this should work (untested):

s_ora = 'S_W'
input = open("y.sql")
try:
lines = [ line for line in input if s_ora in line ]
finally:
input.close()

output = open("z.sql","w")
try:
output.write(str.join('', lines))
finally:
output.close()

Fredrik Lundh · Jan 26, 2005

but output is:

SET2_S_W CHAR(1) NOT NULL,
SET4_S_W CHAR(1) NOT NULL,

It should delete every, not every other!

for i in range(len(lines)):
try:
if s_ora.search(lines): del lines
except IndexError:
...

when you loop over a range, the loop counter is incremented also if you delete
items. but when you delete items, the item numbering changes, so you end up
skipping over an item every time the RE matches.

to get rid of all lines for which s_ora.search matches, try this

lines = [line for line in lines if not s_ora.search(line)]

for better performance, get rid of the leading and trailing ".*" parts of your
pattern, btw. not that it matters much in this case (unless the SQL state-
ment is really huge).

</F>

Kent Johnson · Jan 26, 2005

Input is this:

SET1_S_W CHAR(1) NOT NULL,
SET2_S_W CHAR(1) NOT NULL,
SET3_S_W CHAR(1) NOT NULL,
SET4_S_W CHAR(1) NOT NULL,
;

.py says:

import re, string, sys
s_ora = re.compile('.*S_W.*')
lines = open("y.sql").readlines()
for i in range(len(lines)):
try:
if s_ora.search(lines): del lines

When you delete for example lines[0], the indices of the following lines change. So the former
lines[1] is now lines[0] and will not be checked.

The simplest way to do this is with a list comprehension:
lines = [ line for line in lines if not s_ora.search(line) ]

Even better, there is no need to make the intermediate list of all lines, you can say
lines = [ line for line in open("y.sql") if not s_ora.search(line) ]

In Python 2.4 you don't have to make a list at all, you can just say
open("z.sql","w").writelines(line for line in open("y.sql") if not s_ora.search(line))

Kent

How do i get numberOfItemsHired to only accept 1-500 if it is outside those values error message should be displayed	10	Jul 5, 2024
How to replace the two last digits from an xml file?	2	Mar 6, 2009
Need help with this script	4	Mar 12, 2023
Python battle game help	2	Feb 23, 2023
Python client/server that reads HTML body from server	1	Apr 12, 2023
groveling over a file for Q:: and A:: stmts	3	Jul 24, 2012
ChatBot	4	Jan 19, 2021
Improving the web page download code.	5	Aug 27, 2013

re.search - just skip it

rasdj

Duncan Booth

Fredrik Lundh

Kent Johnson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads