Horacius said:
Hi,
I need to write a program which reads an external text file. Each time
it reads, then it needs to delete some lines, for instance from second
line to 55th line. The file is really big, so what do you think is the
fastest method to delete specific lines in a text file ?
Thanks
One way would be to "mark" the lines as being deleted by either:
1) replacing them with some known character sequence that you treat as deleted.
This assumes that the lines are long enough.
or
2) by keeping a separate dictionary that holds line numbers and deleteflag.
Pickle and dump this dictionary before program execution ends. Load it at
program execution beginning.
deletedFlags={1:False, 2: True, ...}
def load():
pFiles="deletedLines.toc"
fp=open(pFiles, 'wb')
deletedFlags=pickle.dump(fp)
fp.close()
def dump(deletedFlags):
pFiles="deletedLines.toc"
fp=open(pFiles, 'rb')
pickle.dump(deletedFlags, fp)
fp.close()
Caveats:
1) you must write EXACTLY the same number of bytes (padded with spaces, etc.) on
top of deleted lines. This method doesn't work if any of the lines
are so short they don't support your <DELETED> flag string.
2) You must be very careful to maintain consistency of the deletedFlags
dictionary and the data file (by using try/except/finally around your entire
process).
Personally I would employ method #2 and periodically "pack" the file with a
separate process. That could run unattended (e.g. at night). Or, if I did this
a lot, I would use a database instead.
-Larry