Line replace

D

DarkBlue

Hello

I need some help

I have a text file which changes dynamically and has
200-1800 lines. I need to replace a line , this line
can be located via a text marker like :

somelines
# THIS IS MY MARKER
This is the line to be replaced
somemorelines

My question is how to do this in place without
a temporary file , so that the line after
the marker is being replaced with mynewlinetext.



Thanks
Nx
 
S

Steven D'Aprano

Hello

I need some help

I have a text file which changes dynamically and has
200-1800 lines. I need to replace a line , this line
can be located via a text marker like :

somelines
# THIS IS MY MARKER
This is the line to be replaced
somemorelines

My question is how to do this in place without
a temporary file , so that the line after
the marker is being replaced with mynewlinetext.

Let me see if I understand your problem... you need to edit a text file
in place at the same time that another process is also changing the file
in place? That's hard. You need some way to decide who gets precedence if
both you and the other process both try to change the same line
simultaneously.

I think the only way this is even half-doable will be if:

- the other process writing to the file only appends to the end of the
file, and does not try to write to the middle;

- the new line you are writing is the same length as the old line you are
replacing;

- and you are running an operating system that allows two processes to
have simultaneous write access to a file.

What problem are you trying to solve by having simultaneous writes to the
same file? Perhaps there is another way.
 
S

Sam Pointon

Hello
I need some help

I have a text file which changes dynamically and has
200-1800 lines. I need to replace a line , this line
can be located via a text marker like :

somelines
# THIS IS MY MARKER
This is the line to be replaced
somemorelines

My question is how to do this in place without
a temporary file , so that the line after
the marker is being replaced with mynewlinetext.

Thanks
Nx

You will either have to read the whole file into memory (at 1800 lines,
this shouldn't be too bad) or read piecementally from the input file,
write the processed output to a new file, delete the input file and
rename the new file to the original file (yes, that's using a temporary
file, but it'll be more memory friendly).

The first solution would look something like this:
#Untested
import sre
input_file = file('/your/path/here')
input_file_content = input_file.read()
input_file.close()
pat = sre.compile(r'^#THIS IS MY MARKER\n.*$')
mat = pat.search(input_file_content)
while mat:
input_file_content = pat.sub('New text goes here',
input_file_content)
mat = pat.search(input_file_content)
file('/your/path/here', 'w').write(input_file_content)

The second one might be cleaner to do using a shell script (assuming
you're on a *nix) - awk or sed are perfect for this type of job - but
the python solution will look like this:

#Untested
import os
input_file = file('/your/path/goes/here')
output_file = file('/tmp/temp_python_file', 'w')
marked = False
for line in input_file:
if line == '#THIS IS MY MARKER':
marked = True
elif marked:
output_file.write('New line goes here\n')
else:
output_file.write(line)
input_file.close()
os.system('rm /your/path/goes/here')
os.system('mv /tmp/temp_python_file /your/path/goes/here')
 
D

DarkBlue

Steven D'Aprano wrote:

Let me see if I understand your problem... you need to edit a text file
in place at the same time that another process is also changing the file
in place? That's hard. You need some way to decide who gets precedence if
both you and the other process both try to change the same line
simultaneously.

I think the only way this is even half-doable will be if:

- the other process writing to the file only appends to the end of the
file, and does not try to write to the middle;

- the new line you are writing is the same length as the old line you are
replacing;

- and you are running an operating system that allows two processes to
have simultaneous write access to a file.

What problem are you trying to solve by having simultaneous writes to the
same file? Perhaps there is another way.
Thanks for your reply.

I would have no problem to let other processes finish their
writing duty to the file and my script only gets access when
no other process is working with the file.
The file written to is the hosts.allow file which is
changed often by the blockhosts.py script when
some ssh access is attempted. Now blockhosts.py works
great , but sometimes our mobile clients try to access
from ip addresses which are completely blocked to avoid the
thousands of scripted attacks showing up in our logs.
Now our authorized clients register themselves automatically with
computername,id and ip address via a small python script which sends this
information to a firebird database on our server.
A serverside script scans the database ever so often and changes
the hosts.allow file to enable authorized clients to log on via ssh
if they have moved out of their original areas ( like traveling from
china to india and logging in from a hotel room)

Most of the clients run Suse9.3 so does the server
some are wxp machines which get their ssh access via
winscp or putty if needed.

Every client has a marker in the hosts.allow file
so if a change occurs one line shall be replaced
by another line on the fly.

I hope this describes it.



Nx
 
P

Paul Rubin

DarkBlue said:
Now our authorized clients register themselves automatically with
computername,id and ip address via a small python script which sends this
information to a firebird database on our server...
Every client has a marker in the hosts.allow file
so if a change occurs one line shall be replaced
by another line on the fly.

Why don't you use the database to store those markers? It should
support concurrent updates properly. That's a database's job.
 
D

DarkBlue

Why don't you use the database to store those markers? It should
support concurrent updates properly. That's a database's job.
The markers are just there to have a static way to find the line
after the marker, which is the one which might have to be changed.

Nx
 
P

Paul Rubin

DarkBlue said:
The markers are just there to have a static way to find the line
after the marker, which is the one which might have to be changed.

OK, why don't you store those changing lines in the database?

Can you arrange for those changeable lines to be fixed length, i.e.
by padding with spaces or something? If you can, then you could just
overwrite them in place. Use flock or fcntl (Un*x) or the comparable
Windows locking primitives to make sure other processes don't update
the file simultaneously.
 
D

DarkBlue

OK, why don't you store those changing lines in the database?
Can you arrange for those changeable lines to be fixed length, i.e.
by padding with spaces or something? If you can, then you could just
overwrite them in place. Use flock or fcntl (Un*x) or the comparable
Windows locking primitives to make sure other processes don't update
the file simultaneously.

hmm , the line is actually being read from a database
and now needs to be written into a file replacing
the line after the marker...
the line contains only an ip address

pseudocode is like this:

get newlinetext from database # this is ok done with kinterbas
preferably check if file not in use by other process
open file and find desired marker
go to line after marker and replace that one with newlinetext
close the file

Should be easy, but I am suffering from New Year writer's block..

Nx
 
M

Mike Meyer

DarkBlue said:
pseudocode is like this:

get newlinetext from database # this is ok done with kinterbas
preferably check if file not in use by other process
open file and find desired marker
go to line after marker and replace that one with newlinetext
close the file

Should be easy, but I am suffering from New Year writer's block..

This is only easy if the old and new data are exactly the same
size. In line oriented files, that's not normally the case.

The standard solution is to overwrite the entire file. Your code goes
like so:

get lock on file.
read in file.
produce new version of file.
write out file.
release lock on file.

Of course, this has the problem that if someothing goes wrong during
the write you're going to be up the creek without a paddle - or much
in the way of a canoe. This is why experienced people always use a
temp file, and do things like so:

get lock on file
read in file
produce new version of file in temp file
rename temp file to real file
release lock on file

I can't think of a good reason to skip using the temp file once you
have to write out the entire file.

If you can't deal with writing out the entire file, convert the file
from lines of text to something that can be update in place. It's not
clear how much else is going to have to change to deal with this,
though.

<mike
 
D

Dennis Lee Bieber

My question is how to do this in place without
a temporary file , so that the line after
the marker is being replaced with mynewlinetext.
On a stream-oriented I/O system... YOU DON'T

On a line-oriented I/O system... Maybe (Under the old Xerox CP/V OS,
you'd open the file for "UPDATE", which maintains two I/O pointers; one
for reading, one for writing, where the read pointer must be ahead of
the write pointer).

IF you can ensure that all records in your file are the same length,
you can use (f)seek to locate the beginning of records...

fseek((recnum -1) * reclen) #simplified, needs more arguments

.... and then overwrite just that record.

I would recommend that your "marker" NOT be part of this file --
otherwise you have to do lots of I/O each time you add/remove a marker.
Removing a marker requires something on the order of:

loop until EOF
read(recnum + 1)
write(recnum)
recnum += 1

and inserting one is even worse -- you might as well use a temp file
that allows for cleanly swapping in the modified file.

If the marker is, instead, stored in a second file and contains
references to the line number of the first, it is much cleaner. Of
course, that does mean the marker file is effectively a temp file since
you'd lock it at the start of processing, and delete it at the end.
--
 
D

Dennis Lee Bieber

Now our authorized clients register themselves automatically with
computername,id and ip address via a small python script which sends this
information to a firebird database on our server.
A serverside script scans the database ever so often and changes
the hosts.allow file to enable authorized clients to log on via ssh
if they have moved out of their original areas ( like traveling from
china to india and logging in from a hotel room)
Would it not be easier to just have the database update logic
regenerate the entire hosts.allow file during such an update, rather
than trying to put some sort of identifying comment in the hosts.allow
file that you then try editing later?
--
 
D

DarkBlue

Thank you for all the suggestions

It appears the safest solution still is using a temp file
as was so apt suggested further up here without it I maybe
white water rafting without a canoe.
I also will test the feasibility to regenerate
the whole file from the database.

Nx
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,274
Messages
2,571,372
Members
48,064
Latest member
alibsskemoSeAve

Latest Threads

Top