removing paragraphs from text files

A

alfonsobaldaserra

hello,

i have a specific paragraph in a bunch of configuration files that i
want to remove. the lines are as follows

define service{
use linux-service
host_name ninjasrv
service_description PING
check_command check_ping!100.0,20%!500.0,60%
action_url /nagios/pnp/index.php?host=$HOSTNAME$&srv=
$SERVICEDESC$
}

the 'use' and 'host_name' directives are different in each file. the
unique string is 'PING'.

i was just wondering if it is possible to do such thing in Perl?

thanks.
 
A

alfonsobaldaserra

    perl -p0777 -i -e 's/define service\{[^}]*PING[^}]*\}\s+//g' *.cf

that was so amazing, all done in a single shot. could you please also
help on what exactly is -p0777 and how did this substitution work 's/
define service\{[^}]*PING[^}]*\}\s+//g'. i have never seen/read such
regex.

thanks again.
 
A

alfonsobaldaserra

help on what exactly is -p0777 and how did this substitution work 's/
define service\{[^}]*PING[^}]*\}\s+//g'.  i have never seen/read such
regex.

i just found
-0777
the separator between records is 777 in octal; this is not a real
ASCII char so the whole file is slurped in as a single record;

now my confusion is the regex match.
it goes like, search for
define service followed by a { then any characters but not } then PING
then any characters but not } then atleast one space and replace with
nothing. i am just wondering what exactly is this [^}]* doing. i
tried it with .* like

define service\{.*PING.*\}\s+//g
but it would not replace.

my understanding is that it should work because [^}]* (any character
but not }) is same as .* in this case since I know there is no }
before PING string.

what am i missing?
 
P

Peter J. Holzer

how did this substitution work 's/ define
service\{[^}]*PING[^}]*\}\s+//g'.  i have never seen/read such regex.
[...]
now my confusion is the regex match.
it goes like, search for
define service followed by a { then any characters but not } then PING
then any characters but not } then atleast one space and replace with
nothing. i am just wondering what exactly is this [^}]* doing. i
tried it with .* like

define service\{.*PING.*\}\s+//g
but it would not replace.

my understanding is that it should work because [^}]* (any character
but not }) is same as .* in this case since I know there is no }
before PING string.

/./ is not "any character" but "any character except newline" unless you
use the /s modifier. So your substitution would only work if the whole
section was on a single line.

s/define service\{.*PING.*\}\s+//sg

OTOH would match anything from the first "define service{" to the last
"}" in the file (provided there's a PING somewhere between them) so it
would probably remove a lot more than you want. The /[^}]*/ in Tad's
regex is there to keep the match within a single brace-delimited block
(and it's a bit simple-minded: It won't work if you have a } inside a
comment, for example, but you probably don't, so that doesn't matter).

hp
 
S

sln

help on what exactly is -p0777 and how did this substitution work 's/
define service\{[^}]*PING[^}]*\}\s+//g'.  i have never seen/read such
regex.

i just found
-0777
the separator between records is 777 in octal; this is not a real
ASCII char so the whole file is slurped in as a single record;

now my confusion is the regex match.
it goes like, search for
define service followed by a { then any characters but not } then PING
then any characters but not } then atleast one space and replace with
nothing. i am just wondering what exactly is this [^}]* doing. i
tried it with .* like

define service\{.*PING.*\}\s+//g
but it would not replace.

my understanding is that it should work because [^}]* (any character
but not }) is same as .* in this case since I know there is no }
before PING string.

what am i missing?

If you have never read such a regex, you don't know regex. This is very simple.
You should visit this group/site more often.

Assuming a slurped in file and your test: s/define service\{.*PING.*\}\s+//g,
as Holzer said .* will greedily grab all the chars up until the last anchor 'PING.*\}\s+',
that is all except '\n' newline because you don't have /s modifier, and won't match anything.
Try 's/define service\{.*PING.*\}\s+//sg'.

Also, using greedy quantifiers with '.' is a tricky prospect. They have thier place
though. Most beginners just throw '.*' in the middle of thier regex, when in reality,
they should only be put in when the regex can already be described without them,
if at all.

The reason is that there is no guarantee of the shape of text when it is written to
a file, none! For this reason, regexs' should be molded with at least a certain level
of built in error checking (qualification). And while not %100, 90-95 will do as a
minimal QA check.

Thus, Tad used the '[^}]*' character class to describe all characters, but one.
Specifically NOT '}' which would signify the end of a block. Which leads to the next
problem:

How do you know the syntax of what the known parser uses to extract information
from that file? Even if the form of the writer is simple, even custom, there may be
anomolies introduced from the file system, even if the writer changes form, then what?
Surely you would want a little robustness of QA built into the regex.

Tad gave you what you wanted from your simple problem statement. Indeed it was stated
in simple terms, that would not be acceptable in a production environment.

A lot of times (most of them) here on this group/site, that is the case.
It just amazes me sometimes that people come back with, 'but it doesen't work if I
have this condition', that was never stated.

Tads regex could have been written (untested) like this:

/define\s+service\s*\{[^}]*service_description\s+PING[^}]*\}\s*//g

and still work, that maybe give some variability the way normal parsers work.
But you didn't state information on where it came from or how it is parsed.
Whether 'use' or 'service_description' any other other var type is there,
what order, required, etc...

No, you stated PING, the only constant, is in this form:
'define service{PING}'

Not alot to go on, but don't expect this to be a real parser unless you understand
the RULES.

Good luck.

-sln
 
E

Eric Pozharski

*SKIP*
*skipping alfonsobaldaserra since he skipped Tad anyway*
s/define service\{.*PING.*\}\s+//sg

OTOH would match anything from the first "define service{" to the last
"}" in the file (provided there's a PING somewhere between them) so it
would probably remove a lot more than you want. The /[^}]*/ in Tad's
regex is there to keep the match within a single brace-delimited block
(and it's a bit simple-minded: It won't work if you have a } inside a
comment, for example, but you probably don't, so that doesn't matter).

Then stricter

qr/\}\n+/

and stricter

qr/\}(?:\h*\n)+/ # needs 5.10

and stricter

qr/\}\h*\n(?:\h*\n)*/

What leads as to

perdoc -q nesting

and applieing regexes at HTML.
 
A

alfonsobaldaserra

Not alot to go on, but don't expect this to be a real parser unless you understand
the RULES.

that was an excellent explanation. thank you very much guys, i have
understood it now.
 
S

sln

*SKIP*
*skipping alfonsobaldaserra since he skipped Tad anyway*
s/define service\{.*PING.*\}\s+//sg

OTOH would match anything from the first "define service{" to the last
"}" in the file (provided there's a PING somewhere between them) so it
would probably remove a lot more than you want. The /[^}]*/ in Tad's
regex is there to keep the match within a single brace-delimited block
(and it's a bit simple-minded: It won't work if you have a } inside a
comment, for example, but you probably don't, so that doesn't matter).

Then stricter

qr/\}\n+/

and stricter

qr/\}(?:\h*\n)+/ # needs 5.10
^^^^
510 is great, a lot of new stuff in the engine.
New nesting, etc.

When you can write a regex without the need for the
# needs 5.10
maybe it might be usefull.

Btw, I don't think anybody skipped Tad, who never skips
anybody.

-sln
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,817
Latest member
DicWeils

Latest Threads

Top