Regular expression question

K

Kapil Khosla

Hi,
I am trying to match the expression

#brace.txt
int main()
{
return main();
}

in a file but am getting stuck somewhere. This is the code I could write till now.

open I, "E:\\mycode\\perl\\brace.txt";

while($line = <I>)
{
$line =~ s/int main\(\)\s*\{return\s*main\(\)\;\s*\}/matched/ ;
print $line;
}


close I;

Where am I going wrong? Please help,
Thanks,
Kapil
 
T

Tore Aursand

I am trying to match the expression

#brace.txt
int main()
{
return main();
}

in a file but am getting stuck somewhere. This is the code I could write
till now.

open I, "E:\\mycode\\perl\\brace.txt";

Always (!) check if open() succeeds, and drop the double quotes; they're
not necessary here;

open(I, '<', 'e:\mycode\perl\brace.txt') or die "$!\n";
while($line = <I>)
{
$line =~ s/int main\(\)\s*\{return\s*main\(\)\;\s*\}/matched/;
print $line;
}

You're trying to match multiple lines. The code above will try to match
only one line at a time. You should consider reading your whole file into
one string and _then_ do the matching.


--
Tore Aursand <[email protected]>
"Scientists are complaining that the new "Dinosaur" movie shows
dinosaurs with lemurs, who didn't evolve for another million years.
They're afraid the movie will give kids a mistaken impression. What
about the fact that the dinosaurs are singing and dancing?" -- Jay
Leno
 
K

Kapil Khosla

Great ! Thanks. I modified my script to

open I, "E:\\mycode\\perl\\brace.txt" or die"Could not open file";

while($line = <I>)
{
$line =~ s/\s*int main\(.*\)\s*{?//g;
$line =~ s/\s*\{.*//g;
$line =~ s/\s*return main\(.*\)\;//g;
$line =~ s/\}//g;
print $line;
}


close I

This has one issue. The input file looks like
int main()
{
// Do something
}

int main()
{
return main();
}

I only want to delete the second instance of the main block and not
the first instance. The code I wrote above deletes the main from the
first block too.
Is there a way to modify this script to just delete the second block.

Thanks,
Kapil
 
T

Tore Aursand

Great ! Thanks.

Don't top post, and read the posting guidelines posted in this newsgroup
regularly.
I modified my script to

open I, "E:\\mycode\\perl\\brace.txt" or die"Could not open file";

You listened to something that I wrote in my previous message. That's
good. Why didn't you listen to everything I said? You don't have to
worry about escaping the backslashes if you stay with single quote. You
also want to know _what_ went wrong (which is stored in '$!').
while($line = <I>)
{
$line =~ s/\s*int main\(.*\)\s*{?//g;
$line =~ s/\s*\{.*//g;
$line =~ s/\s*return main\(.*\)\;//g;
$line =~ s/\}//g;
print $line;
}

I think you misunderstood my previous post: You need to _match_ on more
than one line in 'brace.txt'. You don't need to do the match (or the
substitution) in more than one line.
The input file looks like [...]

Why didn't you tell us this the first time you posted?
int main()
{
// Do something
}

int main()
{
return main();
}

I only want to delete the second instance of the main block and not
the first instance. The code I wrote above deletes the main from the
first block too.

First of all - you have to read the _entire_ as a string. Then you need
to match on something _possibly_ followed by something similar.

Read the Perl documentation for matching nested structures;

perldoc -q match
 
T

Tad McClellan

Kapil Khosla said:
I am trying to match
^^^^^
^^^^^

Did you try executing this before posting?

perldoc -q match

open I, "E:\\mycode\\perl\\brace.txt";


You should always, yes *always*, check the return value from open().

If you use single quotes you won't have to backslash the backslashes.

If you use sane slashes you won't even have any backslashes to backslash:

open I, 'E:/mycode/perl/brace.txt' or
die "could not open 'E:/mycode/perl/brace.txt' $!";

while($line = <I>)
{
$line =~ s/int main\(\)\s*\{return\s*main\(\)\;\s*\}/matched/
^^
^^ need \s* here

Where am I going wrong?


By concentrating on the wrong half of the problem.

There are *two* things that might be causing a pattern match to misbehave,
the pattern and the string that you are attempting to match the
pattern against.

You have a problem with the string that you are trying to match
against (it contains only a single line).

You also have a problem with your pattern, as noted above.

Please help,


See the answer to your FAQ:

I'm having trouble matching over more than one line. What's wrong?
 
K

Kapil Khosla

I figured this out based on your previous post.
I have attached the code below.
Thanks.

open I, "E:\\mycode\\perl\\brace.txt" or die"Could not open file";
local $/ = undef;

$line = <I>;

$line =~ s/\s*int main\(.*\)\s*\{\s*return\s*main\(.*\)\s*\;\s*\}//g;
print $line;

close I;
 
K

Kapil Khosla

Allright,
I just read your post. This came in after I replied so my apologies.
What do you mean by "Dont top post".
I am sure this would be somewhere in the documentation but I have been
going through
http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html#must

but could not find what it means.
Thanks,
Kapil


Great ! Thanks.

Don't top post, and read the posting guidelines posted in this newsgroup
regularly.
I modified my script to

open I, "E:\\mycode\\perl\\brace.txt" or die"Could not open file";

You listened to something that I wrote in my previous message. That's
good. Why didn't you listen to everything I said? You don't have to
worry about escaping the backslashes if you stay with single quote. You
also want to know _what_ went wrong (which is stored in '$!').
while($line = <I>)
{
$line =~ s/\s*int main\(.*\)\s*{?//g;
$line =~ s/\s*\{.*//g;
$line =~ s/\s*return main\(.*\)\;//g;
$line =~ s/\}//g;
print $line;
}

I think you misunderstood my previous post: You need to _match_ on more
than one line in 'brace.txt'. You don't need to do the match (or the
substitution) in more than one line.
The input file looks like [...]

Why didn't you tell us this the first time you posted?
int main()
{
// Do something
}

int main()
{
return main();
}

I only want to delete the second instance of the main block and not
the first instance. The code I wrote above deletes the main from the
first block too.

First of all - you have to read the _entire_ as a string. Then you need
to match on something _possibly_ followed by something similar.

Read the Perl documentation for matching nested structures;

perldoc -q match
 
P

Paul Lalli

Allright,
I just read your post. This came in after I replied so my apologies.
What do you mean by "Dont top post".
I am sure this would be somewhere in the documentation but I have been
going through
http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html#must

but could not find what it means.
Thanks,
Kapil

Do you see how your replies are always at the top of the message, with the
quoted text below them? Do you see how mine right now (and most others)
are below what I'm quoting? That's the difference. What you are doing is
top-posting. It's considered rude because you can't read a 'conversation'
straight through by starting at the top and going to the bottom. If you
bottom-post, like you're supposed to, the email thread is far easier to
read by all involved.

This policy can be found at the section entitled "Use an effective
followup style " both at the URL you posted as well as the Posting
Guidelines sent to this group itself.

Paul Lalli
 
C

Chris Mattern

Kapil said:
Allright,
I just read your post. This came in after I replied so my apologies.
What do you mean by "Dont top post".
I am sure this would be somewhere in the documentation but I have been
going through
http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html#must

but could not find what it means.
Thanks,
Kapil
Then you must not have looked too hard. Cut and pasted directly
from the website:

Intersperse your comments following each section of quoted text to which
they relate. Unappreciated followup styles are referred to as
``top-posting'', ``Jeopardy'' (because the answer comes before the
question), or ``TOFU'' (Text Over, Fullquote Under).

Familiarizing yourself with your browser's "Find" function might help;
I got this very quickly simply by searching for "top".
--
Christopher Mattern

"Which one you figure tracked us?"
"The ugly one, sir."
"...Could you be more specific?"
 
R

Richard Morse

I figured this out based on your previous post.
I have attached the code below.
Thanks.

open I, "E:\\mycode\\perl\\brace.txt" or die"Could not open file";
local $/ = undef;

$line = <I>;

$line =~ s/\s*int main\(.*\)\s*\{\s*return\s*main\(.*\)\s*\;\s*\}//g;
print $line;

close I;

If I may, might I suggest the following would be "better" (for some
value of better):

#!perl
use strict;
use warnings;


my $data;
{
# if you're going to lexicalize global variables, always do
# so in a code block. Or save and restore their original
# values
open(my $in, "<", 'e:/mycode/perl/brace.txt')
or die("could not open file: $!");
local $/ = undef;
$data = <$in>;
close($in);
}

$data =~ s/\s*int main\(.*?\)\s*{\s*return\s*main\(.*?\)\s*;\s*}//g;
print $data;

__END__

HTH,
Ricky
 
T

Tad McClellan

Richard Morse said:
(e-mail address removed) (Kapil Khosla) wrote:


[snip - about multi-line matching ]

my $data;
{
# if you're going to lexicalize global variables, always do
^^^^^^^^^^

Eh?

I guess you meant "if you're going to change a built-in
variable's value" instead?

I hope so, because you cannot make $/ be lexical, it is
always a package variable.

I also hope so because then the advice you gave is very good advice. :)

# so in a code block. Or save and restore their original
# values
open(my $in, "<", 'e:/mycode/perl/brace.txt')
or die("could not open file: $!");
local $/ = undef;


$/ is NOT "lexical" here, it is still "dynamic"...

$data = <$in>;
close($in);
}


If I'm going to make my own scope anyways, then I usually let
Perl handle all that file opening/closing stuff for me:

{
local $/; # enable slurp mode
local @ARGV = 'e:/mycode/perl/brace.txt';
$data = <>;
}
 
R

Richard Morse

Tad McClellan said:
^^^^^^^^^^

Eh?

I guess you meant "if you're going to change a built-in
variable's value" instead?

I'm sorry, I've been reading too much about lexical variables lately. I
meant, of course, 'localize', not 'lexicalize'.

This may still be the wrong terminology, but at least it's the right
wrong.
If I'm going to make my own scope anyways, then I usually let
Perl handle all that file opening/closing stuff for me:

{
local $/; # enable slurp mode
local @ARGV = 'e:/mycode/perl/brace.txt';
$data = <>;
}

How does it handle the 'or die' clause?

Ricky
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,148
Messages
2,570,834
Members
47,380
Latest member
AlinaBlevi

Latest Threads

Top