regroup several lines into one by counting parenthesis

Headphones · Nov 14, 2005

Hello,

I try to regroup several lines into one by counting parenthesis. The
goal is to read a tnsnames files (Oracle Database) to get an array
where one line = one instance.

tnsnames example :
Instance_name = (DESCRIPTION=

(ADDRESS=(PROTOCOL=tcp)(HOST=This_is_host_name)(PORT=1234))
(CONNECT_DATA=(SID=xxxxxx))
)

I know i can do a loop and count parenthesis for each lines and regroup
line when count is ok but i 'll prefer to have a regular expression for
doing this. Is there a regexp master that can help me for this ?

Thanks a lot

Dr.Ruud · Nov 14, 2005

Headphones:

I try to regroup several lines into one by counting parenthesis. The
goal is to read a tnsnames files (Oracle Database) to get an array
where one line = one instance.

tnsnames example :
Instance_name = (DESCRIPTION=

(ADDRESS=(PROTOCOL=tcp)(HOST=This_is_host_name)(PORT=1234))
(CONNECT_DATA=(SID=xxxxxx))
)

I know i can do a loop and count parenthesis for each lines and
regroup line when count is ok but i 'll prefer to have a regular
expression for doing this. Is there a regexp master that can help me
for this ?

First put them all on 1 long line and normalize the spaces, then replace
the (...) inside-out by {...} except for (DESCRIPTION=...), then insert
a linebreak after each remaining ), at last replace the {} with ()
again.

Untested:

s/\s+/ /sg

s/[(](?!DESCRIPTION=)([^()]*)[)]/{$1}/g

s/([)])/$1\n/sg

s/{/(/sg
s/}/)/sg

This assumes that {} are not in the original, pick any other pair if
necessary. And that () are not inside the values.

Now show us your Perl code.

Matt Garrish · Nov 14, 2005

Dr.Ruud said:
Headphones:

I try to regroup several lines into one by counting parenthesis. The
goal is to read a tnsnames files (Oracle Database) to get an array
where one line = one instance.

tnsnames example :
Instance_name = (DESCRIPTION=

(ADDRESS=(PROTOCOL=tcp)(HOST=This_is_host_name)(PORT=1234))
(CONNECT_DATA=(SID=xxxxxx))
)

I know i can do a loop and count parenthesis for each lines and
regroup line when count is ok but i 'll prefer to have a regular
expression for doing this. Is there a regexp master that can help me
for this ?

Click to expand...

First put them all on 1 long line and normalize the spaces, then replace
the (...) inside-out by {...} except for (DESCRIPTION=...), then insert
a linebreak after each remaining ), at last replace the {} with ()
again.

Untested:

s/\s+/ /sg

s/[(](?!DESCRIPTION=)([^()]*)[)]/{$1}/g

s/([)])/$1\n/sg

s/{/(/sg
s/}/)/sg

This assumes that {} are not in the original, pick any other pair if
necessary. And that () are not inside the values.

Or just use Text::Balanced...

Matt

Headphones · Nov 14, 2005

You told me :

First put them all on 1 long line ... then replace ... except for (DESCRIPTION=...),

In fact : the <code>(DESCRIPTION=...)</code> is exactly what i try to
retrieve. If on my long lone line, i could select everthing like this :
INSTANCE_NAME=(DESCRIPT*) with the ')' corresponding to the closing
parenthesis, i would reach my goal.

I thought there were regular expressions to get the closing
parenthesis. To solve my problem, i do "à la C".

my $parenthesis = 0;
my $uniline = "";

while(<TNSNAMES_FILE>){
chomp;
next if /^\s*$/; # remove empty lines
next if /^\s*#/; # remove comments
s/^\s*//; #remove leading spaces
s/\s*$//; #remove trailing spaces

# count opening parenthesis
# now go through and count all the "("
my $txtline = $_;
while ($txtline =~ /$/g) {$parenthesis++;}

# count closing parenthesis
my $txtline = $_;
while ($txtline =~ /$/g) { $parenthesis--; }

# append line to uniline
$uniline .= $_;

# if parenthesis count == 0, we have our uniline
if ($parenthesis == 0){
print "UNILINE = $uniline\n";

# Then reset uniline for the next instance
$uniline="";
}
}

Headphones · Nov 14, 2005

Text::Balanced apparently doesn't work as i expect.

The prefix 'INSTANCE_NAME = ' cause problems and so, the function
return only one line with all the configs.

my $oneLine = ""; # put the file on one single line.
while(<$fh>){
chomp;
next if /^\s*$/;
next if /^\s*#/;
s/^\s*//; #remove leading spaces
s/\s*$//; #remove trailing spaces
$oneLine .= $_;
}

my @result = extract_bracketed( $oneLine, '()' ); # doesn't return
what i expected :-(

I think i will keep my code 'à la C' exept if someone find the
'golden' code that resolve this without 'à la C' code.

thanks to all.

Dr.Ruud · Nov 14, 2005

Headphones:

the <code>(DESCRIPTION=...)</code> is exactly what i try to
retrieve. If on my long lone line, i could select everthing like this
:
INSTANCE_NAME=(DESCRIPT*) with the ')' corresponding to the closing
parenthesis, i would reach my goal.

That is basically what I wrote out, by first replacing the inner (...)
by curlies.

But go and look into Text::Balanced first, as Matt Garish suggested.
http://search.cpan.org/~dconway/Text-Balanced/lib/Text/Balanced.pm

Dr.Ruud · Nov 14, 2005

Headphones:

Always put

#!/usr/bin/perl
use strict;
use warnings;

at the start of your code.

chomp;

That is redundant, see below.

next if /^\s*$/; # remove empty lines
next if /^\s*#/; # remove comments
s/^\s*//; #remove leading spaces
s/\s*$//; #remove trailing spaces

Your last two '\s*' are meant as '\s+'.

s/^\s+//; # remove leading whitespace
s/\s+$//; # remove trailing whitespace, \n too.
next if /^#/; # skip comment lines
next if /^$/; # skip empty lines

my $txtline = $_;

No need for $txtline.

while ($txtline =~ /\(/g) {$parenthesis++;}

Alternatives:

$parenthesis++ while /\(/g;

and

$parenthesis += () = s/(\()/$1/g;

Dr.Ruud · Nov 14, 2005

Headphones:

Text::Balanced apparently doesn't work as i expect.

The prefix 'INSTANCE_NAME = ' cause problems and so, the function
return only one line with all the configs.

my $oneLine = ""; # put the file on one single line.
while(<$fh>){
chomp;
next if /^\s*$/;
next if /^\s*#/;
s/^\s*//; #remove leading spaces
s/\s*$//; #remove trailing spaces
$oneLine .= $_;
}

Variant:

{
local $/ = undef;
$_ = <$fh>; # slurp
}
s/\s*\n\s*//g # remove newlines and surrounding whitespace
s/^ //; # remove any leading space
s/ $//; # remove any trailing space

my $oneLine = $_;

my @result = extract_bracketed( $oneLine, '()' ); # doesn't return
what i expected :-(

extract_bracketed() can be given a 3rd parameter, check the
documentation.

I think i will keep my code 'à la C' exept if someone find the
'golden' code that resolve this without 'à la C' code.

That code only works if you can assume that a new Instance_name will be
on the start of a new line, which you didn't explicitly state.

Headphones · Nov 15, 2005

Your two last messages are intersting. 'caus i'm busy this week, I'll
investigate this and respond later. (response for next week)

And thanks for this code (i didn't knew it was possible) :
{
local $/ = undef;
$_ = <$fh>; # slurp
}

A. Sinan Unur · Nov 15, 2005

Your two last messages are intersting.

Whose last two messages? Please quote some context when you reply.

And thanks for this code (i didn't knew it was possible) :
{
local $/ = undef;
$_ = <$fh>; # slurp
}

Strictly speaking, the call to undef is not needed. Just

local $/;

is enough.

Sinan

Headphones · Nov 18, 2005

Your two last messages are intersting.

Whose last two messages? Please quote some context when you reply.

Oups, i forgot that GoogleGroups was also usenet :-/

DrRuud :
TextBalanced::extract_bracketed
documentation.

I checked the documentation but the third parameter, if given, will be
matched but skipped in the return value. As i want to get back the
instance_name AND What is described in parenthesis, extract_bracketed
will not help me.
on the start of a new line, which you didn't explicitly state

Correct ! i'm still looking to resolve that problem

Another nice shortcut for coding that i didn't knew.

Here is the final code (with 2 minor improvments still to do.
if ($fh->open("< $file")) {
while(<$fh>){
chomp;
next if /^\s*$/;
next if /^\s*#/; # remove comments
# TODO : comments at end of line are not removed !!

# remove leading and trailing spaces
s/^\s+//; #remove leading spaces
s/\s*$//; #remove trailing spaces
# count opening parenthesis

# now go through and count all the "("
$parenthesis++ while /$/g;

# count closing parenthesis
$parenthesis-- while /$/g;

# append line to uniline
$uniline .= $_;

# if parenthesis count == 0, we have our uniline
if ($parenthesis == 0){

# Do some process
# .......

# reset uniline
$uniline="";
}
}
}

Dr.Ruud · Nov 18, 2005

Headphones:

chomp;
next if /^\s*$/;
next if /^\s*#/; # remove comments
# TODO : comments at end of line are not removed !!

# remove leading and trailing spaces
s/^\s+//; #remove leading spaces
s/\s*$//; #remove trailing spaces

<record state=broken>
The chomp removes an optional \n, which is whitespace.
Your s/\s*$// should be s/\s+$//, because you need at least 1 whitespace
character before you can remove any.
</record>

s/^\s+//; # remove leading whitespace
s/\s+$//; # remove trailing whitespace
next if /^$/; # skip empty lines
next if /^#/; # skip comment lines

Removing end-of-line comments is a bit harder. This will remove the
non-interpunctual ones:
s/#(?:\s*\w*)*$//

Group several lines into one line	6	Apr 27, 2009
GURU NEEDED : break a command into several lines and comment each line	4	Jan 13, 2011
how to split text into lines?	5	Jul 30, 2008
matching lines....	5	Dec 16, 2006
Building several parsing modules	1	Mar 18, 2007
baffled by 'new' operator with array type.	12	Jul 6, 2013
connecting to Oracle from ASP.NET 2.0 ?	1	Jun 7, 2007
Inserting lines into text files, or howto "fix" vCards having no n: entry	7	Jun 7, 2006

regroup several lines into one by counting parenthesis

Headphones

Dr.Ruud

Matt Garrish

Headphones

Headphones

Dr.Ruud

Dr.Ruud

Dr.Ruud

Headphones

A. Sinan Unur

Headphones

Dr.Ruud

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads