Any Idea why this code doesn't remove all the blank lines?

Jack Wang · Feb 14, 2008

This is the code I've written so far.

#!/usr/bin/perl
my $result = "";
while (<>){
if (/---START---/../--END\s---/){
$result.=$_;
}
}
$text="";
$result=~m/^---START---(.*)--END\s---$/s;
$text.=$1;
$text =~ s/\n+/\n/g;
print $text;

This is the text that it should handle (shortened, ........ represents
more data).

---START---

1342A 1O B10/B11
1003 1O B45/Z46
1094 1O F39/F40
1416 1O G37/G38
1007 1O Z33/A34
..........................

.............................
.............................
.....stuff here..........
.....................

4105 4L F31/F32
.......................
......................

--END ---

I want to extract the data betweeen ---START--- and --END ---,
removing any blanklines. However, the above mentioned program would
outputs everything correctly except it leaves a blank line at the top
and I can't figure out why. Thanks for any help!

John W. Krahn · Feb 14, 2008

Jack said:
This is the code I've written so far.

#!/usr/bin/perl

use warnings;
use strict;

my $result = "";
while (<>){
if (/---START---/../--END\s---/){

next unless /\S/;
next if /---START---/ || /--END\s---/;

$result.=$_;
}
}
$text="";
$result=~m/^---START---(.*)--END\s---$/s;
$text.=$1;
$text =~ s/\n+/\n/g;
print $text;

John

xhoster · Feb 14, 2008

$result=~m/^---START---(.*)--END\s---$/s;
$text.=$1;
$text =~ s/\n+/\n/g;
....

However, the above mentioned program would
outputs everything correctly except it leaves a blank line at the top
and I can't figure out why. Thanks for any help!

You get a blank line either when there are two \n in a row, or when
the string has a single \n at the beginning. Your regex captures one,
but not the other.

Either don't capture them in the first place:

$result=~m/^---START---\n*(.*)--END\s---$/s;

Or remove it particularly:

$text =~ s/\n+/\n/g;
$text =~ s/^\n+//;

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Uri Guttman · Feb 14, 2008

JWK> use warnings;
JWK> use strict;

JWK> next unless /\S/;
JWK> next if /---START---/ || /--END\s---/;

you can use the return value of .. to eliminate the redundancy of those
regexes:

if ( my $range_num = /---START---/ .. /--END\s---/ ) {

next if $range_num == 1 || $range_num =~ /e/i ;
}

i would even drop the block:

my $range_num = /---START---/ .. /--END\s---/ ) {
next unless $range_num ;
next if $range_num == 1 || $range_num =~ /e/i ;
next unless /\S/ ;

but my favorite way is so much faster and shorter (untested):

use File::Slurp ;

my $text = read_file( \*STDIN ) ;
while( my( $result ) = $text =~ m/^---START---(.+)--END\s---$/msg ) {

# do newline and other cleanup here

$result =~ tr/\n//s ;

print $result ;
}

can't get much simpler than that.

uri

Martijn Lievaart · Feb 14, 2008

I want to extract the data betweeen ---START--- and --END ---,
removing any blanklines. However, the above mentioned program would
outputs everything correctly except it leaves a blank line at the top
and I can't figure out why. Thanks for any help!

Because you ask it to?

Your problem can be shortened to:
$ perl -e '$t="\ntest\n\ntest\n"; $t=~ s/^\n+/\n/g; print "t=$t\n"'

This does exactly the same thing, it leaves the first empty line. Why?
Because you replace the newline there wit a newline.

Try:
$ perl -e '$t="\ntest\n\ntest\n"; $t=~ s/\n+/x/g; print "t=$t\n"'

And you'll see what I mean.

You probably want to add:
$text =~ s/^\n//;
to achieve what you want.

Some stylistic issues:

#!/usr/bin/perl

use strict;
use warnings;

my $result = "";
while (<>){
if (/---START---/../--END\s---/){
$result.=$_;
}
}

Indentation helps for readability.

$text="";
$result=~m/^---START---(.*)--END\s---$/s;
$text.=$1;

Useless use of concatenation, Change to:

$result=~m/^---START---(.*)--END\s---$/s;
my $text = $1;

$text =~ s/\n+/\n/g;
print $text;

HTH,
M4

Mario D'Alessio · Feb 15, 2008

Try this:

while(<>)
{
#
# Grab the lines between these two lines (exclusive)
#
my $sequence = /---START---/.../--END\s---/;
next unless $sequence > 1; # Excludes left-hand pattern
next if $sequence =~ /E0$/; # Excludes right-hand pattern

next if /^\s*$/; # Skip blank lines
print;
}

Modify Python Code - no idea at all	0	Nov 5, 2003
[SUMMARY] Code Heuristics (#172)	1	Aug 10, 2008
Musatov's 'Mode/Code' Primary method call	4	Oct 31, 2009
Errata for The C Programming Language, Second Edition, by Brian Kernighanand Dennis Ritchie	4	May 16, 2009
Can't make this page work	6	Mar 8, 2006
[SUMMARY] TumbleDRYer (#53)	2	Nov 3, 2005
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Jan 12, 2008
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006

Any Idea why this code doesn't remove all the blank lines?

Jack Wang

John W. Krahn

xhoster

Uri Guttman

Martijn Lievaart

Mario D'Alessio

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads