Why is this sub removing newlines??

Rainer Weikusat · Dec 7, 2013

Janek Schleicher said:
Am 06.12.2013 15:29, schrieb Rainer Weikusat:

So, you also prefer to write
s/\r?\n$// instead of oversimplifying chomp; ?

Since you apparently missed this: While there's doubtlessly many a
developer who is convinced to have invented something comparable to 'the
wheel', ie, a basic design which will remain in use for a few thousand
years, as soon as he managed to tack three lines of code together doing
something other than 'crash immediately', possibly even more so on CPAN,
using this comparison is either a case of hybris bordering serious
megalomania or just someone babbling along without spending much effort
on thinking about what he's actually saying, not the least because this
simile is actually wrong: Wheels come in many different kinds and even a
seriously reality-impaired mathead should have noticed the difference
between, say, tanks, push chairs, racing cars and pottery wheels. "But
can't you see they all round and rotate!" isn't much of a similarity at
the technical level.

Charles DeRykus · Dec 10, 2013

Sorry, more redemption is needed.

But I'm not quite ready to declare it unredeemable:

$string =~ s/ ^\s+ | \s+(?=\n)$ | \s*[^\n\S]+$ //gx;

[ depending on flavor of white space you want ]

Rainer Weikusat · Dec 10, 2013

Rainer Weikusat said:
Charles DeRykus said:

34:44 AM UTC-8, Ben Bacarisse wrote:

On Thursday, December 5, 2013 1:56:46 PM UTC-8, John Black wrote:>

keep this in mind - I had wanted that trim function to not strip the
newlines (and not add any either if there wasn't one). Should not be

Another option: a regex that'd handle any trailing newline:
$string =~ s/ ^\s+ | \s+(?=\n|)$ //gx;

Surely this strips the newline?

Indeed. I was slipping off the end... I think,hope a redemptive tweak will do it:
$string =~ s/ ^s+ | \s++(?=\n) /gx;

Sorry, more redemption is needed.

Click to expand...

But I'm not quite ready to declare it unredeemable:

$string =~ s/ ^\s+ | \s+(?=\n)$ | \s*[^\n\S]+$ //gx;

Click to expand...

I may be missing something here, but what about

s/\s+?(?=\n)?$//

One thinhg I missed was a trailing newline without other whitespace in
front of it. Making this

s/\s*?(?=\n)?$//;

instead works with that as well (although this should surely be called a
questionable construct, given the number of ?s ...).

Dr.Ruud · Dec 11, 2013

I've been using \s as a shortcut for spaces or tabs.

See also perlrecharclass, look for [[:blank:]] and \h.

John Black · Dec 11, 2013

I've been using \s as a shortcut for spaces or tabs.

Click to expand...

See also perlrecharclass, look for [[:blank:]] and \h.

Thanks. Looks like what I really wanted in most cases was \h. [[:black:]] sounds like it
would work too but its just too bulky to put into regexs since it can be easily avoided with
\h.

John Black

Rainer Weikusat · Dec 12, 2013

[remove whitespace at end of string but keep \n if it is there]

$string =~ s/ ^\s+ | \s+(?=\n)$ | \s*[^\n\S]+$ //gx;

Click to expand...

Click to expand...

[this also deal with whitespace at the beginning]

s/\s*?(?=\n)?$//;

Maybe logically simpler:

s/\s*?(\n)?$/$1/;

(this will likely a result in a warning when there's no newline at the
end of the line and runtime warnings are enabled).

Charles DeRykus · Dec 13, 2013

[remove whitespace at end of string but keep \n if it is there]

$string =~ s/ ^\s+ | \s+(?=\n)$ | \s*[^\n\S]+$ //gx;

Click to expand...

Click to expand...

[this also deal with whitespace at the beginning]

s/\s*?(?=\n)?$//;

Click to expand...

Maybe logically simpler:

s/\s*?(\n)?$/$1/;

(this will likely a result in a warning when there's no newline at the
end of the line and runtime warnings are enabled).

Hm, as you note though, doesn't handle initial w/s and coughs an
'uninitialized" warning if no ending \n. However, you could take a cough
suppressant:

s{\s*?(\n)?$}{$1 // ''}e;

But, more significantly, doesn't handle multiple ending newlines, eg,
"foo \n\n\n"
[which of course may not be an issue for the OP]

Rainer Weikusat · Dec 13, 2013

Charles DeRykus said:
Rainer Weikusat said:

Charles DeRykus <[email protected]> writes:

Click to expand...

[remove whitespace at end of string but keep \n if it is there]

$string =~ s/ ^\s+ | \s+(?=\n)$ | \s*[^\n\S]+$ //gx;

Click to expand...

[this also deal with whitespace at the beginning]

s/\s*?(?=\n)?$//;

Click to expand...

Maybe logically simpler:

s/\s*?(\n)?$/$1/;

(this will likely a result in a warning when there's no newline at the
end of the line and runtime warnings are enabled).

Click to expand...

Hm, as you note though, doesn't handle initial w/s and coughs an
uninitialized" warning if no ending \n. However, you could take a
cough suppressant:

s{\s*?(\n)?$}{$1 // ''}e;

It wasn't supposed to handle initial whitespace because that's not
really related to the \n-issue (also true for the first) ...

But, more significantly, doesn't handle multiple ending newlines, eg,
"foo \n\n\n"
[which of course may not be an issue for the OP]

.... and it certainly wasn't supposed to do that, either: When processing
something line-by-line which I assumed to be the case here, "foo \n\n\n"
will be the three lines

"foo \n"
"\n"
"\n"

and assuming that handling "foo \n bla\n \n" should result in
"foo\n blah\n \n", ie the purpose is to remove leading whitespace at
the beginning of a multi-line text but not leading whitespace on the
individual lines seems rather bizarre to me. Or that processing
"a \n " should remove the \n given that newlines are not supposed to
be removed. And what about " a b \n bbb\n\n c \n"?

Rainer Weikusat · Dec 13, 2013

Ben Morrow said:
Quoth Rainer Weikusat said:

Rainer Weikusat said:

Charles DeRykus <[email protected]> writes:

Click to expand...

[remove whitespace at end of string but keep \n if it is there]

$string =~ s/ ^\s+ | \s+(?=\n)$ | \s*[^\n\S]+$ //gx;

Click to expand...

[this also deal with whitespace at the beginning]

s/\s*?(?=\n)?$//;

Click to expand...

Maybe logically simpler:

s/\s*?(\n)?$/$1/;

Click to expand...

If you're willing to rely on \s*? finding all the whitespace (it does,
because the 'start earlier in the string' rule trumps the 'match
minimally' rule, but IMHO it's confusing), you just need

s/\s*?$//

What's confusing here is that $ matches two different things depending
on the context: Apparently, if it is preceded by \s*?, it matches
immediately before \n at the end of the line and if that is \s*, it
matches after the \n. But that's certainly good to know.

Charles DeRykus · Dec 13, 2013

Charles DeRykus said:
Charles DeRykus said:

[remove whitespace at end of string but keep \n if it is there]

$string =~ s/ ^\s+ | \s+(?=\n)$ | \s*[^\n\S]+$ //gx;

Click to expand...

Click to expand...

...

It wasn't supposed to handle initial whitespace because that's not
really related to the \n-issue (also true for the first) ...

Yes, I was over-generalizing. But, imo, a one-liner handles both goals
without being rocket science[1].

But, more significantly, doresn't handle multiple ending newlines, eg,
"foo \n\n\n"
[which of course may not be an issue for the OP]

Click to expand...

... and it certainly wasn't supposed to do that, either: When processing
something line-by-line which I assumed to be the case here, "foo \n\n\n"
will be the three lines

That wasn't really specified though. The goal was to remove trailing
whitespace from the beginning and end of "strings" [rather than just
well-behaved lines] without clobbering a trailing newline.

"foo \n"
"\n"
"\n"

and assuming that handling "foo \n bla\n \n" should result in
"foo\n blah\n \n", ie the purpose is to remove leading whitespace at
the beginning of a multi-line text but not leading whitespace on the
individual lines seems rather bizarre to me. Or that processing
"a \n " should remove the \n given that newlines are not supposed to
be removed. And what about " a b \n bbb\n\n c \n"?

Yes, agreed. Without more certainty about original intent, it becomes
bizarre. But less bizarrely, a string might easily have multiple
newlines on the end with the reasonable goal of removing all but the
final one.

Why does spacing matter in this context?	0	Aug 1, 2022
Removing trailing newlines -	7	Apr 23, 2008
Chatbot	0	Oct 8, 2024
Why is this WordPress comments form not submitting?	1	Jan 12, 2020
Can anyone code this for me ?	1	Dec 6, 2024
Problem with displaying character that code number is 219 (after SetConsoleTextAttribute)?	3	Jan 9, 2023
FAQ 4.38 Why don't my <<HERE documents work?	0	Apr 18, 2011
TF-IDF	2	Aug 19, 2021

Why is this sub removing newlines??

Rainer Weikusat

Charles DeRykus

Rainer Weikusat

Dr.Ruud

John Black

Rainer Weikusat

Charles DeRykus

Rainer Weikusat

Rainer Weikusat

Charles DeRykus

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads