split into array

M

Mark J Fenbers

$myline = "0,1,2,3,4,,,,";
@myarray = split /,/, $myline;
print scalar @myarray; # prints 5

$myline = "0,1,2,3,4,,,,8";
@myarray = split /,/, $myline;
print scalar @myarray; # prints 9

In database output, sometimes the last field(s) in a record is(are) null, and so
the line looks like the first $myline. Other times, the last field is not null,
and so the line looks like the second $myline. Is there a way that I can get
Perl to report that scalar @myarray is 9 even if there are trailing null fields?

Mark
 
G

Gunnar Hjalmarsson

Mark said:
$myline = "0,1,2,3,4,,,,";
@myarray = split /,/, $myline;
print scalar @myarray; # prints 5

$myline = "0,1,2,3,4,,,,8";
@myarray = split /,/, $myline;
print scalar @myarray; # prints 9

In database output, sometimes the last field(s) in a record is(are)
null, and so the line looks like the first $myline. Other times,
the last field is not null, and so the line looks like the second
$myline. Is there a way that I can get Perl to report that scalar
@myarray is 9 even if there are trailing null fields?

As you show above, the split() function deletes empty trailing fields.
If you want to count all the fields, whether empty or not, why not do
it separately:

my $numfields = 1 + $myline =~ tr/,//;
 
K

ko

Mark said:
$myline = "0,1,2,3,4,,,,";
@myarray = split /,/, $myline;
print scalar @myarray; # prints 5

$myline = "0,1,2,3,4,,,,8";
@myarray = split /,/, $myline;
print scalar @myarray; # prints 9

In database output, sometimes the last field(s) in a record is(are) null, and so
the line looks like the first $myline. Other times, the last field is not null,
and so the line looks like the second $myline. Is there a way that I can get
Perl to report that scalar @myarray is 9 even if there are trailing null fields?

Mark

my @myarray = split /(?=,)/, $myline;

'perldoc -f split' from your shell gives you the explanation - suggest
that you read it.

You might also want to look at the standard Text::parseWords module. It
allows you parse CSV files and deal with fields that contain commas,
e.g. escaped using quotes and backslashes.

HTH - keith
 
T

Tad McClellan

Gunnar Hjalmarsson said:
Mark J Fenbers wrote:
As you show above, the split() function deletes empty trailing fields.


It deletes trailing empty fields _by default_.

If you want to count all the fields, whether empty or not,


Then don't let it default to the default. :)


@myarray = split /,/, $myline, -1;
 
G

Gunnar Hjalmarsson

Tad said:
It deletes trailing empty fields _by default_.


Then don't let it default to the default. :)

@myarray = split /,/, $myline, -1;

Thanks, Tad.

Actually, I looked in the docs for how to have it do something else
but the default, but didn't find it. Now I realize that this sentence
might be it: "If LIMIT is negative, it is treated as if an arbitrarily
large LIMIT had been specified." The meaning of that sentence wasn't
exactly obvious to me the first time I read it, and, to be honest, it
isn't now either.

What am I missing?
 
T

Tad McClellan

Gunnar Hjalmarsson said:
Thanks, Tad.

Actually, I looked in the docs for how to have it do something else
but the default, but didn't find it. Now I realize that this sentence
might be it: "If LIMIT is negative, it is treated as if an arbitrarily
large LIMIT had been specified." The meaning of that sentence wasn't
exactly obvious to me the first time I read it, and, to be honest, it
isn't now either.

What am I missing?


The sentence before the one you quoted. :)

If LIMIT is unspecified or zero, trailing null fields are stripped

So, if LIMIT is specified non-zero then trailing fields are not stripped.
 
G

Gunnar Hjalmarsson

Tad said:
The sentence before the one you quoted. :)

If LIMIT is unspecified or zero, trailing null fields are
stripped

So, if LIMIT is specified non-zero then trailing fields are not
stripped.

Hmm.. Is it just me who don't find that to be the height of clearness?
I rather find it to border on an 'undisclosed feature'. :(
 
G

gnari

[snip discussion about the clearness of perldoc -f split]
A better way to write the sentence in question might be:

"If LIMIT is specified and positive, splits into a maximum of that
many fields or the number of total fields including null fields,
whichever is lower."

I do not agree. I think the docs are quite clear, and your version
is more confusing. the split() function is amazingly complicated,
with lots of special cases. the docs do an admirable job of
describing each effect separately in a clear way.

gnari
 
C

ctcgag

Greg Klinedinst said:
Hmm, perhaps I wasn't clear myself, or at least I used the wrong word.
The clarity is fine, what is lacking is the completeness. For example
it is clear that if LIMIT is unspecified or null split will strip null
fields. However logically we cannot assume that if LIMIT is specified
the opposite will happen. The only two other relevant statements say
that we can set a maximum using a positive integer(though it may split
into fewer, but how many is "fewer" referring to?), and that a
negative integer acts as a VERY large positve int. Neither of them
covers when it will not strip null fields. I forget the formal logic
syntax but it goes something like this:

A -> B != !A -> !B

I think that if one wants something spelled out in exhaustive detail
equivalent to symbolic logic, one would read the source code rather than
the documentation.

I thought the documentation was pretty effective at conveying the meaning
when I first read it.

Xho
 
B

Brad Baxter

I do not agree. I think the docs are quite clear, and your version
is more confusing. the split() function is amazingly complicated,
with lots of special cases. the docs do an admirable job of
describing each effect separately in a clear way.

gnari

I agree with gnari's disagreement. I remember a long ago discussion in
which I tried (for my own edification) to create examples that illustrated
the various permutations of split's behavior as described in the docs, and
was asking about some cases. At a crucial point in that discussion, I
believe it was Ilya who said that split was an abomination, because of the
plethora of special cases.

While I respect that viewpoint, I think it's only in rare occasions that
the documentation fails to cover an obscure corner of use. Perhaps a
perlsplittut is in order? (No, I'm not--shudder--volunteering.)

Regards,

Brad
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,146
Messages
2,570,832
Members
47,374
Latest member
EmeliaBryc

Latest Threads

Top