["Followup-To:" header set to comp.lang.c.]
On 16 Jan 2004 02:50:32 -0800,
in Msg. said:
And even simpler to get it wrong, or create an inadequate solution as
we shall see.
I've never claimed that my solution was adequate in that it solved the
OP's problem by 100%.
1) This code is complicated.
It's a mere 23 lines. You want to see complicated code?
Just two, non-nested, and only one of them non-trivial.
zero state machines
and I have difficulty knowing how to know that this code is
correct.
Like with any third-party function you've got to either trust the
documentation or write your own.
In the middle there you've hidden a nice "a > s" comparison
... I didn't know you could compare pointers like that in a portable
way.
You can as long as they point into the same array.
2) The code only shows the inner loop -- the original request asks for
filling a two dimensional array.
Again, I never claimed to've been trying to solve the OP's problem. I was
merely giving an example of how to parse a string in C.
3) It encodes the classic C worthlessness of requiring that you
specify the size of your containers (array) up front, before you know
the size of the data you require.
Easily fixed by a reallocing mechanism that I didn't bother with. The
function I gave is specifically geared to parsing csv tables where the
number of columns is usually known.
The original poster also posted to
C++ -- using an STL vector to hold the result is probably the right
answer.
In C++ it would be.
With or without the '\n' on the end? And do you compensate for a
potential '\r' in there?
The "doc" (my comments) states that leading and trailing whitespace gets
chopped off all tokens.
Are you doing the typical C thing of pushing
issues and complexities upward?
No, but you're doing the typical thing of not reading the documentation.
Its also the minimum malloc size. BTW, what if you decide to pass a
negative value for maxfields?
UB, obviously. Trivially fixed with a single line. A bug.
tabs but not spaces? Tabs/spaces are chosen because of their human
readability/editability features.
Yes, but it's difficult to parse tables that contain empty cells or
elements with whitespace in them.
I means if you use them as
seperators you use them interchangeably, in arbitrary numbers also
with the possibility that one or the other might not be present.
A
quick look at your algorithm makes it look like it will fail if one
pair of entries it only separated by spaces.
You're right: My algorithm gets tripped up when anything that's
isspace() is used a s separator.
Which makes it only barely better than strtok ...
It's re-entrant, and in the usual file-parsing situation (reading csv
data) each line is typically only used once.
I specifically didn't want to allocate additional memory for the field's
contents in the function because it's 1) an unnecessary waste of memory
and performance most of the time, and 2) it's easily provided by 2 extra
standard function calls outside my routine.
Ok ... this is another classic case of pushing up complexity upwards.
It's better to push complexity upwards than implementing it downstairs
where it, although unneeded in most cases, may lead to resource and
performance penalties. Especially when the "complexity" involves nothing
but one call to strdup() and another one to free().
Ok, I'm not sure the OP was asking that you change the policy they
have decided upon.
Hey, I have better things to do than writing code for other people. All I
did was give a simple example of how to do efficient string parsing with a
few lines in C.
--Daniel