trimming leading zeros

R

rayw

Following on from my post yesterday about nums -> binary strings ...

I've written two routines that can trim the leading zeros from the results,
so I call either of these like this:

char * binary3(I_TYPE n)
{
...
return trimZeros1(buffer);

OR

return trimZeros2(buffer);
}

trimZeros1 works fine it seems, but I found a bug in trimZeros2.

First the fixed trimZeros2 code, and then the question.

// Method 2 - use a standard library call to help us.
//
char * trimZeros2(char s[])
{
int n;

// strspn() - Get length of substring composed of given characters.
//
// Scans s character by character, returning the number of characters
// read before the first character not included in "0" is found.

// If strspn() returns a non 0 value, it found a character at 'n'
// that wasn't a '0' ADDED BUT IT COULD BE A '\0'
//
if((n = strspn(s, "0")) != 0 && s[n] != '\0')
{
return &s[n];
}

// s was all 00000000000s.
//
else

{
return s;
}
}

Q.

Originally the crucial lines read

if((n = strspn(s, "0")) != 0)
{
return &s[n];
}

That worked fine until I passed it an all "000000" string. Then the routine
found the nul terminator.

So, I had to add a sequence point ... && s[n] != '\0' to the test.

That seemed a little clumsy, so I 'intuitively' tried to embed the nul into
the "0" string instead

if((n = strspn(s, "0\0")) != 0)

Looks a bit dumb I know - as the original string ("0") already HAS a nul
terminator at its end, and so I'm assuming, even if this is the correct
syntax, that strspn treats my embedded nul as the end of its 'string of
important chars.?

However, it made me think/wonder whether there IS someway to embed a '\0'
into a quoted string?


*For completeness, here's trimZeros1*

// Method 1 - DIY.
//
char * trimZeros1(char s[])
{
char * p = NULL;

int n;

// If s starts with a '1', there's nothing to do, as there are no
// leading 0s to trim.
//
if(s[0] != '1')
{
// For each character in s ...
//
for(n = 0; n < strlen(s); ++n)
{
// If we find a '1', AND, as we know that s didn't begin with
// one, we're basically done.
//
if(s[n] == '1')
{
p = &s[n];

break;
}
}

// Check to see that s wasn't all 1111111111s.
//
if(p)
{
return p;
}
}

return s;
}
 
M

Mark McIntyre

However, it made me think/wonder whether there IS someway to embed a '\0'
into a quoted string?

By definition, a string terminates at a null, so you can't embed nulls
in a string.

By the way, did you check what value strspn returned when your string
was all zeros? Can you compare that value to something else you know
about the string? That said, your method is probably quicker since
you've already walked the string once with strspn.
 
P

pete

rayw wrote:
// Method 2 - use a standard library call to help us.
//
char * trimZeros2(char s[])
{
int n;

// strspn() - Get length of substring composed of given characters.
//
// Scans s character by character, returning the number of characters
// read before the first character not included in "0" is found.

// If strspn() returns a non 0 value, it found a character at 'n'
// that wasn't a '0' ADDED BUT IT COULD BE A '\0'
//
if((n = strspn(s, "0")) != 0 && s[n] != '\0')
{
return &s[n];
}

// s was all 00000000000s.
//
else

{
return s;
}
}

You can write that, this way:

char * trimZeros2(char s[])
{
size_t n = strspn(s, "0");

return s[n] != '\0' ? s + n : s;
}
 
P

pemo

pete said:
rayw said:
// Method 2 - use a standard library call to help us.
//
char * trimZeros2(char s[])
{
int n;

// strspn() - Get length of substring composed of given characters.
//
// Scans s character by character, returning the number of characters
// read before the first character not included in "0" is found.

// If strspn() returns a non 0 value, it found a character at 'n'
// that wasn't a '0' ADDED BUT IT COULD BE A '\0'
//
if((n = strspn(s, "0")) != 0 && s[n] != '\0')
{
return &s[n];
}

// s was all 00000000000s.
//
else

{
return s;
}
}

You can write that, this way:

char * trimZeros2(char s[])
{
size_t n = strspn(s, "0");

return s[n] != '\0' ? s + n : s;
}

Or even ...

return s[strspn(s, "0")] != '\0' ? s + n : s;

But then we get into readability - or, what was it someone called it,
'maintenance drone' issues?

On another matter ... excessive use of whitespace [above/thread], or
reasonable?

A long time ago, I was dragged screaming into the use of K&R bracing -
another 'row' on the monitor was worth it. However, these days, I liberally
sprinkle WS around the place, and have gone back to what see as the more
readable bracing style of ...

if(?)
{
..
}

Noe that I don't
if (?)

But I do do

if(x > y) instead of

if(x>y) etc.

My verdict [ excessive use of ..] reasonable and clear.
 
K

Keith Thompson

Mark McIntyre said:
By definition, a string terminates at a null, so you can't embed nulls
in a string.

No, but you can embed them in a string literal, such as "foo\0bar".

C99 6.4.5, footnote 66:

A character string literal need not be a string (see 7.1.1),
because a null character may be embedded in it by a \0 escape
sequence.

But anything that treats it as a string will ignore everything past
the first '\0' character. For example:

#include <stdio.h>
#include <string.h>
int main()
{
const char s[] = "foo\0bar";
printf("s = \"%s\"\n", s);
printf("strlen(s) = %d\n", (int)strlen(s));
printf("sizeof s = %d\n", (int)sizeof s);
return 0;
}

The output is:

s = "foo"
strlen(s) = 3
sizeof s = 8
 
E

Ed Prochak

Keith said:
Mark McIntyre said:
By definition, a string terminates at a null, so you can't embed nulls
in a string.

No, but you can embed them in a string literal, such as "foo\0bar".

C99 6.4.5, footnote 66:

A character string literal need not be a string (see 7.1.1),
because a null character may be embedded in it by a \0 escape
sequence.

But anything that treats it as a string will ignore everything past
the first '\0' character. For example:

#include <stdio.h>
#include <string.h>
int main()
{
const char s[] = "foo\0bar";
printf("s = \"%s\"\n", s);
printf("strlen(s) = %d\n", (int)strlen(s));
printf("sizeof s = %d\n", (int)sizeof s);
return 0;
}

The output is:

s = "foo"
strlen(s) = 3
sizeof s = 8

And it is important to realize this fact. For example the ORACLE C
interface uses counted strings. One bug I had a heck of a time finding
was when one C program inserted data with an imbedded nul without
changing the string count. After the insert, using MS ACCESS to query
the data showed the right data but we could not select it. Using ORACLE
showed the entire string (foobar) but also could not select it. It
looked like the database was corrupt!

Point is nul is a valid character even though C gives it special
treatment in character operations.

Program carefully out there!
Ed
 
C

Chuck F.

Keith said:
Mark McIntyre said:
By definition, a string terminates at a null, so you can't embed
nulls in a string.

No, but you can embed them in a string literal, such as "foo\0bar".

C99 6.4.5, footnote 66:

A character string literal need not be a string (see 7.1.1),
because a null character may be embedded in it by a \0 escape
sequence.

But anything that treats it as a string will ignore everything
past the first '\0' character. For example:

#include <stdio.h>
#include <string.h>
int main()
{
const char s[] = "foo\0bar";
printf("s = \"%s\"\n", s);
printf("strlen(s) = %d\n", (int)strlen(s));
printf("sizeof s = %d\n", (int)sizeof s);
return 0;
}

The output is:

s = "foo"
strlen(s) = 3
sizeof s = 8

Another probably not so amusing (to the maintainer) game would be
to define multiple lines in one constant:

char *s = "Now is\0the time\0for all silly fools\0";
/* I assume an extra \0 will be automatically appended */

void dumpstuff(char *s) /* needs \0\0 termination */
{
while (*s) {
while (*s) putchar(*s++);
putchar(\n);
s++;
}
}

This could be used to dump the MSDOS environment.
 
P

pete

pemo said:
char * trimZeros2(char s[])
{
size_t n = strspn(s, "0");

return s[n] != '\0' ? s + n : s;
}

Or even ...

return s[strspn(s, "0")] != '\0' ? s + n : s;

Why would you want to call strspn twice
with the same arguments,
when you already have the return value stored in n?
 
P

pemo

pete said:
pemo said:
char * trimZeros2(char s[])
{
size_t n = strspn(s, "0");

return s[n] != '\0' ? s + n : s;
}

Or even ...

return s[strspn(s, "0")] != '\0' ? s + n : s;

Why would you want to call strspn twice
with the same arguments,
when you already have the return value stored in n?

'fraid it's 'brain fade' on my part - duh.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,989
Messages
2,570,207
Members
46,782
Latest member
ThomasGex

Latest Threads

Top