Partial string loss with sprintf/strcat

N

Neal Barney

I have a C program which runs on a device using a Zilog Z180
microprocessor. While it can address 1MB of RAM, it can only address
64KB at any given time. And of that only 16KB can be used for "stack
and heap space". So I'm running in a very memory constricted
environment.

The program "speaks" a proprietary protocol which sends ASCII strings
back and forth from the device to the server. Within the past couple of
months we've been noticing errors in our server logs and people have
been complaing of failures. Upon viewing server logs we've noticed the
string the device is sending is missing it's header.

The string gets built like so:

/* Global variable */
char Message[512];

/* Local variable */
char tmpMsg[512];

/* var1 - var4 (local variables) get populated with values... */

/* Build the header */
sprintf( tmpMsg, "|XX|%s|0|00.00|XXXX:XXXX|XXXX:%s|XXXX:%s|XXXX:%s",
var1, var2, var3, var4 );

/* X's are alphas and zeros are numbers */

if ( conditionTrue )
strcat( tmpMsg, ",S" );
else
strcat( tmpMsg, ",H" );

/* Continue building the rest of the string... */

/* Put msg length at beginning of string*/
len = strlen( tmpMsg );
sprintf( Message, "%04d%s", len, tmpMsg );

return;


Now when the device transmits the string to the server, the server logs
something similar to:

0089,S|XXXX:XXXX|XXXX:XXXX|... (and so on)
or sometimes something like
0093 ,S|XXXX:XXXX|XXXX:XXXX|... (and so on)

The entire header appears to be getting lost. The calculated message
length matches the size of the string without the header, which may
suggest:

a) The sprintf that places the header in tmpMsg is failing (but why?).
b) A bug in strcat for the C library I'm using is wiping out the
string/starting at the beginning of the string. (It doesn't do it later
in the same function or elsewhere in the program).
c) "Something else" is walking on that part of memory.
d) ...?

This is not 100% reproduceable (in fact, in the past 9 months I have
only personally seen it happen once on my test device, and I can't
reproduce it there). But we are seeing this happen in production
multiple times a day from different devices. Unplugging the device and
plugging it back in will often "fix" the problem (though it may occur
again in the future).

What are the most likely causes for this problem? Solutions?
 
A

Andrew Poelstra

The program "speaks" a proprietary protocol which sends ASCII strings
back and forth from the device to the server. Within the past couple of
months we've been noticing errors in our server logs and people have
been complaing of failures. Upon viewing server logs we've noticed the
string the device is sending is missing it's header.

The string gets built like so:

/* Global variable */
char Message[512];

/* Local variable */
char tmpMsg[512];

/* var1 - var4 (local variables) get populated with values... */

/* Build the header */
sprintf( tmpMsg, "|XX|%s|0|00.00|XXXX:XXXX|XXXX:%s|XXXX:%s|XXXX:%s",
var1, var2, var3, var4 );

Are you sure that var1 - var4 take up no more than 465 bytes?
/* X's are alphas and zeros are numbers */

if ( conditionTrue )
strcat( tmpMsg, ",S" );
else
strcat( tmpMsg, ",H" );

Looks like I meant 463 bytes...
/* Continue building the rest of the string... */

....or less. Are you sure that a 512-byte buffer is enough? If not, you
need to have some way of dealing with running out of space. Perhaps you
could eliminate 4-digit dates (they aren't too important in human-read
logs), or detailed timestamp information. It really depends on what you
need from the log files, and what your design is.

If you overflowed your buffer in the above code, you've left the realm
of C, and have entered Honah Lee.
Now when the device transmits the string to the server, the server logs
something similar to:

0089,S|XXXX:XXXX|XXXX:XXXX|... (and so on)
or sometimes something like
0093 ,S|XXXX:XXXX|XXXX:XXXX|... (and so on)

The entire header appears to be getting lost.

Which part is the header? What exactly should be displayed that isn't?
This is not 100% reproduceable (in fact, in the past 9 months I have
only personally seen it happen once on my test device, and I can't
reproduce it there). But we are seeing this happen in production
multiple times a day from different devices. Unplugging the device and
plugging it back in will often "fix" the problem (though it may occur
again in the future).

What are the most likely causes for this problem? Solutions?

Sure sounds like a buffer overflow, or some other undefined memory issue.
 
E

Eric Sosman

Neal said:
I have a C program which runs on a device using a Zilog Z180
microprocessor. While it can address 1MB of RAM, it can only address
64KB at any given time. And of that only 16KB can be used for "stack
and heap space". So I'm running in a very memory constricted
environment.

The program "speaks" a proprietary protocol which sends ASCII strings
back and forth from the device to the server. Within the past couple of
months we've been noticing errors in our server logs and people have
been complaing of failures. Upon viewing server logs we've noticed the
string the device is sending is missing it's header.

The string gets built like so:

/* Global variable */
char Message[512];

/* Local variable */
char tmpMsg[512];

/* var1 - var4 (local variables) get populated with values... */

/* Build the header */
sprintf( tmpMsg, "|XX|%s|0|00.00|XXXX:XXXX|XXXX:%s|XXXX:%s|XXXX:%s",
var1, var2, var3, var4 );

/* X's are alphas and zeros are numbers */

if ( conditionTrue )
strcat( tmpMsg, ",S" );
else
strcat( tmpMsg, ",H" );

/* Continue building the rest of the string... */

/* Put msg length at beginning of string*/
len = strlen( tmpMsg );
sprintf( Message, "%04d%s", len, tmpMsg );

return;


Now when the device transmits the string to the server, the server logs
something similar to:

0089,S|XXXX:XXXX|XXXX:XXXX|... (and so on)
or sometimes something like
0093 ,S|XXXX:XXXX|XXXX:XXXX|... (and so on)

Like Andrew Poelstra, I was at first suspicious of a buffer
overflow. But the lengths shown in these fragments don't support
that hypothesis, not directly at any rate.

I think you need to show us actual code, not paraphrases.
The bug might lie in "Continue building the rest of the string"
or even in the way Message and tmpMsg are declared (what you've
shown is obviously not the full story; it doesn't, for example,
show how one of them gets to be "global" and the other "local").

If you don't want to show your (proprietary) code and would
prefer to debug on your own, I'd suggest adding a few printf()
calls to output interesting values. For example, the value
returned by sprintf() might be interesting, especially if it
is markedly different from the value returned by strlen(). I'm
thinking of something along the lines of

tmpLen = sprintf(tmpMsg, ...);
if (tmplen != strlen(tmpMsg))
print_everything_you_can_and_then_die();

if (conditionTrue)
strcat(tmpMsg, ",S");
else
strcat(tmpMsg, ",H");
tmpLen += 2;
if (tmplen != strlen(tmpMsg))
print_everything_you_can_and_then_die();

/* Continue building the rest of the string... */

len = strlen( tmpMsg );
sprintf( Message, "%04d%s", len, tmpMsg );
if (len + 4 != strlen(Message))
print_everything_you_can_and_die();
 
J

jacob navia

Eric Sosman a écrit :
If you don't want to show your (proprietary) code and would
prefer to debug on your own, I'd suggest adding a few printf()
calls to output interesting values. For example, the value
returned by sprintf() might be interesting, especially if it
is markedly different from the value returned by strlen(). I'm
thinking of something along the lines of

tmpLen = sprintf(tmpMsg, ...);
if (tmplen != strlen(tmpMsg))
print_everything_you_can_and_then_die();

This is a very good proposition. It means to try to verify
what each part of the software is doing. The result of the
sprintf call is a good starting point to start adding redundant
verifications, (also called sanity checks!!!) into the software.

When something like this happens, MANY things can be the reason
for the observed symptoms: run time data corruption, buffer
overflows, etc.

Starting with the result of the sprintf call
you can verify also that it is less than the length of the
buffer!!!
 
N

Neal Barney

Eric said:
Like Andrew Poelstra, I was at first suspicious of a buffer
overflow. But the lengths shown in these fragments don't support
that hypothesis, not directly at any rate.

I think you need to show us actual code, not paraphrases.
The bug might lie in "Continue building the rest of the string"
or even in the way Message and tmpMsg are declared (what you've
shown is obviously not the full story; it doesn't, for example,
show how one of them gets to be "global" and the other "local").

If you don't want to show your (proprietary) code and would
prefer to debug on your own, I'd suggest adding a few printf()
calls to output interesting values. For example, the value
returned by sprintf() might be interesting, especially if it
is markedly different from the value returned by strlen(). I'm
thinking of something along the lines of

tmpLen = sprintf(tmpMsg, ...);
if (tmplen != strlen(tmpMsg))
print_everything_you_can_and_then_die();

if (conditionTrue)
strcat(tmpMsg, ",S");
else
strcat(tmpMsg, ",H");
tmpLen += 2;
if (tmplen != strlen(tmpMsg))
print_everything_you_can_and_then_die();

/* Continue building the rest of the string... */

len = strlen( tmpMsg );
sprintf( Message, "%04d%s", len, tmpMsg );
if (len + 4 != strlen(Message))
print_everything_you_can_and_die();

My first suspicion was also a buffer overflow. However, the header
should always be between 78 and 85 characters. The entire message
rarely exceeds 200 bytes so a 512 byte buffer -should- be safe, but an
overflow is still possible. After trolling through the code for nearly
a week, I have yet to find where it is overflowing if that is what is
happening.

I understand that it's hard to know what advice to give with the
mocked-up code I presented. I would post the actual code, but it's not
-my- code, it's that of my employer and I'm sure they wouldn't
appreciate it. :)

I do like your suggestion of testing the string length at every step of
the code and dying if it doesn't meet the lengths that I expect. The
only issue that still leaves for me is that I am unable to reproduce it
myself on my development box. This code works about 95-98% of the time
for those using it production (though some people seem to get it more
than their fair share). I guess if it did die on them in production,
I'd have a better idea of where the problem was.

Thank you for everyone who has responded so far. If anyone else has any
ideas or suggestions, I'm still open to them.
 
F

Flash Gordon

Neal Barney wrote:

I do like your suggestion of testing the string length at every step of
the code and dying if it doesn't meet the lengths that I expect. The
only issue that still leaves for me is that I am unable to reproduce it
myself on my development box. This code works about 95-98% of the time
for those using it production (though some people seem to get it more
than their fair share). I guess if it did die on them in production,
I'd have a better idea of where the problem was.

Thank you for everyone who has responded so far. If anyone else has any
ideas or suggestions, I'm still open to them.

Could you get it to log data that would overflow the buffer somewhere
you (or a customer) could retrieve the log then take some appropriate
recovery action?
 
E

Eric Sosman

Neal said:
[...]
I do like your suggestion of testing the string length at every step of
the code and dying if it doesn't meet the lengths that I expect. The
only issue that still leaves for me is that I am unable to reproduce it
myself on my development box. This code works about 95-98% of the time
for those using it production (though some people seem to get it more
than their fair share). I guess if it did die on them in production,
I'd have a better idea of where the problem was.

The important thing isn't really that the program die,
but that it die noisily, having logged everything that might
conceivably be of value in diagnosing the problem. When a
bug is hard to reproduce, it is particularly important to
capture as much information as possible.

One way to do this, of course, is to output everything
you can every time you go through the suspect code. That has
an unfortunate tendency to drown you with too much data, so a
technique that's fairly frequently employed is to log everything
to a circular buffer, going merrily 'round and 'round as long
as everything looks all right. Then when something screwy is
detected, you dump the buffer containing (by now) the information
collected from the last dozen or hundred or whatever log points.

Having captured the bad trace, it isn't even essential that
the program actually die -- but from your description, it seems
that the bogus transmission may have disrupted communication
with the other end of the wire, and the program might have some
difficulty continuing "normal operations" anyhow.
 
I

Ivanna Pee

Neal said:
I have a C program which runs on a device using a Zilog Z180
microprocessor. While it can address 1MB of RAM, it can only address
64KB at any given time. And of that only 16KB can be used for "stack
and heap space". So I'm running in a very memory constricted
environment.

The program "speaks" a proprietary protocol which sends ASCII strings
back and forth from the device to the server. Within the past couple of
months we've been noticing errors in our server logs and people have
been complaing of failures. Upon viewing server logs we've noticed the
string the device is sending is missing it's header.

The string gets built like so:

/* Global variable */
char Message[512];

/* Local variable */
char tmpMsg[512];

/* var1 - var4 (local variables) get populated with values... */

/* Build the header */
sprintf( tmpMsg, "|XX|%s|0|00.00|XXXX:XXXX|XXXX:%s|XXXX:%s|XXXX:%s",
var1, var2, var3, var4 );

/* X's are alphas and zeros are numbers */

if ( conditionTrue )
strcat( tmpMsg, ",S" );
else
strcat( tmpMsg, ",H" );

/* Continue building the rest of the string... */

/* Put msg length at beginning of string*/
len = strlen( tmpMsg );
sprintf( Message, "%04d%s", len, tmpMsg );

return;


Now when the device transmits the string to the server, the server logs
something similar to:

0089,S|XXXX:XXXX|XXXX:XXXX|... (and so on)
or sometimes something like
0093 ,S|XXXX:XXXX|XXXX:XXXX|... (and so on)

The entire header appears to be getting lost. The calculated message
length matches the size of the string without the header, which may
suggest:

a) The sprintf that places the header in tmpMsg is failing (but why?).
b) A bug in strcat for the C library I'm using is wiping out the
string/starting at the beginning of the string. (It doesn't do it later
in the same function or elsewhere in the program).
c) "Something else" is walking on that part of memory.
d) ...?

This is not 100% reproduceable (in fact, in the past 9 months I have
only personally seen it happen once on my test device, and I can't
reproduce it there). But we are seeing this happen in production
multiple times a day from different devices. Unplugging the device and
plugging it back in will often "fix" the problem (though it may occur
again in the future).

What are the most likely causes for this problem? Solutions?

Does the transmitting program crash when this occurs? It doesn't sound
like it if the problem data packet is recieved at the other end.

I would guess that during "Continue building the rest of the string"
you are stomping on tmpMsg[0], perhaps overrunning the local data on
the stack just below it.

How often does this thing transmit? Speed up the xmission interval on
the test setup to hopefully increase the frequency at which it fails,
and take a look at the stack when it does.
 
D

Dave Thompson

I have a C program which runs on a device using a Zilog Z180
microprocessor. While it can address 1MB of RAM, it can only address
64KB at any given time. And of that only 16KB can be used for "stack
and heap space". So I'm running in a very memory constricted
environment.

The program "speaks" a proprietary protocol which sends ASCII strings
back and forth from the device to the server. Within the past couple of
months we've been noticing errors in our server logs and people have
been complaing of failures. Upon viewing server logs we've noticed the
string the device is sending is missing it's header.

The string gets built like so:

/* Global variable */
char Message[512];

/* Local variable */
char tmpMsg[512];
/* Build the header */
sprintf( tmpMsg, <snip>
/* Continue building the rest of the string... */

/* Put msg length at beginning of string*/
len = strlen( tmpMsg );
sprintf( Message, "%04d%s", len, tmpMsg );

return;


Now when the device transmits the string to the server, <snip>
The entire header appears to be getting lost. The calculated message
length matches the size of the string without the header, which may
suggest:

a) The sprintf that places the header in tmpMsg is failing (but why?).
b) A bug in strcat for the C library I'm using is wiping out the
string/starting at the beginning of the string. (It doesn't do it later
in the same function or elsewhere in the program).
c) "Something else" is walking on that part of memory.
d) ...?

This is not [very] reproduceable <snip>

Like other responders I don't see any likely culprits in the partial
code you posted. As already suggested collecting information about the
state and sequence(s) leading up to a fault is likely to be helpful in
narrowing down the suspects, and/or helping you construct cases that
will reproduce it in test. I realize general advice like this is not
much more helpful than telling you that candy tastes good.

It may be obvious, but if you haven't done so already I would inspect
the generated (and I assume shipped) assembler/machine code for your
routine and if reasonably possible (and not prohibited) for the
library routines your code calls (sprintf, strcat, etc.) just to make
sure the compiler/implementor isn't doing something dumb somewhere.
(But that part would be offtopic for c.l.c, except maybe for questions
about exactly what standard-library routines are required to do,
especially in corner cases, which I don't see you approaching.)

I do however see something I would do differently in a memory
constrained environment (and maybe even a normal one) which might or
might not be of help to you. Instead of building the message body in
one buffer (tmpMsg) and then copying it plus a small fixed-length
prefix into another (Message), do something like:
size_t used = sprintf (&Message[4], "format", headerargs);
strcpy (&Message[4+used], nextitem); used += nextitemlen;
etc., bumping used in the process as long as not too difficult
char tmpLen [4+1];
sprintf (tmpLen, "%04d", used /* or strlen (Message+4) */ );
memcpy (&Message[0], tmpLen, 4);
/* you can simplify &Message[n] to Message+n and particularly
&Message[0] to Message if your compiler is so badly lame it doesn't */

If the code generated, by your compiler for your target*, to access
(within) the global/static buffer is less efficient, use a pointer
into it instead:
/* register? */ char * putMsg = &Message[4];
putMsg += sprintf (putMsg, "format", headerargs); etc.
(* For 8080 and IIRC Z80, linker-resolved globals or statics are
actually better, but I don't know if Z180 is different here.)

Also, if you have snprintf (or _snprintf or somesuch) available in
your library, you might use that instead. (It is standard in C99 for
hosted environments, but I'd guess the implementation you're using
isn't even trying to conform at that level.) For the code and data you
posted it doesn't look likely that you're overflowing, but it could be
arbitrarily bad if you do, so if you can be certain at reasonable cost
you don't it's probably worth it just to rule that out.

- David.Thompson1 at worldnet.att.net
 
N

Neal Barney

Like other responders I don't see any likely culprits in the partial
code you posted. As already suggested collecting information about the
state and sequence(s) leading up to a fault is likely to be helpful in
narrowing down the suspects, and/or helping you construct cases that
will reproduce it in test. I realize general advice like this is not
much more helpful than telling you that candy tastes good.

Well, I suppose it is a small comfort to me that I'm not crazy and didn't
overlook something patently obvious. After further research I do believe
that I have a better idea of what is going on. The stack and heap size
must be pre-declared and it appears as if the declared stack size is too
small. The stack space is being exceeded (or so it appears). I have
given the stack an additional 1000 bytes and will do further testing to
see if this resolves the issue. If it doesn't I'm in trouble. :) With
that change to the stack size that puts my memory usage at 16302 bytes out
of 16384. Talk about pushing the limits... :-O
It may be obvious, but if you haven't done so already I would inspect
the generated (and I assume shipped) assembler/machine code for your
routine and if reasonably possible (and not prohibited) for the library
routines your code calls (sprintf, strcat, etc.) just to make sure the
compiler/implementor isn't doing something dumb somewhere. (But that
part would be offtopic for c.l.c, except maybe for questions about
exactly what standard-library routines are required to do, especially in
corner cases, which I don't see you approaching.)

A good suggestion but it's been years since I inspected assembler level
code. And this machine uses a modified variant of the "Amsterdam
compiler" that compiles to EM byte-code. The machine runs an interpreter
that translates the EM code to native machine calls. I suppose there may
be enough information out there to learn more about the EM byte code and
figure things out, but that will probably be a last resort.
I do however see something I would do differently in a memory
constrained environment (and maybe even a normal one) which might or
might not be of help to you. Instead of building the message body in one
buffer (tmpMsg) and then copying it plus a small fixed-length prefix
into another (Message), do something like:
size_t used = sprintf (&Message[4], "format", headerargs); strcpy
(&Message[4+used], nextitem); used += nextitemlen; etc., bumping used
in the process as long as not too difficult char tmpLen [4+1]; sprintf
(tmpLen, "%04d", used /* or strlen (Message+4) */ ); memcpy
(&Message[0], tmpLen, 4);
/* you can simplify &Message[n] to Message+n and particularly
&Message[0] to Message if your compiler is so badly lame it doesn't */

I already rewrote the function to do exactly what you are suggesting
(though implemented a little differently). It should now be more
efficient as well as likely being easier to see which call is causing me
the issue. We'll see if this makes a difference.
If the code generated, by your compiler for your target*, to access
(within) the global/static buffer is less efficient, use a pointer into
it instead:
/* register? */ char * putMsg = &Message[4]; putMsg += sprintf
(putMsg, "format", headerargs); etc.
(* For 8080 and IIRC Z80, linker-resolved globals or statics are
actually better, but I don't know if Z180 is different here.)

Also, if you have snprintf (or _snprintf or somesuch) available in your
library, you might use that instead. (It is standard in C99 for hosted
environments, but I'd guess the implementation you're using isn't even
trying to conform at that level.) For the code and data you posted it
doesn't look likely that you're overflowing, but it could be arbitrarily
bad if you do, so if you can be certain at reasonable cost you don't
it's probably worth it just to rule that out.

I do not have a snprintf of any sort available to me. The compiler I'm
using is ANSI C89 conforming, but that's as far as it goes (The compiler
was written in 1993). I thought about using a third party library (such
as trio), but in the end I decided that it wasn't worth linking against
any additional code. Size is too important when the program gets
downloaded at 2400/1200baud.

I appreciate all of the replies I have received. I apologize for asking
the question and then more-or-less disappearing. I was out of town on
training and didn't have Usenet access. But all of the suggestions have
been much appreciated.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,818
Latest member
Brigette36

Latest Threads

Top