"Bus Error" related to compiler option

C

Chul Min Kim

Hi,

I got a BUS ERROR from one of my company's program.
Let me briefly tell our environment.

Machine : Sun E3500 (Ultra Sparc II 400Mhz CPU 4EA)
OS : Solaris7
Compiler : Sun Workshop 5.0 cc complier

I get "BUS ERROR" only when I execute the binary which builds
with "-g" complier option.

However, the binary builds with "-O" option is ok. No Bus Error.

I found FAQ realted to this "Bus Error". I have read the articles
realted to this issue already. "Bus Error" is related to misallignment.

My question is "why" bus error is not occurred with the binary built
by "-O" complier option.

Here is source code.

/* SAMPLE CODE - BEGIN */
typedef struct _SVC10336_ {
int a;
char b;
char c;
} svc10336_t;

main()
{
char reqbuf[9454];

memset((char *)&reqbuf, 0x00, sizeof(reqbuf));

{
int a = 100;

memcpy(reqbuf, &a, 4);
reqbuf[4] = '3';
reqbuf[5] = '4';
}

svc_start(reqbuf);
}

svc_start(char *reqbuf)
{
tr_svc10336(reqbuf);
}

tr_svc10336(char *reqbuf)
{
svc10336_t *svc10336;
int a;
char b;
char c;

svc10336 = (svc10336_t *)reqbuf;

a = svc10336->a;
b = svc10336->b;
c = svc10336->c;

printf("a[%d] b[%c] c[%c]\n", a,b,c);
}
/* SAMPLE CODE - END */


Here is the command that I build the binary.

(1) $ cc -g -o sample sample.c

this binary goes to "Bus Error".

(2) $ cc -O -o sample sample.c

this binary OK.
 
C

Christian Kandeler

Chul said:
I found FAQ realted to this "Bus Error". I have read the articles
realted to this issue already. "Bus Error" is related to misallignment.

Yes, that seems plausible. See below.
int a = 100;
memcpy(reqbuf, &a, 4);

This is not related to the error, but the last argument to memcpy() really
should be sizeof a. Do it, it doesn't cost you anything.
/* SAMPLE CODE - BEGIN */
typedef struct _SVC10336_ {
int a;
char b;
char c;
} svc10336_t;

[ ... ]
tr_svc10336(char *reqbuf)
{
svc10336_t *svc10336;
int a;
char b;
char c;

svc10336 = (svc10336_t *)reqbuf;

This cast is not legal. A struct may have a certain alignment requirement
(in this case: probably an address that's a multiple of four bytes) while
the char array-turned-pointer does not. The bus error has probably not
happened yet, but the bomb is ticking...
a = svc10336->a;

.... and now it's going off, as you are trying to read an int from an address
where an int may not be stored.
(1) $ cc -g -o sample sample.c
    this binary goes to "Bus Error".

(2) $ cc -O -o sample sample.c
   this binary OK.

Coincidence. Perhaps the char array happens to be at a suitable address in
the latter case. Do not count on that! You _must_ get rid of the cast,
since it invoked undefined behavior.


Christian
 
M

Michael Mair

Chul said:
Hi,

I got a BUS ERROR from one of my company's program.
Let me briefly tell our environment.

Machine : Sun E3500 (Ultra Sparc II 400Mhz CPU 4EA)
OS : Solaris7
Compiler : Sun Workshop 5.0 cc complier

I get "BUS ERROR" only when I execute the binary which builds
with "-g" complier option.

However, the binary builds with "-O" option is ok. No Bus Error.

I found FAQ realted to this "Bus Error". I have read the articles
realted to this issue already. "Bus Error" is related to misallignment.

My question is "why" bus error is not occurred with the binary built
by "-O" complier option.

Here is source code.

Provide a minimal example that gives no excess warnings.
/* SAMPLE CODE - BEGIN */
typedef struct _SVC10336_ {
int a;
char b;
char c;
} svc10336_t;

main()

int main ()
or
int main (void)
{
char reqbuf[9454];

This buffer may be arbitrarily aligned.
memset((char *)&reqbuf, 0x00, sizeof(reqbuf));

No prototype for memset up to now. #include said:
{
int a = 100;

memcpy(reqbuf, &a, 4);
Error: You assume that sizeof a is 4.
You want: memcpy(reqbuf, &a, sizeof a);
Note: Make reqbuf an array of unsigned char rather than char
to be guaranteed that this works as intended
Once again: #include said:
reqbuf[4] = '3';

You mean: reqbuf[offsetof(struct _SVC10336_, b)]
(at least if you want to build the same structure).
Use memcpy() for everything to be on the safe side.
reqbuf[5] = '4'; dito
}

svc_start(reqbuf);
No prototype for this function.

return 0;
}

svc_start(char *reqbuf)
{
tr_svc10336(reqbuf);
No prototype for this function.
}

tr_svc10336(char *reqbuf)
{
svc10336_t *svc10336;
int a;
char b;
char c;

svc10336 = (svc10336_t *)reqbuf;

a = svc10336->a;

Stupid error: svc10336_t may have to be X-aligned and reqbuf
can be 1-aligned, X>1.
Rule: If you memcpy() it into your buffer, also memcpy() it out
of it.
b = svc10336->b;
c = svc10336->c;

printf("a[%d] b[%c] c[%c]\n", a,b,c);

No prototype so far: #include said:
}
/* SAMPLE CODE - END */


Here is the command that I build the binary.

(1) $ cc -g -o sample sample.c

this binary goes to "Bus Error".

(2) $ cc -O -o sample sample.c

this binary OK.

If your compiler does not provide any warnings for the above code,
turn up the warning level. If this is not possible, use *lint
(I recommend splint).


Cheers
Michael
 
G

Giorgos Keramidas

Hi,

I got a BUS ERROR from one of my company's program.
Let me briefly tell our environment.

Machine : Sun E3500 (Ultra Sparc II 400Mhz CPU 4EA)
OS : Solaris7
Compiler : Sun Workshop 5.0 cc complier
[...]

Check the warnings printed by the Sun CC and make sure you fix all of
them first. You really should make it a habit to compile programs
using at least the -v -Xa options of Sun's C compiler. I've copied
the warnings between the (numbered) lines of the original source.
The lines that start with "warning:" are output from Sun's CC:

1 typedef struct _SVC10336_ {
2 int a;
3 char b;
4 char c;
5 } svc10336_t;
6
7 main()

The definition of main() can only be one of:

int main(void);
int main(int, char **);

What you have here is not even a complete definition.

8 {
9 char reqbuf[9454];
10

Magic constants like 9454 are bound to bite you one day. Use a proper
#define for this one.

11 memset((char *)&reqbuf, 0x00, sizeof(reqbuf));

warning: implicitly declaring function to return int: memset()

12
13 {
14 int a = 100;
15

Nested blocks to 'fake' the declaration of locally-scoped variables in
short functions like your current main() looks like extremely bad
style. What's wrong with declaring `a' a bit higher, where reqbuf[]
was declared too?

16 memcpy(reqbuf, &a, 4);

warning: implicitly declaring function to return int: memcpy()

Additionaly, there is no guarantee that sizeof(int) == 4.

17 reqbuf[4] = '3';
18 reqbuf[5] = '4';
19 }
20
21 svc_start(reqbuf);

warning: implicitly declaring function to return int: svc_start()

22 }

warning: Function has no return statement : main

23
24 svc_start(char *reqbuf)
25 {
26 tr_svc10336(reqbuf);

warning: implicitly declaring function to return int: tr_svc10336()

27 }

warning: Function has no return statement : svc_start

28
29 tr_svc10336(char *reqbuf)
30 {
31 svc10336_t *svc10336;
32 int a;
33 char b;
34 char c;
35
36 svc10336 = (svc10336_t *)reqbuf;

The alignment of arrays holding char values is not necessarily fit for
objects of the (svc10336_t) type. This part is quite probably the
cause of your bus errors.

37
38 a = svc10336->a;
39 b = svc10336->b;
40 c = svc10336->c;
41
42 printf("a[%d] b[%c] c[%c]\n", a,b,c);

warning: implicitly declaring function to return int: printf()

43 }

warning: Function has no return statement : tr_svc10336
 
M

Martin Ambuhl

Chul said:
Hi, [...]
Here is source code.

I doubt it, unless you purposely omit required headers and omit prototypes.

How does the following, somewhat better formed code work for you?
#include <string.h>
#include <stdio.h>

void svc_start(char *reqbuf);
void tr_svc10336(char *reqbuf);

typedef struct
{
int a;
char b;
char c;
} svc10336_t;

int main(void)
{
char reqbuf[9454];

memset(reqbuf, 0x00, sizeof(reqbuf));
{
int a = 100;
memcpy(reqbuf, &a, 4);
reqbuf[4] = '3';
reqbuf[5] = '4';
}
svc_start(reqbuf);
return 0;
}

void svc_start(char *reqbuf)
{
tr_svc10336(reqbuf);
}

void tr_svc10336(char *reqbuf)
{
svc10336_t *svc10336;
int a;
char b;
char c;
svc10336 = (svc10336_t *) reqbuf;

a = svc10336->a;
b = svc10336->b;
c = svc10336->c;
printf("a[%d] b[%c] c[%c]\n", a, b, c);
}
 
R

Randy Howard

Yes, that seems plausible. See below.


It can also happen in some bizarre, unexpected places. For example,
I saw this just yesterday. ssh into a 2.6 Linux system. CD into a
build tree, start a make. Hit ctrl-c during the make, get your
prompt back, with "Bus Error".

Reproducible, but intermittent. Not quite sure how to explain that
one away easily as alignment.
 
C

Chris Croughton

It can also happen in some bizarre, unexpected places. For example,
I saw this just yesterday. ssh into a 2.6 Linux system. CD into a
build tree, start a make. Hit ctrl-c during the make, get your
prompt back, with "Bus Error".

Really? I just tried upwards of 40 times (I lost count in the middle)
with my Gentoo box and didn't get that at all. I tried breaking at a
load of different times, the 'worst' I got was:

make: *** Deleting file `mparith.o'
make: *** wait: No child processes. Stop.
make: *** Waiting for unfinished jobs....
make: *** wait: No child processes. Stop.

% uname -a
Linux gentoo 2.6.10-gentoo-r6 #1 Fri Jan 28 18:11:39 GMT 2005 i686
Intel(R) Celeron(R) CPU 2.40GHz GenuineIntel GNU/Linux

% gcc --version
gcc (GCC) 3.3.4 20040623 (Gentoo Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6)
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

% make --version
GNU Make 3.80
Copyright (C) 2002 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Reproducible, but intermittent. Not quite sure how to explain that
one away easily as alignment.

Some problem with your system, with the version of make (or gcc or
something else being interrupted)?

Chris C
 
R

Randy Howard

Really? I just tried upwards of 40 times (I lost count in the middle)
with my Gentoo box and didn't get that at all.

I didn't mean to imply it would happen anywhere. I mean, it was a
really strange message, which made no sense at all. I have no clue
what the cause is, but don't have time to track it down now.
Some problem with your system, with the version of make (or gcc or
something else being interrupted)?

Entirely possible. It might be something to do with the ssh client
being used. *shrug*
 
C

Chris Croughton

I didn't mean to imply it would happen anywhere. I mean, it was a
really strange message, which made no sense at all. I have no clue
what the cause is, but don't have time to track it down now.

Ah, sorry, I read your "ssh into a 2.6 Linux system." etc. as an
instruction on how to reproduce it.
Entirely possible. It might be something to do with the ssh client
being used. *shrug*

Could be that as well.

Chris C
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top