Problems with enums across a dll interface?

S

SpaceCowboy

I recently got into a discussion with a co-worker about using enums across a
dll interface. He wanted to use chars instead, argueing that depending on
compiler settings the size of an enum could change and lead to memory
corruption. I didn't see how this was possible. He claims that if a dll
built for a program is built with different compiler settings than the
launcher program, the enum size could change.

The only way I could figure out to break this would be change the byte
alignment to 1 or 2. Then read in the data structure from the dll a byte at
a time assuming all the while that you knew how the memory was laid out.
Then the alignment could change where the enum was located and you could
read bad memory.

The only symptom of this problem is that when the launcher application is
exited, the machine reboots. I believe it is a 98 system running VC 6.0.
To me this sounds more like memory being trashed by improper pointer
management. Does any of this sound remotely possible. Is it somehow unsafe
to use enums in any way? I hope not or my world will fall apart.

SpaceCowboy
 
E

Ekkehard Morgenstern

Hi SpaceCowboy,

SpaceCowboy said:
I recently got into a discussion with a co-worker about using enums across a
dll interface. He wanted to use chars instead, argueing that depending on
compiler settings the size of an enum could change and lead to memory
corruption. I didn't see how this was possible. He claims that if a dll
built for a program is built with different compiler settings than the
launcher program, the enum size could change.

The size of all C or C++ datatypes are compiler-dependent (even
compiler-option dependent!).

So, to maintain consistency across DLLs, you should agree on a specific
type. If your application is to be run only on Windows, you can use
predefined fixed-length types like BYTE, WORD or DWORD. Don't use INT or
int, since it changes depending whether you're writing a 16-,32- or 64-bit
application.

The size of enum types isn't specified. The compiler can either make them
the smallest size possible, int, or something else.

The size of the other types is defined such that sizeof(char) <=
sizeof(short) <= sizeof(int) <= sizeof(long). If you use these types, you
make yourself dependent on the compiler. The usual workaround is to have an
extra include file for defining types like "int8_t", "int16_t", "int32_t"
and "int64_t", or similar. This way you have to change only the include file
in case the compiler changes.

I hope that helps.

Regards,
Ekkehard Morgenstern.
 
B

Ben Hutchings

SpaceCowboy said:
I recently got into a discussion with a co-worker about using enums across
a dll interface. He wanted to use chars instead, argueing that depending
on compiler settings the size of an enum could change and lead to memory
corruption. I didn't see how this was possible. He claims that if a dll
built for a program is built with different compiler settings than the
launcher program, the enum size could change.

This is correct. However, this is not the only thing that could change.
Be very, very wary of compiling different source files with different
compiler settings (though it's normally safe to have different
optimisation settings).

The only symptom of this problem is that when the launcher application is
exited, the machine reboots. I believe it is a 98 system running VC 6.0.

So far as I know, there are no settings that would change the size of
enum objects in Visual C++, though there are in some other compilers.

Here's a list of settings for Visual C++ 6.0 that can cause
incompatibility if they differ between the various translation units and
executable modules. It might not be complete.

- preprocessor definitions and header paths, if they cause different
definitions to be included
- pointer-to-member representation
- exception handling enabled/disabled
- RTTI enabled/disabled
- construction displacements enabled/disabled
- run-time library variant
- calling convention
- struct member alignment

Also note that different C++ compilers don't generally produce
compatible code, though the Windows C++ compiler vendors generally try
to produce code that's compatible with some version of Visual C++.
To me this sounds more like memory being trashed by improper pointer
management. Does any of this sound remotely possible.

Maybe. One never knows, with Windows 98. Try it out in Windows NT
(note, 2000 and XP are later versions of NT) as it is more likely to
trap such bugs.
Is it somehow unsafe to use enums in any way? I hope not or my
world will fall apart.

You need to be careful when adding enumerators. Strictly speaking,
after any change to shared definitions you should rebuild every file
that uses them (the One Definition Rule). In practice it's generally
safe to add enumerators as long as (1) you don't change the value of
any of the existing ones, and (2) any code that checks an enumerated
value is written to be able to cope with unrecognised values. For
example, if you have:

enum blah
{
AARDVARK,
ABACUS,
ADVERB
};

and you change it to be:

enum blah
{
AARDVARK,
ABACUS,
ACTOR,
ADVERB
};

then the numeric value of ADVERB changes from 2 to 3. This will
break compatibility with any code compiled with the old definition.
However, if you add ACTOR at the end of the enumeration then the
existing enumerators will keep their old values, and the change
should be safe.

Note that the underlying integer type of an enumerated type
depends on the values of its members, so you must avoid adding an
enumerator that would require a change of type. So long as your
enumerators are all small enough to be represented by a char this
shouldn't be an issue.
 
E

Ekkehard Morgenstern

Hi SpaceCowboy,

SpaceCowboy said:
I recently got into a discussion with a co-worker about using enums across a
dll interface. He wanted to use chars instead, argueing that depending on
compiler settings the size of an enum could change and lead to memory
corruption. I didn't see how this was possible. He claims that if a dll
built for a program is built with different compiler settings than the
launcher program, the enum size could change.

The size of all C or C++ datatypes are compiler-dependent (even
compiler-option dependent!).

So, to maintain consistency across DLLs, you should agree on a specific
type. If your application is to be run only on Windows, you can use
predefined fixed-length types like BYTE, WORD or DWORD. Don't use INT or
int, since it changes depending whether you're writing a 16-,32- or 64-bit
application.

The size of enum types isn't specified. The compiler can either make them
the smallest size possible, int, or something else.

The size of the other types is defined such that sizeof(char) <=
sizeof(short) <= sizeof(int) <= sizeof(long). If you use these types, you
make yourself dependent on the compiler. The usual workaround is to have an
extra include file for defining types like "int8_t", "int16_t", "int32_t"
and "int64_t", or similar. This way you have to change only the include file
in case the compiler changes.

I hope that helps.

Regards,
Ekkehard Morgenstern.
 
S

Steve Clamage

In C++, enums of different ranges are allowed to be of different sizes.
For example, if the range fits in one byte, the compiler is allowed to
make the enum type one byte in size (or two bytes, or sizeof(int)).

For compatibilty with C, which requires enums to be the size of ints,
most C++ compilers make enums the same size by default. Some compilers
have an option to affect the size of enums.

Of course, if the enum range won't fit in an int, some larger size must
be used for the enum. The C++ standard encourages implementors to keep
enums no larger than an int when possible.

If you use a compiler option to change the size of enums, you are
changing the ABI (the binary program interface). You cannot link
together binaries using different ABIs and expect a good result. You
certainly should document anything unusual you do in building a program
that affects clients of the program.

So, yes, it is possible for different parts of a program to have
different sizes for enum types, although in my view using non-default
enum sizes is asking for trouble. I certainly would not do it in a
library that anybody else might use. And if I were contemplating using
non-default enum sizes, I'd look very carefully at anything my program
depended on.
 
K

kanze

I recently got into a discussion with a co-worker about using enums
across a dll interface. He wanted to use chars instead, argueing that
depending on compiler settings the size of an enum could change and
lead to memory corruption. I didn't see how this was possible. He
claims that if a dll built for a program is built with different
compiler settings than the launcher program, the enum size could
change.

It really depends on the compiler. All of the compilers I know are in
fact several compilers; which one you get depends on the invocation
arguments. All of them, for example, implement several different
languages (Posix C, standard C, standard C++, backwards compatible
C++...). Most of them also implement several different targets: with
signed char's, with unsigned char's, with different size enums, with
different alignment requirements, with different size long's and
pointers...

Generally speaking, compatibility between shared objects, libraries,
etc. is only garanteed if all translation units have been compiled for
the same target. Depending on the linker and the differences, you may
get an error, you may get simply get strange crashes, or everything may
work as expected: if on my system I specify 32 bit mode for some
modules, and 64 bit mode for others, I get a linker error on my system.
On the other hand, if I specify different alignment requirements, rather
than using the default, the modules will link, but I will likely get a
core dump at execution. With VC++, if I specified that plain char is
signed in one module, and unsigned in another, I can not only linke, but
everything will work.

As a general rule, you should know what options you use, and stick to
one set of options for anything which affects the target. Or possibly
generate several versions of the library -- it is quite common, for
example, to generate both a multi-threaded version and a single threaded
version, for both 32 bits and 64 bits, on my platform (Sun Sparc under
Solaris). (Add the fact that we support two different compilers, and
both static and dynamic linking, as well as debug and optimized
versions, in all configurations, and a new library release can take
quite some time to build -- 16 different versions in all.) If the
libraries are only to be used for one or two in house applications,
however, it is probably simpler to simply standardize on one target.
The only way I could figure out to break this would be change the byte
alignment to 1 or 2. Then read in the data structure from the dll a
byte at a time assuming all the while that you knew how the memory was
laid out. Then the alignment could change where the enum was located
and you could read bad memory.

There are many different options which can affect compatibility. With
Sun CC 5.1, I can generate two different class layouts (with different
name mangling, so they probably won't link), two different hardware
models (32 bits and 64 bits -- this information is placed in the object
file, and the linkers won't allow mixing), and with or without proper
alignment (never tried, but I suspect that mixing would be a source of
core dumps). With VC++, there are options concerning calling
conventions, packing, the signed-ness of plain character, and probably a
few other things I don't know about. As far as I know (but I don't use
the platform often enough to be sure), the linker verifies none of
these, although the calling convention may affect name mangling, and
thus trigger linker errors.

As a user, it is ALWAYS your responsibility to ensure that everything
you are linking is compatible.
The only symptom of this problem is that when the launcher application
is exited, the machine reboots. I believe it is a 98 system running
VC 6.0. To me this sounds more like memory being trashed by improper
pointer management. Does any of this sound remotely possible. Is it
somehow unsafe to use enums in any way? I hope not or my world will
fall apart.

This could be any number of things. I don't know about the enum
question in particular, but in general, it is not safe to link modules
if you don't know whether they were compiled with compatible options or
not. It's not particular to enum's, and it's not particular to
Microsoft (although they seem to have more unverified options than
most).
 
J

James Kuyper

SpaceCowboy said:
I recently got into a discussion with a co-worker about using enums across a
dll interface. He wanted to use chars instead, argueing that depending on
compiler settings the size of an enum could change and lead to memory
corruption. I didn't see how this was possible. He claims that if a dll
built for a program is built with different compiler settings than the
launcher program, the enum size could change.

In principle, using a compiler with different command line options
makes it a different implementation of C++. There's no guarantee that
different implementations of C++ are compatible with each other; any
feature that can be different for different implmentations might be
changed by a compiler, and that includes the size of an enum. In
practice, a vendor would have to be crazy to deliberately build in
unnecessary incompatibilities between object files compiled using
different compiler options.

However, some compiler options necessarily produce incompatibilities.
For example, the compiler I use most frequently has a -64 option to
compile in 64 bit mode rather than the default of 32 bits. The whole
point of the option is to change the size of pointers and long
integers, which inherently renders the object files incompatible.
You'll need to check your compiler's documentation of the meanings of
the particular options you're worrying about.
 
A

Allan W

Steve Clamage said:
In C++, enums of different ranges are allowed to be of different sizes.
For example, if the range fits in one byte, the compiler is allowed to
make the enum type one byte in size (or two bytes, or sizeof(int)).

enum COLOR {
black=30001, brown, red, orange, yellow,
green, blue, violet, grey, white,
gold
};
int x = red; // Must be the value 30003

Note that there are only 11 valid colors and their values are
contiguous (gold - black + 1 = 11).

Since I used a value which does not fit into a char, is the compiler
required to use at least 2 bytes for a COLOR?

Or, is the compiler allowed to stuff these values into a byte? It could
use the value 0 for black, and then implement the values I created by
adding 30001 whenever it is translated back to an integral type.

COLOR color = grey;
cout << *(reinterpret_cast<short*>(&color));

Assuming that short is at least 2 bytes, is this guaranteed to
display 30009?
 
S

Steve Clamage

Allan said:
enum COLOR {
black=30001, brown, red, orange, yellow,
green, blue, violet, grey, white,
gold
};
int x = red; // Must be the value 30003

Note that there are only 11 valid colors and their values are
contiguous (gold - black + 1 = 11).

Since I used a value which does not fit into a char, is the compiler
required to use at least 2 bytes for a COLOR?

Or, is the compiler allowed to stuff these values into a byte? It could
use the value 0 for black, and then implement the values I created by
adding 30001 whenever it is translated back to an integral type.

COLOR color = grey;
cout << *(reinterpret_cast<short*>(&color));

Assuming that short is at least 2 bytes, is this guaranteed to
display 30009?

The enumerators are required to be represented as integer values, and an
enum type must have an underlying integer type that can represent all
the enumerator values. In this case, the underlying type must be able to
represent the values 30001 through 30011. Since a short can represent
all those values, a short or int could be used to represent the type. (A
short is required to be able able to represent values in the range
-32767 through +32767.) If bytes were 16 bits on the architecture, an
implementation could use a char type for this enum.

Because of the rule about underlying type, you can sometimes create
values of an enum type that are not the value of any enumerator. The
requirement is that the value be representable in the minimum number of
bits needed to represent all of the enumerator values.

For example, if the smallest enumerator is 3 and the largest is 11, you
can create values of the enum type in the range 0 through 15, since 4
bits are required to represent the enumerators.

C++ added the requirement to allow a contiguous range of enum values to
support the common idiom of using enumerators as bit flags that can be
OR'd together.

Reference: C++ standard section 7.2, "Enumeration Declaration".
 
E

Edward Diener

SpaceCowboy said:
I recently got into a discussion with a co-worker about using enums
across a dll interface. He wanted to use chars instead, argueing
that depending on compiler settings the size of an enum could change
and lead to memory corruption. I didn't see how this was possible.
He claims that if a dll built for a program is built with different
compiler settings than the launcher program, the enum size could
change.

Some compilers have different options to determine an enum size if the enum
values all fit within a particular integer size. With those compilers there
is often a #pragma which can be placed in a header file prior to an enum
declaration which will set the enum size to the same value when source is
compiled and when a user has to interface with the enum in a class or
namespace. This #pragma ensures that the enum size is the same when a shared
library is built and when it is used.

If one doesn't use the same enum size in the cases described just above,
using the enum would almost certainly lead to disaster if the enum size is
changed between the time when the shared library is built and when it is
used. In other words, the #pragma is necessary to ensure the same size even
when a compiler option allows the enum size to change, since the #pragma
takes precedence over the compiler option. So while your co-worker is
theoretically right, if the shared library has been created correctly, with
the necessary #pragma in the header file, passing enums between shared
modules and executables shouldn't be a problem. Of course there may be
developers of 3rd party libraries who do not take into account the necessary
settings for enum size when they build their libraries, and this may lead to
disaster.
 
D

Dave Thompson

In C++, enums of different ranges are allowed to be of different sizes.
For example, if the range fits in one byte, the compiler is allowed to
make the enum type one byte in size (or two bytes, or sizeof(int)).

For compatibilty with C, which requires enums to be the size of ints,
most C++ compilers make enums the same size by default. Some compilers
have an option to affect the size of enums.
No, the C standard also allows each enumerated type to be "compatible
with an integer type; the choice of type is implementation-defined"
and C99 adds explicitly what was obviously intended "but shall be
capable of representing the values of all the members of the
enumeration." plus a footnote saying explicitly "An implementation may
delay the choice of which integer type until all enumeration constants
have been seen." which clearly allows it to be minimal for that range.
(E.g. can't do enum foo { quux = sizeof (enum foo), bar = 999 }; )

Some C compilers may choose to implement all enums as int, and this
may well be a good choice, since after all int is supposed to be the
"natural [and presumably efficient] size" for the platform. A C++
compiler built on, packaged with, or otherwise compatible with such a
C compiler probably should and does do the same where it can.

And to someone else who said they didn't know of a C compiler option
to do this, one but an important one: gcc -fshort-enums.
Of course, if the enum range won't fit in an int, some larger size must
be used for the enum. The C++ standard encourages implementors to keep
enums no larger than an int when possible.
This is the difference. C does not allow enum *constants* to be larger
than int, thus no C enum type ever *needs* to be larger than int; and
it treats all enum *constants* as type int -- but then any object of
an enum type or its value which is actually narrower than int will in
nearly all cases be taken to int by the integer promotions anyway.

Well, also, in C it's easy(ier?) to do it yourself if the compiler
doesn't; you can just declare an object of an integer type that you
have determined is large enough for the range of an enum -- or,
perhaps, of the values you wish to use from an enum -- and blithely
use it wherever you need that enum; C++ requires a cast to enum.

- David.Thompson1 at worldnet.att.net
 
J

James Kuyper

enum COLOR {
black=30001, brown, red, orange, yellow,
green, blue, violet, grey, white,
gold
};
int x = red; // Must be the value 30003

Note that there are only 11 valid colors and their values are
contiguous (gold - black + 1 = 11).

Since I used a value which does not fit into a char, is the compiler
required to use at least 2 bytes for a COLOR?

No. If UCHAR_MAX<=30011, it can use unsigned char. If
SCHAR_MAX<=30011, it can also used signed char.
Or, is the compiler allowed to stuff these values into a byte? It could
use the value 0 for black, and then implement the values I created by
adding 30001 whenever it is translated back to an integral type.

No, it must use an underlying integral type with a range that's
sufficient to represent all of the actual enumerator values, not just
a code for those values.
COLOR color = grey;
cout << *(reinterpret_cast<short*>(&color));

Assuming that short is at least 2 bytes, is this guaranteed to
display 30009?

No. Even if short is at least 2 bytes, and UCHAR_MAX<30011, there's no
guarantee that the underlying type is actually 'short'. If it isn't,
there's no guarantee that &color even has the right alignment to be
converted to a short*. There's no guarantee that the first
sizeof(short) bytes of color contain the low order bits of the type
that's actually used, much less that those bits are stored in the same
order as the corresponding bits of a 'short'.
 
J

James Kuyper

No. If UCHAR_MAX<=30011, it can use unsigned char. If
SCHAR_MAX<=30011, it can also used signed char.\

Correction: <= should have been >=, in both cases.
 
M

Martin Slater

Note that the underlying integer type of an enumerated type
depends on the values of its members, so you must avoid adding an
enumerator that would require a change of type. So long as your
enumerators are all small enough to be represented by a char this
shouldn't be an issue.


In the past we've had an additional memeber with a value of 0x7fffffff
to force it to require a 32bit int. Is there any problems with doing this?

Martin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,147
Messages
2,570,835
Members
47,383
Latest member
EzraGiffor

Latest Threads

Top