templates + RTTI + shared library = impossible?

D

Dan Caugherty

Hey all --

I think I have a legitimate reason for using RTTI, particularly
dynamic_cast<>, and I'd like to use it to query instances created in a
shared library. No matter how I tweak GCC's visibility criteria with
#pragmas or linker flags, dynamic_cast always fails.

Here's a very brief idea about what I'm trying to do:

----------
// any_value.h
class any_value {
public:
any_value(); //defined in any_value.cpp
virtual ~any_value(); //also defined in any_value.cpp
};

----------
// value.h
template <typename T>
class value {
T t_;
public:
value(); //defined in value_def.h (t_ = T
())
value(const T & t); //defined in value_def.h
~value(); //defined in value_def.h
const T & item() const; //defined in value_def.h (never virtual)
};
----------- Only .cpp file used for shared lib:
// value.cpp
#include <typeinfo>
#include "value.h"
#include "value_def.h" //only include for value_def.h *anywhere*

template class value<int>;

-------------- Only .cpp file for test executable:
// my_exec.cpp
// .. linked with shared library libvalue.dylib (yes, this is on MacOS
X)

#include <typeinfo>
#include <iostream>
#include "value.h"

int main(int argc, char ** argv)
{
any_value * pa = new value<int>(42);
std::cout << "Dyn cast returns "
<< dynamic_cast< value<int> * >(pa) << std::endl;
return 0;
}
-------------------- END

The test executable always prints "Dyn cast returns 0".

Any ideas as to what I'm doing wrong? I make it a point to export all
symbols in the shared library. I think this has something to do with
the type_info objects created by GCC (multiple versions perhaps that
cause dynamic_cast to fail), but I don't know how to verify that. (In
any case, I'm not sure how I'd fix it!)

Thanks in advance,
-- Dan C.
 
D

Dan Caugherty

Noticed a typo in the code below..
the value<T> type definition should have the following line instead:

class value : public any_value {


Sorry about that...
 
R

Richard

[Please do not mail me a copy of your followup]

Dan Caugherty <[email protected]> spake the secret code
I think I have a legitimate reason for using RTTI, particularly
dynamic_cast<>, and I'd like to use it to query instances created in a
shared library. No matter how I tweak GCC's visibility criteria with
#pragmas or linker flags, dynamic_cast always fails.

A coworker of mine ran into this problem. His issue was that the
shared library defined the derived class but he wanted to dynamic_cast
to the derived class in his code (executable or another shared
library). He solved it by creating a static function in the shared
library that defines the derived class and have that static function
do the dynamic_cast. From the other libraries/executable, you call
this static function instead of doing dynamic_cast.
 
D

Dan Caugherty

Hmm. So you're basically saying, try adding the following in my shared
lib:

--- value.h

template <T>
value<T> * is_a(any_value *);

--- value_def.h
template <T>
value<T> * is_a(any_value * pA)
{
return dynamic_cast< value<T> * >(pA);
}

--- value.cpp

// not clear on the syntax here, all hints appreciated
// if this is wrong...
template is_a<int>(any_value *);
 
D

Dan Caugherty

Explicit instantiation tells the compiler to generate the code for that
particular instantiation.  Since it's a function, the name (mangled, of
course) gets added to the list of external symbols and the function
becomes callable from outside, so the linker can resolve it.

This much I understand. My original question was that the type_info
instances for the value<> types were, for whatever reason, invisible
to the executable. This seems to be keeping dynamic_cast<> from
working from outside the library.

The only real solution seems to be instantiating static functions
explicitly within the shared lib to do the work of dynamic_cast<>.
This is annoying, but not terribly inconvenient. (And I haven't tried
it yet, so for all I know, it may not work.)

I guess I'm also wondering if this can be avoided somehow.
 
B

BGB / cr88192

Explicit instantiation tells the compiler to generate the code for that
particular instantiation. Since it's a function, the name (mangled, of
course) gets added to the list of external symbols and the function
becomes callable from outside, so the linker can resolve it.

<--
This much I understand. My original question was that the type_info
instances for the value<> types were, for whatever reason, invisible
to the executable. This seems to be keeping dynamic_cast<> from
working from outside the library.

The only real solution seems to be instantiating static functions
explicitly within the shared lib to do the work of dynamic_cast<>.
This is annoying, but not terribly inconvenient. (And I haven't tried
it yet, so for all I know, it may not work.)

I guess I'm also wondering if this can be avoided somehow.
-->

take note that RTTI often works by having a class contain a vtable-pointer
to a class-info structure, which usually points to the superclass info, ...

dynamic_cast then, would involve walking the graph, checking superclasses,
and seeing if the target class is in the list (usually by
pointer-comparrison). if not found, then the dynamic_cast fails.

now, what happens when one links with shared libraries?... consider this:
what if the linker only sees stuff (during linking) which is in its own set
of stuff being linked.
so, different class-info structures for the same class will merge, ...

now, with the same class in different libraries:
it does the same things, each library then ending up with their own versions
of whatever class-info structs are used.

so, the same class in different libs will have different info structures,
and hence not compare equal via a simple pointer-based check, hence, not
working.


similar issues can also manifest in other ways as well:
malloc/free not working between DLL's (I have ran into this before with
MSVC, where using malloc in one DLL and free in another may actually cause
the app to subsequently crash);
(this may or may not also apply to new/delete, but I have not tested);
....

so, alas, DLL or shared-library / shared-object issues may need to be
treated with care, as it is not exactly the same as is the case with static
linking.
 
R

Richard

[Please do not mail me a copy of your followup]

Dan Caugherty <[email protected]> spake the secret code
Hmm. So you're basically saying, try adding the following in my shared
lib:

--- value.h

template <T>
value<T> * is_a(any_value *);

--- value_def.h
template <T>
value<T> * is_a(any_value * pA)
{
return dynamic_cast< value<T> * >(pA);
}

--- value.cpp

// not clear on the syntax here, all hints appreciated
// if this is wrong...
template is_a<int>(any_value *);

As someone else pointed out, you need to explicitly instantiate the
template for the specific type in the shared lib.

The problem with templates and shared libraries is that the templates
aren't shared from the library, they're instantiated in the calling
code, so it still wouldn't work that way.

In my coworker's situation he didn't use templates. Try just getting
it to work with an explicit function from the shared lib:

value<int> *is_an_int(any_value *val)
{
return dynamic_cast<value<int> *>(val);
}
 
J

James Kanze

take note that RTTI often works by having a class contain a
vtable-pointer to a class-info structure, which usually points
to the superclass info, ...
dynamic_cast then, would involve walking the graph, checking
superclasses, and seeing if the target class is in the list
(usually by pointer-comparrison). if not found, then the
dynamic_cast fails.
now, what happens when one links with shared libraries?

What happens when one links with shared libraries depends very
much on the compiler and the system.
... consider this:
what if the linker only sees stuff (during linking) which is
in its own set of stuff being linked. so, different
class-info structures for the same class will merge, ...
now, with the same class in different libraries:
it does the same things, each library then ending up with their own versions
of whatever class-info structs are used.
so, the same class in different libs will have different info structures,
and hence not compare equal via a simple pointer-based check, hence, not
working.

Yes and no. First, of course, the information concerning any
one class is a sort of a package -- it's either all there, or
none of it is. Second, of course, access to this information is
through the vptr, so the compiler only needs to know about it in
constructors and destructors (when it sets or resets the vptr).
And of course, the most derived class must know about all of the
sub-classes (even if they are in a different DLL).

All of this means that if all of the constructors and
destructors of each class are in the same DLL (no inlining!),
then there's absolutely no reason for more than one instance of
the RTTI information.

In practice, the situation is a bit more varied (and people do
inline constructors or destructors). In the case of VC++, for
example, I've noticed that every DLL does get its copy of the
RTTI information -- which one you get depends not on where the
code for the constructor was generated, but where the
constructor was called. (Even in the case of non-inlined
constructors. I can't figure out why---it seems like they're
going to a lot of extra work just to make things more difficult
to use.) In the case of Unix, with the compilers I've seen, two
different solutions are used: either the compiler generates the
RTTI information in the object file which contains the first
non-inlined virtual function (exporting the symbol, of course),
or it generates the information in every module where it
generates code for a constructor or the destructor, merging them
during the link phase. (The first solution is probably a
hang-over from earlier days, when linkers didn't know how to
merge.) And if two DLL's contain the same (exported) symbol,
only the first one loaded is actually used, so if the RTTI
information is exported, all instances will share the same
address. About the only time I think you might get different
addresses is if the shared object is loaded with the RTLD_LOCAL
flag (thus, explicitely), and your class has inline constructors
or destructors.

(As a general rule, inline functions and templates don't work
well with dynamic linked libraries, and should be avoided when
dynamic linking is being used. And don't forget that the
compiler generated defaults are inline---if you're using dynamic
linking, be sure to declare and define things like the copy
constructor, even if the definition corresponds to what the
compiler would generate.)
similar issues can also manifest in other ways as well:
malloc/free not working between DLL's (I have ran into this
before with MSVC, where using malloc in one DLL and free in
another may actually cause the app to subsequently crash);
(this may or may not also apply to new/delete, but I have not
tested);

This is a purely Windows problem, present probably because
Windows doesn't bundle the CRT DLL's with the OS (I think). If
you link everything (the main and all of the DLL's) with the CRT
DLL's (option /MD or /MDd to cl, IIRC), then there's no problem.
(On the other hand, if you do this, you may have to deliver the
CRT DLL when you deploy your program.) The problem doesn't occur
under Unix first because this is the standard way of
proceding---since the CRT and the system API are in the same
shared object (libc.so), they're automatically bundled with the
OS. And secondly because the CRT will be implicitely loaded,
with RTLD_GLOBAL, so all of its symbols will be exported, and
all of the shared libraries will use the first instance loaded.
so, alas, DLL or shared-library / shared-object issues may
need to be treated with care, as it is not exactly the same as
is the case with static linking.

In general, I find that people use DLL's far too much. It's
rarely justified to use a DLL within an application: DLL's are
for interfacing with external software which is separately
installed (the system, a data base, etc.) or for plug-ins.
 
R

Richard

[Please do not mail me a copy of your followup]

James Kanze <[email protected]> spake the secret code
"Dan Caugherty" <[email protected]> wrote in message
similar issues can also manifest in other ways as well:
malloc/free not working between DLL's [...]

This is a purely Windows problem, present probably because
Windows doesn't bundle the CRT DLL's with the OS (I think). [...]

The issue is that each module (DLL or EXE) has its own Win32 heap.
Memory allocated by a module will be allocated from that module's heap
and must be freed by code in that module to be freed from the corect
heap.
 
B

BGB / cr88192

Richard said:
[Please do not mail me a copy of your followup]

James Kanze <[email protected]> spake the secret code
"Dan Caugherty" <[email protected]> wrote in message
similar issues can also manifest in other ways as well:
malloc/free not working between DLL's [...]

This is a purely Windows problem, present probably because
Windows doesn't bundle the CRT DLL's with the OS (I think). [...]

The issue is that each module (DLL or EXE) has its own Win32 heap.
Memory allocated by a module will be allocated from that module's heap
and must be freed by code in that module to be freed from the corect
heap.

yeah.
partial solution: don't pass memory ownership across DLL boundaries...


but, yeah, DLL's do have at least the major good point of avoiding lots of
10 or 20 MB EXE's (instead, one gets maybe 10-20 1 MB DLL's...), and a bunch
of smaller EXE's...

another good point, is that one need only remember to link in those they
need, and any others "come along for the ride", which improves on the
static-lib case, of having to remember to link in every static lib
referenced both directly and indirectly, which can itself become a
non-trivial problem.

as well as simplifying dynamically-loadable components, ...

....

so, in general, they are a good tradeoff for their costs (sometimes
questionable linkage semantics, having to annotate declarations, having to
worry about moving around a small army of DLL's and keeping them visible
from the current directory, ...).

or such...
 
J

James Kanze

James Kanze <[email protected]> spake the secret code
<[email protected]> thusly:
similar issues can also manifest in other ways as well:
malloc/free not working between DLL's [...]
This is a purely Windows problem, present probably because
Windows doesn't bundle the CRT DLL's with the OS (I think). [...]
The issue is that each module (DLL or EXE) has its own Win32
heap. Memory allocated by a module will be allocated from
that module's heap and must be freed by code in that module to
be freed from the corect heap.

That's not really an issue for C++ programs, at least those that
are using new, and not the Windows API directly, to allocate
memory. C++ programs don't allocate memory from Windows, they
call the operator new function, which in turn calls malloc.
And malloc is in the CRT library. If the CRT library is in a
distinct DLL, all Windows allocations will be from that DLL, and
will use that DLL's heap. If you statically link a separate
instance of the CRT with each DLL, then each DLL will have a
separate instance of the operator new function and malloc, and
will use its own heap.

The arrangement of using separate system heaps for each DLL
seems like a very poor design decision to me (supposing you're
right---the Windows documentation says that "Each *process* has
a default heap*---, but it doesn't really matter here. Even
with one common system heap, each instance of malloc/free will
use a different set of static variables to manage this heap, and
memory returned by a call to HeapAlloc in one instance of malloc
will not be known in any other instance.

As I said, the motivation for statically linking the CRT is that
it isn't bundled with the OS. If you link with it dynamically,
you either have to require that all systems on which you run
have it installed, in addition to the standard system stuff, or
that you bundle it into your deployment package. (The Microsoft
site has extensive documentation about these issues. I'd
suggest that anyone deploying code written for Windows which
uses DLL's wade through it. I'd also suggest avoiding DLL's in
your own application, if possible, since they do make deployment
more complicated.)
 
J

James Kanze

Richard said:
James Kanze <[email protected]> spake the secret code
<[email protected]> thusly:
similar issues can also manifest in other ways as well:
malloc/free not working between DLL's [...]
This is a purely Windows problem, present probably because
Windows doesn't bundle the CRT DLL's with the OS (I think). [...]
The issue is that each module (DLL or EXE) has its own Win32
heap. Memory allocated by a module will be allocated from
that module's heap and must be freed by code in that module
to be freed from the corect heap.

No.

partial solution: don't pass memory ownership across DLL boundaries...

It's not a problem if you link your modules correctly.
but, yeah, DLL's do have at least the major good point of
avoiding lots of 10 or 20 MB EXE's (instead, one gets maybe
10-20 1 MB DLL's...), and a bunch of smaller EXE's...

So how is this a good point? You still end up with 10 or 20 MB
used on the disk, and you've made deployment significantly more
complicated, and introduced yet another way things can go wrong:
mixing versions of the DLL's. And of course, DLL's and
templates don't mix very well.
another good point, is that one need only remember to link in
those they need, and any others "come along for the ride",
which improves on the static-lib case, of having to remember
to link in every static lib referenced both directly and
indirectly, which can itself become a non-trivial problem.

Sorry, I don't understand this one. Are you using static libs,
or dynamically linked objects? What you (generally) want to
avoid is using a static library for something that will be used
from several different DLL's---that's what causes problems.
as well as simplifying dynamically-loadable components, ...

so, in general, they are a good tradeoff for their costs
(sometimes questionable linkage semantics, having to annotate
declarations, having to worry about moving around a small army
of DLL's and keeping them visible from the current directory,
...).

DLL's are an added complication. You use them when you need
them, but you should avoid them otherwise.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,149
Members
46,695
Latest member
StanleyDri

Latest Threads

Top