dynamic_cast is ugly!

R

Robert Martin

back to visitor pattern! Which seems to be nearly as "bad"
as a dynamic cast.

The visitor patter *is* dynamic cast.

Exercise for the reader: create two functions:
Square* dynamic_cast_square(Shape* s);
Circle* dynamic_cast_circle(Shape* s);

These functions behave just like dyamic_cast, returning 0 if the input
argument is the wrong type, otherwise returning a properly downcast
pointer.

Use the Visitor pattern.
 
C

Christopher Dearlove

You don't. Whatever functionality that we're talking about here
belongs either in Square or some other interface (not Shape) that it
implements. The problem here is that Shape, in this example, is too
generic to be very useful.

So let's suppose that Square for example also inherits from Cornered
(this isn't a great example, but without changing from Shape/Square to
something else, I'm stuck with it).

So you'd be happy to dynamic_cast from Shape * to Cornered *,
but not from Shape * to Square *?

Or in other words if you need to dynamic_cast, prefer a cross-cast
to another abstract base class to a downcast to a concrete derived class?

That makes reasonable sense to me. I think of the two dynamic_casts
I can recall using, one was this, and one was imposed by a form of
reality not unlike the ordering problem, though not actually equal to it.
 
D

dave_mikesell

  And you lose the ordering relation the squares had with respect to the
other objects.

  Sometimes type-specific operations need to be done in the exact order
in which the objects appear in the container.

Can you give an example or two of these type-specific operations?
 
D

Dmitry A. Kazakov

Dmitry A. Kazakov wrote:

And you lose the ordering relation the squares had with respect to the
other objects.

Who implements the relation the container or the elements?
Sometimes type-specific operations need to be done in the exact order
in which the objects appear in the container.

The operations are of squares, so the order (with is defined by an
operation) can be of squares only.
Of course if I can do something to the square only, I do that already.
That's not the problem here.

Yes, you have a design problem.
 
D

dave_mikesell

So let's suppose that Square for example also inherits from Cornered
(this isn't a great example, but without changing from Shape/Square to
something else, I'm stuck with it).

So you'd be happy to dynamic_cast from Shape * to Cornered *,
but not from Shape * to Square *?

No, I would avoid heterogeneous containers in the first place. If my
container contains Shape *, I expect only to use Shape operations on
its elements.
 
C

Christopher Dearlove

No, I would avoid heterogeneous containers in the first place. If my
container contains Shape *, I expect only to use Shape operations on
its elements.

There's a split here between the purists, who will do anything to avoid
this,
and the realists, who agree that you should rarely do it, but recognise that
occasionally it's necessary (meaning that the alternatives are worse).

Addressing my comments to the realists, do you think a dynamic cross-cast
to an alternative abstract base class is generally to be preferred to a
dynamic down-cast to a non-abstract derived class in such cases?

(I've deleted comp.object, as I expect the realists to be in comp.lang.c++.)
 
J

Jeff Schwab

James said:
That's not strictly true. Goto is more a language issue than
anything else. The necessary flow control structures for clean
programming are known (and have been for a long time), and the
advantages of using them have been proven. Most modern
languages provide all of the necessary structures (and more), so
there is no reason to use goto.

Although the vast majority of code requires no goto's, they are mostly
just out of fashion. I still see and use them frequently in embedded
code and device drivers.

Gotos are extremely useful for implementing finite state machines, since
there is generally no need to care how a particular state was reached,
and no need for a return address. You don't have any choice in a
pre-stack environment; you can't call a function until you have
initialized the memory controller, so you do a lot of (very orderly)
direct branching.

In device drivers, goto is used locally in (e.g.) almost all ioctl
handlers. There are typically two labels near the end of such a
function, corresponding to successful return and error-condition return.
You could, in principle, use try/catch instead, but the performance
would be prohibitively poor, the code would be less clear, and woe
betide you if ever someone forgot a catch block. (By contrast,
forgetting a goto's label produces a compile-time error.)
 
J

Jeff Schwab

James said:
The problem is that dynamic_cast is really only justified in
fairly large applications, where layering becomes an important
issue. As long as you can manage the entire application as a
single layer, it is possible to avoid "loosing" type (or at
least, I've not seen an exception). It's when you start having
to define separate layers, frameworks and such, that it becomes
important.

The suggestion that dynamic_case is necessary for passing
application-layer objects through lower level transport layers is only
marginally true. It is true that casts are used for this purpose, but
IME they are almost always of the reinterpret_cast variety. They have
to be: A general-purpose transport layer does not know about the higher
layer's object types. The transport mechanism is just sending bits over
a wire, or at most doing some kind of byte-order or line-ending
conversion. For dynamic_cast to be relevant, the transport mechanism
has to know about some base type used by the higher level. The only
case in which I could see this happening is if the transport layer
provides a base type with virtual methods, maybe "serializable_object"
or something, from which higher-layer objects are supposed to derive.
IMO, that kind of design is completely upside-down and unnecessarily
invasive.
 
H

H. S. Lahman

Yes, but _which_ object? According to you (and I agree) the ordering
should be the responsibility of the collection.

The problem is that the collection classes for the individual
associations can only capture said:
No. The _ordering relation_ is not exposed by the dynamic_cast. The
client has no reason to know whether it's "less" or "greater", or
something more complicated calculated by an oracle from multiple
properties of the object, or even some abstraction not present _in_ the
object at all, such as its position in an external database. All the
client needs to know is that the objects will be presented in the
correct order.

I see your concern, but I don't see it as a significant problem. (There
is also a way to address the concern directly that I'll get to later.)

The client has relationships with two different sorts of objects and
that, quite naturally, are abstracted as two separate binary
associations among peer classes in the OOA. That's because the real
collaboration is the processing that [Client] does with the [ClassA] or
[ClassB] object in hand. Each of those associations is necessarily
ordered in order to solve the problem in hand but that ordering only
applies to the simple binary association.

The fact that both sets of objects also need to be somehow ordered
*together* is a separate problem constraint than the ordering of the
individual collections. That joint constraint spans the associations so
it can (at most) only be partially implemented within the collections
(e.g., as an alternative ordering as suggested in my example). It is
that coordination _between collections_ that appears in the client
because the client logically owns any coordination among its
participation in its associations.

Now one can get around that by providing yet another class to act as a
collection of collections to encapsulate that synchronization:

[Client]
| 1
|
| R1 <<ordered>>
|
| accesses
| 1 1
[HCollection] ------------------------+
| 1 |
| |
| R2 <<ordered>> | R3 <<ordered>>
| |
| coordinates with | coordinates with
| * | *
[ClassA] [ClassB]

While we have conveniently encapsulated the overall ordering constraint,
there are several problems with this. We have introduced [HCollection]
as a peer object (i.e., at a higher level of abstraction than the R2 and
R3 collections) that has no counterpart in the customer domain; it is a
pure OOP implementation entity from the computing domain. So it has no
business being in the OOA solution where functional requirements are
resolved.

The second problem is that for the OP's situation we have obscured the
access to just [ClassA] or [ClassB] objects. That is especially
troublesome when the ordering for those situations is different that the
overall ordering. We can solve that by providing additional direct
associations between [Client] and [ClassA] and [ClassB], but that
complicates the design and introduces more collections to be managed at
the OOP level.

The third problem is that [Client] now receives a stream of both
[ClassA] and [ClassB] objects from [HCollection] but the client needs to
process objects from each class differently. The [Client] could do that
in a typesafe manner if it were navigating R2 and R3 directly, but it
has no way of knowing which type the next object from [HCollection] is.

Since C++ has no builtin facility for managing heterogeneous collections
in a typesafe manner, the convenient way to do this is via dynamic_cast.
But to do that [Client] must understand that [HCollection] manages the
[ClassA] and [ClassB] collections; it must know what types [HCollection]
manages just to properly use dynamic_cast. IOW, [Client] must know who
[HCollection] collaborates with (i.e., [Client] must understand the R2
and R3 associations) and that knowledge is hard-wired into [Client]'s
implementation.

The point is that as soon as one introduces [HCollection] as a peer
problem space entity, one potentially opens a can of worms that creates
implementation dependencies in [Client]'s implementation on software
structure that is removed from its immediate context (i.e., it depends
on [HCollection]'s associations rather than its own associations). Using
dynamic_cast just manifests that more fundamental OOA/D problem.

To put it a different way, the overall ordering constraint needs to be
implemented somewhere. It can't by fully implemented within the binary
associations to [ClassA] and [ClassB] that are defined in the problem
domain. IOW, one must encapsulate the coordination somewhere and the
problem space entity with both associations in common seems like a good
choice.

One could argue that the deficiency of C++ is the root problem. If one
had a language that has builtin, typesafe support of heterogeneous
collections (e.g., Ada), one can encapsulate the synchronization in a
collection. But lacking that, having the client coordinate its binary
associations explicitly is the better choice.

Having said all this, there is a way to encapsulate the overall ordering
in [HCollection] to satisfy your concerns and not use dynamic_cast. If
we had additional, conditional relationships:

0..1 current for R4 0..1
[Client] ----------------------------- [ClassA]

0..1 current for R5 0..1
[Client] ----------------------------- [ClassB]

to capture the notion of the current object for [Client] to process,
then only one relationship would be "live" (instantiated) at a time.
Then HCollection::getNext() doesn't return an object. Instead it removes
any existing instantiation of R3 and R4 and instantiates the appropriate
relationship based on comparing the sizes of the next object in each
collection. When that action returns the [Client] object then looks to
see which conditional association is instantiated.

This removes any responsibility from [Client] for knowing which object
to process is next. It also serializes the concerns of a heterogeneous
object stream into managing relationship instantiation for the "current"
object.

I might use this solution if I could identify an entity in the problem
space that naturally has the responsibility for managing disparate
entities. For example, the notion of a Queue Manager might be a relevant
concept for a domain expert for doing something like managing messages
in different formats from different sources in a FIFO manner. Then one
rationalizes the R2 and R3 collections as particular source queues and
one renames [HCollection] to be [QueueManager].

However, another good OOA/D practice is to try to minimize conditional
associations because they require additional executable code to manage
and they tend to be more fragile during maintenance. So for something as
simple as the OP's example, I would still probably let [Client] do the
interleaving.

<aside>
Note that this would be even better if one decided to break up [Client]
into different objects for the processing around [ClassA] and [ClassB]
objects. Continuing the example above, that might be the case if the
message processor is subclassed by format, which just happens to map
directly to message source.

Now [QueueManager] instantiates the association to the right subclass
client. This changes the flow of control design because now
[QueueManager] would be the one to trigger the processing by also
sending a message to announce a message was ready to the right format
processor that, in turn, would navigate the relationship. That is, when
it is time to process another message, a <OO> message is sent to
[QueueManager] who selects the right message from among the queues,
instantiates the relationship, and sends a <OO> message to the right
client to do its thing. That's actually a much more OO-like way to
connect the dots of flow of control than telling the client to do its
thing and having the client, in turn, tell [QueueManager] to get the
next widget.
Making flow of control decisions based on problem space properties
will always be more robust during maintenance than making such
decisions on 3GL implementation properties. The goal is to make the
application more maintainable and avoid foot-shooting; not elegance,
reduced keystrokes, minimizing static structure, or even being
convenient for the developer.
Are you really suggesting that there's no correlation between "elegance,
[...] convenient for the developer" and "more maintainable"?

Not quite. I am suggesting that violating good OOA/D practice to provide
elegance, reduced keystrokes, or developer convenience tends to reduce
long-term maintainability. IOW, when the choice is between misusing
dynamic_cast in order to have the convenience of a heterogeneous
collection vs. maintainability, maintainability wins.



--
There is nothing wrong with me that could
not be cured by a capful of Drano.

H. S. Lahman
(e-mail address removed)
Pathfinder Solutions
http://www.pathfindermda.com
blog: http://pathfinderpeople.blogs.com/hslahman
"Model-Based Translation: The Next Step in Agile Development". Email
(e-mail address removed) for your copy.
Pathfinder is hiring:
http://www.pathfindermda.com/about_us/careers_pos3.php.
(888)OOA-PATH
 
I

Ian Collins

Kai-Uwe Bux said:
Somehow, I feel that this discussion is suffering from the lack of a
concrete non-trivial case. We would need to have some more or less
real-life OO component (say a library) before us which either (a) uses
dynamic_cast in its implementation or (b) might require client code to use
dynamic_cast. Then, the challenge would be to present a functionally
equivalent (i.e., at least equally powerful) replacement that doesn't do
so.
How about W3C DOM (http://www.w3.org/TR/DOM-Level-3-Core/core.html)
model I cited earlier?

The DOM provides a family of objects derived form a Node. The Node
objects contains two containers (both of Nodes), a NodeList of child
nodes and a NamedNodeMap of Attribute Nodes. When parsing or
manipulating a document most children are Element and Text objects and
the attribute nodes are Attr objects.

One could add the extension member functions of the known child objects
as virtual methods, but there are other classes of objects the derive
form DOM objects in other standards, XHTML for instance.

Like all things XML, DOM is designed to be extensible, so there is no
way of knowing up front which virtual methods would be required. In my
opinion giving the base class knowledge of its children through virtual
methods is poor design that breaks the fundamental principal of
extensibility.
 
A

Andy Champ

Jeff Schwab wrote:
You don't have any choice in a
pre-stack environment; you can't call a function until you have
initialized the memory controller, so you do a lot of (very orderly)
direct branching.

Not completely true. You have a couple of registers - SP and BP on an
x86 machine spring to mind - which aren't a lot of use without memory.

Except that they can be a handy place to put a return address!

Been there, done that, got the EPROM.

Andy
 
J

Jeff Schwab

Andy said:
Jeff Schwab wrote:


Not completely true.

What's not true?
You have a couple of registers - SP and BP on an
x86 machine spring to mind - which aren't a lot of use without memory.

Except that they can be a handy place to put a return address!

Right, so you do indirect branching as well as direct. :)
Been there, done that, got the EPROM.

How do you use store and use the address in the register? You can't use
call/return. You have to branch explicitly. You can indeed store
addresses in registers, and frequently do; you can even have one level
of "call depth" by using only only the lower halves of the
general-purpose registers for local data, left-shifting before a "call,"
and right-shifting after the "return." But none of this gets you around
the branch, or jump, or -- if you're writing C++ code -- the goto.
 
D

Daniel T.

Juha Nieminen said:
He didn't mention any concrete, implementable design, just some
abstract theoretical stuff, with not even a single line of code.

This is unreal. Of course he didn't post code, the person presenting the
problem didn't post code either, but I don't hear you saying a word
about that.
That's exactly what I asked how to do...

You don't know how to make a virtual function in a base class?
Please show me some concrete C++ code, not just some abstract
theoretical concepts.

Please post a concrete problem.
 
D

Daniel T.

Juha Nieminen said:
In my case this solution doesn't work for two reasons:

1) The order of the objects may change in the main container (and this
ordering is very relevant). I would have to maintain the same order in
the type-specific containers as well, which in some cases can become
exceedingly difficult. (Certainly more difficult than having to use
dynamic cast in a few places, so trying to do it becomes
counter-productive.)

2) In some cases I have to traverse *all* the objects in the order in
which they are in the main container and perform *type-specific*
operations to them (iow. operations which cannot be specified as virtual
functions in the base class). It would not be enough to traverse the
type-specific containers one after another (because it would mean that
the objects are traversed out-of-order with respect to the main container).

Post the problem statement that requires the above, post some code doing
it and maybe we can provide a better solution.
 
D

Daniel T.

Robert Martin said:
Sometimes you *know* that the object you are holding is a square. And
yet the static type of that object is Shape*.

Shape* s = someplace.getShape();

You haven't lost the type information because you *know* that s points
to s square. You know it because the runtime logic is set up to make
it impossible for any other outcome. You may have lost the static
type, but you have held in in the dynamic state.

So using a dynamic cast is simply an assertion of what you know to
already be true. You are converting a fact held by dynamic state back
into a static type.

Square* sq = dynamic_cast<Square*>(s);

To suggest that this is bad design is to suggest that it is bad design
to move any static knowledge into dynamic state.

I don't suggest that the above is bad design. In fact I mentioned it as
a valid use of dynamic_cast.

However, that doesn't speak to the point of the issue. In the example
given you have a container of Shapes and you *don't* know which of them,
if any, are Squares.

To do a dynamic_cast in that case is fishing.
 
C

coal

The suggestion that dynamic_case is necessary for passing
application-layer objects through lower level transport layers is only
marginally true.  It is true that casts are used for this purpose, but
IME they are almost always of the reinterpret_cast variety.  They have
to be:  A general-purpose transport layer does not know about the higher
layer's object types.  The transport mechanism is just sending bits over
a wire, or at most doing some kind of byte-order or line-ending
conversion.  For dynamic_cast to be relevant, the transport mechanism
has to know about some base type used by the higher level.

I think the compiler should be involved in generating the
code used by the transport layer.

http://preview.tinyurl.com/38femh
 The only
case in which I could see this happening is if the transport layer
provides a base type with virtual methods, maybe "serializable_object"
or something, from which higher-layer objects are supposed to derive.
IMO, that kind of design is completely upside-down and unnecessarily
invasive.- Hide quoted text -

Yes, that would be a mess.
Going back to what I outlined in that prior thread... I don't
use any dynamic_casts so far in terms of the code that is written
programmatically. If a vector<Shape*> is being transmitted, a
constant integer precedes the data for each object. The "compiler"
(my software isn't a compiler, but I think compilers should do what
my stuff does.) outputs a constant integer for each distinct type it
encounters. If the classes are:

Shape
|
Square

the constants output would be:

unsigned int const Shape_Num = 4201;
unsigned int const Square_Num = 4202;

And the Send functions would embed whichever one of those two values
is appropriate into the output stream. The receiving end
uses that value to interpret the subsequent data in the stream.
So far I don't find a need for dynamic_cast in this context. In
general I'm not completely opposed to dynamic_cast, but I like to
avoid it, and a search of what I'm working on turns up no uses of
it.

Brian Wood
Ebenezer Enterprises
www.webebenezer.net
 
J

James Kanze

The suggestion that dynamic_case is necessary for passing
application-layer objects through lower level transport layers
is only marginally true. It is true that casts are used for
this purpose, but IME they are almost always of the
reinterpret_cast variety.

That very much depends on the transport layer, I would think.
They have to be: A general-purpose transport layer does not
know about the higher layer's object types.

And a specialize transport layer only knows a little. It
doesn't take much specialization to ensure that all of the
objects transported have a common base (i.e.
TransportableObject?), and are polymorphic.
The transport mechanism is just sending bits over a wire,

Maybe, maybe not. There's not necessarily a wire; it's sending
objects between to components. There are many different types
of transport layer. (And even if there is a wire, many
protocols will have all transportable objects derive from some
common base type, or a small set of common base types.)
or at most doing some kind of byte-order or line-ending
conversion.

I think you're confusing the transport layer of OSI network
protocols (TCP, etc.) with the transport layer of OO design.
For dynamic_cast to be relevant, the transport mechanism has
to know about some base type used by the higher level. The
only case in which I could see this happening is if the
transport layer provides a base type with virtual methods,
maybe "serializable_object" or something, from which
higher-layer objects are supposed to derive. IMO, that kind
of design is completely upside-down and unnecessarily
invasive.

Why? If you want an object to be transportable, you say so. It
seems to me to be part of the basic principles of static type
checking. You're transporting message objects between higher
level components which have connected to the transport layer.
You don't (and can't) transport just anything.
 
J

Jerry Coffin

[ ... ]
There are any number of ways to do it, Dave Mikes mentioned one,
another would be to have a virtual function in Shape that lets Shapes
know that a "change rectangles to yellow" request has been made.

I would posit that (at least as stated) this would constitute quite a
poor design. Rather than "change rectangles to yellow", the request
should be dealt with much more abstractly -- something more like
"highlight interfaces", using a relatively abstract description of the
desired action instead of directly describing its physical
manifestation.

Although I can't say I've seen a use for this specifically in UML, let's
assume for the moment that it really was something you wanted. In that
case, I think stepping through the collection of all the objects and
setting the interfaces to highlighted is only a minor improvement over
stepping through them and setting rectangles to yellow.

Instead, if we want to be able to highlight all the interfaces (or
whatever) we'd share the "highlighted" vs. "normal" state (or perhaps
the current color) among all objects of that type:

struct UML_object {
int x_, y_;
virtual void draw(surface_t &) = 0;
};

class UML_interface : public UML_object {
static color_t color;
public:
static void highlight(bool state=true) {
static color_t colors[] = {
RGB(0,0,0),
RGB(255, 0,0)
};

color = colors[state];
// code to force re-draw goes here.
}

square(int x, int y) : x_(x), y_(y) {}

virtual void draw(surface_t &s) {
// draw "interface" on specified surface
}
};

class UML_class : public UML_object {
public:
UML_class(int x, int y) : x_(x), y_(y) {}

virtual void draw(surface_t &s) {
// draw "class" on specified surface
}
};

color_t UML_interface::color;

This way we don't need separate containers OR a dynamic_cast to
highlight all your UML_interface objects -- instead, you call:
UML_interface::highlight();
and they all highlight. To change them all back to normal, you call:
UML_interface::highlight(false);

IMO, if you want shared state, it's better to create real shared state
than to force all objects of that type to the same state, independently
of each other.
 
A

Andy Champ

Jeff said:
What's not true?


Right, so you do indirect branching as well as direct. :)


How do you use store and use the address in the register? You can't use
call/return. You have to branch explicitly. You can indeed store
addresses in registers, and frequently do; you can even have one level
of "call depth" by using only only the lower halves of the
general-purpose registers for local data, left-shifting before a "call,"
and right-shifting after the "return." But none of this gets you around
the branch, or jump, or -- if you're writing C++ code -- the goto.

Well for one thing I was working in Assembler, not C++. No high level
language that I'm aware of will give you that much control.

The return?

mov ax,esp ; or bp
jmp [ax]

I think it goes. But my assembler books are at work, and I haven't
written much lately.

And no, you can't write assembler without some sort of goto, even when
you are using it to emulate a cleaner structure. OTOH, you don't have a
dynamic_cast either, which is where I started this thread!

Andy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,176
Messages
2,570,947
Members
47,501
Latest member
Ledmyplace

Latest Threads

Top