"We don't need no steenking headers!"

L

Luke Meyers

So, just a little while ago I had this flash of insight. It occurred
to me that, while of course in general there are very good reasons for
the conventional two-file header/implementation separation for each C++
class, there are cases in which this paradigm contributes nothing and
simply introduces extra boilerplate overhead.

The particular case I have in mind is CppUnit tests. Each test header
is only ever included by the corresponding implementation file, never
by anything else. The implementation file registers itself with the
test suite, and that's all there is to it. So, I can't think of any
reason at all for there to be two files.

As soon as I thought of this, I asked myself two questions. First, am
I missing anything? Is there some negative consequence that hasn't
occurred to me? Second, assuming this isn't a phenomenon unique to
unit test cases, that there is some general category of classes for
which the same reasoning applies, what are the properties which
determine membership in that category?

I'm interested in hearing others' thoughts on this. My initial
estimation is that such classes will basically be leaf classes with no
additional public interface. Furthermore, there are restrictions on
the circumstances under which one can construct such a class, since no
code outside Foo.cpp will ever even *see* the symbol Foo. The whole
thing could be in an anonymous namespace, even! There is a loophole,
though -- generic code. For example, CppUnit's AutoRegisterSuite is a
template which takes the test class as its type parameter, and
instantiates it. CRTP-based designs would function similarly.

So... I don't have any dazzling conclusions. I think having half as
many source files to juggle seems like a worthwhile thing in terms of
comprehensibility/maintenance, even if it only takes place in
restricted domains. I don't really see wanting to change my design to
accommodate such a nicety, but it's something I'll be mulling over.

Anyway, as I said... thoughts?

Luke
 
I

Ian Collins

Luke said:
So, just a little while ago I had this flash of insight. It occurred
to me that, while of course in general there are very good reasons for
the conventional two-file header/implementation separation for each C++
class, there are cases in which this paradigm contributes nothing and
simply introduces extra boilerplate overhead.

The particular case I have in mind is CppUnit tests. Each test header
is only ever included by the corresponding implementation file, never
by anything else. The implementation file registers itself with the
test suite, and that's all there is to it. So, I can't think of any
reason at all for there to be two files.
That depend where you build your test suites or runners. My CppUnit
headers get include in at least two source files.
As soon as I thought of this, I asked myself two questions. First, am
I missing anything? Is there some negative consequence that hasn't
occurred to me? Second, assuming this isn't a phenomenon unique to
unit test cases, that there is some general category of classes for
which the same reasoning applies, what are the properties which
determine membership in that category?
Implementation classes when one is using the PIMPL idiom?
I'm interested in hearing others' thoughts on this. My initial
estimation is that such classes will basically be leaf classes with no
additional public interface. Furthermore, there are restrictions on
the circumstances under which one can construct such a class, since no
code outside Foo.cpp will ever even *see* the symbol Foo. The whole
thing could be in an anonymous namespace, even! There is a loophole,
though -- generic code. For example, CppUnit's AutoRegisterSuite is a
template which takes the test class as its type parameter, and
instantiates it. CRTP-based designs would function similarly.
Biggest problem I can see is that you are buggered if one of these
classes stops being a leaf. Also if you are using CppUnit, you may want
to include the private header in the test file.
So... I don't have any dazzling conclusions. I think having half as
many source files to juggle seems like a worthwhile thing in terms of
comprehensibility/maintenance, even if it only takes place in
restricted domains. I don't really see wanting to change my design to
accommodate such a nicety, but it's something I'll be mulling over.
Well I guess you could put all the code in the headers (like some
compilers require you do do with templates) and include them all in one
source file for compilation. No pretty!
 
L

Luke Meyers

Ian said:
That depend where you build your test suites or runners. My CppUnit
headers get include in at least two source files.

Hmm, really? I'm curious what that compilation structure looks like,
and whether you feel that it's advantageous. I have a single
TestMain.cpp per suite which builds the runner and uses test registry
magic (read: hidden globals) to discover all the tests which have,
privately within their own .cpp files. So, no need for anybody to
include FooTest.h.
Implementation classes when one is using the PIMPL idiom?

Hmm, certainly a well-known example of a class with no header. I think
this is a different sort of animal, as are for example little functor
structs and other helpers. They're clearly subordinate, little
different from inner classes. The dominant class in such cases still
generally has its own header.
Biggest problem I can see is that you are buggered if one of these
classes stops being a leaf.

I think that's a pretty mild buggering -- all you'd wind up doing is
splitting one file into two, no real headache involved. Leaf classes
which become base classes pretty much always require modification,
anyway -- introduction of protected members, virtual member functions,
and such. Otherwise why inherit?
Also if you are using CppUnit, you may want
to include the private header in the test file.

Could you explain what you mean?
Well I guess you could put all the code in the headers (like some
compilers require you do do with templates) and include them all in one
source file for compilation. No pretty!

Yes, but this carries some pretty well-known drawbacks -- namely,
having to recompile downstream classes every time your implementation
changes, as opposed to only when the interface changes. Also, people
and tools generally expect not to compile header files directly, so
that could lead to headaches. Anyway, this doesn't really have much in
common with the .cpp-only approach I described.

Oh, and there are some other solutions to the template instantiation
problem, by the way.

I recall a while ago someone posted on here a link to an essay about
"Java-style classes in C++" which proposed yet another unconventional
compilation structure. I don't recall being particularly convinced,
but it was food for thought. Anyone else got any interesting (ab?)uses
of the C++ compilation model?

Luke
 
I

Ian Collins

Luke said:
Hmm, really? I'm curious what that compilation structure looks like,
and whether you feel that it's advantageous. I have a single
TestMain.cpp per suite which builds the runner and uses test registry
magic (read: hidden globals) to discover all the tests which have,
privately within their own .cpp files. So, no need for anybody to
include FooTest.h.
I haven't tried that, maybe I should...
Could you explain what you mean?
How to you unit test as leaf class?
Oh, and there are some other solutions to the template instantiation
problem, by the way.
Use a compiler that doesn't suffer from it :)
I recall a while ago someone posted on here a link to an essay about
"Java-style classes in C++" which proposed yet another unconventional
compilation structure. I don't recall being particularly convinced,
but it was food for thought. Anyone else got any interesting (ab?)uses
of the C++ compilation model?
Well it's not an abuse, but if you are looking to simplify things -
seeing as some compilers can find template definitions in source files,
why can't C++ compilers use search rules to find header files when they
encounter a class? If you're not sure what I'm on about, google for
"php autoload".
 
P

Phlip

Luke said:
The particular case I have in mind is CppUnit tests. Each test header
is only ever included by the corresponding implementation file, never
by anything else.

That might be an architectural flaw of CppUnit. Rigs like CppUnitLite, using
a TEST_() macro, don't even need a header.
The implementation file registers itself with the
test suite, and that's all there is to it. So, I can't think of any
reason at all for there to be two files.

If it's not an architectural flaw, then why is the header there? It may just
be a flaw of the sample code. Take it out if you don't need it.

Sometimes test suites inherit test suites, particularily to follow the
Abstract Test Pattern. Those cases need headers - if the derived suites are
indeed in separate files. There only reason I can think not to put them in
the same file is file length.
As soon as I thought of this, I asked myself two questions. First, am
I missing anything? Is there some negative consequence that hasn't
occurred to me? Second, assuming this isn't a phenomenon unique to
unit test cases, that there is some general category of classes for
which the same reasoning applies, what are the properties which
determine membership in that category?

Huh? If you don't share a class between modules, don't put it in a header.
(Someone aware of CppUnit ought to recognize the Refactoring
implications...)
I'm interested in hearing others' thoughts on this.

Try this. Two teams start with two hot-head C++ gurus for team leads. One
decrees:

A. put all method bodies inside their classes
unless profiling reveals you should take
them out

B. put all method bodies outside their classes
unless profiling reveals you should put
them in

One is default inline and the other default out-of-line.

Neither boss says to put all classes in .h files (or to put only one class
in each .h file, or anything retarded like that).

Team A's code resembles Java, yet as profiling of runtime and of compile
time reveals bottlenecks, some methods migrate out of their classes, and
into .cpp files.

The reason most teams pick option B is because all their legacy code goes
like that. I suspect that either systems are sustainable. System B has much
less paperwork. And programs that abuse templates tend to attract system A.
 
P

Phlip

Ian said:
That depend where you build your test suites or runners. My CppUnit
headers get include in at least two source files.

Why? If it's for registering tests, you can write a TEST_() macro in raw
CppUnit, and never register again...

Luke said:
...I have a single
TestMain.cpp per suite which builds the runner and uses test registry
magic (read: hidden globals)

If it's hidden (a good thing), then it ain't a global (a bad thing!).
Implementation classes when one is using the PIMPL idiom?

Why would one Pimpl test suites that are already discrete and discreet?
 
L

Luke Meyers

Ian said:
How to you unit test as leaf class?

Assuming this is a typo for "how do you unit test a leaf class," I
think you've hit upon the major flaw/limitation in this approach.
Every class which one is interested in unit-testing (which should be
basically every class) has at least one class which is interested in
including the header for that class so as to use it. Unless one does
something funky like put the test code and production code in the same
compilation unit. Either by putting them both in the same literal
file, or by #including one implementation file from the other (possibly
with an #if TESTING guard around the #include, going in one direction).

Any other ideas to get around this limitation?
Use a compiler that doesn't suffer from it :)

That's one. Individual programmers on a team/within a company are
frequently not entirely at liberty to simply choose whatever compiler
they like for production code, though. But the FAQ mentions some other
options, which I'm sure you've read, and which I've used (with minor
variants of my own devising) to good effect. The practice of
#including the implementation files for templates has some interesting
consequences, like enabling easy control of the size and number of
one's compilation units.
Well it's not an abuse, but if you are looking to simplify things -
seeing as some compilers can find template definitions in source files,

You are referring to the "export" keyword, right?
why can't C++ compilers use search rules to find header files when they
encounter a class? If you're not sure what I'm on about, google for
"php autoload".

Well, I don't think this has very much to do with template exporting,
but it could be readily accomplished (modulo a reasonable approach to
disambiguation) with an added preprocessor step. Use the scripting
language of your choice. I had a peek at php autoload -- doesn't look
like a model that would be doable with the C++ preprocessor as-is.

Luke
 
P

Phlip

Luke said:
Every class which one is interested in unit-testing (which should be
basically every class) has at least one class which is interested in
including the header for that class so as to use it. Unless one does
something funky like put the test code and production code in the same
compilation unit. Either by putting them both in the same literal
file, or by #including one implementation file from the other (possibly
with an #if TESTING guard around the #include, going in one direction).

Any other ideas to get around this limitation?

Under pure Test Driven Development, you write the client interface you need,
and write unstructured behavior behind that interface, to pass the test.
Then you refactor, frequently testing, until the behavior is structured. So
the behavior could migrate into private classes at file scope, or less, and
it's all still perfectly tested.

Refactoring shouldn't change behavior. You should only do tiny refactors
that you know won't change behavior (including incidental behavior, such as
which order to call a sequence of functions that don't influence each
other). And your test cases will preserve the behavior that you drew up to
the interface.

So the result should be well-tested and well-encapsulated behavior.
 
L

Luke Meyers

Phlip said:
If it's hidden (a good thing), then it ain't a global (a bad thing!).

I appreciate when people interpret my words a little less rigidly than
this. The global-ness is hidden behind macros and statics and such,
depending on which interface one uses. That's just the way CppUnit
test registries work. All I do is include in each test source file a
line like:

CppUnit::AutoRegisterSuite<TestFoo> suite("moduleName");

Usually I put it in an anonymous namespace, too, because I am that kind
of bear.

Luke
 
L

Luke Meyers

Phlip said:
That might be an architectural flaw of CppUnit. Rigs like CppUnitLite, using
a TEST_() macro, don't even need a header.

CppUnit doesn't need a header either -- that's my point. My
realization was that I was using the header separation model simply out
of habit, rather than expedience.
Huh? If you don't share a class between modules, don't put it in a header.

What about classes within the same module which use each other? How do
you provide them access to each other's definitions without headers,
unless putting them all in the same source file? Is this a confusion
over the word "module?"
(Someone aware of CppUnit ought to recognize the Refactoring
implications...)

I'm aware that, depending on how the notion of "module" works in my
particular build structure, I may have the option to narrow my public
interface by only making a carefully-selected subset of my header files
visible to other modules.
Try this. Two teams start with two hot-head C++ gurus for team leads. One
decrees:

A. put all method bodies inside their classes
unless profiling reveals you should take
them out

B. put all method bodies outside their classes
unless profiling reveals you should put
them in

One is default inline and the other default out-of-line.

Neither boss says to put all classes in .h files (or to put only one class
in each .h file, or anything retarded like that).

Team A's code resembles Java, yet as profiling of runtime and of compile
time reveals bottlenecks, some methods migrate out of their classes, and
into .cpp files.

The reason most teams pick option B is because all their legacy code goes
like that. I suspect that either systems are sustainable. System B has much
less paperwork. And programs that abuse templates tend to attract system A.

It's preposterous to levy runtime performance as the only
consideration. I see no reason to suspect a strong correllation one
way or the other between either strategy and runtime performance. The
chief reason to separate class definitions from implementations, as I
see it, is to create a recompilation firewall. Also, while translation
unit size isn't as big of a concern as it once was, putting
implementation in headers does mean bloating each unit with all those
function definitions. No reason to make the compiler chew on all that
again and again.

Luke
 
P

Phlip

Luke said:
What about classes within the same module which use each other? How do
you provide them access to each other's definitions without headers,
unless putting them all in the same source file? Is this a confusion
over the word "module?"

Yes. Don't interpret it so rigidly. The C++ Standard has no definition of
module. (And in the other post I thought _you_ were disparaging globals. No
biggie.)

In this case, it means a translation unit. And sometimes modules are
clusters of translation units.
It's preposterous to levy runtime performance as the only
consideration.

I also mentioned recompile times.

However, I was quoting team leads like James Kanze, who tell their minions
to make _everything_ out-of-line unless profiling reveals it should go
inline. Naturally that errs on the side of compile time, but that's not why
he does it.

As a thought experiment, one could grow a project using the opposite rule -
put everything inside a class unless profiling (rebuild times or run times)
reveals it should go out-of-line.

http://c2.com/cgi/wiki?CppHeresy hence
http://c2.com/cgi/wiki?InlineAllMethodsWhereverPossible

(I had no idea the second page was there. Feel free to ignore the first one
it's obviously just PeterMerel...)

So James Kanze orders his minions to out-of-line everything as a baseline
for profiling.
I see no reason to suspect a strong correllation one
way or the other between either strategy and runtime performance. The
chief reason to separate class definitions from implementations, as I
see it, is to create a recompilation firewall.

C++ allows logical encapsulation to parallel physical encapsulation. So if
the most-frequently-depended-on things are mostly abstract base classes,
then no matter how you abuse the concrete classes, recompiles don't cascade
when you change behavior.

Tragic recompile situations frequently occur in systems that ramble on and
on. Suppose a team of 20 started coding in pure C++, maybe with no tests,
and added lines for a few years. Now the system is huge, a lead programmer
inserted Pimpls as an emergency defense, and the link time is outrageous.
Just as bad as an InlineAllMethodsWhereverPossible project.
Also, while translation
unit size isn't as big of a concern as it once was, putting
implementation in headers does mean bloating each unit with all those
function definitions. No reason to make the compiler chew on all that
again and again.

In this case, the system should have followed "encapsulation is
hierarchical". That means the longer the logical distance between two
elements, the narrower the physical channel between them. The CppHeresy page
recommended breaking things up into modules by using Python as the glue
between them. So now link times are healthy because your soft layers
(Python, an ORB, whatever) defer the high-level linking to run-time.

No steeking headers.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,968
Messages
2,570,150
Members
46,697
Latest member
AugustNabo

Latest Threads

Top