C++ fluency

J

Jerry Coffin

[ ... ]
Testing, of neither kind, can never "prove" anything.

Unless you're postulating some kind of indeterminancy, a test can prove
the outputs that are produced for a given set of inputs. In a lot of
cases, you're not really out to prove anything new or different, but to
prove equivalence (e.g. that a fast computation produces a result that
agrees with a slower version to a specifed degree of precision). If you
want to call that something other than a "proof", you're welcome to do
so, but I don't think you accomplish much by doing so.
Right. And as soon as you empirically detect it's wrong, the TDD tests are there
to resist bugs as you change the code to make it right.

Yes, but the effect is pretty minimal when threading becomes an
important part of the program -- the tests only deal with the easiest
parts to get right.

[ ... ]
It is beyond obvious that one dumb window timer is not a realtime application
accessing raw hardware. Yet premature concurrency is indeed like premature
optimization - the root of all evil!

Concurrency is an optimization, so premature concurrency is simply one
form of premature optimization.
 
J

Jerry Coffin

[ ... ]
To be sure that they do fail if the code is broken. It's
standard proceedure for regression tests---when you get an error
report from the field, you don't correct the code until you have
a regression test (a unit test) which detects the error (fails).

With a regression test, I can understand it -- but at least as I recall
things, this was talking about working with developing new code,
meaning:
This doesn't really apply to your initial code, however; the
fact that the test fails if there is no code doesn't prove
anything about whether it tests what it is actually supposed to
test. For the initial code, you need code review, which
includes reviewing the tests for completeness.

Right -- that was pretty much my point.

[ ... ]
If the test is supposed to test something
very specific and non-trivial, it might be worth creating a
special version of the code to be tested, with the exact error
the test is meant to reveal.

Oh yes, that certainly arises -- but I'd rarely depend on simply not
having written the code yet to ascertain that the test was checking for
exactly what I wanted.
This is very important with parts
of the test framework: when I wrote my memory checker, for
example, I intentionally wrote code which leaked memory, to
ensure that the leak was detected. For most of the traditional
unit tests, writing such special code is probably not worth the
effort, but when you have an actual failure, which doesn't show
up in your tests, it's certainly worthwhile making sure that the
test you add detect it befoer correcting the error.

Right -- as it happens, I've dealt with things somewhat on this order
more than I like to remember. In my day job, I spend quite a bit of time
evaluating documentation, especially to look at things like whether it's
sufficient for somebody to be able to implement a specification. I've
repeatedly had to deal with specifications that said what the code
should do, but failed to specify how to handle errors. Sometimes it's
failed to specify error handling sufficiently, other times it's failed
to mention it at all.
 
B

Bart van Ingen Schenau

Phlip said:
TDD testing is not QA testing.

Am I to understand you have two different sets of testcases for TDD
testing and QA testing at the unit level?

Could you enlighten me what you exactly mean with TDD testing and QA
testing? It seems we might be using slightly different definitions
(which is not all that uncommon when discussing agile methodologies).
If you know the next line of code to write, you must perforce know how
to make a new test fail because it's not there yet.

One test or multiple tests?

And no. I don't always know the test to write. For example, if the next
line only consist of a brace.
Furthermore, it is not possible to test C or C++ code on the level of
individual lines. The smallest granularity you can get is function
level, so that is the level at which I write my tests.
When writing a new function, I would start with several testcases for
the stated purpose, the boundaries and several invalid conditions. Each
testcase will also involve a number of environmental conditions needed
to verify its test-purpose.

When fixing a bug, I would first extend the existing testcases with one
that fails for the same reasons (and under the same conditions) as the
actual problem. Then I change as many lines as needed to make the entire
suite of testcases pass again.
The test should be easy to write. If it's not, you already did
something wrong, so spend your half hour or more fixing that.

The test itself is usually straightforward. The hard part, as I have
experienced, is in establishing the correct environmental conditions.
Some anecdotal evidence:
- In a recent project I worked on a system for playing music files from
an USB stick. After the acceptance tests, we got a failure report that
for a very specific configuration of the filesystem on the stick, we
would select the files for playback in the wrong order. Creating the
testcase itself was trivial (copy-paste of the existing testcase for
that feature), but setting up the correct (simulated) filesystem in the
stubs was a lot more work. Definitely more than half an hour.

- Longer ago, I have worked as a test engineer writing black-box
conformance testcases. 90% of the code in each testscript was dedicated
to getting the DUT in a state where we could verify our requirement.
(And no, common code did not help, as the conditions were just different
enough each time.)
Yes. You also run the test before you have your foo structure in
place. If you need a new method, you run the test before the method
exists, and you predict (out loud, to your pair) what the result of
the test run shall be.

And you don't read a "log" - the test failure should appear in your
editor's transcript, along with a clear, visible indicator of passing
or failing.

Just to be clear, I consider the output in a window within your editor
also to be a log.
And I also want a very visible PASS/FAIL indication, which a failure to
build does not give. At best, I consider a failure to build equivalent
to an ERROR verdict (testcase has detected an internal problem).
Why? If the tests are easy to run, then just hit the damn button in
between and keep going.

Because, with my way of writing code, the code would not compile before
I finished writing the function.
I write blocks in the order
- opening brace
- block contents
- closing brace.

I guess, with your insistence on frequent testing, you hit that button
after each character you typed in the editor.
Answered under "fake it till you make it".

"Fake it till you make it" does not tell you which conditions you forgot
to fake and did not test.

Bart v Ingen Schenau
 
J

James Kanze

Then why don't you run the test and watch it fail?

I do run the test. And watch it fail or succeed, depending on
what I've written. What I don't do is run the test before I've
written it. And I don't write it until I know exactly what the
function should do, so I know what to test.
It seems to me that correctly predicting the failure you get
would validate your mental model of the code's state, and
would improve the value of writing the code. That's the
test-"first".

If the function hasn't been written, the code which tests it
won't compile or link. I fail to see how correctly predicting
that tells me anything about the quality of the test.
I'm not smart enough to do any design work first.

Then you're better off finding a different profession, because
without some design, you'll never produce anything of working
quality.
 
J

James Kanze

Phlip wrote:

[...]
Are you smart enough to think of all the (boundary-)conditions
you need to test? To me that requires a smarter person than
someone who can write a design.

More to the point, until someone has done some design, you don't
know what functions are needed, nor what they should do, so you
don't know what to test.

Maybe Philip is just a "coder", developing code to other peoples
specifications.
 
I

Ian Collins

Bart said:
Am I to understand you have two different sets of testcases for TDD
testing and QA testing at the unit level?

No. QA tests aren't unit tests, they are black box acceptance tests.
One test or multiple tests?

And no. I don't always know the test to write. For example, if the next
line only consist of a brace.
Furthermore, it is not possible to test C or C++ code on the level of
individual lines. The smallest granularity you can get is function
level, so that is the level at which I write my tests.

So all your functions do exactly once thing with a single datum?
When writing a new function, I would start with several testcases for
the stated purpose, the boundaries and several invalid conditions. Each
testcase will also involve a number of environmental conditions needed
to verify its test-purpose.

We (those who use TDD) would use a smaller increment. One test, one
stated purpose. One test, one invalid condition and so on.
When fixing a bug, I would first extend the existing testcases with one
that fails for the same reasons (and under the same conditions) as the
actual problem. Then I change as many lines as needed to make the entire
suite of testcases pass again.

That's one practice everyone should be following.
 
J

James Kanze

Any theory that answers everything, answers nothing.
:)

I don't know. I don't know of anyone who's come up with an
answer either.

And yet, there are multithreaded programs which work correctly.
(Admittedly, not very many.) And programs using floating point
which work correctly.

In the case of floating point, there is a large body of theory
and knowledge, which permits formal proofs in some cases (but
the validity of the proofs still has to be verified). In both
cases (and in every other case I know, for that matter), if you
want quality software, you use a combination of methods:
testing, code review, etc., etc. No one tool is perfect, and
each more or less verifies the others.
Not sure what your question is. You seem to have answered it
yourself: user requirements.

But my user requirements are in terms of e.g. Implement the
protocol defined in RFC xxx. The resulting program must be able
to handle n requests per minute on machine x. The following
logging will be supported... From that, I need to design, to
determine what classes are needed, etc. It's not until I've
specified the requirements for each class in detail that I can
start thinking about unit tests.
 
J

James Kanze

[...]
Isn't the definition of "silver bullet" is "an order of
magnitude boost in productivity in under a decade". 10% in 10
years..?

That might be considered a silver bullet. But nothing has come
close. And none of the companies I know with high productivity
are using TDD. (FWIW: none of the companies I've seen which have
adopted some form of something called "agile programming" even
know how to measure productivity. So it's hard to say how much
the new techniques achieved. But none of them come anywhere
close to companies at SEI level 3 or better in terms of
productivity, so there are obviously better techniques
available.)
 
J

James Kanze

James Kanze wrote:
James, you might find the following paper interesting

I'll read it in detail later, but in the very first paragraph,
they explain that they are comparing TDD to something that
they've invented in order to make it look good, so I'm not
positively inspired. It's certain that any methodology will
beat no methodology, and all of the "success stories" I've heard
involving any form of "agile programming" start from no
methodology.
 
B

Bart van Ingen Schenau

Phlip said:
James is bragging his tests take too long to run. Your fix is the same
as his - to run 0 after every few edits.

Brilliant!

You are equally bragging that you can write any testcase (and supporting
code) in just a few minutes and have them all execute in the blink of an
eye. (albeit a rather slow blink at 20 seconds)

Don't blame me for not believing you. My experience tells me otherwise.

Bart v Ingen Schenau
 
J

James Kanze

[...]
No - the 10% metric goes from Waterfall or Code-n-Fix to TDD.
Not from excessive unit testing to TDD.

In other words, a "methodology" which has never existed in
practice, and the absense of a methodology (which has been known
bad practice for at least 20 years). The measures would be more
interesting if there was a comparison with currently used good
methodologies. How much would a company at SEI level 3 gain,
for example? (Or would it loose? The fact that companies at
high SEI levels aren't adopting TDD suggests that it actually
reduced productivity.)
I think these "well run companies" (ie companies that people
like Ian, Noah, or me are not smart enough to run) would still
achieve a much higher velocity.

The fact remains that they don't. And a well run company does
evaluate and try new technologies, measuring there impact, and
adopting those that improve something.
And I also question the definition of "bug". If it's
"divergence from the specification", then you might be
underrepresenting the other definition - "features the users
don't like". That rate will go down as your velocity goes up
and you deploy more often, in smaller feature increments.

First, I don't usually speak of bugs, I speak of defects. And a
defect can also be something which makes the code harder to
maintain or to understand.
results immediately anyway.
Yes you do.

Before you test it?
A TDD project can integrate any code change, and could deploy
& release after each 1-line edit.

That's the stupidest claim I've heard to date. If modify an
interface (for whatever reasons), I can't deploy until the
client code has modified its use of my code.
Please don't tell me this is impossible because I have done
it.

So you've never worked on anything more than a one man toy. Or
you're just lying. Or who knows what. At any rate, you're not
making sense, and you're making impossible claims.
 
J

James Kanze

I think that's the key - teams using TDD and the other XP
practices can achieve impressive error rates in a shorter time
than teams employing after the fact unit tests and extensive
code review.

Then why aren't they doing it? Why aren't the companies
actually producing high quality code efficiently using TDD?
It may be possible with sufficient vigour to get the same test
coverage with after the fact tests, but they will take a lot
longer to write and may require code coverage tools for
verification. One of the big benefits if TDD is it makes
testing an enjoyable part of the process rather than a chore.

I won't deny that there may be psychological benefits in some
cases. Although I find a lot of satisfaction in participating
in teams where we get zero errors from the field, having
delivered on time and in budget.
 
J

James Kanze

James said:
I think a better way of characterizing the problem is that the
various latencies are part of the "input". The problem isn't
that the code behaves differently for the same input; the
problem is that the set of input is almost infinite, and that
you usually have no way of controlling it for test purposes.
That's why you apply Karl Popper's principle of "Science as
Falsification" (search that and the article will be the top
link). Your tests then become attempts to prove your code is
broken. You think of as many ways as you can to cause a
failure in your code. There's little reason to try them ALL,
just come up with a scenario that could break your code in the
various general ways that are possible. For instance, if
you're working with threads you can attempt to cause a
deadlock or race condition. There's surely a few different
paths you can check that would cover most of the failures that
you can generate.

Generally not, when threading is involved.
Even this won't turn up everything though. It simply is a
fact that we can never test our code to 100% certainty. To
argue that we shouldn't do TDD because it's impossible to test
EVERYTHING is really a red herring. If we were to buy it as a
valid argument we'd apply it to testing in general and say
THAT'S a waste of time. Instead, what we do is come up with
methods to intelligently test so that we're wrong as little as
possible. Yes, much *THINKING* has to go into what we want to
test.

Certainly. Nothing it 100% certain, and I'm certainly not
saying that you should forego testing of threaded codes, just
because there are certain things the tests can't be guaranteed
to pick up. I was just responding to the statement that you
never write a line of code except as a result of a test that
failed. There are requirements that can't be tested, and you
write code to handle them, even if you can't create a test which
is guaranteed to fail. (More precisely, you write code which
you hope handles them, and you use other methods---which aren't
100% either---to verify it.)

And of course, there are major issues which testing doesn't
address at all: readability or maintainability. I presume that
you use other techniques (code review, pair programming, etc.)
for these.
Of course, what you're doing is taking my statement that I
"know" my code is good from that point and considering it too
literally.

What I'm trying to do is simply make you realize that TDD isn't
a miracle solution. Testing is certainly necessary---we both
agree with that. And at many levels. There are different
solutions to organizing it, and in the end, the important thing
is that it is done, not the details of the solution which
ensures that it is done. If TDD (as you have explained it, not
as some of its other proponents seem to be explaining it, not
the details of the solution which ensures that it is done. If
TDD (as you have explained it, not as some of its other
proponents seem to be explaining it) works for you, to ensure
that adequate testing takes place, fine. I don't say you
shouldn't use it. Just that it is not the only possible
solution, and that it (like all of the other solutions) requires
additional measures, e.g. to ensure that it is actually used,
and used correctly (tests sufficiently complete, etc.), and that
other non-testable requirements (readability, etc.) are met.
 
J

James Kanze

I don't know about windows, but most if not all Unix variants
have a real-time scheduling class.

Which does what? (And where do you get it? None of the Unix
variants I know have any classes---their interface is pure C.)
 
I

Ian Collins

Release to whom? Any code should still be run through its acceptance
tests before it is released into the wild.
That's the stupidest claim I've heard to date. If modify an
interface (for whatever reasons), I can't deploy until the
client code has modified its use of my code.

True, the same would apply to a TDD team: they can't integrate code that
doesn't compile and pass its tests.

But they have a very high degree of confidence that the change hasn't
broken anything. Which would also be true of any team who's process
mandates comprehensive unit tests.

It should be a goal of any project to maintain a releasable main line.
TDD teams tend to integrate more often than most (because they don't
have to wait until unit test are written). I encourage pairs to
integrate after each new test passes.
 
I

Ian Collins

James said:
Which does what? (And where do you get it? None of the Unix
variants I know have any classes---their interface is pure C.)

What do scheduling classes have to do with C?

At least on Solaris a real time thread will not be time-sliced or
prempted by anything other than a higher priority real time thread. So
they are ideal for simulating RTOS behaviour.
 
J

James Kanze

(e-mail address removed)>, (e-mail address removed)
says...
[ ... ]
I think a better way of characterizing the problem is that
the various latencies are part of the "input". The problem
isn't that the code behaves differently for the same input;
the problem is that the set of input is almost infinite, and
that you usually have no way of controlling it for test
purposes.
I have no problem with that characterization. In fact, I'd
consider it useful in at least one respect -- it tends to
emphasize the similarity between this type of problem, and
some others that prevent testing due to inputs that are too
large to test exhaustively, or even sample meaningfully.

That is, in fact, the reason I used this way of characterizing
it.
In both cases, you can exercise some control. That's rarely of
much use though.
As you said (or at least implied) elsethread, to be useful,
testing has to progress from the complexity of the code itself
toward something that's ultimately so simple we trust it
without testing.

That's really the key for just about everything.
In the case of attempting to demonstrate subtle thread
synchronization problems, the progression is in the opposite
direction -- even when a problem is quite obvious, getting it
to manifest itself within a reasonable period of time is often
quite difficult, and the code to do so substantially more
complex than the original code, or a corrected version
thereof.

Yes. I know I found a threading bug in the g++ implementation
of std::string (bug number 21334). I found it by code review.
I know it is there. But I don't know how to reliably reproduce
it; in fact, it requires a set of coincidences in the timing
that I don't think it's ever actually been triggered. But it's
an error, none the less.
 
I

Ian Collins

James said:
Then why aren't they doing it? Why aren't the companies
actually producing high quality code efficiently using TDD?

Mine are.
I won't deny that there may be psychological benefits in some
cases. Although I find a lot of satisfaction in participating
in teams where we get zero errors from the field, having
delivered on time and in budget.

So do I and I bet I had more fun writing it!
 
J

James Kanze

What do scheduling classes have to do with C?

I understood "class" as the C++ keyword. What do you mean by
it? (I can't find any use of the word in the Posix
documentation.)
At least on Solaris a real time thread will not be time-sliced
or prempted by anything other than a higher priority real time
thread. So they are ideal for simulating RTOS behaviour.

Different Unix can (and often will) support different scheduling
policies. I don't quite see how this affects testability; the
question is what happens when thread A interrupts thread B at
any given instant (say because thread A has a higher priority
than thread B).

For a simple example of the sort of thing I'm talking about, see
g++ bug 21334 (which I found by code review, not by testing).
 
J

James Kanze

Of course it's not the only design tool I'm using.
In order to understand the use of *Test Driven Development* in
the terms of the only area of research I've seen it used, you
need to learn about Agile development. Of course, many things
are meant by Agile development but there's plenty out there to
read about it and there's certainly many general themes that
are shared by most Agile methods.

Yes. There is, in fact, too much "agile" out there. If it's
not "agile" and doesn't use OO, it won't sell.

Regretfully, unlike the case of OO, I didn't learn agile before
it got to that state, so it's hard for me to talk about it.
What I actually know about it is all marketing blather, which
doesn't mean much. (I don't like the fact that it uses a
loaded, but meaningless word, to describe itself. Good
programmers were "agile" long before it became an in
methodology.) And there's really too much "agile" now to learn
it---how do you separate the wheat from the chaff?

My impression of TDD started out much like that, and my
impression is that some of the other posters are still
presenting it like that. But you have been clear that you are
talking about a specific tool, to solve a specific problem, so
we can speak rationally about it. And maybe I'll learn
something.
Architecture is, of course, one of those areas where people
are still trying to figure out what to do. I know a web
search for Agile and Architecture turns up a video, because I
watched it.
What I do is write prototypes. Prototypes are not tested and
aren't expected to fully function. I still write unit tests
though and still generally use TDD but in the prototyping
stage I'm prepared to throw much out and sometimes break out
of better habits. It's a tool for thinking about the problem
(remember, I am NOT the person that said unit tests help you
stop thinking).

Yes. In a way, a prototype is a test of the architecture.
(I've also seen prototypes used as tests of the specifications.
This is particularly useful for graphic applications, where the
user often really doesn't know what he wants until he sees it.)

Another type of test. And we agree that tests are a good thing.
And in a real way, I guess, using a prototype to fix the
specifications could be called test driven design, or at least
test first design. You don't commit to the final specifications
until you've "tested" them with the user. Of course, I first
learned about this technique for developing user interfaces back
in the '70s, when it was called simply "prototyping".
To describe the process lets say I'm writing a simple drawing
program. It's a GUI so I already know certain things:
Document/View or MVC or something generally like one of those;
Command; I probably have a few state machines, especially for
the canvas controlling aspect; etc... It's not like I enter
each problem entirely blind, I have past experience helping me
solve similar problems.

Yes. The "in" word for that is "design patterns". In this
case, IMHO, the word is associated with something new (or at
least it was for me): the patterns have a name, and are more or
less formallized, where as before I had heard of design
patterns, it was all very ad hoc (but I did draw on my
experience).
I scratch out a UML diagram that I think will generally work.
Now I have a few class names so I create a project solution
(VS), a unit test project within it, and files with those
names. I have a general idea how many of the above mentioned
patterns look in the generic sense so I start there. When
specifics come up I address them by writing a unit test for
whatever feature I need from that unit. I test the view by
making sure "View" is an abstraction that doesn't require GUI
and mocking it with some sort of streamer.
At any rate, we don't just take a blank slate and instantly
start writing production code based on nothing. The company I
am working for just started a new product in fact and it's
taken us more than a month to come up with an architecture
we're confident will adapt well. Believe it or not, it's
simpler than any of the various others we tried and leaves
much to decide later. Trying to answer too much too early can
lead to monsters. As they say in the video I mentioned,
"Answer only what makes the rest of the answers easy."
To conclude, I don't know where people got the idea that unit
tests are the *only* design tool that people using TDD use.
If I said I use UML to create my designs would you come to the
same assumption??

Without more information, perhaps. UML is also an acronym (and
while a very useful tool, oversold in some milieux---and overly
condemned in others).
I have a stack of books 5 feet high on design, patterns,
etc... I have UML, prototyping, unit tests, and my
experience. I have MANY tools. Unit tests are a single tool
in my arsenal, but a very important one. Designs are
emergent, they come from the problem you're trying to solve,
and unit tests help explore and expose that design.

It sounds like our basic strategies aren't all that different.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,161
Messages
2,570,892
Members
47,427
Latest member
HildredDic

Latest Threads

Top