Requesting critique of a C unit test environment

B

Ben Pfaff

Ben Bacarisse said:
But this was a "by hand" proof in 1977. A machine assisted proof of
the actual code could be expected to inspire a little more confidence.

YMMV of course, but if I could get Donald Knuth to prove my
programs correct "by hand", I'd feel no need for additional
confidence.
 
P

Phlip

Flash said:
You have missed out internal formal testing which in many environments
is far more complete than acceptance testing. For example, I've worked
on projects where a formal test literally takes a week to complete but
the customer acceptance testing takes only a few hours.

What did y'all do if the "formal" test failed?

What I look for is this: Replicate the failure as a short unit test. Not a
proof - just a stupid test that fails because the code change needed to fix
that formal test isn't there.

The point is to make the fast tests higher value as you go...
 
B

Ben Bacarisse

Ben Pfaff said:
YMMV of course, but if I could get Donald Knuth to prove my
programs correct "by hand", I'd feel no need for additional
confidence.

No, MMIS[1]. I did not intend to disparage Prof. Knuth's "hand
proofs" (what a thought!) but rather to say that the problem he is
referring to is as likely to be that one proves something other than
the program one has written (or later writes) as it is to be that ones
proof is (internally) flawed.

I suspect that he is not entirely happy with the way that quip is used
so often to suggest the pointlessness of proofs[2] (after all, what
did he choose to do with his "Notes on van Emde Boas constriction of
priority deques" -- a proof rather than a test implementation!).

[1] "My mileage is similar".
[2] This not one of those times -- RH was just countering the much
stronger assertion that proof => no need to test.
 
I

Ian Collins

Flash said:
Ian Collins wrote, On 27/08/07 21:51:

You have missed out internal formal testing which in many environments
is far more complete than acceptance testing. For example, I've worked
on projects where a formal test literally takes a week to complete but
the customer acceptance testing takes only a few hours.
If performed, internal formal testing is still a step away from
developer testing.
Informal testing can be run in any environment the developer has
available. Formal testing, which is the only sort of testing that you
can guarantee will be available and working for those maintaining later,
is another matter.
How so? A unit test suite doesn't just vanish when the code is
released, it is an essential part of the code base.
Acceptance testing has very little to do with proving whether the system
works, it is just to give the customer some confidence.

That depends on your definition of Acceptance tests. In our case, they
are the automated suite of tests that have to pass before the product is
released to customers.
The real worth
while formal testing has to be completed *before* doing customer
acceptance testing and done with the correct compiler. At least, this is
the case in many environments, including all the projects where I have
been involved in QA, and on the safety critical project I was involved in.
Again, that depends on your process.
If your customer acceptance testing is sufficient to prove the SW is
sufficiently correct then your customer has either very little trust in
your company or a lot of time to waste. If your customer acceptance
testing is the only testing done with the correct compiler and it is not
sufficient to prove your SW is sufficiently correct then your SW is not
tested properly. At least, not according to any standard of testing I
have come across.

Why? Our acceptance test are very comprehensive, written by
professional testers working with a product manager (the customer).

It sounds like you don't have fully automated acceptance tests. Where
ever possible, all tests should be fully automated.
 
R

Richard Heathfield

Ben Bacarisse said:

I suspect that [DEK] is not entirely happy with the way that quip
is used so often to suggest the pointlessness of proofs[2]

[2] This not one of those times -- RH was just countering the much
stronger assertion that proof => no need to test.

Right. One problem is that they don't always prove what you asked them
to prove. What you actually want to know is "does this program properly
do what I need it to do?", but what a prover actually tells you is
whether program X conforms to a particular expression of specification
Y. It makes no comment whatsoever on whether specification Y
corresponds to wishlist Z. And, very often, such correspondence is far
from perfect.
 
J

Jonathan Kirwan


Perhaps because this is an example of a medical product where that is
felt to be required. There is a list of various kinds of "structural
coverages," those selected commensurate with the level of risk posed
by the software. (Saying something has "coverage," I think, always
implies 100% coverage, too. Not partial. So you either have coverage
or you don't.)

Borrowing from one of the US CDRH PDFs I have laying about:

· Statement Coverage – This criteria requires sufficient test cases
for each program statement to be executed at least once; however,
its achievement is insufficient to provide confidence in a
software product's behavior.

· Decision (Branch) Coverage – This criteria requires sufficient test
cases for each program decision or branch to be executed so that
each possible outcome occurs at least once. It is considered to be
a minimum level of coverage for most software products, but
decision coverage alone is insufficient for high-integrity
applications.

· Condition Coverage – This criteria requires sufficient test cases
for each condition in a program decision to take on all possible
outcomes at least once. It differs from branch coverage only when
multiple conditions must be evaluated to reach a decision.

· Multi-Condition Coverage – This criteria requires sufficient test
cases to exercise all possible combinations of conditions in a
program decision.

· Loop Coverage – This criteria requires sufficient test cases for
all program loops to be executed for zero, one, two, and many
iterations covering initialization, typical running and termination
(boundary) conditions.

· Path Coverage – This criteria requires sufficient test cases for
each feasible path, basis path, etc., from start to exit of a
defined program segment, to be executed at least once. Because of
the very large number of possible paths through a software program,
path coverage is generally not achievable. The amount of path
coverage is normally established based on the risk or criticality
of the software under test.

· Data Flow Coverage – This criteria requires sufficient test cases
for each feasible data flow to be executed at least once. A number
of data flow testing strategies are available.

For potentially high risk software, you may not just use a different
compiler or a different operating system environment or change even
the optimization options. As the OP mentioned, it's probably going to
enough just justifying an instruction simulator.

I can easily see a desire for an automated way of demonstrating that
structural testing has achieved one or more of these cases. If I read
the OP right about this, anyway.

Jon
 
I

Ian Collins

Jonathan said:
Perhaps because this is an example of a medical product where that is
felt to be required. There is a list of various kinds of "structural
coverages," those selected commensurate with the level of risk posed
by the software. (Saying something has "coverage," I think, always
implies 100% coverage, too. Not partial. So you either have coverage
or you don't.)
The why was prompted by the posting subject "critique of a C unit test
environment". To my way of thinking (TDD), unit tests are developer
tool, not formal product tests.


<interesting stuff snipped>
 
C

Colin Paul Gloster

|--------------------------------------------------------------------------|
|"[..] |
| |
|YMMV of course, but if I could get Donald Knuth to prove my |
|programs correct "by hand", I'd feel no need for additional |
|confidence." |
|--------------------------------------------------------------------------|

Such as the way Donald E. Knuth told Leslie Lamport that TeX would
hardly change at all? From
HTTP://research.Microsoft.com/users/Lamport/pubs/lamport-latex-interview.pdf
:"[..]
[..] When Don
was writing TEX80, he announced that it would be a
reimplementation of TEX78, but he was not going to
add new features. I took him seriously and asked for
almost no changes to TEX itself. [..] However, there were many other
im-
provements that I could have suggested but didn't. In
the end, Don wound up making very big changes to
TEX78. But they were all incremental, and there was
never a point where he admitted that he was willing
to make major changes. Had I known at the begin-
ning how many changes he would be making, I would
have tried to participate in the redesign. [..]
[..]"

Regards,
Colin Paul Gloster
 
C

Colin Paul Gloster

|------------------------------------------------------------------------|
|"Ben Bacarisse said: |
| |
|> |
|>> Erik Wikström said: |
|>> |
|>> <snip> |
|>> |
|>>> Testing is used to find errors, while formal methods are used to |
|>>> prove that there are no errors, at least that's the goal. So if you |
|>>> can prove that there are no errors why test for them? |
|>> |
|>> "Beware of bugs in the above code; I have only proved it correct, not|
|>> tried it." - Donald E Knuth. |
|> |
|> But this was a "by hand" proof in 1977. A machine assisted proof of |
|> the actual code could be expected to inspire a little more confidence.|
| |
|Why? Presumably the machine that is doing the assisting is itself a |
|computer program. What makes you think the assistance program is |
|correct?" |
|------------------------------------------------------------------------|

Full points to Mister Heathfield.
 
P

Phlip

Colin said:
Such as the way Donald E. Knuth told Leslie Lamport that TeX would
hardly change at all? From
HTTP://research.Microsoft.com/users/Lamport/pubs/lamport-latex-interview.pdf
:"[..]
[..] When Don
was writing TEX80, he announced that it would be a
reimplementation of TEX78, but he was not going to
add new features. I took him seriously and asked for
almost no changes to TEX itself. [..] However, there were many other
im-
provements that I could have suggested but didn't. In
the end, Don wound up making very big changes to
TEX78. But they were all incremental, and there was
never a point where he admitted that he was willing
to make major changes. Had I known at the begin-
ning how many changes he would be making, I would
have tried to participate in the redesign. [..]

"Principle 4

" * Level out the workload (heijunka). (Work like the tortoise, not the
hare).

"This helps achieve the goal of minimizing waste (muda), not overburdening
people or the equipment (muri), and not creating uneven production levels
(mura)."

http://en.wikipedia.org/wiki/The_Toyota_Way
 
W

Walter Banks

I will second that, well put.

The Achilles heel for either Testing or formal methods is
contaminating the evaluation process with information
from the implementation.

I have seen unit tests contaminated from just knowing
the application area the code was going to be used.

w..
 
F

Flash Gordon

Ian Collins wrote, On 28/08/07 04:46:
If performed, internal formal testing is still a step away from
developer testing.

Yes. However, above you said that it should not matter for unit testing
whether you use the same compiler or not. Since unit testing can and
often *is* formal such a statement is at least misleading. Had you said
that it did not matter for informal testing and had the OP been asking
about informal testing you might have a point, but it was never stated
that the unit testing was informal.
How so? A unit test suite doesn't just vanish when the code is
released, it is an essential part of the code base.

Simple. If it is not formal then you (the next developer) have no
guarantee that it is in a usable state. So you, the next developer, have
to fully validate any tests you will rely on during your development.
That depends on your definition of Acceptance tests. In our case, they
are the automated suite of tests that have to pass before the product is
released to customers.

Yes, this could be a matter of definition. To me an acceptance test is
the customer coming in and witnessing some pre-agreed tests where if
they pass the customer will accept the SW and/or HW (and pay for it). It
has nothing to do with whether the company is prepared to give the SW to
the customer.
Again, that depends on your process.

I've not worked for a company where they would be prepared to try and
get a customer to accept SW before having a decent level of confidence
that it is correct *and* acceptable to the customer.
Why? Our acceptance test are very comprehensive, written by
professional testers working with a product manager (the customer).

It sounds like you don't have fully automated acceptance tests. Where
ever possible, all tests should be fully automated.

It is not possible for a reasonable cost to fully automate all testing.
On a number of projects I have worked on the formal testing included
deliberately connecting up the system incorrectly (and changing the
physical wiring whilst the SW is running), inducing faults in the HW
that the SW was intended to test, responding either correctly and
incorrectly to operator prompts, putting a plate in front of a camera so
that it could not see the correct image whilst the SW is looking at it,
swapping a card in the system for a card from a system with a different
specification etc. It would literally require a robot to automate some
of this testing, and some of the rest of it would require considerable
investment to automate. Compared to the cost of the odd few man-weeks to
manually run through the formal testing with a competent whiteness the
cost of automation would be stupid.

BTW, on the SW I am mainly thinking of there were so few bug reports
that on one occasion when the customer representative came to us for
acceptance testing, a few years after the previous version, both the
customer representative and I could remember all of the fault reports
and discuss why I new none of them were present in the new version. The
customer representative was *not* a user (he worked for a "Procurement
Executive" and not for the organisation that used the kit), so he would
not have seen it for several years.

If you doubt the quality of the manual testing, then look at how many
50000 line pieces of SW have as few as 10 fault reports from customers
over a 15 year period. Most of those fault reports were in the early
years, and *none* were after the last few deliveries I was involved in.

BTW, if they are still using the SW at the start of 2028 we have a
problem, but that is documented and could easily be worked around.
 
I

Ian Collins

Flash said:
Ian Collins wrote, On 28/08/07 04:46:

Yes. However, above you said that it should not matter for unit testing
whether you use the same compiler or not. Since unit testing can and
often *is* formal such a statement is at least misleading. Had you said
that it did not matter for informal testing and had the OP been asking
about informal testing you might have a point, but it was never stated
that the unit testing was informal.
We all work from our own point of reference, in mine, units tests are a
developer tool so that's why I answered as I did.
Simple. If it is not formal then you (the next developer) have no
guarantee that it is in a usable state. So you, the next developer, have
to fully validate any tests you will rely on during your development.
Again, as one who uses TDD, the tests are always up to date as they
document the workings of the code. All down to process.
Yes, this could be a matter of definition. To me an acceptance test is
the customer coming in and witnessing some pre-agreed tests where if
they pass the customer will accept the SW and/or HW (and pay for it). It
has nothing to do with whether the company is prepared to give the SW to
the customer.
Ah, that explains a lot!
I've not worked for a company where they would be prepared to try and
get a customer to accept SW before having a decent level of confidence
that it is correct *and* acceptable to the customer.
Neither have I.
It is not possible for a reasonable cost to fully automate all testing.

True, but with care you can automate the majority of them. The beauty
of automated tests is they cost next to nothing to run, so they can be
continuously run against your code repository.
On a number of projects I have worked on the formal testing included
deliberately connecting up the system incorrectly (and changing the
physical wiring whilst the SW is running), inducing faults in the HW
that the SW was intended to test, responding either correctly and
incorrectly to operator prompts, putting a plate in front of a camera so
that it could not see the correct image whilst the SW is looking at it,
swapping a card in the system for a card from a system with a different
specification etc. It would literally require a robot to automate some
of this testing, and some of the rest of it would require considerable
investment to automate. Compared to the cost of the odd few man-weeks to
manually run through the formal testing with a competent whiteness the
cost of automation would be stupid.
There you have what I'd call integration testing, something we also do
with any software that interacts with other equipment.
If you doubt the quality of the manual testing, then look at how many
50000 line pieces of SW have as few as 10 fault reports from customers
over a 15 year period. Most of those fault reports were in the early
years, and *none* were after the last few deliveries I was involved in.
I don't doubt it, I just prefer to send my resources elsewhere. We go
through the full manual integration tests for major software releases
(adding acceptance and unit tests to reproduce any bugs found). This
process of feeding back tests into the automated suites makes them
progressively more thorough, to the extent that minor updates can be
released without manual testing ant the testing of major releases finds
few, if any, bugs. Most of the bugs found by the manual testing are
differing interpretations of the specification.
 
F

Flash Gordon

Ian Collins wrote, On 28/08/07 22:03:
We all work from our own point of reference, in mine, units tests are a
developer tool so that's why I answered as I did.

You should try to avoid assuming everyone works the same way. In the
defence industry at least it is very common for there to be a lot of
formal unit tests.
Again, as one who uses TDD, the tests are always up to date as they
document the workings of the code. All down to process.

If the process is enforced then the testing is formal and, I would
expect, the results are recorded somewhere the 10th developer after you
will be able to find them.
Ah, that explains a lot!

Acceptance tests are used to accept, simple :)
Neither have I.

So you do your acceptance tests before the customer sees the kit?
True, but with care you can automate the majority of them. The beauty
of automated tests is they cost next to nothing to run, so they can be
continuously run against your code repository.

I fully understand the use of them. However, it is not always either
practical or cost effective. In this case there was no automated test
system available, so if we wanted one we would have had to design,
implement and test it, then write all the test harnesses...

Almost forgot, we would have had to generate and validate a *lot* of
test data instead of just using real kit either with or without faults.

At the end of the day we would also have had to do thorough integration
testing as well. So I still believe doing automated testing would have
been more expensive overall, and certainly would have been a significant
up-front cost.

Note that this SW does a *lot* of HW interaction, since it is actually
the main SW of a piece of 2nd line test equipment.
There you have what I'd call integration testing,

Yes and no. Each set of tests was focused on exercising a specific unit,
it was just using the rest of the SW as a test harness.
something we also do
with any software that interacts with other equipment.

Obviously. We just killed multiple birds with the same high-tech
missile^W^W^Wstone.
I don't doubt it, I just prefer to send my resources elsewhere.

I still don't believe it cost more time overall.
We go
through the full manual integration tests for major software releases
(adding acceptance and unit tests to reproduce any bugs found).

We also added tests to trap the few bugs that were found.
This
process of feeding back tests into the automated suites makes them
progressively more thorough, to the extent that minor updates can be
released without manual testing ant the testing of major releases finds
few, if any, bugs.

We started off by making the tests thorough which is why the testing
takes so long. Due to this and the low bug count almost all releases
whilst I worked at the company were major releases (adding support for
major variants of the kit it tested, testing major new features in new
versions of the kit it tested etc) with only a small number of bug-fix
releases.
Most of the bugs found by the manual testing are
differing interpretations of the specification.

Not on this SW. Reviews of requirements caught most of them and reviews
of design most of the remainder. I can only think of one interpretation
issue on the SW that was not caught before coding started on this SW.
 
I

Ian Collins

Flash said:
Ian Collins wrote, On 28/08/07 22:03:

If the process is enforced then the testing is formal and, I would
expect, the results are recorded somewhere the 10th developer after you
will be able to find them.
The results are recorded every time the tests run - either "OK" or
failure messages :)
So you do your acceptance tests before the customer sees the kit?
They are run as soon as the feature they test is complete.
I fully understand the use of them. However, it is not always either
practical or cost effective. In this case there was no automated test
system available, so if we wanted one we would have had to design,
implement and test it, then write all the test harnesses...
It the project is log running, or a family of products are to be
maintained it can be worth the effort. I preferred to have my test
engineers developing innovative ways to build automatic tests that have
them running manual tests. Provided they can produces the tests at
least as fast as the developers code the features, everyone is happy.
Almost forgot, we would have had to generate and validate a *lot* of
test data instead of just using real kit either with or without faults.
I like to capture all of the data generated during manual tests and feed
it back through as part of the automated tests.
Note that this SW does a *lot* of HW interaction, since it is actually
the main SW of a piece of 2nd line test equipment.
The example I'm referring to were power system controllers.
I still don't believe it cost more time overall.
This project has been running (the product has to continuously evolve to
meet the changing market) for 5 years, so the up front cost has paid for
its self many times over.
 
P

Phlip

Flash said:
You should try to avoid assuming everyone works the same way. In the
defence industry at least it is very common for there to be a lot of
formal unit tests.

I think Ian refers to "developer tests". Giving them different definitions
helps. They have overlapping effects but distinct motivations.

The failure of a unit test implicates only one unit in the system, so the
search for a bug should be very easy. The failure of a developer test
implicates the last edit - not that it inserted a bug, but only that it
failed the test suite! Finding and reverting that edit is easier than
debugging.
I fully understand the use of them. However, it is not always either
practical or cost effective.

They are more cost effective than endless debugging!!!
 
F

Flash Gordon

Ian Collins wrote, On 30/08/07 02:42:
The results are recorded every time the tests run - either "OK" or
failure messages :)

They are only recorded if they are put somewhere that someone can see
them after you have left the company. Otherwise they are only reported.
They are run as soon as the feature they test is complete.

And all re-run after the final line of code is cut, I trust.
It the project is log running,

I started on it in the late 80's and the last I heard was a contract
signed giving an option of support until 2020, that long enough for you?
or a family of products are to be
maintained it can be worth the effort.

Only half a dozen or so variants, all using over 90% common code.
I preferred to have my test
engineers developing innovative ways to build automatic tests that have
them running manual tests.

Ah, but we did not spend vast amounts of time running the tests, not
compared to the time/effort involved in generating the required test
data, automating the tests, and then writing the integration tests
needed to prove it works as an entire system.
Provided they can produces the tests at
least as fast as the developers code the features, everyone is happy.

We did not have the luxury of dedicated test developers. Those
developing the tests where those analysing the requirements, designing
the SW and implementing it.
I like to capture all of the data generated during manual tests and feed
it back through as part of the automated tests.

That would require writing a lot of SW to capture the data. All of which
would have to be tested.
The example I'm referring to were power system controllers.

I'm talking about 2nd line test equipment for *very* high end camera and
image processing systems. 2nd line is the kit the customer puts it on
when it has come back from operation broken.
This project has been running (the product has to continuously evolve to
meet the changing market) for 5 years, so the up front cost has paid for
its self many times over.

Ah well, the SW I'm referring changes only every few years due to new
customers or existing customers wanting enhancements to the kit it is to
test. The last set of updates I'm aware of will have started probably in
2001 (maybe 2000) but I had left the company by then. I know we had won
the contract. So definitely over twice as long a period. Requirements
changes also had minimal code impact because we had designed the system
to allow for changes.
 
I

Ian Collins

Flash said:
Ian Collins wrote, On 30/08/07 02:42:

They are only recorded if they are put somewhere that someone can see
them after you have left the company. Otherwise they are only reported.
The tests are part of the project, in the sane source control. Without
the tests, the project can not build. Building and running the tests is
an integral part of the build process.

The last sentence is important, so I'll repeat it - the unit test are
built and run each time the module is compiled.
And all re-run after the final line of code is cut, I trust.
Rerun every build, dozens of times a day for each developer or pair.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top