Re: "Strong typing vs. strong testing"

T

TheFlyingDutchman

You can't have it both ways.  Either I am calling it incorrectly, in
which case I should get a compiler error, or I am calling it correctly,
and I should get the right answer.  That I got neither does in fact
falsify the claim.  The only way out of this is to say that
maximum(8589934592, 1) returning 1 is in fact "correct", in which case
we'll just have to agree to disagree.
1) long trying_to_break_maximum = 8589934592;
2) /* compiler adds */
int created_to_allow_maximum_call = (int) trying_to_break_maximum;
3) maximum(created_to_allow_maximum_call, 1);

I think we have to agree to disagree, because I don't see the lack of
a compiler error at step 2 as a problem with the maximum() function.
 
L

Lie Ryan

"in C I can have a function maximum(int a, int b) that will always
work. Never blow up, and never give an invalid answer. "

Dynamic typed languages like Python fail in this case on "Never blows
up".

How do you define "Never blows up"?

Personally, I'd consider maximum(8589934592, 1) returning 1 as a blow
up, and of the worst kind since it passes silently.
 
P

Pascal Costanza

"in C I can have a function maximum(int a, int b) that will always
work. Never blow up, and never give an invalid answer. "

Dynamic typed languages like Python fail in this case on "Never blows
up".

They don't "blow up". They may throw an exception, on which you can act.
You make it sound like a core dump, which it isn't.


Pascal
 
I

Ian Collins

int maximum(int a, int b);

int foo() {
int (*barf)() = maximum;
return barf(3);
}

This compiles fine for me. Where is the cast? Where is the error message?
Are you saying barf(3) doesn't call maximum?

Try a language with stricter type checking:

CC /tmp/u.c
"/tmp/u.c", line 7: Error: Cannot use int(*)(int,int) to initialize
int(*)().
"/tmp/u.c", line 8: Error: Too many arguments in call to "int(*)()".
 
T

TheFlyingDutchman

How do you define "Never blows up"?

Never has execution halt.

I think a key reason in the big rise in the popularity of interpreted
languages is that when execution halts, they normally give a call
stack and usually a good reason for why things couldn't continue. As
opposed to compiled languages which present you with a blank screen
and force you to - fire up a debugger, or much much worse, look at a
core dump - to try and discern all the information the interpreter
presents to you immediately.
Personally, I'd consider maximum(8589934592, 1) returning 1 as a blow
up, and of the worst kind since it passes silently.

If I had to choose between "blow up" or "invalid answer" I would pick
"invalid answer".

In this example RG is passing a long literal greater than INT_MAX to a
function that takes an int and the compiler apparently didn't give a
warning about the change in value as it created the cast to an int,
even with the option -Wall (all warnings). I think it's legitmate to
consider that an option for a warning/error on this condition should
be available. As far the compiler generating code that checks for a
change in value at runtime when a number is cast to a smaller data
type, I think that's also a legitimate request for a C compiler option
(in addition to other runtime check options like array subscript out
of bounds).
 
N

Nick Keighley

I would even go further.

Types are only part of the story.  You may distinguish between integers
and floating points, fine.  But what about distinguishing between
floating points representing lengths and floating points representing
volumes?  Worse, what about distinguishing and converting floating
points representing lengths expressed in feets and floating points
representing lengths expressed in meters.

fair points
If you start with the mindset of static type checking, you will consider
that your types are checked and if the types at the interface of two
modules matches you'll think that everything's ok.  And six months later
you Mars mission will crash.

do you have any evidence that this is actually so? That people who
program in statically typed languages actually are prone to this "well
it compiles so it must be right" attitude?
On the other hand, with the dynamic typing mindset, you might even wrap
your values (of whatever numerical type) in a symbolic expression
mentionning the unit and perhaps other meta data, so that when the other
module receives it, it may notice (dynamically) that two values are not
of the same unit, but if compatible, it could (dynamically) convert into
the expected unit.  Mission saved!

they *may* do this but do they *actually* do it? My (limited)
experience of dynamically typed languges is everynow and again you
attempt to apply an operator to the wrong type of operand and kerblam!
If your testing is inadaquate then it's inadaquate whatever the
typiness of your language.
 
N

Nick Keighley

Never has execution halt.

I think a key reason in the big rise in the popularity of interpreted
languages is that when execution halts, they normally give a call
stack and usually a good reason for why things couldn't continue. As
opposed to compiled languages which present you with a blank screen
and force you to - fire up a debugger, or much much worse, look at a
core dump - to try and discern all the information the interpreter
presents to you immediately.




If I had to choose between "blow up" or "invalid answer" I would pick
"invalid answer".

there are some application domains where neither option would be
viewed as a satisfactory error handling strategy. Fly-by-wire, petro-
chemicals, nuclear power generation. Hell you'd expect better than
this from your phone!
 
P

Pascal Bourguignon

TheFlyingDutchman said:
In this example RG is passing a long literal greater than INT_MAX to a
function that takes an int and the compiler apparently didn't give a
warning about the change in value as it created the cast to an int,
even with the option -Wall (all warnings). I think it's legitmate to
consider that an option for a warning/error on this condition should
be available. As far the compiler generating code that checks for a
change in value at runtime when a number is cast to a smaller data
type, I think that's also a legitimate request for a C compiler option
(in addition to other runtime check options like array subscript out
of bounds).

I think that it's a legitimate request, in this age and day, for a C
programmer to require that it be NOT an option to a C compiler not to
give any error for this and similar cases.

(And we should just kill all the programs that don't pass this check,
which I'm afraid would be a big number, which I understand, is the
reason why C compilers don't change).
 
P

Pascal Bourguignon

Nick Keighley said:
do you have any evidence that this is actually so? That people who
program in statically typed languages actually are prone to this "well
it compiles so it must be right" attitude?

Yes, I can witness that it's in the mind set.

Well, the problem being always the same, the time pressures coming from
the sales people (who can sell products of which the first line of
specifications has not been written yet, much less of code), it's always
a battle to explain that once the code is written, there is still a lot
of time needed to run tests and debug it. I've even technical managers,
who should know better, expecting that we write bug-free code in the
first place (when we didn't even have a specification to begin with!).

they *may* do this but do they *actually* do it? My (limited)
experience of dynamically typed languges is everynow and again you
attempt to apply an operator to the wrong type of operand and kerblam!
If your testing is inadaquate then it's inadaquate whatever the
typiness of your language.

Unfortunately, a lot of programmers in dynamic programming languages
have been formed with static programming languages bring with them their
old mindset. Moreover, when the syntax of the newer dynamic programming
languages is explicitely designed similar to an older static programming
language, in order to attract these programmers toward the better
technologies, this does not help changing the mindset either.

Unfortunately, you can write FORTRAN code in any programming language.

But my point is that at least with dynamic programming languages,
there's an alternative mindset and it is easier to implement such
a scheme than with static programming languages.

In Lisp, which stresses the symbolic computing part (S-expr are Symbolic
Expressions), it is almost trivial to implement.
 
T

TheFlyingDutchman

there are some application domains where neither option would be
viewed as a satisfactory error handling strategy. Fly-by-wire, petro-
chemicals, nuclear power generation. Hell you'd expect better than
this from your phone!

I wasn't speaking generally, just in the case of which of only two
choices RG's code should be referred to - "blowing up" or "giving an
invalid answer".

I think error handling in personal computer and website software has
improved over the years but there is still some room for improvement
as you will still get error messages that don't tell you something you
can relay to tech support more than that an error occurred or that
some operation can't be performed.

But I worked with programmers doing in-house software who were
incredibly turned off by exception handling in C++. I thought that
meant that they preferred to return and check error codes from
functions as they had done in C, and for some of them it did seem to
mean that. But for others it seemed that they didn't want to
anticipate errors at all ("that file is always gonna be there!"). I
read a Java book by Deitel and Deitel and they pointed out what might
have lead to that attitude - the homework and test solutions in
college usually didn't require much if any error handling - the
student could assume files were present, data was all there and in the
format expected, user input was valid and complete, etc.
 
N

Nick Keighley

I wasn't speaking generally, just in the case of which of only two
choices RG's code should be referred to - "blowing up" or "giving an
invalid answer".

I think I'd prefer termination if those were my only choices. What's
the rest of the program going to do with the wrong result? When the
program finally gives up the cause is lost in the mists of time, and
those are hard to debug!
I think error handling in personal computer and website software has
improved over the years but there is still some room for improvement
as you will still get error messages that don't tell you something you
can relay to tech support more than that an error occurred or that
some operation can't be performed.

But I worked with programmers doing in-house software who were
incredibly turned off by exception handling in C++. I thought that
meant that they preferred to return and check error codes from
functions as they had done in C, and for some of them it did seem to
mean that. But for others it seemed that they didn't want to
anticipate errors at all ("that file is always gonna be there!").

that was one of the reasons I liked exceptions. If my library threw an
exception then the caller *had* to do something about it. Even to
ignore it he had to write some code.
I
read a Java book by Deitel and Deitel and they pointed out what might
have lead to that attitude - the homework and test solutions in
college usually didn't require much if any error handling - the
student could assume files were present, data was all there and in the
format expected, user input was valid and complete, etc.

plausible. Going from beginner to <whatever> I probably steadily
increased the pessimism of my code. The file might not be there. That
other team might send us syntactically invalid commands. Even if it
can't go wrong it will go wrong. Fortunately my collage stuff included
some OS kernal stuff. There anything that can go wrong will go wrong.
 
T

Tim Bradshaw

there are some application domains where neither option would be
viewed as a satisfactory error handling strategy. Fly-by-wire, petro-
chemicals, nuclear power generation. Hell you'd expect better than
this from your phone!

People always give these kind of scenarios, but actually there are far
more mundane ones. In my day job I'm a sysadmin and I spend a bunch of
time writing code (typically what would nowadays be called "scripts"
rather than programs, but there's no real difference) which does things
of the form

for every machine in <several hundred systems>
do <something>

where <something> is fairly often "modify critical system configuration file".

Programs like that have some absolute, non-negotiable requirements:
- they must never fail silently;
- they must check everything they do however unlikely it seems that it
would failm
because they will come across systems which have almost arbitrary
misconfiguration.
- they should be idempotent if possible;
- if they come across something odd they either need to handle it,
or put things back the way they were and back out;
- if they absolutely can not put things back, they need to report this
very clearly
and carefully preserve any detriitus in such a way that a human can
pick up the bits;
- whatever they do they need to report in a completely parsable way
what happened
(success, failure, already done, backed out, not backed out, and so on).

These are quite mundane everyday things, but the consequences of
getting them wrong can be quite nasty (the worst ones being "the
machines will still run, but won't boot").
 
K

Keith Thompson

RG said:
You can't have it both ways. Either I am calling it incorrectly, in
which case I should get a compiler error, or I am calling it correctly,
and I should get the right answer. That I got neither does in fact
falsify the claim. The only way out of this is to say that
maximum(8589934592, 1) returning 1 is in fact "correct", in which case
we'll just have to agree to disagree.

You are calling maximum() incorrectly, but you are doing so in a way
that the compiler is not required to diagnose.

If you want to say that the fact that the compiler is not required
to diagnose the error is a flaw in the C language, I won't
argue with you. It's just not a flaw in the maximum() function.

If I write:

const double pi = 22.0/7.0;
printf("pi = %f\n", pi);

then I suppose I'm calling printf() incorrectly, but I wouldn't
expect my compiler to warn me about it.

If you're arguing that

int maximum(int a, int b) { return a > b ? a : b; }

is flawed because it's too easy to call it incorrectly, you're
effectively arguing that it's not possible to write correct
code in C at all.
 
S

Seebs

How do you define "Never blows up"?

I would say "blow up" would be "raise an exception".
Personally, I'd consider maximum(8589934592, 1) returning 1 as a blow
up, and of the worst kind since it passes silently.

So run your compiler with a decent set of warning levels, and watch as
you are magically warned that you're passing an object of the wrong type.

On any given system, one or the other is true:

1. The constant 8589934592 is of type int, and the function will
"work" -- will give that result.
2. The constant is not of type int, and the compiler will warn you about
this if you ask.

-s
 
S

Seebs

even with the option -Wall (all warnings).

For various historical reasons, "-Wall" has the semantics you might
expect from an option named "-Wsome-common-warnings-but-not-others".

-s
 
S

Seebs

We lost some important context somewhere along the line:
Please take note of the second sentence.

I did. That is entirely correct.
One way or another, this claim is plainly false. The point I was trying
to make is not so much that the claim is false (someone else was already
doing that), but that it can be demonstrated to be false without having
to rely on any run-time input.

It is not at all obvious to me that it is, in fact, false. So far as
I can tell, *if* the function is successfully called, then it will take
two integers, compare them, and return the larger one. It will never
return something which is not an integer. It will never raise an exception.
It will never return a value which, if you try to treat it as an integer,
raise an exception.

Now, if you pass the wrong values to it, you will get wrong answers -- but
that's your problem for passing it wrong values.

I would understand an "invalid" answer to be one of the wrong category. For
instance, if I have a function in Python that I expect to return a string,
and it returns None, I have gotten an answer that is "invalid" -- it's not
a string.

-s
 
S

Seebs

int maximum(int a, int b);

int foo() {
int (*barf)() = maximum;
return barf(3);
}
This compiles fine for me. Where is the cast?

On the first line of code inside foo().
Where is the error message?

You chose to use a form that suppresses the error message.
Are you saying barf(3) doesn't call maximum?

I would say that it is undefined whether or not it calls maximum, because
you called a function through a function pointer of a different sort,
which invoked undefined behavior.

There exist real compiles on which code much like this will coredump
without ever once trying to jump to the address of the maximum function,
because the compiler caught your error.

-s
 
S

Seebs

You can't have it both ways. Either I am calling it incorrectly, in
which case I should get a compiler error,

You get a warning if you ask for it. If you choose to run without all
the type checking on, that's your problem.

-s
 
S

Seebs

Yes, I can witness that it's in the mind set.

Huh.

So here I am, programming in statically typed languages, and I have never
in my life thought that things which compiled were necessarily right. Not
even when I was an arrogant teenager.

I guess I don't exist. *sob*
Well, the problem being always the same, the time pressures coming from
the sales people (who can sell products of which the first line of
specifications has not been written yet, much less of code), it's always
a battle to explain that once the code is written, there is still a lot
of time needed to run tests and debug it.

At $dayjob, they give us months between feature complete and shipping,
because they expect us to spend a lot of time testing, debugging, and
cleaning up. But during that time we are explicitly not adding features...
But my point is that at least with dynamic programming languages,
there's an alternative mindset and it is easier to implement such
a scheme than with static programming languages.

I think this grossly oversimplifies things.

-s
 
R

RG

Keith Thompson said:
You are calling maximum() incorrectly, but you are doing so in a way
that the compiler is not required to diagnose.

Yes. I know. That was my whole point. There are ways to call a
function incorrectly (more broadly, there are errors in code) that a C
compiler is not required to diagnose.
If you want to say that the fact that the compiler is not required
to diagnose the error is a flaw in the C language, I won't
argue with you.

I'm not even saying it's a flaw in the language. All I'm saying is that
the original claim -- that any error in a C program will be caught by
the compiler -- is false, and more specifically, that it can be
demonstrated to be false without appeal to unknown run-time input.

As an aside, this particular error *could* be caught (and in fact would
be caught by other tools like lint), but there are errors that can not
be caught by any static analysis, and that therefore one should not be
lulled into a false sense of security by the fact that your code is
written in a statically typed language and compiled without errors or
warnings. That's all.
If I write:

const double pi = 22.0/7.0;
printf("pi = %f\n", pi);

then I suppose I'm calling printf() incorrectly, but I wouldn't
expect my compiler to warn me about it.

If you're arguing that

int maximum(int a, int b) { return a > b ? a : b; }

is flawed because it's too easy to call it incorrectly, you're
effectively arguing that it's not possible to write correct
code in C at all.

I would say that it is very, very hard to write correct code in C for
any non-vacuous definition of "correct". That is the reason that core
dumps and buffer overflows are so ubiquitous. I prefer Lisp or Python,
where core dumps and buffer overflows are virtually nonexistent. One
does get the occasional run-time error that might have been caught at
compile time, but I much prefer that to a core dump or a security hole.

One might hypothesize that the best of both worlds would be a dynamic
language with a static analyzer layered on top. Such a thing does not
exist. It makes an instructive exercise to try to figure out why. (For
the record, I don't know the answer, but I've learned a lot through the
process of pondering this conundrum.)

rg
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,737
Latest member
Georgeengab

Latest Threads

Top