Re: "Strong typing vs. strong testing"

namekuseijin · Sep 27, 2010

in C I can have a function maximum(int a, int b) that will always
work. Never blow up, and never give an invalid answer. If someone
tries to call it incorrectly it is a compile error.
In a dynamic typed language maximum(a, b) can be called with incorrect
datatypes. Even if I make it so it can handle many types as you did
above, it could still be inadvertantly called with a file handle for a
parameter or some other type not provided for. So does Eckel and
others, when they are writing their dynamically typed code advocate
just letting the function blow up or give a bogus answer, or do they
check for valid types passed? If they are checking for valid types it
would seem that any benefits gained by not specifying type are lost by
checking for type. And if they don't check for type it would seem that
their code's error handling is poor.

that is a lie.

Compilation only makes sure that values provided at compilation-time
are of the right datatype.

What happens though is that in the real world, pretty much all
computation depends on user provided values at runtime. See where are
we heading?

this works at compilation time without warnings:
int m=numbermax( 2, 6 );

this too:
int a, b, m;
scanf( "%d", &a );
scanf( "%d", &b );
m=numbermax( a, b );

no compiler issues, but will not work just as much as in python if
user provides "foo" and "bar" for a and b... fail.

What you do if you're feeling insecure and paranoid? Just what
dynamically typed languages do: add runtime checks. Unit tests are
great to assert those.

Fact is: almost all user data from the external words comes into
programs as strings. No typesystem or compiler handles this fact all
that graceful...

Pascal J. Bourguignon · Sep 27, 2010

namekuseijin said:
that is a lie.

Compilation only makes sure that values provided at compilation-time
are of the right datatype.

What happens though is that in the real world, pretty much all
computation depends on user provided values at runtime. See where are
we heading?

this works at compilation time without warnings:
int m=numbermax( 2, 6 );

this too:
int a, b, m;
scanf( "%d", &a );
scanf( "%d", &b );
m=numbermax( a, b );

no compiler issues, but will not work just as much as in python if
user provides "foo" and "bar" for a and b... fail.

What you do if you're feeling insecure and paranoid? Just what
dynamically typed languages do: add runtime checks. Unit tests are
great to assert those.

Fact is: almost all user data from the external words comes into
programs as strings. No typesystem or compiler handles this fact all
that graceful...

I would even go further.

Types are only part of the story. You may distinguish between integers
and floating points, fine. But what about distinguishing between
floating points representing lengths and floating points representing
volumes? Worse, what about distinguishing and converting floating
points representing lengths expressed in feets and floating points
representing lengths expressed in meters.

If you start with the mindset of static type checking, you will consider
that your types are checked and if the types at the interface of two
modules matches you'll think that everything's ok. And six months later
you Mars mission will crash.

On the other hand, with the dynamic typing mindset, you might even wrap
your values (of whatever numerical type) in a symbolic expression
mentionning the unit and perhaps other meta data, so that when the other
module receives it, it may notice (dynamically) that two values are not
of the same unit, but if compatible, it could (dynamically) convert into
the expected unit. Mission saved!

Scott L. Burson · Sep 27, 2010

Pascal said:
On the other hand, with the dynamic typing mindset, you might even wrap
your values (of whatever numerical type) in a symbolic expression
mentionning the unit and perhaps other meta data, so that when the other
module receives it, it may notice (dynamically) that two values are not
of the same unit, but if compatible, it could (dynamically) convert into
the expected unit. Mission saved!

In fairness, you could do this statically too, and without the consing
required by the dynamic approach.

-- Scott

Pascal J. Bourguignon · Sep 27, 2010

Scott L. Burson said:
In fairness, you could do this statically too, and without the consing
required by the dynamic approach.

I don't deny it. My point is that it's a question of mindset.

John Nagle · Sep 28, 2010

The trouble with that essay is that he's comparing with C++.
C++ stands alone as offering hiding without memory safety.
No language did that before C++, and no language has done it
since.

The basic problem with C++ is that it take's C's rather lame
concept of "array=pointer" and wallpapers over it with
objects. This never quite works. Raw pointers keep seeping
out. The mold always comes through the wallpaper.

There have been better strongly typed languages. Modula III
was quite good, but it was from DEC's R&D operation, which was
closed down when Compaq bought DEC.

John Nagle

Malcolm McLean · Sep 28, 2010

Fact is: almost all user data from the external words comes into
programs as strings. No typesystem or compiler handles this fact all
that graceful...- Hide quoted text -

You're right. C should have a much better library than it does for
parsing user-supplied string input.

The scanf() family of functions is fine for everyday use, but not
robust enough for potentially hostile inputs. atoi() had to be
replaced by strtol(), but there's a need for a higher-leve function
built on strtol().

I wrote a generic commandline parser once, however it's almost
impossible to achieve something that is both useable and 100%
bulletproof.

Malcolm McLean · Sep 28, 2010

On the other hand, with the dynamic typing mindset, you might even wrap
your values (of whatever numerical type) in a symbolic expression
mentionning the unit and perhaps other meta data, so that when the other
module receives it, it may notice (dynamically) that two values are not
of the same unit, but if compatible, it could (dynamically) convert into
the expected unit. Mission saved!

I'd like to design a language like this. If you add a quantity in
inches to a quantity in centimetres you get a quantity in (say)
metres. If you multiply them together you get an area, if you divide
them you get a dimeionless scalar. If you divide a quantity in metres
by a quantity in seconds you get a velocity, if you try to subtract
them you get an error.

BartC · Sep 28, 2010

Malcolm McLean said:
I'd like to design a language like this. If you add a quantity in
inches to a quantity in centimetres you get a quantity in (say)
metres. If you multiply them together you get an area, if you divide
them you get a dimeionless scalar. If you divide a quantity in metres
by a quantity in seconds you get a velocity, if you try to subtract
them you get an error.

As you suggested in 'Varaibles with units' comp.programming Feb 16 2008?
[Yes with that spelling...]

I have a feeling that would quickly make programming impossible (if you
consider how many combinations of dimensions/units, and operators there
might be).

One approach I've used is to specify a dimension (ie. unit) only for
constant values, which are then immediately converted (at compile time) to a
standard unit:

a:=sin(60°) # becomes sin(1.047... radians)
d:=6 ins # becomes 152.4 mm

Here the standard units are radians, and mm. Every other calculation uses
implied units.

Tim Bradshaw · Sep 28, 2010

I'd like to design a language like this. If you add a quantity in
inches to a quantity in centimetres you get a quantity in (say)
metres. If you multiply them together you get an area, if you divide
them you get a dimeionless scalar. If you divide a quantity in metres
by a quantity in seconds you get a velocity, if you try to subtract
them you get an error.

There are several existing systems which do this. The HP48 (and
descendants I expect) support "units" which are essentially dimensions.
I don't remember if it signals errors for incoherent dimensions.
Mathematica also has some units support, and it definitely does not
indicate an error: "1 Inch + 1 Second" is fine. There are probably
lots of other systems which do similar things.

Albert van der Horst · Sep 28, 2010

I would even go further.

Types are only part of the story. You may distinguish between integers
and floating points, fine. But what about distinguishing between
floating points representing lengths and floating points representing
volumes? Worse, what about distinguishing and converting floating
points representing lengths expressed in feets and floating points
representing lengths expressed in meters.

When I was at Shell (late eighties) there were people claiming
to have done exactly that, statically, in ADA.
I would say the dimensional checking is underrated. It must be
complemented with a hard and fast rule about only using standard
(SI) units internally.

Oil output internal : m^3/sec
Oil output printed: kbarrels/day

If you start with the mindset of static type checking, you will consider
that your types are checked and if the types at the interface of two
modules matches you'll think that everything's ok. And six months later
you Mars mission will crash.

A mission failure is a failure of management. The Ariadne crash was.
Management must take care that engineering mistakes don't lead to
mission failure. In the Ariadne case it was not so much engineering
mistakes, but management actually standing in the way of good
engineering practice.

__Pascal Bourguignon__ http://www.informatimago.com/

Groetjes Albert

Malcolm McLean · Sep 28, 2010

There are several existing systems which do this. The HP48 (and
descendants I expect) support "units" which are essentially dimensions.
I don't remember if it signals errors for incoherent dimensions.
Mathematica also has some units support, and it definitely does not
indicate an error: "1 Inch + 1 Second" is fine. There are probably
lots of other systems which do similar things.

The problem is that if you allow expressions rather than terms then
the experssions can get arbitrarily complex. sqrt(1 inch + 1 Second),
for instance.

On the other hand sqrt(4 inches^2) is quite well defined. The question
is whether to allow sqrt(1 inch). It means using rationals rather than
integers for unit superscripts.

(You can argue that you can get things like km^9s^-9g^3 even in a
simple units system. The difference is that these won't occur very
often in real programs, just when people are messing sbout with the
system, and we don't need to make messing about efficient or easy to
use).

Tim Bradshaw · Sep 28, 2010

he problem is that if you allow expressions rather than terms then
the experssions can get arbitrarily complex. sqrt(1 inch + 1 Second),
for instance.

I can't imagine a context where 1 inch + 1 second would not be an
error, so this is a slightly odd example. Indeed I think that in
dimensional analysis summing (or comparing) things with different
dimensions is always an error.

On the other hand sqrt(4 inches^2) is quite well defined. The question
is whether to allow sqrt(1 inch). It means using rationals rather than
integers for unit superscripts.

There's a large existing body of knowledge on dimensional analysis
(it's a very important tool for physics, for instance), and obviously
the answer is to do whatever it does. Raising to any power is fine, I
think (but transcendental functions, for instance, are never fine,
because they are equivalent to summing things with different
dimensions, which is obvious if you think about the Taylor expansion of
a transcendental function).

--tim

Dennis Lee Bieber · Sep 28, 2010

I'd like to design a language like this. If you add a quantity in
inches to a quantity in centimetres you get a quantity in (say)
metres. If you multiply them together you get an area, if you divide
them you get a dimeionless scalar. If you divide a quantity in metres
by a quantity in seconds you get a velocity, if you try to subtract
them you get an error.

That language is called: Ada

Though you'll have to define the packages for the unit types and all
valid combinations of operations with those units.

Niklas Holsti · Sep 28, 2010

Albert said:
When I was at Shell (late eighties) there were people claiming
to have done exactly that, statically, in ADA.

Click to expand...

It is cumbersome to do it statically, in the current Ada standard. Doing
it by run-time checks in overloaded operators is easier, but of course
has some run-time overhead. There are proposals to extend Ada a bit to
make a static check of physical units ("dimensions") simpler. See
http://www.ada-auth.org/cgi-bin/cvsweb.cgi/acs/ac-00184.txt?rev=1.3&raw=Y
and inparticular the part where Edmond Schonberg explains a suggestion
for the GNAT Ada compiler.

A mission failure is a failure of management. The Ariadne crash was.

Click to expand...

Just a nit, the launcher is named "Ariane".

Thomas A. Russ · Sep 28, 2010

Malcolm McLean said:
I'd like to design a language like this. If you add a quantity in
inches to a quantity in centimetres you get a quantity in (say)
metres. If you multiply them together you get an area, if you divide
them you get a dimeionless scalar. If you divide a quantity in metres
by a quantity in seconds you get a velocity, if you try to subtract
them you get an error.

Done in 1992.

See
<http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/lisp/code/syntax/measures/0.html>
citation at <http://portal.acm.org/citation.cfm?id=150168>

and my extension to it as part of the Loom system:
<http://www.isi.edu/isd/LOOM/documentation/loom4.0-release-notes.html#Units>

George Neuner · Sep 28, 2010

I would say the dimensional checking is underrated. It must be
complemented with a hard and fast rule about only using standard
(SI) units internally.

Oil output internal : m^3/sec
Oil output printed: kbarrels/day

"barrel" is not an SI unit. And when speaking about oil there isn't
even a simple conversion.

42 US gallons ? 34.9723 imp gal ? 158.9873 L

[In case those marks don't render, they are meant to be the
double-tilda sign meaning "approximately equal".]

George

MRAB · Sep 28, 2010

I would say the dimensional checking is underrated. It must be
complemented with a hard and fast rule about only using standard
(SI) units internally.

Oil output internal : m^3/sec
Oil output printed: kbarrels/day

Click to expand...

"barrel" is not an SI unit. And when speaking about oil there isn't
even a simple conversion.

42 US gallons ? 34.9723 imp gal ? 158.9873 L

[In case those marks don't render, they are meant to be the
double-tilda sign meaning "approximately equal".]

Do you mean:

42 US gallons â‰ˆ 34.9723 imp gal â‰ˆ 158.9873 l

The post as I received it was encoded as 7-bit us-ascii, so definitely
no double-tilde. This post was encoded as utf-8.

Nick · Sep 28, 2010

Done in 1992.

See
<http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/lisp/code/syntax/measures/0.html>
citation at <http://portal.acm.org/citation.cfm?id=150168>

and my extension to it as part of the Loom system:
<http://www.isi.edu/isd/LOOM/documentation/loom4.0-release-notes.html#Units>

I didn't go as far as that, but:

$ cat test.can
database input 'canal.sqlite'

for i=link 'Braunston Turn' to '.*'
print 'It is ';i.distance into 'distance:%M';' miles (which is '+i.distance into 'distance:%K'+' km) to ';i.place2 into 'name

lace'
end for i
$ canal test.can
It is 0.10 miles (which is 0.16 km) to London Road Bridge No 90
It is 0.08 miles (which is 0.13 km) to Bridge No 95
It is 0.19 miles (which is 0.30 km) to Braunston A45 Road Bridge No 91

Keith Thompson · Sep 28, 2010

George Neuner said:
"barrel" is not an SI unit.

He didn't say it was. Internal calculations are done in SI units (in
this case, m^3/sec); on output, the internal units can be converted to
whatever is convenient.

And when speaking about oil there isn't
even a simple conversion.

42 US gallons ? 34.9723 imp gal ? 158.9873 L

[In case those marks don't render, they are meant to be the
double-tilda sign meaning "approximately equal".]

There are multiple different kinds of "barrels", but "barrels of oil"
are (consistently, as far as I know) defined as 42 US liquid gallons.
A US liquid gallon is, by definition, 231 cubic inches; an inch
is, by definition, 0.0254 meter. So a barrel of oil is *exactly*
0.158987294928 m^3, and 1 m^3/sec is exactly 13.7365022817792
kbarrels/day. (Please feel free to check my math.) That's
admittedly a lot of digits, but there's no need for approximations
(unless they're imposed by the numeric representation you're using).

Keith Thompson · Sep 28, 2010

Erik Max Francis said:
299792458.0 m/s

Actually, the speed of light is exactly 299792458.0 m/s by
definition. (The meter and the second are defined in terms of the
same wavelength of light; this was changed relatively recently.)

python philosophical question - strong vs duck typing	2	Jan 3, 2012
Strong encapsulation, static typing and unit testing	3	Dec 29, 2006
STRONG dynamic typing favors tools..?	5	Jan 7, 2008
Regex ^ beginning not strong?	2	Jul 26, 2010
Error Testing	37	Oct 19, 2013
Parameter guidelines for the Strong AI Singularity	0	Feb 24, 2012
How to Name Gems - A STRONG Recommendation	3	Nov 15, 2010
Strong Typed generic Composite class	0	Jul 17, 2008

Re: "Strong typing vs. strong testing"

namekuseijin

Pascal J. Bourguignon

Scott L. Burson

Pascal J. Bourguignon

John Nagle

Malcolm McLean

Malcolm McLean

BartC

Tim Bradshaw

Albert van der Horst

Malcolm McLean

Tim Bradshaw

Dennis Lee Bieber

Niklas Holsti

Thomas A. Russ

George Neuner

MRAB

Nick

Keith Thompson

Keith Thompson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads