Re: "Strong typing vs. strong testing"

P

Pascal J. Bourguignon

BartC said:
But if you had to implement a VM directly in hardware, which one (of
the several varieties) would you choose?

And having chosen one, how would that impact the performance of a
language with an incompatible VM?

Indeed. C running on LispMachine, wasn't so fast. All this bit
twiddling and pointer computing... But if that could be construed as a
reason why to use dynamic languages (they run faster!) than C, it'd be
all for the best!


Otherwise we need to further go down the road of VM (cf. the hardware
virtualization stream), down to the microcode. Again, it's because of
the cheapness of microprocessors founders that we forgot for a long time
the notion of microcode, which was found more often on big irons.
Nowadays the biggest microprocessors are back on the track of microcode;
this should be open, and virtual machines should be more routinely
implemented in microcode.

Perhaps processors executing native code as it is now, aren't such a
bad idea.

Perhaps if they had a more controlled execution model it would be a
better idea. Remember that processors are like they are because C (and
unix) is like it is!
 
P

Pascal J. Bourguignon

Seebs said:
Unless, of course, the "wrong type" happens to be compatible enough to
pass. In which case, it's unclear whether it is the "wrong type" or not.


I have no clue what exact scenario you're talking about here. I've never
seen a bug that could plausibly be described as "compiler passes wrong
type" which wasn't picked up quickly by running with more warnings enabled.

This is the scenario discussed in this thread, a long is passed to
maximum without a compiler warning.

And on the other end of things, it is not always obvious or straightforward
to figure out *why* the dynamic language has ended up with something of the
wrong type, or what's wrong with that type.

It is on the contrary rather obvious or easily discoverable, looking at
the backtrace, and inspecting the data structures referenced from it.
 
J

John Nagle

I don't know if you are intentionally trying to be deceitful or if you honestly didn't spent much
time thinking about this issue. To be brief I will only point out the following topics:


a) no language is inherently more or less efficient than any other language. The efficiency
aspect is only related to how those languages are implemented (i.e., the investments made in
optimizing the compilers/interpreters)
b) Just because someone invested enough effort to optimize a specific implementation of language X
to run, under a specific scenario, a benchmark faster than some other implementation of language Y
it doesn't mean that language X's implementation outperforms or even matches every implementation
of language Y under every conceivable scenario.

If the implementation has an omniscient compiler, yes. If not,
some languages have serious speed penalties.

Python has some inherent performance problems stemming from
its "gratuitous hidden dynamism". That is, there are
bits of dynamism which are very unlikely, but not visible at
compile time, and thus very difficult to optimize out.
For example, it's possible to add an attribute to an object
from outside the object's class. It's thus very difficult to
detect at compile time which classes could use static "slots",
and which could not. Python allows replacing a function at
run time, even while another thread is executing in it.
(Even PHP disallows that.) This breaks a wide range of
optimizations. And then, of course, there's the "global
interpreter lock" problem.

Dynamism is not the problem. It's hidden dynamism, which
forces compiling for the worst case all the time, that's
the problem.

Most of these problems come from CPython,
which is a naive interpreter. The feature set of Python
is well matched to a naive interpreter.

The organizational situation is a lot like the one with
Pascal and Wirth. There'a a very straightforward way to write
a recursive-descent Pascal compiler that compiles to a stack
notation. Wirth wrote one of those, and insisted that the
language should be very close to what a compiler of that form
can do. This eventually killed Pascal.

John Nagle
 
S

Seebs

This is the scenario discussed in this thread, a long is passed to
maximum without a compiler warning.

The compiler didn't pass the wrong type, the user did.
It is on the contrary rather obvious or easily discoverable, looking at
the backtrace, and inspecting the data structures referenced from it.

This is a fascinating assertion, but it is slightly marred by not actually
being generally true. It's often the case that it's pretty obvious, but
sometimes it's not so obvious -- it can take quite a bit of doing to figure
out how or where some hunk of a data structure got set up wrong. It's very
easy to find the call where a nil was passed to something expecting some kind
of integer; it's sometimes not so easy to find the completely unrelated hunk
of code which modified the data structure half an hour earlier.

-s
 
T

Terry Reedy

Why do you consider the term "compile error" a "mandate to stop
compiling"? What do you say to refer to the situation when you have a
statement in your program that the compiler finds is an error? And is
it really material whether the compiler flagged an error and stopped,
or flagged an error and looked for additional errors???

For the purpose of the argument, I do not think it matters whether the
compiler looks for additional errors before it stops compiling. More to
to point is whether or not it eventually produces an object file that
can be linked into an executable file. 'Compile error' (as opposed to
'compile warning') should mean that it does not.
 
P

Pascal J. Bourguignon

Seebs said:
The compiler didn't pass the wrong type, the user did.

And the compiler passed it on without saying anything.
This is a fascinating assertion, but it is slightly marred by not actually
being generally true. It's often the case that it's pretty obvious, but
sometimes it's not so obvious -- it can take quite a bit of doing to figure
out how or where some hunk of a data structure got set up wrong. It's very
easy to find the call where a nil was passed to something expecting some kind
of integer; it's sometimes not so easy to find the completely unrelated hunk
of code which modified the data structure half an hour earlier.

When debugging C program, you are perfectly right.
But I don't observe the same when debugging lisp programs.
 
I

Ian Collins

This is the scenario discussed in this thread, a long is passed to
maximum without a compiler warning.

Which will cause the test for the bit of code doing the call to fail.
So it fails at run-time with a failed test, just as it would in a
dynamic language.
 
R

RG

Seebs said:
No, it isn't. It's a correctly identified type mismatch.

You keep moving the goal posts from the actual standard of a false positive
(the compiler warns that something is of the wrong type when it's not of
the wrong type) to a made-up standard (the compiler warns that something is
of the wrong type when it is indeed of the wrong type, but could be safely
converted to the right type).

It doesn't matter whether, in a given case, you *could* safely perform
the conversion. If you don't perform the conversion, and the compiler points
this out, that's not a false positive.




Moving away fast enough that their color has visibly changed.


Not really. If you use the most obvious and natural meanings *for a
statically typed language*, it is obvious that it is true.

And indeed, significantly so. In the real world, programs written in
scripting languages with runtime typing are fairly likely to throw occasional
exceptions because something is of the wrong type. In a statically typed
language, the of-the-wrong-type is something which can, by definition, be
caught at compile time.

The fundamental thing you seem to be getting stuck on is that you're assuming
that if a conversion could be made, that it should be and it should be
automatic and silent. That, however, is at odds with discussion of a
statically typed language. There's a reason we have the option of converting
things from one type to another.

No, this is not what I'm getting stuck on. I understand the technical
theory behind statically typed languages. What I'm getting "stuck" on
is this:
In a statically typed language, the of-the-wrong-type is something which
can, by definition, be caught at compile time.

Any time something is true "by definition" that is an indication that
it's not a particularly useful fact. The whole concept of "type" is a
red herring. It's like this: there are some properties of programs that
can be determined statically, and others that can't. Some of the
properties that can't be determined statically matter in practice. But
all of the properties that can be determined statically can also be
determined dynamically. The *only* advantage that static analysis has
is that *sometimes* it can determine *some* properties of a program
faster or with less effort than a dynamic approach would.

What I take issue with is the implication made by advocates of static
analysis that static analysis is somehow *inherently* superior to
dynamic analysis, that static analysis can provide some sort of
"guarantee" of reliability that actually has some sort of practical
meaning in the real world. It doesn't. The net effect of static
analysis in the real world is to make programmers complacent about
properties of programs that can only be determined at run time, to make
them think that compiling without errors means something, and that if a
program compiles without errors then there is no need for run-time
checking of any sort. *You* may not believe these things, but the vast
majority of programmers who use statically typed languages do believe
these things, even if only tacitly. The result is a world where
software by and large is a horrific mess of stack overflows, null
pointer exceptions, core dumps, and security holes.

I'm not saying that static analysis is not useful. It is. What I'm
saying is that static analysis is nowhere near as useful as its
advocates like to imply that it is. And it's better to forego static
analysis and *know* that you're flying without a net at run-time than to
use it and think that you're safe when you're really not.

rg
 
K

Keith Thompson

Seebs said:
Because that's what people normally mean -- compilation failed.

At least for C, I'd say it refers to a syntax error or constraint
violation, i.e., something that requires a diagnostic.

[...]
 
K

Keith Thompson

TheFlyingDutchman said:
You're missing a lot of context.  I'm not trying to criticize C, just to
refute a false claim that was made about it.
Can you cite the article that made this false claim, and exactly what
the false claim was?

        Message-ID:
        <0497e39d-6bd1-429d-a86f-f4c89babe...@u31g2000pru.googlegroups.com>
        From: TheFlyingDutchman <[email protected]>
        Newsgroups: comp.lang.lisp
        [...]
        in C I can have a function maximum(int a, int b) that will always
        work. Never blow up, and never give an invalid answer. If someone
        tries to call it incorrectly it is a compile error.
        [...]

That was slightly overstated. In fact, you can have such a
function that will always work when called correctly, *unless*
something else has caused the program's behavior to be undefined,
in which case all bets are off.

[...]
Thanks from me as well, Don. I was worried that people would start to
believe that the original statement was what you said it was:

"I'm not even saying it's a flaw in the language. All I'm saying is
that
the original claim -- that any error in a C program will be caught by
the compiler -- is false, and more specifically, that it can be
demonstrated to be false without appeal to unknown run-time input."

Yes, that would have been an absurd claim if anyone had actually
made it.
 
K

Keith Thompson

RG said:
Seebs said:
That gives (what I would consider to be) false positives, e.g.:
[ron@mighty:~]$ cat foo.c
void foo(long x) {}
int main() { foo(1); }
[ron@mighty:~]$ gcc -Wconversion foo.c
foo.c: In function ???main???:
foo.c:4: warning: passing argument 1 of ???foo??? with different width due
to prototype

But it's *not* a false positive. The compiler is not claiming that the
conversion couldn't be done -- it's just telling you that there is a
change of type going on.

Again, you can't have it both ways. Either a warning is a "compiler
error" according to the claim at issue (see below) or it is not. If it
is, then this is a false positive. If it is not, then my previous
example refutes the claim.
[...]

In this case, the conversion (from int to long) cannot fail (barring
undefined behavior elsewhere in the program), because long is
guaranteed to be able to hold any value within the range of int.

It would be reasonable, at least as an option, to warn only about
conversions that can fail, or only about conversions that can lose
information, perhaps distinguishing between implicit conversions
and explicit conversions (casts).

Even the warning about foo(1) is not entirely unreasonable; some
programmers might prefer to write foo(1L) rather than foo(1),
to make it clear that the argument being passed is of type long
rather than int.

I don't know whether gcc provides this level of control over which
conversions it warns about, but again, that's a gcc issue, not a
C issue.
 
K

Keith Thompson

And the compiler passed it on without saying anything.

Because the user didn't use the option (-Wconversion) that would have
caused the compiler to warn about it.

[...]
 
B

BartC

Nothing extraordinary here. Common Lisp is more efficient than C.
http://www.lrde.epita.fr/~didier/research/verna.06.ecoop.pdf
http://portal.acm.org/citation.cfm?id=1144168

Actually, it's hard to find a language that has no compiler generating
faster code than C...

I had a quick look at Lisp to see if your claims had any basis. I tried this
program:

(defun fib (n)
(if (< n 2)
n
(+ n (fib (- n 1)) (fib (- n 2)) )
))

But it gave the wrong results and it took ages to figure out why. Even after
downloading a working version for comparison, it's was difficult to spot the
extraneous 'n' (I think I was concentrating on getting the round brackets
all in the right places).

I thought you were saying that Lisp (and dynamic typing in general) was
better for pointing out errors? The above error in C would have been picked
up more easily I think (due to less parentheses clutter, and more obvious
separators between terms).

As for speed, executing fib(38) took about 60 seconds on my machine (gnu
clisp with no switches set). (Compared with 3.5 seconds, for my own
interpreted, dynamic language, and 0.6 seconds for C.)
 
P

Pascal J. Bourguignon

BartC said:
I had a quick look at Lisp to see if your claims had any basis. I
tried this program:

(defun fib (n)
(if (< n 2)
n
(+ n (fib (- n 1)) (fib (- n 2)) )
))

But it gave the wrong results and it took ages to figure out why. Even
after downloading a working version for comparison, it's was difficult
to spot the extraneous 'n' (I think I was concentrating on getting the
round brackets all in the right places).

I thought you were saying that Lisp (and dynamic typing in general)
was better for pointing out errors? The above error in C would have
been picked up more easily I think (due to less parentheses clutter,
and more obvious separators between terms).

There are more parentheses in C (you must take into account {} and []
As for speed, executing fib(38) took about 60 seconds on my machine
(gnu clisp with no switches set).

Yes, you choosed the ONLY CL implementation having an interpreter by
default, and which when compiling, compiles to a virtual machine. This
happens also to be my favorite CL implementation, which tells you how
much I care about those silly time benchmarks.

If you want to try a CL implementation generating faster code, try SBCL.
 
B

BartC

Pascal J. Bourguignon said:
(defun fib (n)
(if (< n 2)
n
(+ n (fib (- n 1)) (fib (- n 2)) )
))

But it gave the wrong results and it took ages to figure out why. Even
I thought you were saying that Lisp (and dynamic typing in general)
was better for pointing out errors? The above error in C would have
been picked up more easily I think (due to less parentheses clutter,
and more obvious separators between terms).

There are more parentheses in C (you must take into account {} and []
too, and <> in C++), and the additionnal separators in C are clutter.

I mentioned 'C' because it's one of the cross-posted languages; I normally
use my own language. Nevertheless, both of these (as well as Python and
probably many others) require no more than 4 parentheses in the line in
question; the Lisp line above uses 12 (including the 2 I moved to the next
line for clarity).

And needing an extra "+" when you want to add 3 things instead of 2, sort of
makes it obvious that you do have 3 things instead of 2... In fact the error
would never have got so far. Sometimes redundancy in syntax is useful.
Yes, you choosed the ONLY CL implementation having an interpreter by
default, and which when compiling, compiles to a virtual machine. This
happens also to be my favorite CL implementation, which tells you how
much I care about those silly time benchmarks.

Well, 60 seconds is pretty slow even for an interpreter; even plain CPython
took only 30 seconds.
If you want to try a CL implementation generating faster code, try SBCL.

OK.
 
R

Raffael Cavallaro

I had a quick look at Lisp to see if your claims had any basis. I tried
this program:

(defun fib (n)
(if (< n 2)
n
(+ n (fib (- n 1)) (fib (- n 2)) )
))

But it gave the wrong results and it took ages to figure out why. Even
after downloading a working version for comparison, it's was difficult
to spot the extraneous 'n' (I think I was concentrating on getting the
round brackets all in the right places).

I saw it immediately, but I'm familiar with lisp and you are not -
those two parentheses (what you call "round brackets") on a line by
themselves are a dead giveaway.
I thought you were saying that Lisp (and dynamic typing in general) was
better for pointing out errors? The above error in C would have been
picked up more easily I think (due to less parentheses clutter, and
more obvious separators between terms).

As for speed, executing fib(38) took about 60 seconds on my machine
(gnu clisp with no switches set). (Compared with 3.5 seconds, for my
own interpreted, dynamic language, and 0.6 seconds for C.)

Compiled lisp is fast, but you need to actually compile it:

CL-USER 96 > (defun fib (n)
(declare (optimize speed (debug 0)))
(if (< n 2)
n
(+ (fib (- n 1)) (fib (- n 2)))))
FIB

CL-USER 97 > (compile 'fib)
FIB
NIL
NIL

CL-USER 98 > (time (fib 38))
Timing the evaluation of (FIB 38)

User time = 0.888
System time = 0.002
Elapsed time = 0.877
Allocation = 142568 bytes
0 Page faults
39088169

which is in the same range as your .6 seconds for your equivalent c program.

What happens when you want fib(1000) or fib(1000000)? Then the naive
recursive version isn't very useful (it's just too slow), and the
result is going to overflow c integer types.

in lisp:

CL-USER 118 > (defun fact (n) (if (zerop n) 1 (* n (fact (1- n)))))
FACT

CL-USER 119 > (defun fib (n)
(/ (loop for k from 1 to n by 2
sum (/ (* (expt 5 (/ (1- k) 2)) (fact n))
(fact k) (fact (- n k))))
(expt 2 (1- n))))
FIB

CL-USER 120 > (compile 'fact)
FACT
NIL
NIL

CL-USER 121 > (compile 'fib)
FIB
NIL
NIL

CL-USER 122 > (time (fib 1000))
Timing the evaluation of (FIB 1000)

User time = 0.760
System time = 0.007
Elapsed time = 0.748
Allocation = 474522008 bytes
147 Page faults
43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875

warmest

regards,

Ralph
 
P

Paul Rubin

Raffael Cavallaro said:
CL-USER 119 > (defun fib (n)
(/ (loop for k from 1 to n by 2
sum (/ (* (expt 5 (/ (1- k) 2)) (fact n))
(fact k) (fact (- n k))))
(expt 2 (1- n))))
CL-USER 122 > (time (fib 1000))
Timing the evaluation of (FIB 1000)
43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875

I have no idea what that fancy algorithm is doing, but the number it
prints appears to be the 2000th Fibonacci number rather than the 1000th.
And it took 0.76 seconds even compiled. This Haskell code took no
perceptible time even in the ghci interpreter:

main = print (fib 0 1 !! 2000)
where
fib a b = a:(a+b):fib b (a+b)
 
P

Pascal J. Bourguignon

BartC said:
Pascal J. Bourguignon said:
(defun fib (n)
(if (< n 2)
n
(+ n (fib (- n 1)) (fib (- n 2)) )
))

But it gave the wrong results and it took ages to figure out why. Even
I thought you were saying that Lisp (and dynamic typing in general)
was better for pointing out errors? The above error in C would have
been picked up more easily I think (due to less parentheses clutter,
and more obvious separators between terms).

There are more parentheses in C (you must take into account {} and []
too, and <> in C++), and the additionnal separators in C are clutter.

I mentioned 'C' because it's one of the cross-posted languages; I normally
use my own language. Nevertheless, both of these (as well as Python and
probably many others) require no more than 4 parentheses in the line in
question; the Lisp line above uses 12
So what?
(including the 2 I moved to the next
line for clarity).

Actually for clarity, properly formated lisp code would be like:

(defun fib (n)
(if (< n 2)
n
(+ n
(fib (- n 1))
(fib (- n 2)))))

And needing an extra "+" when you want to add 3 things instead of 2, sort of
makes it obvious that you do have 3 things instead of 2... In fact the error
would never have got so far. Sometimes redundancy in syntax is useful.

On the other hand, baroque syntaxes requires more wetware cycles so
you're left with less to think about important matters, such as knowing
what fib is...
 
R

Raffael Cavallaro

I have no idea what that fancy algorithm is doing, but the number it
prints appears to be the 2000th Fibonacci number rather than the 1000th.

I think you're mistaken. fib(0) = 0, fib(1) = 1, fib(2) = 1, fib(3) = 2
.... fib(11)= 89 ...
fib(1000) =
43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875

If

you like we can do it iteratively instead, which, as in haskel, takes
no perceptible time:

CL-USER 133 > (defun fib (x)
(if (< x 1) 0
(loop
repeat x
for previous = 0 then result
and result = 1 then (+ result previous)
finally (return result))))
FIB

CL-USER 134 > (compile 'fib)
FIB
NIL
NIL

CL-USER 135 > (time (fib 1000))
Timing the evaluation of (FIB 1000)

User time = 0.000
System time = 0.000
Elapsed time = 0.000
Allocation = 60088 bytes
0 Page faults
43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875

The

relevant points are:

1. common lisp implementations produce fast code
2. the result of fib(1000) is going to overflow c integer types

warmest regards,

Ralph
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,740
Latest member
JudsonFrie

Latest Threads

Top