What is Expressiveness in a Computer Language

D

David Hopwood

Joe said:
The #' is just a namespace operator. The function accepts a single
argument and returns the function itself. You need a type that is a
fixed point (in addition to the universally quantified argument).

Yes, see the correction in my follow-up.
Presumably the remainder of the program does something interesting!

The point is that there exists (by construction) programs that can
never be statically checked.

I don't think you've shown that. I would like to see a more explicit
construction of a dynamically typed [*] program with a given specification,
where it is not possible to write a statically typed program that meets
the same specification using the same algorithms.

AFAIK, it is *always* possible to construct such a statically typed
program in a type system that supports typerec or gradual typing. The
informal construction you give above doesn't contradict that, because
it depends on reflecting on a [static] type system, and in a dynamically
typed program there is no type system to reflect on.


[*] Let's put aside disagreements about terminology for the time being,
although I still don't like the term "dynamically typed".
The next question is whether such
programs are `interesting'. Certainly a program that is simply
designed to contradict the type checker is of limited entertainment
value,

I don't know, it would probably have more entertainment value than most
executive toys :)
but there should be programs that are independently non
checkable against a sufficiently powerful type system.

Why?

Note that I'm not claiming that you can check any desirable property of
a program (that would contradict Rice's Theorem), only that you can
express any dynamically typed program in a statically typed language --
with static checks where possible and dynamic checks where necessary.
 
P

Paul Rubin

David Hopwood said:
Note that I'm not claiming that you can check any desirable property of
a program (that would contradict Rice's Theorem), only that you can
express any dynamically typed program in a statically typed language --
with static checks where possible and dynamic checks where necessary.

It starts to look like sufficiently powerful static type systems are
confusing enough, that programming with them is at least as bug-prone
as imperative programming in dynamically typed languages. The static
type checker can spot type mismatches at compile time, but the
types themselves are easier and easier to get wrong.
 
J

Joachim Durchholz

Paul said:
It starts to look like sufficiently powerful static type systems are
confusing enough, that programming with them is at least as bug-prone
as imperative programming in dynamically typed languages. The static
type checker can spot type mismatches at compile time, but the
types themselves are easier and easier to get wrong.

That's where type inference comes into play. Usually you don't write the
types, the compiler infers them for you, so you don't get them wrong.

Occasionally, you still write down the types, if only for documentation
purposes (the general advice is: do the type annotations for external
interfaces, but not internally).

BTW if you get a type wrong, you'll be told by the compiler, so this is
still less evil than bugs in the code that pop up during testing (and
*far* less evil than bugs that pop up after roll-out).
And the general consensus among FPL programmers is that you get the hang
of it fairly quickly (one poster mentioned "two to three months" - this
doesn't seem to be slower than learning to interpret synax error
messages, so it's OK considering it's an entirely new kind of diagnostics).

Regards,
Jo
 
A

Andreas Rossberg

I believe this example requires recursive types. It can also be expressed
in a gradual typing system, but possibly only using an unknown ('?') type.

ISTR that O'Caml at one point (before version 1.06) supported general
recursive types, although I don't know whether it would have supported
this particular example.

No problem at all. It still is possible today if you really want:

~/> ocaml -rectypes
Objective Caml version 3.08.3

# let rec blackhole x = blackhole;;
val blackhole : 'b -> 'a as 'a = <fun>

The problem is, though, that almost everything can be typed once you
have unrestricted recursive types (e.g. missing arguments etc), and
consequently many actual errors remain unflagged (which clearly shows
that typing is not only about potential value class mismatches).
Moreover, there are very few practical uses of such a feature, and they
can always be coded easily with recursive datatypes.

It is a pragmatic decision born from experience that you simply do *not
want* to have this, even though you easily could. E.g. for OCaml,
unrestricted recursive typing was removed as default because of frequent
user complaints.

Which is why this actually is a very bad example to chose for dynamic
typing advocacy... ;-)

- Andreas
 
D

David Hopwood

Paul said:
It starts to look like sufficiently powerful static type systems are
confusing enough, that programming with them is at least as bug-prone
as imperative programming in dynamically typed languages. The static
type checker can spot type mismatches at compile time, but the
types themselves are easier and easier to get wrong.

My assertion above does not depend on having a "sufficiently powerful static
type system" to express a given dynamically typed program. It is true for
static type systems that are not particularly complicated, and suffice to
express all dynamically typed programs.

I disagree with your implication that in general, static type systems for
practical languages are getting too confusing, but that's a separate issue.
(Obviously one can find type systems in research proposals that are too
confusing as they stand.)
 
P

Pascal Costanza

David said:
Staged compilation is perfectly compatible with static typing.
Static typing only requires that it be possible to prove absence
of some category of type errors when the types are known; it
does not require that all types are known before the first-stage
program is run.

Can you change the types of the program that is already running, or are
the levels strictly separated?
There are, however, staged compilation systems that guarantee that
the generated program will be typeable if the first-stage program
is.

....and I guess that this reduces again the kinds of things you can express.
(It's clear that to compare the expressiveness of statically and
dynamically typed languages, the languages must be comparable in
other respects than their type system. Staged compilation is the
equivalent feature to 'eval'.)

If it is equivalent to eval (i.e., executed at runtime), and the static
type checker that is part of eval yields a type error, then you still
get a type error at runtime!


Pascal
 
D

David Hopwood

Pascal said:
Can you change the types of the program that is already running, or are
the levels strictly separated?

In most staged compilation systems this is intentionally not permitted.
But this is not a type system issue. You can't change the types in a
running program because you can't change the program, period. And you
can't do that because most designers of these systems consider directly
self-modifying code to be a bad idea (I agree with them).

Note that prohibiting directly self-modifying code does not prevent a
program from specifying another program to *replace* it.
...and I guess that this reduces again the kinds of things you can express.

Yes. If you don't want that, use a system that does not impose this
restriction/guarantee.
If it is equivalent to eval (i.e., executed at runtime), and the static
type checker that is part of eval yields a type error, then you still
get a type error at runtime!

You can choose to use a system in which this is impossible because only
typeable programs can be generated, or one in which non-typeable programs
can be generated. In the latter case, type errors are detected at the
earliest possible opportunity, as soon as the program code is known and
before any of that code has been executed. (In the case of eval, OTOH,
the erroneous code may cause visible side effects before any run-time
error occurs.)

I don't know what else you could ask for.
 
A

Andreas Rossberg

David said:
(In the case of eval, OTOH,
the erroneous code may cause visible side effects before any run-time
error occurs.)

Not necessarily. You can replace the primitive eval by compile, which
delivers a function encapsulating the program, so you can check the type
of the function before actually running it. Eval itself can easily be
expressed on top of this as a polymorphic function, which does not run
the program if it does not have the desired type:

eval ['a] s = typecase compile s of
f : (()->'a) -> f ()
_ -> raise TypeError

- Andreas
 
P

Pascal Costanza

David said:
In most staged compilation systems this is intentionally not permitted.
But this is not a type system issue. You can't change the types in a
running program because you can't change the program, period. And you
can't do that because most designers of these systems consider directly
self-modifying code to be a bad idea (I agree with them).

Whether you consider something you cannot do with statically typed
languages a bad idea or not is irrelevant. You were asking for things
that you cannot do with statically typed languages.

There are at least systems that a lot of people rely on (the JVM, .NET)
that achieve runtime efficiency primarily by executing what is
essentially self-modifying code. These runtime optimization systems have
been researched primarily for the language Self, and also implemented in
Strongtalk, CLOS, etc., to various degrees.

Beyond that, I am convinced that the ability to update a running system
without the need to shut it down can be an important asset.
Note that prohibiting directly self-modifying code does not prevent a
program from specifying another program to *replace* it.

....and this creates problems with moving data from one version of a
program to the next.


Pascal
 
M

Matthias Blume

Pascal Costanza said:
Whether you consider something you cannot do with statically typed
languages a bad idea or not is irrelevant. You were asking for things
that you cannot do with statically typed languages.

The whole point of static type systems is to make sure that there are
things that one cannot do. So the fact that there are such things are
not an argument per se against static types.

[ ... ]
Beyond that, I am convinced that the ability to update a running
system without the need to shut it down can be an important asset.

And I am convinced that updating a running system in the style of,
e.g., Erlang, can be statically typed.
...and this creates problems with moving data from one version of a
program to the next.

How does this "create" such a problem? The problem is there in either
approach. In fact, I believe that the best chance we have of
addressing the problem is by adopting the "replace the code" model
along with a "translate the data where necessary at the time of
replacement". Translating the data, i.e., re-establishing the
invariants expected by the updated/replaced code, seems much harder
(to me) in the case of self-modifying code. Erlang got this one
right.
 
P

Pascal Costanza

Matthias said:
The whole point of static type systems is to make sure that there are
things that one cannot do. So the fact that there are such things are
not an argument per se against static types.

I am not arguing against static type systems. I am just responding to
the question what the things are that you cannot do in statically typed
languages.
[ ... ]
Beyond that, I am convinced that the ability to update a running
system without the need to shut it down can be an important asset.

And I am convinced that updating a running system in the style of,
e.g., Erlang, can be statically typed.

Maybe. The interesting question then is whether you can express the
kinds of dynamic updates that are relevant in practice. Because a static
type system always restricts what kinds of runtime behavior you can
express in your language. I am still skeptical, because changing the
types at runtime is basically changing the assumptions that the static
type checker has used to check the program's types in the first place.

For example, all the approaches that I have seen in statically typed
languages deal with adding to a running program (say, class fields and
methods, etc.), but not with changing to, or removing from it.
How does this "create" such a problem? The problem is there in either
approach. In fact, I believe that the best chance we have of
addressing the problem is by adopting the "replace the code" model
along with a "translate the data where necessary at the time of
replacement". Translating the data, i.e., re-establishing the
invariants expected by the updated/replaced code, seems much harder
(to me) in the case of self-modifying code. Erlang got this one
right.

....and the "translate the date where necessary" approach is essentially
triggered by a dynamic type test (if value x is of an old version of
type T, update it to reflect the new version of type T [1]). QED.


Pascal

[1] BTW, that's also the approach taken in CLOS.
 
D

David Hopwood

Andreas said:
David said:
(In the case of eval, OTOH,
the erroneous code may cause visible side effects before any run-time
error occurs.)

Not necessarily. You can replace the primitive eval by compile, which
delivers a function encapsulating the program, so you can check the type
of the function before actually running it. Eval itself can easily be
expressed on top of this as a polymorphic function, which does not run
the program if it does not have the desired type:

eval ['a] s = typecase compile s of
f : (()->'a) -> f ()
_ -> raise TypeError

What I meant was, in the case of eval in an untyped ("dynamically typed")
language.

The approach you've just outlined is an implementation of staged compilation
in a typed language.
 
M

Marshall

Matthias said:
How does this "create" such a problem? The problem is there in either
approach. In fact, I believe that the best chance we have of
addressing the problem is by adopting the "replace the code" model
along with a "translate the data where necessary at the time of
replacement". Translating the data, i.e., re-establishing the
invariants expected by the updated/replaced code, seems much harder
(to me) in the case of self-modifying code. Erlang got this one
right.

Pardon my ignorance, but what is the Erlang solution to this problem?


Marshall
 
M

Matthias Blume

Pascal Costanza said:
Maybe. The interesting question then is whether you can express the
kinds of dynamic updates that are relevant in practice. Because a
static type system always restricts what kinds of runtime behavior you
can express in your language. I am still skeptical, because changing
the types at runtime is basically changing the assumptions that the
static type checker has used to check the program's types in the first
place.

That's why I find the Erlang model to be more promising.

I am extremely skeptical of code mutation at runtime which would
"change types", because to me types are approximations of program
invariants. So if you do a modification that changes the types, it is
rather likely that you did something that also changed the invariants,
and existing code relying on those invariants will now break.
For example, all the approaches that I have seen in statically typed
languages deal with adding to a running program (say, class fields and
methods, etc.), but not with changing to, or removing from it.

Of course, there are good reasons for that: removing fields or
changing their invariants requires that all /deployed/ code which
relied on their existence or their invariants must be made aware of
this change. This is a semantic problem which happens to reveal
itself as a typing problem. By making types dynamic, the problem does
not go away but is merely swept under the rug.
How does this "create" such a problem? The problem is there in
either
approach. In fact, I believe that the best chance we have of
addressing the problem is by adopting the "replace the code" model
along with a "translate the data where necessary at the time of
replacement". Translating the data, i.e., re-establishing the
invariants expected by the updated/replaced code, seems much harder
(to me) in the case of self-modifying code. Erlang got this one
right.

...and the "translate the date where necessary" approach is
essentially triggered by a dynamic type test (if value x is of an old
version of type T, update it to reflect the new version of type T
[1]). QED.

But this test would have to already exist in code that was deployed
/before/ the change! How did this code know what to test for, and how
did it know how to translate the data? Plus, how do you detect that
some piece of data is "of an old version of type T"? If v has type T
and T "changes" (whatever that might mean), how can you tell that v's
type is "the old T" rather than "the new T"! Are there two different
Ts around now? If so, how can you say that T has changed?

The bottom line is that even the concept of "changing types at
runtime" makes little sense. Until someone shows me a /careful/
formalization that actually works, I just can't take it very
seriously.
 
J

Joe Marshall

David said:
Joe said:
The point is that there exists (by construction) programs that can
never be statically checked.

I don't think you've shown that. I would like to see a more explicit
construction of a dynamically typed [*] program with a given specification,
where it is not possible to write a statically typed program that meets
the same specification using the same algorithms.

I think we're at cross-purposes here. It is trivial to create a static
type system that has a `universal type' and defers all the required
type narrowing to runtime. In effect, we've `turned off' the static
typing (because everything trivially type checks at compile time).

However, a sufficiently powerful type system must be incomplete or
inconsistent (by Godel). A type system proves theorems about a
program. If the type system is rich enough, then there are unprovable,
yet true theorems. That is, programs that are type-safe but do not
type check. A type system that could reflect on itself is easily
powerful enough to qualify.
AFAIK, it is *always* possible to construct such a statically typed
program in a type system that supports typerec or gradual typing. The
informal construction you give above doesn't contradict that, because
it depends on reflecting on a [static] type system, and in a dynamically
typed program there is no type system to reflect on.

A static type system is usually used for universal proofs: For all
runtime values of such and such... Dynamic checks usually provide
existential refutations: This particular value isn't a string. It may
be the case that there is no universal proof, yet no existential
refutation.
Note that I'm not claiming that you can check any desirable property of
a program (that would contradict Rice's Theorem), only that you can
express any dynamically typed program in a statically typed language --
with static checks where possible and dynamic checks where necessary.

Sure. And I'm not saying that you cannot express these programs in a
static language, but rather that there exist programs that have
desirable properties (that run without error and produce the right
answer) but that the desirable properties cannot be statically checked.

The real question is how common these programs are, and what sort of
desirable properties are we trying to check.
 
M

Matthias Blume

Marshall said:
Pardon my ignorance, but what is the Erlang solution to this problem?

Erlang relies on a combination of purity, concurrency, and message
passing, where messages can carry higher-order values.

Data structures are immutable, and each computational agent is a
thread. Most threads consist a loop that explicitly passes state
around. It dispatches on some input event, applies a state
transformer (which is a pure function), produces some output event (if
necessary), and goes back to the beginning of the loop (by
tail-calling itself) with the new state.

When a code update is necessary, a special event is sent to the
thread, passing along the new code to be executed. The old code, upon
receiving such an event, ceases to go back to its own loop entry point
(by simply not performing the self-tail-call). Instead it tail-calls
the new code with the current state.

If the new code can live with the same data structures that the old
code had, then it can directly proceed to perform real work. If new
invariants are to be established, it can first run a translation
function on the current state before resuming operation.

From what I hear from the Erlang folks that I have talked to, this
approach works extremely well in practice.
 
P

Pascal Costanza

Matthias said:
That's why I find the Erlang model to be more promising.

I am extremely skeptical of code mutation at runtime which would
"change types", because to me types are approximations of program
invariants. So if you do a modification that changes the types, it is
rather likely that you did something that also changed the invariants,
and existing code relying on those invariants will now break.

....no, it will throw exceptions that you can catch and handle, maybe
interactively.
Of course, there are good reasons for that: removing fields or
changing their invariants requires that all /deployed/ code which
relied on their existence or their invariants must be made aware of
this change. This is a semantic problem which happens to reveal
itself as a typing problem. By making types dynamic, the problem does
not go away but is merely swept under the rug.

....and yet this requirement sometimes comes up.
Note that prohibiting directly self-modifying code does not prevent a
program from specifying another program to *replace* it.
...and this creates problems with moving data from one version of a
program to the next.
How does this "create" such a problem? The problem is there in
either
approach. In fact, I believe that the best chance we have of
addressing the problem is by adopting the "replace the code" model
along with a "translate the data where necessary at the time of
replacement". Translating the data, i.e., re-establishing the
invariants expected by the updated/replaced code, seems much harder
(to me) in the case of self-modifying code. Erlang got this one
right.
...and the "translate the date where necessary" approach is
essentially triggered by a dynamic type test (if value x is of an old
version of type T, update it to reflect the new version of type T
[1]). QED.

But this test would have to already exist in code that was deployed
/before/ the change! How did this code know what to test for, and how
did it know how to translate the data?

In CLOS, the test is indeed already there from the beginning. It's part
of the runtime semantics. The translation of the data is handled by
'update-instance-for-redefined-class, which is a generic function for
which you can define your own methods. So this is user-extensible and
can, for example, be provided as part of the program update.

Furthermore, slot accesses typically go through generic functions (aka
setters and getters in other languages) for which you can provide
methods that can perform useful behavior in case the corresponding slots
have gone. These methods can also be provided as part of the program update.

Note that I am not claiming that CLOS provides the perfect solution, but
it seems to work reasonably well that people use it in practice.
Plus, how do you detect that
some piece of data is "of an old version of type T"? If v has type T
and T "changes" (whatever that might mean), how can you tell that v's
type is "the old T" rather than "the new T"! Are there two different
Ts around now? If so, how can you say that T has changed?

Presumably, objects have references to their class metaobjects which
contain version information and references to more current versions of
themselves. This is not rocket science.
The bottom line is that even the concept of "changing types at
runtime" makes little sense. Until someone shows me a /careful/
formalization that actually works, I just can't take it very
seriously.

Maybe this feature just isn't made for you.


Pascal
 
J

Joe Marshall

Andreas said:
~/> ocaml -rectypes
Objective Caml version 3.08.3

# let rec blackhole x = blackhole;;
val blackhole : 'b -> 'a as 'a = <fun>

The problem is, though, that almost everything can be typed once you
have unrestricted recursive types (e.g. missing arguments etc), and
consequently many actual errors remain unflagged (which clearly shows
that typing is not only about potential value class mismatches).
Moreover, there are very few practical uses of such a feature, and they
can always be coded easily with recursive datatypes.

It is a pragmatic decision born from experience that you simply do *not
want* to have this, even though you easily could. E.g. for OCaml,
unrestricted recursive typing was removed as default because of frequent
user complaints.

Which is why this actually is a very bad example to chose for dynamic
typing advocacy... ;-)

Actually, this seems a *good* example. The problem seems to be that
you end up throwing the baby out with the bathwater: your static type
system stops catching the errors you want once you make it powerful
enough to express certain programs.

So now it seems to come down to a matter of taste: use a restricted
language and catch certain errors early or use an unrestricted language
and miss certain errors. It is interesting that the PLT people have
made this tradeoff as well. In the DrScheme system, there are
different `language levels' that provide a restricted subset of Scheme.
At the beginner level, first-class procedures are not allowed. This
is obviously restrictive, but it has the advantage that extraneous
parenthesis (a common beginner mistake) cannot be misinterpreted as the
intent to invoke a first-class procedure.
 
D

David Hopwood

Anton said:
There's only one issue mentioned there that needs to be negotiated
w.r.t. latent types: the requirement that they are "performed entirely
on the program before execution". More below.


I'm not trying to call programmer reasoning in general a type system.
I'd certainly agree that a programmer muddling his way through the
development of a program, perhaps using a debugger to find all his
problems empirically, may not be reasoning about anything that's
sufficiently close to types to call them that. But "latent" means what
it implies: someone who is more aware of types can do better at
developing the latent types.

So these "latent types" may in some cases not exist (in any sense), unless
and until someone other than the original programmer decides how the program
could have been written in a typed language?

I don't get it.
As a starting point, let's assume we're dealing with a disciplined
programmer who (following the instructions found in books such as the
one at htdp.org) reasons about types, writes them in his comments, and
perhaps includes assertions (or contracts in the sense of "Contracts for
Higher Order Functions[1]), to help check his types at runtime (and I'm
talking about real type-checking, not just tag checking).

When such a programmer writes a type signature as a comment associated
with a function definition, in many cases he's making a claim that's
provable in principle about the inputs and outputs of that function.

This is talking about a development practice, not a property of the
language. It may (for the sake of argument) make sense to assign a term
such as latent typing to this development practice, but not to call the
language "latently-typed".
 
R

rossberg

Joe said:
Actually, this seems a *good* example. The problem seems to be that
you end up throwing the baby out with the bathwater: your static type
system stops catching the errors you want once you make it powerful
enough to express certain programs.

That is not the case: you can still express these programs, you just
need to do an indirection through a datatype.

- Andreas
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,825
Latest member
VernonQuy6

Latest Threads

Top