What is Expressiveness in a Computer Language

A

Anton van Straaten

Marshall said:
That is starting to get a bit too mystical for my tastes.

It's informal, but hardly mystical. The formal semantic definitions of
dynamically-checked languages I'm referring to are the typical "untyped"
definitions, in which the language has a single static type.

This means that when you as a programmer see a function that you "know"
e.g. takes a number and returns a number, that knowledge is not
something that's captured by the language's formal semantics or static
type system. All that can be expressed in the language's static type
system is that the function takes a value and returns a value.

When you're programming in your favorite dynamically-typed language,
I'll bet that you're not thinking of functions, or other expressions, in
such a restricted way. Your experience of programming in the language
thus relies heavily on features that are not a "property of the
language", if "the language" is taken to mean its formal semantic
definition.

So in referring to a language as latently-typed, the word "language" can
be taken to mean what a programmer thinks of as the language, rather
than a formal definition that only captures a subset of what the
programmer routinely deals with.

If we don't acknowledge this distinction between the part of the
language that we can capture formally and the part that we haven't
(yet), then we'll be forever stuck without accepted terminology for the
informal bits, because such terminology can always be objected to on the
grounds that it isn't supported by formal semantics.

Anton
 
C

Chris F Clark

Chris Smith said:
I thought about this in the context of reading Anton's latest post to
me, but I'm just throwing out an idea.
I think there is some sense of convergence here.

Apologies for following-up to my own post, but I forgot to describe
the convergence.

The convergence is there is something more general that what type
theorists talk about when discussing types. Type theory is an
abstraction and systemization of reasoning about types. This more
general reasoning doesn't need to be sound, it just needs to be based
aound "types" and computations.

My own reasoning uses such an unsound system. It has clear roots in
sound reasoning, and I wish I could make it sound, but....

In any case, dynamic tagging helps support my unsound reasoning by
providing evidence when I have made an error.

I hope this makes it less mystical for Marshall.

-Chris
 
G

Gabriel Dos Reis

| David Hopwood wrote:
| >
| > A type system that required an annotation on all subprograms that do not
| > provably terminate, OTOH, would not impact expressiveness at all, and would
| > be very useful.
|
| Interesting. I have always imagined doing this by allowing an
| annotation on all subprograms that *do* provably terminate. If
| you go the other way, you have to annotate every function that
| uses general recursion (or iteration if you swing that way) and that
| seems like it might be burdensome. Further, it imposes the
| annotation requirement even where the programer might not
| care about it, which the reverse does not do.

simple things should stay simple. Recursions that provably terminate
are among the simplest ones. Annotations in those cases could be
allowed, but not required. Otherwise the system might become very
irritating to program with.

-- Gaby
 
C

Chris Smith

Chris F Clark said:
Apologies for following-up to my own post, but I forgot to describe
the convergence.

The convergence is there is something more general that what type
theorists talk about when discussing types. Type theory is an
abstraction and systemization of reasoning about types. This more
general reasoning doesn't need to be sound, it just needs to be based
aound "types" and computations.

Unfortunately, I have to again reject this idea. There is no such
restriction on type theory. Rather, the word type is defined by type
theorists to mean the things that they talk about. Something becomes a
type the instant someone conceives of a type system that uses it. We
could try to define a dynamic type system, as you seem to say, as a
system that checks against only what can be known about some ideal type
system that would be able to prove the program valid... but I
immediately begin to wonder if it would ever be possible to prove that
something is a dynamic type system for a general-purpose (Turing-
complete) language.

If you could define types for a dynamic type system in some stand-alone
way, then this difficulty could be largely pushed out of the way,
because we could say that anything that checks types is a type system,
and then worry about verifying that it's a sound type system without
worrying about whether it's a subset of the perfect type system. So
back to that problem again.

You said:
I consider any case where a program gives a function outside of its
domain a "type error", because an ideal (but unacheivable) type system
would have prevented it. That does not make all errors, type errors,
because if I give a program an input within its domain and it
mis-computes the result, that is not a type error.

So what is the domain of a function? (Heck, what is a function? I'll
neglect that by assuming that a function is just an expression;
otherwise, this will get more complicated.) I see three possible ways
to look at a domain.

(I need a word here that implies something like a set, but I don't care
to verify the axioms of set theory. I'm just going to use set. Hope
that's okay.)

1. The domain is the set of inputs to that expression which are going to
produce a correct result.

2. The domain is the set of inputs that I expected this expression to
work with when I wrote it.

3. The domain is the set of inputs for which the expression has a
defined result within the language semantics.

So the problem, then, more clearly stated, is that we need something
stronger than #3 and weaker than #1, but #2 includes some psychological
nature that is problematic to me (though, admittedly, FAR less
problematic than the broader uses of psychology to date.)

Is that a fair description of where we are in defining types, then?
 
D

David Hopwood

Marshall said:
Interesting. I have always imagined doing this by allowing an
annotation on all subprograms that *do* provably terminate. If
you go the other way, you have to annotate every function that
uses general recursion (or iteration if you swing that way) and that
seems like it might be burdensome.

Not at all. Almost all subprograms provably terminate (with a fairly
easy proof), even if they use general recursion or iteration.

If it were not the case that almost all functions provably terminate,
then the whole idea would be hopeless. If a subprogram F calls G, then
in order to prove that F terminates, we probably have to prove that G
terminates. Consider a program where only half of all subprograms are
annotated as provably terminating. In that case, we would be faced with
very many cases where the proof cannot be discharged, because an
annotated subprogram calls an unannotated one.

If, on the other hand, more than, say, 95% of subprograms provably
terminate, then it is much more likely that we (or the inference
mechanism) can easily discharge any particular proof involving more
than one subprogram. So provably terminating should be the default,
and other cases should be annotated.

In some languages, annotations may never or almost never be needed,
because they are implied by other characteristics. For example, the
concurrency model used in the language E (www.erights.org) is such
that there are implicit top-level event loops which do not terminate
as long as the associated "vat" exists, but handling a message is
always required to terminate.
Further, it imposes the
annotation requirement even where the programer might not
care about it, which the reverse does not do.

If the annotation marks not-provably-terminating subprograms, then it
calls attention to those subprograms. This is what we want, since it is
less safe/correct to use a nonterminating subprogram than a terminating
one (in some contexts).

There could perhaps be a case for distinguishing annotations for
"intended not to terminate", and "intended to terminate, but we
couldn't prove it".

I do not know how well such a type system would work in practice; it may
be that typical programs would involve too many non-trivial proofs. This
is something that would have to be tried out in a research language.
 
D

David Hopwood

Pascal said:
Vesa said:
I think that we're finally getting to the bottom of things. While
reading your reponses something became very clear to me: latent-typing and
latent-types are not a property of languages. Latent-typing, also known as
informal reasoning, is something that all programmers do as a normal part
of programming. To say that a language is latently-typed is to make a
category mistake, because latent-typing is not a property of languages.

I disagree with you and agree with Anton. Here, it is helpful to
understand the history of Scheme a bit: parts of its design are a
reaction to what Schemers perceived as having failed in Common Lisp (and
other previous Lisp dialects).

One particularly illuminating example is the treatment of nil in Common
Lisp. That value is a very strange beast in Common Lisp because it
stands for several concepts at the same time: most importantly the empty
list and the boolean false value. Its type is also "interesting": it is
both a list and a symbol at the same time. It is also "interesting" that
its quoted value is equivalent to the value nil itself. This means that
the following two forms are equivalent:

(if nil 42 4711)
(if 'nil 42 4711)

Both forms evaluate to 4711.

It's also the case that taking the car or cdr (first or rest) of nil
doesn't give you an error, but simply returns nil as well.

The advantage of this design is that it allows you to express a lot of
code in a very compact way. See
http://www.apl.jhu.edu/~hall/lisp/Scheme-Ballad.text for a nice
illustration.

The disadvantage is that it is mostly impossible to have a typed view of
nil, at least one that clearly disambiguates all the cases. There are
also other examples where Common Lisp conflates different types, and
sometimes only for special cases. [1]

Now compare this with the Scheme specification, especially this section:
http://www.schemers.org/Documents/Standards/R5RS/HTML/r5rs-Z-H-6.html#%_sec_3.2

This clearly deviates strongly from Common Lisp (and other Lisp
dialects). The emphasis here is on a clear separation of all the types
specified in the Scheme standard, without any exception. This is exactly
what makes it straightforward in Scheme to have a latently typed view of
programs, in the sense that Anton describes. So latent typing is a
property that can at least be enabled / supported by a programming
language, so it is reasonable to talk about this as a property of some
dynamically typed languages.

If anything, I think that this example supports my and Vesa's point.
The example demonstrates that languages
*that are not distinguished in whether they are called latently typed*
support informal reasoning about types to varying degrees.
 
D

David Hopwood

Anton said:
No, I'm not saying it is, although I am saying that the former supports
the latter.

But since the relevant feature that the languages in question possess is
dynamic tagging, it is more precise and accurate to use that term to
describe them.

Also, dynamic tagging is only a minor help in this respect, as evidenced
by the fact that explicit tag tests are quite rarely used by most programs,
if I'm not mistaken. IMHO, the support does not go far enough for it to be
considered a defining characteristic of these languages.

When tag tests are used implicitly by other language features such as
pattern matching and dynamic dispatch, they are used for purposes that are
equally applicable to statically typed and non-(statically-typed) languages.
Right. I see at least two issues here: one is that as a matter of
shorthand, compressing "language which supports latent typing" to
"latently-typed language" ought to be fine, as long as the term's
meaning is understood.

If, for the sake of argument, "language which supports latent typing" is
to be compressed to "latently-typed language", then statically typed
languages must be considered also latently typed.

After all, statically typed languages support expression and
verification of the "types in the programmer's head" at least as well
as non-(statically-typed) languages do. In particular, most recent
statically typed OO languages use dynamic tagging and are memory safe.
And they support comments ;-)

This is not, quite obviously, what most people mean when they say
that a particular *language* is "latently typed". They almost always
mean that the language is dynamically tagged, *not* statically typed,
and memory safe. That is how this term is used in R5RS, for example.
But beyond that, there's an issue here about the definition of "the
language". When programming in a latently-typed language, a lot of
action goes on outside the language - reasoning about static properties
of programs that are not captured by the semantics of the language.

This is true of programming in any language.
This means that there's a sense in which the language that the
programmer programs in is not the same language that has a formal
semantic definition. As I mentioned in another post, programmers are
essentially mentally programming in a richer language - a language which
has informal (static) types - but the code they write down elides this
type information, or else puts it in comments.

If you consider stuff that might be in the programmer's head as part
of the program, where do you stop? When I maintain a program written
by someone I've never met, I have no idea what was in that programmer's
head. All I have is comments, which may be (and frequently are,
unfortunately) inaccurate.

(Actually, it's worse than that -- if I come back to a program 5 years
later, I probably have little idea what was in my head at the time I
wrote it.)
We have to accept, then, that the formal semantic definitions of
dynamically-checked languages are incomplete in some important ways.
Referring to those semantic definitions as "the language", as though
that's all there is to the language in a broader sense, is misleading.

Bah, humbug. The language is just the language.
In this context, the term "latently-typed language" refers to the
language that a programmer experiences, not to the subset of that
language which is all that we're typically able to formally define.

I'm with Marshall -- this is way too mystical for me.
 
D

David Hopwood

David said:
I would call it an informal type annotation. But the very fact that
it has to be expressed as a comment, and is not checked,

What I meant to say here is "and is not used in any way by the language
implementation,"
 
D

David Hopwood

The problem with this is that from that point on, what you're running
is neither the old nor the new program, since its state may have been
influenced by the bug before you corrected it.

I find it far simpler to just restart the program after correcting
anything. If this is too difficult, I change the design to make it
less difficult.
Wow, interesting.

(I say the following only to point out differing strategies of
development, not to say one or the other is right or bad or
whatever.)

I occasionally find myself doing the above, and when I do,
I take it as a sign that I've screwed up. I find that process
excruciating, and I do everything I can to avoid it. Over the
years, I've gotten more and more adept at trying to turn
as many bugs as possible into type errors, so I don't have
to run the debugger.

Now, there might be an argument to be made that if I had
been using a dynamic language, the process wouldn't be
so bad, and I woudn't dislike it so much. But mabe:

As a strawman: suppose there are two broad categories
of mental strategies for thinking about bugs, and people
fall naturally into one way or the other, the way there
are morning people and night people. One group is
drawn to the static analysis, one group hates it.
One group hates working in the debugger, one group
is able to use that tool very effectively and elegantly.

Anyway, it's a thought.

I don't buy this -- or at least, I am not in either group.

A good debugger is invaluable regardless of your attitude to type
systems. Recently I was debugging two programs written to do similar
things in the same statically typed language (C), but where a debugger
could not be used for one of the programs. It took maybe 5 times
longer to find and fix each bug without the debugger, and I found it
a much more frustrating experience.
 
C

Chris F Clark

Chris Smith said:
Unfortunately, I have to again reject this idea. There is no such
restriction on type theory. Rather, the word type is defined by type
theorists to mean the things that they talk about.

Do you reject that there could be something more general than what a
type theorist discusses? Or do you reject calling such things a type?

Let you write:
because we could say that anything that checks types is a type system,
and then worry about verifying that it's a sound type system without
worrying about whether it's a subset of the perfect type system.

I'm particularly interested if something unsound (and perhaps
ambiguous) could be called a type system. I definitely consider such
things type systems. However, I like my definitions very general and
vague. Your writing suggests the opposite preference. To me if
something works in an analogous way to how a known type system, I tend
to consider it a "type system". That probably isn't going to be at
all satisfactory to someone wanting a more rigorous definition. Of
course, to my mind, the rigorous definitions are just an attempt to
capture something that is already known informally and put it on a
more rational foundation.
So what is the domain of a function? (Heck, what is a function? ....
(I need a word here that implies something like a set, but I don't care
to verify the axioms of set theory. I'm just going to use set. Hope
that's okay.)

Yes, I meant the typical HS algebra definition of domain, which is a
set, same thing for function. More rigorous definitions can be
sustituted as necessary and appropriate.
1. The domain is the set of inputs to that expression which are going to
produce a correct result.

2. The domain is the set of inputs that I expected this expression to
work with when I wrote it.

3. The domain is the set of inputs for which the expression has a
defined result within the language semantics.

So the problem, then, more clearly stated, is that we need something
stronger than #3 and weaker than #1, but #2 includes some psychological
nature that is problematic to me (though, admittedly, FAR less
problematic than the broader uses of psychology to date.)

Actually, I like 2 quite well. There is some set in my mind when I'm
writing a particular expression. It is likely an ill-defined set, but
I don't notice that. That set is exactly the "type". I approximate
that set and it's relationships to other sets in my program with the
typing machinery of the language I am using. That is the static (and
well-founded) type. It is however, only an approximation to the ideal
"real" type. If I could make that imaginary mental set concrete (and
well-defined), that would be exactly my concept of type.

Now, that overplays the conceptual power in my mind. My mental image
is influenced by my knowledge of the semantics of the language and
also any implementations I may have used. Thus, I will weaken 2 to be
closer to 3, because I don't expect a perfectly ideal type system,
just an approximation. To the extent that I am aware of gotchas in
the language or a relavent implementation, I amy write extra code to
try and limit things to model 1. Note that in my fallible mind, 1 and
2 are identical. In my hubris, I expect that the extra checks I have
made have restricted my inputs to where model 3 and 1 coincide.

Expanding that a little. I expect the language to catch type errors
where I violate model 3. To the extent model 3 differs from model 1
and my code hasn't added extra checks to catch it, I have bugs
resulting from undetected type errors or perhaps modelling errors. To
me they are type errors, because I expect that there is a typing
system that captures 1 for the program I am writing, even though I
know that that is not generally true.

As I reflect on this more, I really do like 2 in the sense that I
believe there is some abstract Platonic set that is an ideal type
(like 1) and that is what the type system is approximating. to the
sense that languages approximate either 1 or 2, those facilites are
type facilities of a language. That puts me very close to your
(rejected) definition of type as the well-defined semantics of the
language. Except I think of types as the sets of values that the
language provides, so it isn't the entire semantics of the language,
just that part which divides values into discrete sets which are
approriate to different operations (and detect inaapropriate values).

-Chris
 
M

Marshall

Chris said:
I'm particularly interested if something unsound (and perhaps
ambiguous) could be called a type system. I definitely consider such
things type systems.

I don't understand. You are saying you prefer to investigate the
unsound over the sound?

However, I like my definitions very general and
vague. Your writing suggests the opposite preference.

Again, I cannot understand this. In a technical realm, vagueness
is the opposite of understanding. To me, it sounds like you are
saying that you prefer not to understand the field you work in.

To me if
something works in an analogous way to how a known type system, I tend
to consider it a "type system". That probably isn't going to be at
all satisfactory to someone wanting a more rigorous definition.

Analogies are one thing; definitions are another.

Of
course, to my mind, the rigorous definitions are just an attempt to
capture something that is already known informally and put it on a
more rational foundation.

If something is informal and non-rational, it cannot be said to
be "known." At best, it could be called "suspected." Even if you
think something which turns out to be true, we could not say
that you "knew" it unless your reasons for your thoughts were
valid.

I flipped a coin to see who would win the election; it came
up "Bush". Therefore I *knew* who was going to win the
election before it happened. See the probem?


Marshall
 
M

Matthias Blume

David Hopwood said:
I don't think Vesa was talking about trying to solve the halting problem.

A type system that required termination would indeed significantly restrict
language expressiveness -- mainly because many interactive processes are
*intended* not to terminate.

Most interactive processes are written in such a way that they
(effectively) consist of an infinitely repeated application of some
function f that maps the current state and the input to the new state
and the output.

f : state * input -> state * output

This function f itself has to terminate, i.e., if t has to be
guaranteed that after any given input, there will eventually be an
output. In most interactive systems the requirements are in fact much
stricter: the output should come "soon" after the input has been
received.

I am pretty confident that the f for most (if not all) existing
interactive systems could be coded in a language that enforces
termination. Only the loop that repeatedly applies f would have to be
coded in a less restrictive language.
 
C

Chris Smith

Chris F Clark said:
Do you reject that there could be something more general than what a
type theorist discusses? Or do you reject calling such things a type?

I think that the correspondence partly in the wrong direction to
describe it that way. If someone were to stand up and say "every
possible but of reasoning about program correctness or behavior is type
reasoning", then I'd be happy to say that formal type systems are a
subset of that kind of reasoning. However, that's quite the statement
to ask of anyone. So far, everyone has wanted to say that there are
some kinds of reasoning that are type reasoning and some that are not.
If that is the case, then a formal type system is in that sense more
general than this intuitive notion of type, since formal type systems
are not limited to verifying any specific category of statements about
program behavior (except, perhaps that they are intended to verify
guarantees, not possibilities; I don't believe it would fit all
definitions of formal types to try to verify that it is possible for a
program to terminate, for example).

What I can't agree to is that what you propose is actually more general.
It is more general in some ways, and more limited in others. As such,
the best you can say is that is analogous to formal types in some ways,
but certainly not that it's a generalization of them.
Let you write:

I'm particularly interested if something unsound (and perhaps
ambiguous) could be called a type system.

Yes, although this is sort of a pedantic question in the first place, I
believe that an unsound type system would still generally be called a
type system in formal type theory. However, I have to qualify that with
a clarification of what is meant by unsound. We mean that it DOES
assign types to expressions, but that those types are either not
sufficient to prove what we want about program behavior, or they are not
consistent so that a program that was well typed may reduce to poorly
typed program during its execution. Obviously, such type systems don't
prove anything at all, but they probably still count as type systems.

(I'll point out briefly that a typical, but not required, approach to
type theory is to define program semantics so that the evaluator lacks
even the rules to continue evaluating programs that are poorly typed or
that don't have the property of interest. Hence, you'll see statements
like one earlier in this thread that part of type-soundness is a
guarantee that the evaluation or semantics doesn't get stuck. I
actually think this is an unfortunate confusion of the model with the
thing being modeled, and the actual requirement of interest is that the
typing relation is defined so that all well-typed terms have the
interesting property which we are trying to prove. It is irrelevant
whether the definition of the language semantics happens to contain
rules that generalize to ill-typed programs or not.)

I suspect, though, that you mean something else by unsound. I don't
know what you mean, or whether it could be considered formally a type
system.
Actually, I like 2 quite well. There is some set in my mind when I'm
writing a particular expression. It is likely an ill-defined set, but
I don't notice that. That set is exactly the "type".

I don't actually mind #2 when we're talking about types that exist in
the programmer's mind. I suspect that it may get some complaint along
the lines that in the presence of type polymorphism, programmers don't
need to be (perhaps rarely are) thinking of any specific set of values
when they write code. I would agree with that statement, but point out
that I can define at least my informal kind of set by saying, for
instance, "the set of all inputs on which X operations make sense".

I believe there is more of a concrete difference between #1 and #2 than
you may realize. If we restrict types to considering the domain of
functions without regard to the context in which they are called, then
there are plenty of inputs on which this function does its own job quite
well, but a global analysis tool may be able to conclude that if that
input has been passed in, then this program already beyond the
possibility of producing a correct result. #1 would, therefore, be a
more global form of analysis than #2, rather than being merely a
question of whether you are fallible or not.
Expanding that a little. I expect the language to catch type errors
where I violate model 3.

So would I. (Actually, that's another pesky question; if the language
"catches" it, did the error even occur, or did the expression just gain
a defined result such that domain #3 is indistinguishable from the set
of all possible inputs. I tend to agree with Chris Uppal, I suppose,
that the latter is really the case, so that dynamic type systems in
languages, if they are sound, prevent concept #3 from ever being in
question. But, of course, you don't think about it that way, which
distinguishes 2 from 3.)
 
G

George Neuner

No, the following compiles perfectly fine (using GNU Pascal):

program bla;
type
apple = integer;
orange = integer;
var
a : apple;
o : orange;
begin
a := o
end.

You are both correct.

The original Pascal specification failed to mention whether user
defined types should be compatible by name or by structure. Though
Wirth's own 1974 reference implementation used name compatibility,
implementations were free to use structure compatibility instead and
many did. There was a time when typing differences made Pascal code
completely non-portable[1].

When Pascal was finally standardized in 1983, the committees followed
C's (dubious) example and chose to use structure compatibility for
simple types and name compatibility for records.


[1] Wirth also failed to specify whether boolean expression evaluation
should be short-circuit or complete. Again, implementations went in
both directions. Some allowed either method by switch, but the type
compatibility issue continued to plague Pascal until standard
conforming compilers emerged in the mid 80's.

George
 
A

Anton van Straaten

David said:
But since the relevant feature that the languages in question possess is
dynamic tagging, it is more precise and accurate to use that term to
describe them.

So you're proposing to call them dynamically-tagged languages?
Also, dynamic tagging is only a minor help in this respect, as evidenced
by the fact that explicit tag tests are quite rarely used by most programs,
if I'm not mistaken.

It sounds as though you're not considering the language implementations
themselves, where tag tests occur all the time - potentially on every
operation. That's how "type errors" get detected. This is what I'm
referring to when I say that dynamic tags support latent types.

Tags are absolutely crucial for that purpose: without them, you have a
language similar to untyped lambda calculus, where "latent type errors"
can result in very difficult to debug errors, since execution can
continue past errors and produce completely uninterpretable results.
IMHO, the support does not go far enough for it to be
considered a defining characteristic of these languages.

Since tag checking is an implicit feature of language implementations
and the language semantics, it certainly qualifies as a defining
characteristic.
When tag tests are used implicitly by other language features such as
pattern matching and dynamic dispatch, they are used for purposes that are
equally applicable to statically typed and non-(statically-typed) languages.

A fully statically-typed language doesn't have to do tag checks to
detect static type errors.

Latently-typed languages do tag checks to detect latent type errors.

You can take the preceding two sentences as a summary definition for
"latently-typed language", which will come in handy below.
If, for the sake of argument, "language which supports latent typing" is
to be compressed to "latently-typed language", then statically typed
languages must be considered also latently typed.

See definition above. The phrase "language which supports latent
typing" wasn't intended to be a complete definition.
After all, statically typed languages support expression and
verification of the "types in the programmer's head" at least as well
as non-(statically-typed) languages do. In particular, most recent
statically typed OO languages use dynamic tagging and are memory safe.
And they support comments ;-)

But they don't use tags checks to validate their static types.

When statically-typed languages *do* use tags, in cases where the static
type system isn't sufficient to avoid them, then indeed, those parts of
the program use latent types, in the exact same sense as more fully
latently-typed languages do. There's no conflict here, it's simply the
case that most statically-typed languages aren't fully statically typed.
This is not, quite obviously, what most people mean when they say
that a particular *language* is "latently typed". They almost always
mean that the language is dynamically tagged, *not* statically typed,
and memory safe. That is how this term is used in R5RS, for example.

The R5RS definition is compatible with what I've just described, because
the parts of a statically-typed language that would be considered
latently-typed are precisely those which rely on dynamic tags.
This is true of programming in any language.

Right, but when you compare a statically-typed language to an untyped
language at the formal level, a great deal more static reasoning goes on
outside the language in the untyped case.

What I'm saying is that it makes no sense, in most realistic contexts,
to think of untyped languages as being just that: languages in which the
type of every term is simply a tagged value, as though no static
knowledge about that value exists. The formal model requires that you
do this, but programmers can't function if that's all the static
information they have. This isn't true in the case of a fully
statically-typed language.
If you consider stuff that might be in the programmer's head as part
of the program, where do you stop? When I maintain a program written
by someone I've never met, I have no idea what was in that programmer's
head. All I have is comments, which may be (and frequently are,
unfortunately) inaccurate.

You have to make the same kind of inferences that the original
programmer made. E.g. when you see a function that takes some values,
manipulates them using numeric operators, and returns a value, you have
to either figure out or trust the comments that the function accepts
numbers (or integers, floats etc.) and returns a number.

This is not some mystical psychological quality: it's something that
absolutely must be done in order to be able to reason about a program at
all.
(Actually, it's worse than that -- if I come back to a program 5 years
later, I probably have little idea what was in my head at the time I
wrote it.)

But that's simply the reality - you have to reconstruct: recover the
latent types, IOW. There's no way around that!

Just to be concrete, I'll resurrect my little Javascript example:

function timestwo(x) { return x*2 }

Assume I wrote this and distributed it in source form as a library, and
now you have to maintain it. How do you know what values to call it
with, or what kind of values it returns? What stops you from calling it
with a string?

Note that according the formal semantics of an untyped language, the
type of that function is "value -> value".

The phrase "in the programmer's head" was supposed to help communicate
the concept. Don't get hung up on it.
Bah, humbug. The language is just the language.

If you want to have this discussion precisely, you have to do better
than that. There is an enormous difference between the formal semantics
of an untyped language, and the (same) language which programmers work
with and think of as "dynamically typed".

Use whatever terms you like to characterize this distinction, but you
can't deny that the distinction exists, and that it's quite a big,
important one.
I'm with Marshall -- this is way too mystical for me.

I said more in my reply to Marshall. Perhaps I'm not communicating
well. At worst, what I'm talking about is informal. It's easy to prove
that you can't reason about a program if you don't know what types of
values a function accepts or returns - which is what an untyped formal
model gives you.

Anton
 
A

Anton van Straaten

Marshall said:
I don't understand. You are saying you prefer to investigate the
unsound over the sound?

The problem is that there are no useful sound definitions for the type
systems (in the static sense) of dynamically-typed languages. Yet, we
work with type-like static properties in those languages all the time,
as I've been describing.

If you want to talk about those properties as though they were types,
one of the options is what Chris Clark described when he wrote "I reason
about my program using types which I can (at least partially) formalize,
but for which there is no sound axiomatic system."
Again, I cannot understand this. In a technical realm, vagueness
is the opposite of understanding. To me, it sounds like you are
saying that you prefer not to understand the field you work in.

The issue as I see it is simply that if we're going to work with
dynamically-typed programs at all, our choices are limited at present,
when it comes to formal models that capture our informal static
reasoning about those programs. In statically-typed languages, this
reasoning is mostly captured by the type system, but it has no formal
description for dynamically-typed languages.
Analogies are one thing; definitions are another.

A formal definition is problematic, precisely because we're dealing with
something that to a large extent is deliberately unformalized. But as
Chris Clark pointed out, "these types are locally sound, i.e. I can
prove properties that hold for regions of my programs." We don't have
to rely entirely on analogies, and this isn't something that's entirely
fuzzy. There are ways to relate it to formal type theory.
If something is informal and non-rational, it cannot be said to
be "known."

As much as I love the view down into that abyss, we're nowhere near
being in such bad shape.

We know that we can easily take dynamically-typed program fragments and
assign them type schemes, and we can make sure that the type schemes
that we use in all our program fragments use the same type system.

We know that we can assign type schemes to entire dynamically-typed
programs, and we can even automate this process, although not without
some practical disadvantages.

So we actually have quite a bit of evidence about the presence of static
types in dynamically-typed programs.

Besides, we should be careful not to forget that our formal methods are
incredibly weak compared to the power of our intelligence. If that were
not the case, there would be no advantage to possessing intelligence, or
implementing AIs.

We are capable of reasoning outside of fully formal frameworks. The
only way in which we are able to develop formal systems is by starting
with the informal.

We're currently discussing something that so far has only been captured
fairly informally. If we restrict ourselves to only what we can say
about it formally, then the conversation was over before it began.

Anton
 
R

rossberg

Marshall said:
This means that there's a sense in which the language that the
programmer programs in is not the same language that has a formal
semantic definition. As I mentioned in another post, programmers are
essentially mentally programming in a richer language - a language which
has informal (static) types - but the code they write down elides this
type information, or else puts it in comments.

[...]

In this context, the term "latently-typed language" refers to the
language that a programmer experiences, not to the subset of that
language which is all that we're typically able to formally define.

That language is not a subset, if at all, it's the other way round, but
I'd say they are rather incomparable. That is, they are different
languages.
That is starting to get a bit too mystical for my tastes.

I have to agree.

\sarcasm One step further, and somebody starts calling C a "latently
memory-safe language", because a real programmer "knows" that his code
is in a safe subset... And where he is wrong, dynamic memory page
protection checks will guide him.

- Andreas
 
J

Joachim Durchholz

George said:
The point is really that the checks that prevent these things must be
performed at runtime and can't be prevented by any practical type
analysis performed at compile time. I'm not a type theorist but my
opinion is that a static type system that could, a priori, prevent the
problem is impossible.

No type theory is needed.
Assume that the wide index type goes into a function and the result is
assigned to a variable fo the narrow type, and it's instantly clear that
the problem is undecidable.
Undecidability can always be avoided by adding annotations, but of
course that would be gross overkill in the case of index type widening.

Regards,
Jo
 
J

Joachim Durchholz

Dimitri said:
That is the basic argument in favour of compile time error checking,
extended to runtime errors. I don't really care if it's the compiler
or runtime that tells the luser "your code is broken", as long as it
makes it clear it's *his* code that's broken, not mine.

You can make runtime errors point to the appropriate code. Just apply
"programming by contract": explicitly state what preconditions a
function is requiring of the caller, have it check the preconditions on
entry (and, ideally, throw the execption in a way that the error is
reported in the caller's code - not a complicated thing but would
require changes in the exception machinery of most languages).

Any errors past that point are either a too liberal precondition (i.e. a
bug in the precondition - but documenting wrong preconditions is still a
massive bug, even if it's "just" a documentation bug), or a bug in the
function's code.

Regards,
Jo
 
J

Joachim Durchholz

OK, now we are simply back full circle to Chris Smith's original
complaint that started this whole subthread, namely (roughly) that
long-established terms like "type" or "typing" should *not* be
stretched in ways like this, because that is technically inaccurate and
prone to misinterpretation.

Sorry, I have to insist that it's not me who's stretching terms here.

All textbook definitions that I have seen define a type as the
set/operations/axioms triple I mentioned above.
No mention of immutability, at least not in the definitions.

Plus, I don't see any necessity on insisting on immutability for the
definition of "type". Otherwise, you'd have to declare that Smalltalk
objects truly don't have a type (not even an implied one), and that
would simply make no sense: they do in fact have a type, even though it
may occasionally change.

Regards,
Jo
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top