Chris said:
I don't recall who said what at this
point, but earlier today someone else posted -- in this same thread --
the idea that static type "advocates" want to classify some languages as
untyped in order to put them in the same category as assembly language
programming. That's something I've never seen, and I think it's far
from the goal of pretty much anyone; but clearly, *someone* was
concerned about it. I don't know if much can be done to clarify this
rhetorical problem, but it does exist.
For the record, I'm not concerned about that problem as such. However,
I do think that characterizations of dynamically typed languages from a
static type perspective tend to oversimplify, usually because they
ignore the informal aspects which static type systems don't capture.
Terminology is a big part of this, so there are valid reasons to be
careful about how static type terminology and concepts are applied to
languages which lack formally defined static type systems.
The *other* bit that's been brought up in this thread is that the word
"type" is just familiar and comfortable for programmers working in
dynamically typed languages, and that they don't want to change their
vocabulary.
What I'm suggesting is actually a kind of bridge between the two
positions. The dynamically typed programmer tends to think in terms of
values having types, rather than variables. What I'm pointing out is
that even those programmers reason about something much more like static
types than they might realize; and that there's a connection between
that reasoning and static types, and also a connection to the tags
associated with values.
If you wanted to take the word "type" and have it mean something
reasonably consistent between the static and dynamic camps, what I'm
suggesting at least points in that direction. Obviously, nothing in the
dynamic camp is perfectly isomorphic to a real static type, which is why
I'm qualifying the term as "latent type", and attempting to connect it
to both static types and to tags.
The *third* thing that's brought up is that there is a (to me, somewhat
vague) conception going around that the two really ARE varieties of the
same thing. I'd like to pin this down more, and I hope we get there,
but for the time being I believe that this impression is incorrect. At
the very least, I haven't seen a good way to state any kind of common
definition that withstands scrutiny. There is always an intuitive word
involved somewhere which serves as an escape hatch for the author to
retain some ability to make a judgement call, and that of course
sabotages the definition. So far, that word has varied through all of
"type", "type error", "verify", and perhaps others... but I've never
seen anything that allows someone to identify some universal concept of
typing (or even the phrase "dynamic typing" in the first place) in a way
that doesn't appeal to intuition.
It's obviously going to be difficult to formally pin down something that
is fundamentally informal. It's fundamentally informal because if
reasoning about the static properties of terms in DT languages were
formalized, it would essentially be something like a formal type system.
However, there are some pretty concrete things we can look at. One of
them, which as I've mentioned elsewhere is part of what led me to my
position, is to look at what a soft type inferencer does. It takes a
program in a dynamically typed language, and infers a static type scheme
for it (obviously, it also defines an appropriate type system for the
language.) This has been done in both Scheme and Erlang, for example.
How do you account for such a feat in the absence of something like
latent types? If there's no static type-like information already
present in the program, how is it possible to assign a static type
scheme to a program without dramatically modifying its source?
I think it's reasonable to look at a situation like that and conclude
that even DT programs contain information that corresponds to types.
Sure, it's informal, and sure, it's usually messy compared to an
explicitly defined equivalent. But the point is that there is
"something" there that looks so much like static types that it can be
identified and formalized.
Undoubtedly, some programmers sometimes perform reasoning about their
programs which could also be performed by a static type system.
I think that's a severe understatement. Programmers always reason about
things like the types of arguments, the types of variables, the return
types of functions, and the types of expressions. They may not do
whole-program inference and proof in the way that a static type system
does, but they do it locally, all the time, every time.
BTW, I notice you didn't answer any of the significant questions which I
posed to Vesa. So let me pose one directly to you: how should I rewrite
the first sentence in the preceding paragraph to avoid appealing to an
admittedly informal notion of type? Note, also, that I'm using the word
in the sense of static properties, not in the sense of tagged values.
Let me pipe up, then, as saying that I can't see those reasons; or at
least, if I am indeed seeing the same reasons that everyone else is,
then I am unconvinced by them that there's any kind of rigorous
connection at all.
For now, I'll stand on what I've written above. When I see if or how
that doesn't convince you, I can go further.
It is, nevertheless, quite appropriate to call the language "untyped" if
I am considering static type systems.
I agree that in the narrow context of considering to what extent a
dynamically typed language has a formal static type system, you can call
it untyped. However, what that essentially says is that formal type
theory doesn't have the tools to deal with that language, and you can't
go much further than that. As long as that's what you mean by untyped,
I'm OK with it.
I seriously doubt that this usage
in any way misleads anyone into assuming the absence of any mental
processes on the part of the programmer. I hope you agree.
I didn't suggest otherwise (or didn't mean to). However, the term
"untyped" does tend to lead to confusion, to a lack of recognition of
the significance of all the static information in a DT program that is
outside the bounds of a formal type system, and the way that runtime tag
checks relate to that static information.
One misconception that occurs is the assumption that all or most of the
static type information in a statically-typed program is essentially
nonexistent in a dynamically-typed program, or at least is no longer
statically present. That can easily be demonstrated to be false, of
course, and I'm not arguing that experts usually make this mistake.
If not,
then I think you significantly underestimate a large category of people.
If you think there's no issue here, I think you significantly
overestimate a large category of people. Let's declare that line of
argument a draw.
I couldn't disagree more. Rather, when you're talking about static
types (or just "types" in most research literature that I've seen), then
the realm of discussion is specifically defined to be the very set of
errors that are automatically caught and flagged by the language
translator. I suppose that it is possible to have an unimplemented type
system, but it would be unimplemented only because someone hasn't felt
the need nor gotten around to it. Being implementABLE is a crucial part
of the definition of a static type system.
I agree with the latter sentence. However, it's nevertheless the case
that it's common to confuse "type system" with "compile-time checking".
This doesn't help reasoning in debates like this, where the existence
of type systems in languages that don't have automated static checking
is being examined.
I am beginning to suspect that you're make the converse of the error I
made earlier in the thread. That is, you may be saying things regarding
the psychological processes of programmers and such that make sense when
discussing dynamic types, and in any case I haven't seen any kind of
definition of dynamic types that is more acceptable yet; but it's
completely irrelevant to static types. Static types are not fuzzy -- if
they were fuzzy, they would cease to be static types -- and they are not
a phenomenon of psychology. To try to redefine static types in this way
not only ignores the very widely accepted basis of entire field of
existing literature, but also leads to false ideas such as that there is
some specific definable set of problems that type systems are meant to
solve.
I'm not trying to redefine static types. I'm observing that there's a
connection between the static properties of untyped programs, and static
types; and attempting to characterize that connection.
You need to be careful about being overly formalist, considering that in
real programming languages, the type system does have a purpose which
has a connection to informal, fuzzy things in the real world. If you
were a pure mathematician, you might get away with claiming that type
systems are just a self-contained symbolic game which doesn't need any
connections beyond its formal ruleset.
Think of it like this: the more ambitious a language's type system is,
the fewer uncaptured static properties remain in the code of programs in
that language. However, there are plenty of languages with rather weak
static type systems. In those cases, code has more static properties
that aren't captured by the type system. I'm pointing out that in many
of these cases, those properties resemble types, to the point that it
can make sense to think of them and reason about them as such, applying
the same sort of reasoning that an automated type inferencer applies.
If you disagree, then I'd be interested to hear your answers to the two
questions I posed to Vesa, and the related one I posed to you above,
about what else to call these things.
I don't think that has been done, in the case of dynamic types.
I was thinking of the type systems designed for soft type inferencers;
as well as those cases where e.g. a statically-typed subset of an
untyped language is defined, as in the case of PreScheme.
But in such cases, you end up where a program in these systems, while in
some sense statically typed, is also a valid untyped program. There's
also nothing to stop someone familiar with such things programming in a
type-aware style - in fact, books like Felleisen et al's "How to Design
Programs" encourage that, recommending that functions be annotated with
comments expressing their type. Examples:
;; product : (listof number) -> number
;; copy : N X -> (listof X)
You also see something similar in e.g. many Erlang programs. In these
cases, reasoning about types is done explicitly by the programmer, and
documented.
What would you call the descriptions in those comments? Once you tell
me what I should call them other than "type" (or some qualified variant
such as "latent type"), then we can compare terminology and decide which
is more appropriate.
It has
been done for static types, but much of what you're saying here is in
contradiction to the definition of a type system in that sense of the
word.
Which is why I'm using a qualified version of the term.
I see it as quite reasonable when there's an effort by several
participants in this thread to either imply or say outright that static
type systems and dynamic type systems are variations of something
generally called a "type system", and given that static type systems are
quite formally defined, that we'd want to see a formal definition for a
dynamic type system before accepting the proposition that they are of a
kind with each other.
A complete formal definition of what I'm talking about may be impossible
in principle, because if you could completely formally define it, you'd
have a static type system.
If that makes you throw up your hands, then all you're saying is that
you're unwilling to deal with a very real phenomenon that has obvious
connections to type theory, examples of which I've given above. That's
your choice, but at the same time, you have to give up exclusive claim
to any variation of the word "type".
Terms are used in a context, and it's perfectly reasonable to call
something a "latent type" or even a "dynamic type" in a certain context
and point out connections between those terms and their cousins (or if
you insist, their completely unrelated namesakes) static types.
So far, all the attempts I've seen to define a
dynamic type system seem to reduce to just saying that there is a well-
defined semantics for the language.
That's a pretty strong claim, considering you have so far ducked the
most important questions I raised in the post you replied to.
I believe that's unacceptable for several reasons, but the most
significant of them is this. It's not reasonable to ask anyone to
accept that static type systems gain their essential "type system-ness"
from the idea of having well-defined semantics.
The definition of static type system is not in question. However,
realistically, as I pointed out above, you have to acknowledge that type
systems exist in, and are inextricably connected to, a larger, less
formal context. (At least, you have to acknowledge that if you're
interested in programs that do anything related to the real world.) And
outside the formally defined borders of static type systems, there are
static properties that bear a pretty clear relationship to types.
Closing your eyes to this and refusing to acknowledge any connection
doesn't achieve anything. In the absence of some other account of the
phenomena in question, (an informal version of) types turn out to be a
pretty convenient way to deal with the situation.
From the perspective of
a statically typed language, this looks like a group of people getting
together and deciding that the real "essence" of what it means to be a
type system is...
There's a sense in which one can say that yes, the informal types I'm
referring to have an interesting degree of overlap with static types;
and also that static types do, loosely speaking, have the "purpose" of
formalizing the informal properties I'm talking about.
But I hardly see why such an informal characterization should bother
you. It doesn't affect the definition of static type. It's not being
held up as a foundation for type theory. It's simply a way of dealing
with the realities of programming in a dynamically-checked language.
There are some interesting philophical issues there, to be sure
(although only if you're willing to stray outside the formal), but you
don't have to worry about those unless you want to.
and then naming something that's so completely non-
essential that we don't generally even mention it in lists of the
benefits of static types, because we have already assumed that it's true
of all languages except C, C++, and assembly language.
This is based on the assumption that all we're talking about is
"well-defined semantics". However, there's much more to it than that.
I need to hear your characterization of the properties I've described
before I can respond.
Anton