What is Expressiveness in a Computer Language

C

Chris Smith

Chris Smith said:
I see it as quite reasonable when there's an effort by several
participants in this thread to either imply or say outright that static
type systems and dynamic type systems are variations of something
generally called a "type system" [...]

I didn't say that right. Obviously, no one is making all that great an
EFFORT to say anything. Typing is not too difficult, after all. What I
meant is that there's an argument being made to that effect.
 
A

Anton van Straaten

Chris said:
I don't recall who said what at this
point, but earlier today someone else posted -- in this same thread --
the idea that static type "advocates" want to classify some languages as
untyped in order to put them in the same category as assembly language
programming. That's something I've never seen, and I think it's far
from the goal of pretty much anyone; but clearly, *someone* was
concerned about it. I don't know if much can be done to clarify this
rhetorical problem, but it does exist.

For the record, I'm not concerned about that problem as such. However,
I do think that characterizations of dynamically typed languages from a
static type perspective tend to oversimplify, usually because they
ignore the informal aspects which static type systems don't capture.

Terminology is a big part of this, so there are valid reasons to be
careful about how static type terminology and concepts are applied to
languages which lack formally defined static type systems.
The *other* bit that's been brought up in this thread is that the word
"type" is just familiar and comfortable for programmers working in
dynamically typed languages, and that they don't want to change their
vocabulary.

What I'm suggesting is actually a kind of bridge between the two
positions. The dynamically typed programmer tends to think in terms of
values having types, rather than variables. What I'm pointing out is
that even those programmers reason about something much more like static
types than they might realize; and that there's a connection between
that reasoning and static types, and also a connection to the tags
associated with values.

If you wanted to take the word "type" and have it mean something
reasonably consistent between the static and dynamic camps, what I'm
suggesting at least points in that direction. Obviously, nothing in the
dynamic camp is perfectly isomorphic to a real static type, which is why
I'm qualifying the term as "latent type", and attempting to connect it
to both static types and to tags.
The *third* thing that's brought up is that there is a (to me, somewhat
vague) conception going around that the two really ARE varieties of the
same thing. I'd like to pin this down more, and I hope we get there,
but for the time being I believe that this impression is incorrect. At
the very least, I haven't seen a good way to state any kind of common
definition that withstands scrutiny. There is always an intuitive word
involved somewhere which serves as an escape hatch for the author to
retain some ability to make a judgement call, and that of course
sabotages the definition. So far, that word has varied through all of
"type", "type error", "verify", and perhaps others... but I've never
seen anything that allows someone to identify some universal concept of
typing (or even the phrase "dynamic typing" in the first place) in a way
that doesn't appeal to intuition.

It's obviously going to be difficult to formally pin down something that
is fundamentally informal. It's fundamentally informal because if
reasoning about the static properties of terms in DT languages were
formalized, it would essentially be something like a formal type system.

However, there are some pretty concrete things we can look at. One of
them, which as I've mentioned elsewhere is part of what led me to my
position, is to look at what a soft type inferencer does. It takes a
program in a dynamically typed language, and infers a static type scheme
for it (obviously, it also defines an appropriate type system for the
language.) This has been done in both Scheme and Erlang, for example.

How do you account for such a feat in the absence of something like
latent types? If there's no static type-like information already
present in the program, how is it possible to assign a static type
scheme to a program without dramatically modifying its source?

I think it's reasonable to look at a situation like that and conclude
that even DT programs contain information that corresponds to types.
Sure, it's informal, and sure, it's usually messy compared to an
explicitly defined equivalent. But the point is that there is
"something" there that looks so much like static types that it can be
identified and formalized.
Undoubtedly, some programmers sometimes perform reasoning about their
programs which could also be performed by a static type system.

I think that's a severe understatement. Programmers always reason about
things like the types of arguments, the types of variables, the return
types of functions, and the types of expressions. They may not do
whole-program inference and proof in the way that a static type system
does, but they do it locally, all the time, every time.

BTW, I notice you didn't answer any of the significant questions which I
posed to Vesa. So let me pose one directly to you: how should I rewrite
the first sentence in the preceding paragraph to avoid appealing to an
admittedly informal notion of type? Note, also, that I'm using the word
in the sense of static properties, not in the sense of tagged values.
Let me pipe up, then, as saying that I can't see those reasons; or at
least, if I am indeed seeing the same reasons that everyone else is,
then I am unconvinced by them that there's any kind of rigorous
connection at all.

For now, I'll stand on what I've written above. When I see if or how
that doesn't convince you, I can go further.
It is, nevertheless, quite appropriate to call the language "untyped" if
I am considering static type systems.

I agree that in the narrow context of considering to what extent a
dynamically typed language has a formal static type system, you can call
it untyped. However, what that essentially says is that formal type
theory doesn't have the tools to deal with that language, and you can't
go much further than that. As long as that's what you mean by untyped,
I'm OK with it.
I seriously doubt that this usage
in any way misleads anyone into assuming the absence of any mental
processes on the part of the programmer. I hope you agree.

I didn't suggest otherwise (or didn't mean to). However, the term
"untyped" does tend to lead to confusion, to a lack of recognition of
the significance of all the static information in a DT program that is
outside the bounds of a formal type system, and the way that runtime tag
checks relate to that static information.

One misconception that occurs is the assumption that all or most of the
static type information in a statically-typed program is essentially
nonexistent in a dynamically-typed program, or at least is no longer
statically present. That can easily be demonstrated to be false, of
course, and I'm not arguing that experts usually make this mistake.
If not,
then I think you significantly underestimate a large category of people.

If you think there's no issue here, I think you significantly
overestimate a large category of people. Let's declare that line of
argument a draw.
I couldn't disagree more. Rather, when you're talking about static
types (or just "types" in most research literature that I've seen), then
the realm of discussion is specifically defined to be the very set of
errors that are automatically caught and flagged by the language
translator. I suppose that it is possible to have an unimplemented type
system, but it would be unimplemented only because someone hasn't felt
the need nor gotten around to it. Being implementABLE is a crucial part
of the definition of a static type system.

I agree with the latter sentence. However, it's nevertheless the case
that it's common to confuse "type system" with "compile-time checking".
This doesn't help reasoning in debates like this, where the existence
of type systems in languages that don't have automated static checking
is being examined.
I am beginning to suspect that you're make the converse of the error I
made earlier in the thread. That is, you may be saying things regarding
the psychological processes of programmers and such that make sense when
discussing dynamic types, and in any case I haven't seen any kind of
definition of dynamic types that is more acceptable yet; but it's
completely irrelevant to static types. Static types are not fuzzy -- if
they were fuzzy, they would cease to be static types -- and they are not
a phenomenon of psychology. To try to redefine static types in this way
not only ignores the very widely accepted basis of entire field of
existing literature, but also leads to false ideas such as that there is
some specific definable set of problems that type systems are meant to
solve.

I'm not trying to redefine static types. I'm observing that there's a
connection between the static properties of untyped programs, and static
types; and attempting to characterize that connection.

You need to be careful about being overly formalist, considering that in
real programming languages, the type system does have a purpose which
has a connection to informal, fuzzy things in the real world. If you
were a pure mathematician, you might get away with claiming that type
systems are just a self-contained symbolic game which doesn't need any
connections beyond its formal ruleset.

Think of it like this: the more ambitious a language's type system is,
the fewer uncaptured static properties remain in the code of programs in
that language. However, there are plenty of languages with rather weak
static type systems. In those cases, code has more static properties
that aren't captured by the type system. I'm pointing out that in many
of these cases, those properties resemble types, to the point that it
can make sense to think of them and reason about them as such, applying
the same sort of reasoning that an automated type inferencer applies.

If you disagree, then I'd be interested to hear your answers to the two
questions I posed to Vesa, and the related one I posed to you above,
about what else to call these things.
I don't think that has been done, in the case of dynamic types.

I was thinking of the type systems designed for soft type inferencers;
as well as those cases where e.g. a statically-typed subset of an
untyped language is defined, as in the case of PreScheme.

But in such cases, you end up where a program in these systems, while in
some sense statically typed, is also a valid untyped program. There's
also nothing to stop someone familiar with such things programming in a
type-aware style - in fact, books like Felleisen et al's "How to Design
Programs" encourage that, recommending that functions be annotated with
comments expressing their type. Examples:

;; product : (listof number) -> number

;; copy : N X -> (listof X)

You also see something similar in e.g. many Erlang programs. In these
cases, reasoning about types is done explicitly by the programmer, and
documented.

What would you call the descriptions in those comments? Once you tell
me what I should call them other than "type" (or some qualified variant
such as "latent type"), then we can compare terminology and decide which
is more appropriate.
It has
been done for static types, but much of what you're saying here is in
contradiction to the definition of a type system in that sense of the
word.

Which is why I'm using a qualified version of the term.
I see it as quite reasonable when there's an effort by several
participants in this thread to either imply or say outright that static
type systems and dynamic type systems are variations of something
generally called a "type system", and given that static type systems are
quite formally defined, that we'd want to see a formal definition for a
dynamic type system before accepting the proposition that they are of a
kind with each other.

A complete formal definition of what I'm talking about may be impossible
in principle, because if you could completely formally define it, you'd
have a static type system.

If that makes you throw up your hands, then all you're saying is that
you're unwilling to deal with a very real phenomenon that has obvious
connections to type theory, examples of which I've given above. That's
your choice, but at the same time, you have to give up exclusive claim
to any variation of the word "type".

Terms are used in a context, and it's perfectly reasonable to call
something a "latent type" or even a "dynamic type" in a certain context
and point out connections between those terms and their cousins (or if
you insist, their completely unrelated namesakes) static types.
So far, all the attempts I've seen to define a
dynamic type system seem to reduce to just saying that there is a well-
defined semantics for the language.

That's a pretty strong claim, considering you have so far ducked the
most important questions I raised in the post you replied to.
I believe that's unacceptable for several reasons, but the most
significant of them is this. It's not reasonable to ask anyone to
accept that static type systems gain their essential "type system-ness"
from the idea of having well-defined semantics.

The definition of static type system is not in question. However,
realistically, as I pointed out above, you have to acknowledge that type
systems exist in, and are inextricably connected to, a larger, less
formal context. (At least, you have to acknowledge that if you're
interested in programs that do anything related to the real world.) And
outside the formally defined borders of static type systems, there are
static properties that bear a pretty clear relationship to types.

Closing your eyes to this and refusing to acknowledge any connection
doesn't achieve anything. In the absence of some other account of the
phenomena in question, (an informal version of) types turn out to be a
pretty convenient way to deal with the situation.
From the perspective of
a statically typed language, this looks like a group of people getting
together and deciding that the real "essence" of what it means to be a
type system is...

There's a sense in which one can say that yes, the informal types I'm
referring to have an interesting degree of overlap with static types;
and also that static types do, loosely speaking, have the "purpose" of
formalizing the informal properties I'm talking about.

But I hardly see why such an informal characterization should bother
you. It doesn't affect the definition of static type. It's not being
held up as a foundation for type theory. It's simply a way of dealing
with the realities of programming in a dynamically-checked language.

There are some interesting philophical issues there, to be sure
(although only if you're willing to stray outside the formal), but you
don't have to worry about those unless you want to.
and then naming something that's so completely non-
essential that we don't generally even mention it in lists of the
benefits of static types, because we have already assumed that it's true
of all languages except C, C++, and assembly language.

This is based on the assumption that all we're talking about is
"well-defined semantics". However, there's much more to it than that.
I need to hear your characterization of the properties I've described
before I can respond.

Anton
 
A

Anton van Straaten

Marshall said:
In the database theory world, we speak of three levels: conceptual,
logical, physical. In a dbms, these might roughly be compared to
business entities described in a requirements doc, (or in the
business owners' heads), the (logical) schema encoded in the
dbms, and the b-tree indicies the dbms uses for performance.

So when you say "latent types", I think "conceptual types."

That sounds plausible, but of course at some point we need to pick a
term and attempt to define it. What I'm attempting to do with "latent
types" is to point out and emphasize their relationship to static types,
which do have a very clear, formal definition. Despite some people's
skepticism, that definition gives us a lot of useful stuff that can be
applied to what I'm calling latent types.
The thing about this is, both the Lisp and the Haskell programmers
are using conceptual types (as best I can tell.)

Well, a big difference is that the Haskell programmers have language
implementations that are clever enough to tell them, statically, a very
precise type for every term in their program. Lisp programmers usually
don't have that luxury. And in the Haskell case, the conceptual type
and the static type match very closely.

There can be differences, e.g. a programmer might have knowledge about
the input to the program or some programmed constraint that Haskell
isn't capable of inferring, that allows them to figure out a more
precise type than Haskell can. (Whether that type can be expressed in
Haskell's type system is a separate question.)

In that case, you could say that the conceptual type is different than
the inferred static type. But most of the time, the human is reasoning
about pretty much the same types as the static types that Haskell
infers. Things would get a bit confusing otherwise.
And also, the
conceptual/latent types are not actually a property of the
program;

That's not exactly true in the Haskell case (or other languages with
static type inference), assuming you accept the static/conceptual
equivalence I've drawn above.

Although static types are often not explicitly written in the program in
such languages, they are unambiguously defined and automatically
inferrable. They are a static property of the program, even if in many
cases those properties are only implicit with respect to the source
code. You can ask the language to tell you what the type of any given
term is.
they are a property of the programmer's mental
model of the program.

That's more accurate. In languages with type inference, the programmer
still has to figure out what the implicit types are (or ask the language
to tell her).

You won't get any argument from me that this figuring out of implicit
types in a Haskell program is quite similar to what a Lisp, Scheme, or
Python programmer does. That's one of the places the connection between
static types and what I call latent types is the strongest.

(However, I think I just heard a sound as though a million type
theorists howled in unison.[*])

[*] most obscure inadvertent pun ever.
It seems we have languages:
with or without static analysis
with or without runtime type information (RTTI or "tags")
with or without (runtime) safety
with or without explicit type annotations
with or without type inference

Wow. And I don't think that's a complete list, either.
Yup.

I would be happy to abandon "strong/weak" as terminology
because I can't pin those terms down. (It's not clear what
they would add anyway.)

I wasn't following the discussion earlier, but I agree that strong/weak
don't have strong and unambiguous definitions.
Uh, oh, a new term, "manifest." Should I worry about that?

Well, people have used the term "manifest type" to refer to a type
that's explicitly apparent in the source, but I wouldn't worry about it.
I just used the term to imply that at some point, the idea of "latent
type" has to be converted to something less latent. Once you explicitly
identify a type, it's no longer latent to the entity doing the identifying.
Well, darn. It strikes me that that's just a decision the language
designers
made, *not* to record complete RTTI.

No, there's more to it. There's no way for a dynamically-typed language
to figure out that something like the timestwo function I gave has the
type "number -> number" without doing type inference, by examining the
source of the function, at which point it pretty much crosses the line
into being statically typed.
(Is it going to be claimed that
there is an *advantage* to having only incomplete RTTI? It is a
serious question.)

More than an advantage, it's difficult to do it any other way. Tags are
associated with values. Types in the type theory sense are associated
with terms in a program. All sorts of values can flow through a given
term, which means that types can get complicated (even in a nice clean
statically typed language). The only way to reason about types in that
sense is statically - associating tags with values at runtime doesn't
get you there.

This is the sense in which the static type folk object to the term
"dynamic type" - because the tags/RTTI are not "types" in the type
theory sense.

Latent types as I'm describing them are intended to more closely
correspond to static types, in terms of the way in which they apply to
terms in a program.
Hmmm. Another place where the static type isn't the same thing as
the runtime type occurs in languages with subtyping.
Yes.

Question: if a language *does* record complete RTTI, and the
languages does *not* have subtyping, could we then say that
the runtime type information *is* the same as the static type?

You'd still have a problem with function types, as I mentioned above,
as well as expressions like the conditional one I gave:

(flag ? 5 : "foo")

The problem is that as the program is running, all it can ever normally
do is tag a value with the tags obtained from the path the program
followed on that run. So RTTI isn't, by itself, going to determine
anything similar to a static type for such a term, such as "string |
number".

One way to think about "runtime" types is as being equivalent to certain
kinds of leaves on a statically-typed program syntax tree. In an
expression like "x = 3", an inferring compiler might determine that 3 is
of type "integer". The tagged values which RTTI uses operate at this
level: the level of individual values.

However, as soon as you start dealing with other kinds of nodes in the
syntax tree -- nodes that don't represent literal values, or compound
nodes (that have children) -- the possibility arises that the type of
the overall expression will be more complex than that of a single value.
At that point, RTTI can't do what static types do.

Even in the simple "x = 3" case, a hypothetical inferring compiler might
notice that elsewhere in the same function, x is treated as a floating
point number, perhaps via an expression like "x = x / 2.0". According
to the rules of its type system, our language might determine that x has
type "float", as opposed to "number" or "integer". It might then either
treat the original "3" as a float, or supply a conversion when assigning
it to "x". (Of course, some languages might just give an error and
force you to be more explicit, but bear with me for the example - it's
after 3:30am again.)

Compare this to the dynamically-typed language: it sees "3" and,
depending on the language, might decide it's a "number" or perhaps an
"integer", and tag it as such. Either way, x ends up referring to that
tagged value. So what type is x? All you can really say, in the RTTI
case, is that x is a number or an integer, depending on the tag the
value has. There's no way it can figure out that x should be a float at
that point.

Of course, further down when it runs into the code "x = x / 2.0", x
might end up holding a value that's tagged as a float. But that only
tells you the value of x at some point in time, it doesn't help you with
the static type of x, i.e. a type that either encompasses all possible
values x could have during its lifetime, or alternatively determines how
values assigned to x should be treated (e.g. cast to float).

BTW, it's worth noting at this point, since it's implied by the last
paragraph, that a static type is only an approximation to the values
that a term can have during the execution of a program. Static types
can (often!) be too conservative, describing possible values that a
particular term couldn't actually ever have. This gives another hint as
to why RTTI can't be equivalent to static types. It's only ever dealing
with the concrete values right in front of it, it can't see the bigger
static picture.

Anton
 
R

Rob Thorpe

David said:
You seem to very keen to attribute motives to people that are not apparent
from what they have said.

The term "dynamically typed" is well used and understood. The term
untyped is generally associated with languages that as you put it "have
no memory safety", it is a pejorative term. "Latently typed" is not
well used unfortunately, but more descriptive.

Most of the arguments above describe a static type system then follow
by saying that this is what "type system" should mean, and finishing by
saying everything else should be considered untyped. This seems to me
to be an effort to associate dynamically typed languages with this
perjorative term.
A common term for languages which have defined behaviour at run-time is
"memory safe". For example, "Smalltalk is untyped and memory safe."
That's not too objectionable, is it?

Memory safety isn't the whole point, it's only half of it. Typing
itself is the point. Regardless of memory safety if you do a
calculation in a latently typed langauge, you can find the type of the
resulting object.
 
A

Andreas Rossberg

Marshall said:
What we generally (in programming) call variables are locals
and globals. If the languages supports an update operation
on those variables, then calling them variables makes sense.
But "variable" has become such a catch-all term that we call

public static final int FOO = 7;

a variable, even though it can never, ever vary.

That doesn't make any sense.

It does, because it is only a degenerate case. In general, you can have
something like

void f(int x)
{
const int foo = x+1;
//...
}

Now, foo is still immutable, it is a local, but it clearly also varies.

- Andreas
 
V

Vesa Karvonen

In comp.lang.functional Anton van Straaten said:
I reject this comparison. There's much more to it than that. The point
is that the reasoning which programmers perform when working with an
program in a latently-typed language bears many close similiarities to
the purpose and behavior of type systems.
This isn't an attempt to jump on any bandwagons, it's an attempt to
characterize what is actually happening in real programs and with real
programmers. I'm relating that activity to type systems because that is
what it most closely relates to.
[...]

I think that we're finally getting to the bottom of things. While reading
your reponses something became very clear to me: latent-typing and latent-
types are not a property of languages. Latent-typing, also known as
informal reasoning, is something that all programmers do as a normal part
of programming. To say that a language is latently-typed is to make a
category mistake, because latent-typing is not a property of languages.

A programmer, working in any language, whether typed or not, performs
informal reasoning. I think that is fair to say that there is a
correspondence between type theory and such informal reasoning. The
correspondence is like the correspondence between informal and formal
math. *But* , informal reasoning (latent-typing) is not a property of
languages.

An example of a form of informal reasoning that (practically) every
programmer does daily is termination analysis. There are type systems
that guarantee termination, but I think that is fair to say that it is not
yet understood how to make a practical general purpose language, whose
type system would guarantee termination (or at least I'm not aware of such
a language). It should also be clear that termination analysis need not
be done informally. Given a program, it may be possible to formally prove
that it terminates.

I'm now more convinced than ever that "(latently|dynamically)-typed
language" is an oxymoron. The terminology really needs to be fixed.

-Vesa Karvonen
 
R

rossberg

Anton said:
Languages with latent type systems typically don't include type
declarations in the source code of programs. The "static" type scheme
of a given program in such a language is thus latent, in the English
dictionary sense of the word, of something that is present but
undeveloped. Terms in the program may be considered as having static
types, and it is possible to infer those types, but it isn't necessarily
easy to do so automatically, and there are usually many possible static
type schemes that can be assigned to a given program.

Programmers infer and reason about these latent types while they're
writing or reading programs. Latent types become manifest when a
programmer reasons about them, or documents them e.g. in comments.

I very much agree with the observation that every programmer performs
"latent typing" in his head (although Pascal Constanza's seems to have
the opposite opinion).

But I also think that "latently typed language" is not a meaningful
characterisation. And for the very same reason! Since any programming
activity involves latent typing - naturally, even in assembler! - it
cannot be attributed to any language in particular, and is hence
useless to distinguish between them. (Even untyped lambda calculus
would not be a counter-example. If you really were to program in it,
you certainly would think along lines like "this function takes two
chuch numerals and produces a third one".)

I hear you when you define latently typed languages as those that
support the programmer's latently typed thinking by providing dynamic
tag checks. But in the very same post (and others) you also explain in
length why these tags are far from being actual types. This seems a bit
contradictory to me.

As Chris Smith points out, these dynamic checks are basically a
necessaity for a well-defined operational semantics. You need them
whenever you have different syntactic classes of values, but lack a
type system to preclude interference. They are just an encoding for
differentiating these syntactic classes. Their connection to types is
rather coincidential.

- Andreas
 
A

Andreas Rossberg

David said:
Set theory has no difficulty with this. It's common, for example, to see
"the set of strings representing propositions" used in treatments of
formal systems.

Oh, I was not saying that this particular aspect cannot be described in
set theory (that was a different argument, about different issues). Just
that you cannot naively equate types with a set of underlying values,
which is what is usually meant by the types-are-sets metaphor - to
capture something like type abstraction you need to do more. (Even then
it might be arguable if it really describes the same thing.)

- Andreas
 
C

Chris Uppal

Andreas said:
Chris said:
It's worth noting, too, that (in some sense) the type of an object
can change over time[*].

No. Since a type expresses invariants, this is precisely what may
*not* happen. If certain properties of an object may change then the
type of
the object has to reflect that possibility. Otherwise you cannot
legitimately call it a type.

Well, it seems to me that you are /assuming/ a notion of what kinds of
logic can be called type (theories), and I don't share your
assumptions. No offence intended.

OK, but can you point me to any literature on type theory that makes a
different assumption?

'Fraid not. (I'm not a type theorist -- for all I know there may be lots, but
my suspicion is that they are rare at best.)

But perhaps I shouldn't have used the word theory at all. What I mean is that
there is one or more logic of type (informal or not -- probably informal) with
respect to which the object in question has changed it categorisation. If no
existing type /theory/ (as devised by type theorists) can handle that case,
then that is a deficiency in the set of existing theories -- we need newer and
better ones.

But, as a sort of half-way, semi-formal, example: consider the type environment
in a Java runtime. The JVM does formal type-checking of classfiles as it loads
them. In most ways that checking is static -- it's treating the bytecode as
program text and doing a static analysis on it before allowing it to run (and
rejecting what it can't prove to be acceptable by its criteria). However, it
isn't /entirely/ static because the collection of classes varies at runtime in
a (potentially) highly dynamic way. So it can't really examine the "whole"
text of the program -- indeed there is no such thing. So it ends up with a
hybrid static/dynamic type system -- it records any assumptions it had to make
in order to find a proof of the acceptability of the new code, and if (sometime
in the future) another class is proposed which violates those assumptions, then
that second class is rejected.

Neither do I. But what is wrong with a mutable reference-to-union type,
as I suggested? It expresses this perfectly well.

Maybe I misunderstood what you meant by union type. I took it to mean that the
type analysis didn't "know" which of the two types was applicable, and so would
reject both (or maybe accept both ?). E..g if at instant A some object, obj,
was in a state where it to responds to #aMessage, but not #anotherMessage; and
at instant B it is in a state where it responds to #anotherMessage but not
#aMessage. In my (internal and informal) type logic, make the following
judgements:

In code which will be executed at instant A
obj aMessage. "type correct"
obj anotherMessage. "type incorrect"

In code which will be executed at instant B
obj aMessage. "type incorrect"
obj anotherMessage. "type correct"

I don't see how a logic with no temporal element can arrive at all four those
judgements, whatever it means by a union type.

-- chris
 
P

Patricia Shanahan

Vesa Karvonen wrote:
....
An example of a form of informal reasoning that (practically) every
programmer does daily is termination analysis. There are type systems
that guarantee termination, but I think that is fair to say that it is not
yet understood how to make a practical general purpose language, whose
type system would guarantee termination (or at least I'm not aware of such
a language). It should also be clear that termination analysis need not
be done informally. Given a program, it may be possible to formally prove
that it terminates.

To make the halting problem decidable one would have to do one of two
things: Depend on memory size limits, or have a language that really is
less expressive, at a very deep level, than any of the languages
mentioned in the newsgroups header for this message.

A language for which the halting problem is decidable must also be a
language in which it is impossible to simulate an arbitrary Turing
machine (TM). Otherwise, one could decide the notoriously undecidable TM
halting problem by generating a language X program that simulates the TM
and deciding whether the language X program halts.

One way out might be to depend on the boundedness of physical memory. A
language with a fixed maximum memory size cannot simulate an arbitrary
TM. However, the number of states for a program is so great that a
method that depends on its finiteness, but would not work for an
infinite memory model, is unlikely to be practical.

Patricia
 
P

Pascal Costanza

Matthias said:
Yes, the phrase "runtime type error" is actually a misnomer. What one
usually means by that is a situation where the operational semantics
is "stuck", i.e., where the program, while not yet arrived at what's
considered a "result", cannot make any progress because the current
configuration does not match any of the rules of the dynamic
semantics.

The reason why we call this a "type error" is that such situations are
precisely the ones we want to statically rule out using sound static
type systems. So it is a "type error" in the sense that the static
semantics was not strong enough to rule it out.


I disagree with this. If the program keeps running in a defined way,
then it is not what I would call a type error. It definitely is not
an error in the sense I described above.

If your view of a running program is that it is a "closed" system, then
you're right. However, maybe there are several layers involved, so what
appears to be a well-defined behavior from the outside may still be
regarded as a type error internally.

A very obvious example of this is when you run a program in a debugger.
There are two levels involved here: the program signals a type error,
but that doesn't mean that the system as a whole is stuck. Instead, the
debugger takes over and offers ways to deal with the type error. The
very same program run without debugging support would indeed simply be
stuck in the same situation.

So to rephrase: It depends on whether you use the "message not
understood" machinery as a way to provide well-defined behavior for the
base level, or rather as a means to deal with an otherwise unanticipated
situation. In the former case it extends the language to remove certain
type errors, in the latter case it provides a kind of debugging facility
(and it indeed may be a good idea to deploy programs with debugging
facilities, and not only use debugging tools at development time).

This is actually related to the notion of reflection, as coined by Brian
C. Smith. In a reflective architecture, you distinguish between various
interpreters, each of which interprets the program at the next level. A
debugger is a program that runs at a different level than a base program
that it debugs. However, the reflective system as a whole is "just" a
single program seen from the outside (with one interpreter that runs the
whole reflective tower). This distinction between the internal and the
external view of a reflective system was already made by Brian Smith.


Pascal
 
P

Piet van Oostrum

M> While we're on the topic of terminology, here's a pet peeve of
M> mine: "immutable variable."
M> immutable = can't change
M> vary-able = can change
M> Clearly a contradiction in terms.

I would say that immutable = 'can't *be* changed' rather than 'can't
change'. But I am not a native English speaker.

Compare with this: the distance of the Earth to Mars is variable (it
varies), but it is also immutable (we can't change it).
 
P

Pascal Costanza

Chris said:
Hmm. I'm afraid I'm going to be picky here. I think you need to
clarify what is meant by "appropriate".

No, I cannot be a lot clearer here. What operations are appropriate for
what values largely depends on the intentions of a programmer. Adding a
number to a string is inappropriate, no matter how a program behaves
when this actually occurs (whether it continues to execute the operation
blindly, throws a continuable exception, or just gets stuck).
If you mean "the operation will
not complete successfully" as I suspect you do, then we're closer...

No, we're not. You're giving a purely technical definition here, that
may or may not relate to the programmer's (or "designer's")
understanding of the domain.


Pascal
 
P

Pascal Costanza

Marshall said:
I am sceptical of the idea that when programming in a dynamically
typed language one doesn't have to think about both models as well.
I don't have a good model of the mental process of working
in a dynamically typed language, but how could that be the case?
(I'm not asking rhetorically.) Do you then run your program over
and over, mechanically correcting the code each time you discover
a type error? In other words, if you're not thinking of the type model,
are you using the runtime behavior of the program as an assistant,
the way I use the static analysis of the program as an assistant?
Yes.

I don't accept the idea about pairing the appropriateness of each
system according to whether one is doing exploratory programming.
I do exploratory programming all the time, and I use the static type
system as an aide in doing so. Rather I think this is just another
manifestation of the differences in the mental processes between
static typed programmers and dynamic type programmers, which
we are beginning to glimpse but which is still mostly unknown.
Probably.

Oh, and I also want to say that of all the cross-posted mega threads
on static vs. dynamic typing, this is the best one ever. Most info;
least flames. Yay us!

Yay! :)


Pascal
 
P

Pascal Costanza

Marshall said:
I still don't really see the difference.

I would not expect that the dynamic programmer will be
thinking that this code will have two numbers most of the
time but sometimes not, and fail. I would expect that in both
static and dynamic, the thought is that that code is adding
two numbers, with the difference being the static context
gives one a proof that this is so.

There is a third option: it may be that at the point where I am writing
this code, I simply don't bother yet whether a and b will always be
numbers. In case something other than numbers pop up, I can then make a
decision how to proceed from there.
In this simple example,
the static case is better, but this is not free, and the cost
of the static case is evident elsewhere, but maybe not
illuminated by this example.

Yes, maybe the example is not the best one. This kind of example,
however, occurs quite often when programming in an object-oriented
style, where you don't know yet what objects will and will not respond
to a message / generic function. Even in the example above, it could be
that you can give an appropriate definition for + for objects other than
numbers.


Pascal
 
A

Anton van Straaten

Vesa said:
I think that we're finally getting to the bottom of things. While reading
your reponses something became very clear to me: latent-typing and latent-
types are not a property of languages. Latent-typing, also known as
informal reasoning, is something that all programmers do as a normal part
of programming. To say that a language is latently-typed is to make a
category mistake, because latent-typing is not a property of languages.

A programmer, working in any language, whether typed or not, performs
informal reasoning. I think that is fair to say that there is a
correspondence between type theory and such informal reasoning. The
correspondence is like the correspondence between informal and formal
math. *But* , informal reasoning (latent-typing) is not a property of
languages.

Well, it's obviously the case that latent types as I've described them
are not part of the usual semantics of dynamically typed languages. In
other messages, I've mentioned types like "number -> number" which have
no meaning in a dynamically typed language. You can only write them in
comments (unless you implement some kind of type handling system), and
language implementations aren't aware of such types.

OTOH, a programmer reasoning explicitly about such types, writing them
in comments, and perhaps using assertions to check them has, in a sense,
defined a language. Having done that, and reasoned about the types
in his program, he manually erases them, "leaving" code written in the
original dynamically-typed language. You can think of it as though it
were generated code, complete with comments describing types, injected
during the erasure process.

So, to address your category error objection, I would say that latent
typing is a property of latently-typed languages, which are typically
informally-defined supersets of what we know as dynamically-typed languages.

I bet that doesn't make you happy, though. :D

Still, if that sounds a bit far-fetched, let me start again at ground
level, with a Haskell vs. Scheme example:

let double x = x * 2

vs.:

(define (double x) (* x 2))

Programmers in both languages do informal reasoning to figure out the
type of 'double'. I'm assuming that the average Haskell programmer
doesn't write out a proof whenever he wants to know the type of a term,
and doesn't have a compiler handy.

But the Haskell programmer's informal reasoning takes place in the
context of a well-defined formal type system. He knows what the "type
of double" means: the language defines that for him.

The type-aware Scheme programmer doesn't have that luxury: before he can
talk about types, he has to invent a type system, something to give
meaning to an expression such as "number -> number". Performing that
invention gives him types -- albeit informal types, a.k.a. latent types.

In the Haskell case, the types are a property of the language. If
you're willing to acknowledge the existence of something like latent
types, what are they a property of? Just the amorphous informal cloud
which surrounds dynamically-typed languages? Is that a satisfactory
explanation of these two quite similar examples?

I want to mention two other senses in which latent types become
connected to real languages. That doesn't make them properties of the
formal semantics of the language, but the connection is a real one at a
different level.

The first is that in a language without a rich formal type system,
informal reasoning outside of the formal type system becomes much more
important. Conversely, in Haskell, even if you accept the existence of
latent types, they're close enough to static types that it's hardly
necessary to consider them. This is why latent types are primarily
associated with languages without rich formal type systems.

The second connection is via tags: these are part of the definition of a
dynamically-typed language, and if the programmer is reasoning
explicitly about latent types, tags are a useful tool to help ensure
that assumptions about types aren't violated. So this is a connection
between a feature in the semantics of the language, and these
extra-linguistic latent types.
An example of a form of informal reasoning that (practically) every
programmer does daily is termination analysis. There are type systems
that guarantee termination, but I think that is fair to say that it is not
yet understood how to make a practical general purpose language, whose
type system would guarantee termination (or at least I'm not aware of such
a language). It should also be clear that termination analysis need not
be done informally. Given a program, it may be possible to formally prove
that it terminates.

Right. And this is partly why talking about latent types, as opposed to
the more general "informal reasoning", makes sense: because latent types
are addressing the same kinds of things that static types can capture.
Type-like things.
I'm now more convinced than ever that "(latently|dynamically)-typed
language" is an oxymoron. The terminology really needs to be fixed.

I agree that fixing is needed. The challenge is to do it in a way that
accounts for, rather than simply ignores, the many informal correlations
to formal type concepts that exist in dynamically-typed languages.
Otherwise, the situation won't improve.

Anton
 
P

Pascal Costanza

Marshall said:
I want to make sure I understand. I can think of several things
you might mean by this. It could be:
1) I want to run my program, even though I know parts of it
are broken, because I think there are parts that are not broken
and I want to try them out.
2) I want to run my program, even though it is broken, and I
want to run right up to a broken part and trap there, so I can
use the runtime facilities of the language to inspect what's
going on.



This statement is interesting, because the conventional wisdom (at
least as I'm used to hearing it) is that it is best to catch bugs
at the *first* possible moment. But I think maybe we're talking
about different continua here. The last last last possible moment
is after the software has shipped to the customer, and I'm pretty
sure that's not what you mean. I think maybe you mean something
more like 2) above.

Nowadays, we have more options wrt what it means to "ship" code. It
could be that your program simply runs as a (web) service to which you
have access even after the customer has started to use the program. See
http://www.paulgraham.com/road.html for a good essay on this idea.


Pascal
 
C

Chris Uppal

Eliot Miranda wrote:

[me:]
Taking Smalltalk /specifically/, there is a definite sense in which it
is typeless -- or trivially typed -- in that in that language there are
no[*] operations which are forbidden[**],

Come one Chris U. One has to distinguish an attempt to invoke an
operation with it being carried out. There is nothing in Smalltalk to
stop one attempting to invoke any "operation" on any object. But one
can only actually carry-out operations on objects that implement them.
So, apart from the argument about inadvertent operation name overloading
(which is important, but avoidable), Smalltalk is in fact
strongly-typed, but not statically strongly-typed.

What are you doing /here/, Eliot, this is Javaland ? Smalltalk is thatta
way ->

;-)


But this discussion has been all about /whether/ it is ok to apply the notion
of (strong) typing to what runtime-checked languages do. We all agree that
the checks happen, but the question is whether it is
reasonable/helpful/legitimate to extend the language of static checking to the
dynamic case. (I'm on the side which says yes, but good points have been made
against it).

The paragraph you quoted doesn't represent most of what I have been saying --
it was just a side-note looking at one small aspect of the issue from a
different angle.

-- chris
 
C

Chris Uppal

Anton said:
In that case, you could say that the conceptual type is different than
the inferred static type. But most of the time, the human is reasoning
about pretty much the same types as the static types that Haskell
infers. Things would get a bit confusing otherwise.

Or any mechanised or formalised type system, for any language. If a system
doesn't match pretty closely with at least part of the latent type systems (in
your sense) used by the programmers, then that type system is useless.

(I gather that it took, or maybe is still taking, theorists a while to get to
grips with the infromal type logics which were obvious to working OO
programmers.)

-- chris
 
C

Chris Uppal

David said:
A common term for languages which have defined behaviour at run-time is
"memory safe". For example, "Smalltalk is untyped and memory safe."
That's not too objectionable, is it?

I find it too weak, as if to say: "well, ok, it can't actually corrupt memory
as such, but the program logic is still apt go all over the shop"...

-- chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,996
Messages
2,570,238
Members
46,826
Latest member
robinsontor

Latest Threads

Top