What is Expressiveness in a Computer Language

E

Eliot Miranda

Chris said:
Taking Smalltalk /specifically/, there is a definite sense in which it is
typeless -- or trivially typed -- in that in that language there are no[*]
operations which are forbidden[**],

Come one Chris U. One has to distinguish an attempt to invoke an
operation with it being carried out. There is nothing in Smalltalk to
stop one attempting to invoke any "operation" on any object. But one
can only actually carry-out operations on objects that implement them.
So, apart from the argument about inadvertent operation name overloading
(which is important, but avoidable), Smalltalk is in fact
strongly-typed, but not statically strongly-typed.
 
C

Claudio Grondi

Xah said:
in March, i posted a essay “What is Expressiveness in a Computer
Languageâ€, archived at:
http://xahlee.org/perl-python/what_is_expresiveness.html

I was informed then that there is a academic paper written on this
subject.

On the Expressive Power of Programming Languages, by Matthias
Felleisen, 1990.
http://www.ccs.neu.edu/home/cobbe/pl-seminar-jr/notes/2003-sep-26/expressive-slides.pdf

Has anyone read this paper? And, would anyone be interested in giving a
summary?

thanks.

Xah
(e-mail address removed)
∑ http://xahlee.org/
Looking this thread growing it appears to me, that at least one thing
becomes evident here:

Xah unwillingness to learn from the bad past experience contaminates
others (who are still posting to his trolling threads).

Here another try to rescue these ones who are just virgin enough not to
know what I am speaking about:

Citation from http://www.xahlee.org/Netiquette_dir/_/art_trolling.html :
"""
What I want this document to focus on is how to create entertaining
trolls. I have drawn on the expertise of the writer's of some of
Usenet's finest and best remembered trolls. Trolls are for fun. The
object of recreational trolling is to sit back and laugh at all those
gullible idiots that will believe *anything*.
[...]
Section 5 Know Your Audience
Remember that you have two audiences. The people who are going to get
the maximum enjoyment out of your post are other trollers. You need to
keep in contact with them through both your troll itself and the way you
direct its effect. It is trollers that you are trying to entertain so be
creative - trollers don't just want a laugh from you they want to see
good trolls so that they can also learn how to improve their own in the
never ending search for the perfect troll.
[...]
Section 6 Following-Up
Try not to follow-up to your own troll. The troll itself quickly becomes
forgotten in the chaos and if you just sit back you can avoid being
blamed for causing it. Remember, if you do follow up you are talking to
an idiot. Treat them with the ill-respect they deserve.
"""

Claudio Grondi (a past 'gullible idiot' who learned to enjoy the fun of
being the audience)
 
C

Chris Smith

Claudio Grondi said:
Looking this thread growing it appears to me, that at least one thing
becomes evident here:

Xah unwillingness to learn from the bad past experience contaminates
others (who are still posting to his trolling threads).

Here another try to rescue these ones who are just virgin enough not to
know what I am speaking about:

I am enjoying the discussion. I think several other people are, too.
(At least, I hope so!) It matters not one whit to me who started the
thread, or the merits of the originating post. Why does it matter to
you?
 
T

Timo Stamm

Claudio said:
Looking this thread growing it appears to me, that at least one thing
becomes evident here:

Xah unwillingness to learn from the bad past experience contaminates
others (who are still posting to his trolling threads).

This is actually one of the most interesting threads I have read in a
long time. If you ignore the evangelism, there is a lot if high-quality
information and first-hand experience you couldn't find in a dozen books.
 
D

Darren New

Eliot said:
can only actually carry-out operations on objects that implement them.

Execpt that every operation is implemented by every object in Smalltalk.
Unless you specify otherwise, the implementation of every method is to
call the receiver with doesNotUnderstand. (I don't recall whether the
class of nil has a special rule for this or whether it implements
doesNotUnderstand and invokes the appropriate "don't send messages to
nil" method.)

There are a number of Smalltalk extensions, such as
multiple-inheritance, that rely on implementing doesNotUnderstand.
 
D

David Hopwood

Rob said:
Its easy to create a reasonable framework. My earlier posts show simple
ways of looking at it that could be further refined, I'm sure there are
others who have already done this.

The real objection to this was that latently/dynamically typed
languages have a place in it.

You seem to very keen to attribute motives to people that are not apparent
from what they have said.
But some of the advocates of statically
typed languages wish to lump these languages together with assembly
language a "untyped" in an attempt to label them as unsafe.

A common term for languages which have defined behaviour at run-time is
"memory safe". For example, "Smalltalk is untyped and memory safe."
That's not too objectionable, is it?

(It is actually more common for statically typed languages to fail to be
memory safe; consider C and C++, for example.)
 
B

Benjamin Franksen

George said:
The point is really that the checks that prevent these things

[such as 'array index out of bounds']
must be
performed at runtime and can't be prevented by any practical type
analysis performed at compile time. I'm not a type theorist but my
opinion is that a static type system that could, a priori, prevent the
problem is impossible.

It is not impossible. Dependent type systems can -- in principle -- do all
such things. Take a look at Epigram (http://www.e-pig.org/).

Systems based on dependent types such as Epigram are not yet mature enough
to be used for serious programming. However, they clearly show that these
things /are/ possible.

Cheers
Ben
 
A

Anton van Straaten

Marshall said:
Can you be more explicit about what "latent types" means?
I'm sorry to say it's not at all natural or intuitive to me.
Are you referring to the types in the programmers head,
or the ones at runtime, or what?

Sorry, that was a huge omission. (What I get for posting at 3:30am.)

The short answer is that I'm most directly referring to "the types in
the programmer's head". A more complete informal summary is as follows:

Languages with latent type systems typically don't include type
declarations in the source code of programs. The "static" type scheme
of a given program in such a language is thus latent, in the English
dictionary sense of the word, of something that is present but
undeveloped. Terms in the program may be considered as having static
types, and it is possible to infer those types, but it isn't necessarily
easy to do so automatically, and there are usually many possible static
type schemes that can be assigned to a given program.

Programmers infer and reason about these latent types while they're
writing or reading programs. Latent types become manifest when a
programmer reasons about them, or documents them e.g. in comments.

(As has already been noted, this definition may seem at odds with the
definition given in the Scheme report, R5RS, but I'll address that in a
separate post.)

There's a close connection between latent types in the sense I've
described, and the "tagged values" present at runtime. However, as type
theorists will tell you, the tags used to tag values at runtime, as e.g.
a number or a string or a FooBar object, are not the same thing as the
sort of types which statically-typed languages have.

A simple example of the distinction can be seen in the type of a
function. Using Javascript as a lingua franca:

function timestwo(x) { return x * 2 }

In a statically-typed language, the type of a function like this might
be something like "number -> number", which tells us three things: that
timestwo is a function; that it accepts a number argument; and that it
returns a number result.

But if we ask Javascript what it thinks the type of timestwo is, by
evaluating "typeof timestwo", it returns "function". That's because the
value bound to timestwo has a tag associated with it which says, in
effect, "this value is a function".

But "function" is not a useful type. Why not? Because if all you know
is that timestwo is a function, then you have no idea what an expression
like "timestwo(foo)" means. You couldn't write working programs, or
read them, if all you knew about functions was that they were functions.
As a type, "function" is incomplete.

By my definition, though, the latent type of timestwo is "number ->
number". Any programmer looking at the function can figure out that
this is its type, and programmers do exactly that when reasoning about a
program.

(Aside: technically, you can pass timestwo something other than a
number, but then you get NaN back, which is usually not much use except
to generate errors. I'll ignore that here; latent typing requires being
less rigourous about some of these issues.)

So, where do tagged values fit into this? Tags help to check types at
runtime, but that doesn't mean that there's a 1:1 correspondence between
tags and types. For example, when an expression such as "timestwo(5) *
3" is evaluated, three checks occur that are relevant to the type of
timestwo:

1. Before the function call takes place, a check ensures that timestwo
is a function.

2. Before the multiplication in "x * 2", a check ensures that x is a number.

3. When timestwo returns, before the subsequent multiplication by 3, a
check ensures that the return type of timestwo is a number.

These three checks correspond to the three pieces of information
contained in the function type signature "number -> number".

However, these dynamic checks still don't actually tell us the type of a
function. All they do is check that in a particular case, the values
involved are compatible with the type of the function. In many cases,
the checks may infer a signature that's either more or less specific
than the function's type, or they may infer an incomplete signature --
e.g., the return type doesn't need to be checked when evaluating "arr
= timestwo(5)".

I used a function just as an example. There are many other cases where
a value's tag doesn't match the static (or latent) type of the terms
through which it flows. A simple example is an expression such as:

(flag ? 5 : "foo")

Here, the latent type of this expression could be described as "number |
string". There won't be a runtime tag anywhere which represents that
type, though, since the language implementation never deals with the
actual type of expressions, except in those degenerate cases where the
type is so simple that it happens to be a 1:1 match to the corresponding
tag. It's only the programmer that "knows" that this expression has
that type.

Anton
 
D

David Hopwood

Pascal said:
Moreover, a good proportion of the program and a good number of
algorithms don't even need to know the type of the objects they
manipulate.

For example, sort doesn't need to know what type the objects it sorts
are. It only needs to be given a function that is able to compare the
objects.

But this is true also in a statically typed language with parametric
polymorphism.

[...]
Why should adding a few functions or methods, and providing input
values of a new type be rejected from a statically checked point of
view by a compiled program that would be mostly bit-for-bit the same
with or without this new type?

It usually wouldn't be -- adding methods in a typical statically typed
OO language is unlikely to cause type errors (unless there is a naming
conflict, in some cases). Nor would adding new types or new functions.

(*Using* new methods without declaring them would cause an error, yes.)
 
D

David Hopwood

Pascal said:
But it's always possible at run-time that new functions and new
function calls be generated such as:

(let ((x "two"))
(eval `(defmethod g ((self ,(type-of x))) t))
(eval `(defmethod h ((x ,(type-of x)) (y string))
(,(intern (format nil "DO-SOMETHING-WITH-A-~A" (type-of x))) x)
(do-something-with-a-string y)))
(funcall (compile nil `(let ((x ,x)) (lambda () (f x "Hi!"))))))

Will you execute the whole type-inference on the whole program "black
box" everytime you define a new function? Will you recompile all the
"black box" functions to take into account the new type the arguments
can be now?

Yes, why not?
This wouldn't be too efficient.

It's rare, so it doesn't need to be efficient. 'eval' is inherently
inefficient, anyway.
 
D

David Hopwood

Andreas said:
"Values" refers to the concrete values existent in the semantics of a
programming language. This set is usually infinite, but basically fixed.
To describe the set of "values" of an abstract type you would need
"fresh" values that did not exist before (otherwise the abstract type
would be equivalent to some already existent type). So you'd need at
least a theory for name generation or something similar to describe
abstract types in a types-as-sets metaphor.

Set theory has no difficulty with this. It's common, for example, to see
"the set of strings representing propositions" used in treatments of
formal systems.
 
A

Anton van Straaten

Vesa said:
[...]

This static vs dynamic type thing reminds me of one article written by
Bjarne Stroustrup where he notes that "Object-Oriented" has become a
synonym for "good". More precisely, it seems to me that both camps
(static & dynamic) think that "typed" is a synonym for having
"well-defined semantics" or being "safe" and therefore feel the need
to be able to speak of their language as "typed" whether or not it
makes sense.

I reject this comparison. There's much more to it than that. The point
is that the reasoning which programmers perform when working with an
program in a latently-typed language bears many close similiarities to
the purpose and behavior of type systems.

This isn't an attempt to jump on any bandwagons, it's an attempt to
characterize what is actually happening in real programs and with real
programmers. I'm relating that activity to type systems because that is
what it most closely relates to.

Usually, formal type theory ignores such things, because of course
what's in the programmer's head is outside the domain of the formal
definition of an untyped language. But all that means is that formal
type theory can't account for the entirety of what's happening in the
case of programs in untyped languages.

Unless you can provide some alternate theory of the subject that's
better than what I'm offering, it's not sufficient to complain "but that
goes beyond (static/syntactic) type theory". Yes, it goes beyond
traditional type theory, because it's addressing with something which
traditional type theory doesn't address. There are reasons to connect
it to type theory, and if you can't see those reasons, you need to be
more explicit about why.
I agree. I think that instead of "statically typed" we should say
"typed" and instead of "(dynamically|latently) typed" we should say
"untyped".

The problem with "untyped" is that there are obvious differences in
typing behavior between the untyped lambda calculus and, say, a language
like Scheme (and many others). Like all latently-typed languages,
Scheme includes, in the language, mechanisms to tag values in a way that
supports checks which help the programmer to ensure that the program's
behavior matches the latent type scheme that the programmer has in mind.
See my other recent reply to Marshall for a more detailed explanation
of what I mean.

I'm suggesting that if a language classifies and tags values in a way
that supports the programmer in static reasoning about the behavior of
terms, that calling it "untyped" does not capture the entire picture,
even if it's technically accurate in a restricted sense (i.e. in the
sense that terms don't have static types that are known within the
language).

Let me come at this from another direction: what do you call the
classifications into number, string, vector etc. that a language like
Scheme does? And when someone writes a program which includes the
following lines, how would you characterize the contents of the comment:

; third : integer -> integer
(define (third n) (quotient n 3))

In my experience, answering these questions without using the word
"type" results in a lot of silliness. And if you do use "type", then
you're going to have to adjust the rest of your position significantly.
That is not unreasonable. You see, you can't have types unless you
have a type system. Types without a type system are like answers
without questions - it just doesn't make any sense.

The first point I was making is that *automated* checking has very
little to do with anything, and conflating static types with automated
checking tends to lead to a lot of confusion on both sides of the
static/dynamic fence.

But as to your point, latently typed languages have informal type
systems. Show me a latently typed language or program, and I can tell
you a lot about its type system or type scheme.

Soft type inferencers demonstrate this by actually defining a type
system and inferring type schemes for programs. That's a challenging
thing for an automated tool to do, but programmers routinely perform the
same sort of activity on an informal basis.
There is a huge hole in your argument above. Types really do not make
sense without a type system. To claim that a program has a type
scheme, you must first specify the type system. Otherwise it just
doesn't make any sense.

Again, the type system is informal. What you're essentially saying is
that only things that are formally defined make sense. But you can't
wish dynamically-checked languages out of existence. So again, how
would you characterize these issues in dynamically-checked languages?

Saying that it's just a matter of well-defined semantics doesn't do
anything to address the details of what's going on. I'm asking for a
more specific account than that.
Mathematicians operated for thousands of years without automated
checking of proofs, so you can't argue that because a
dynamically-checked program hasn't had its type scheme proved correct,
that it somehow doesn't have types. That would be a bit like arguing
that we didn't have Math until automated theorem provers came along.


No - not at all. First of all, mathematics has matured quite a bit
since the early days. I'm sure you've heard of the axiomatic method.
However, what you are missing is that to prove that your program has
types, you first need to specify a type system. Similarly, to prove
something in math you start by specifying [fill in the rest].

I agree, to make the comparison perfect, you'd need to define a type
system. But that's been done in various cases. So is your complaint
simply that most programmers are working with informal type systems?
I've already stipulated that.

However, I think that you want to suggest that those programmers are not
working with type systems at all.

This reminds me of a comedy skit which parodied the transparency of
Superman's secret identity: Clark Kent is standing in front of Lois Lane
and removes his glasses for some reason. Lois looks confused and says
"where did Clark go?" Clark quickly puts his glasses back on, and Lois
breathes a sigh of relief, "Oh, there you are, Clark".

The problem we're dealing with in this case is that anything that's not
formally defined is essentially claimed to not exist. It's lucky that
this isn't really the case, otherwise much of the world around us would
vanish in a puff of formalist skepticism.

We're discussing systems that operate on an informal basis: in this
case, the reasoning about the classification of values which flow
through terms in a dynamically-checked language. If you can produce a
useful formal model of those systems that doesn't omit significant
information, that's great, and I'm all ears.

However, claiming that e.g. using a universal type is a complete model
what's happening misses the point: it doesn't account at all for the
reasoning process I've just described.
Untyped is not misleading. "Typed" is not a synonym for "safe" or
"having well-defined semantics".

Again, your two suggested replacements don't come close to capturing
what I'm talking about. Without better alternatives, "type" is the
closest appropriate term. I'm qualifying it with the term "latent",
precisely to indicate that I'm not talking about formally-defined types.

I'm open to alternative terminology or ways of characterizing this, but
they need to address issues that exist outside the boundaries of formal
type systems, so simply applying terms from formal type theory is not
usually sufficient.
I won't (use "latently typed"). At least not without further
qualification.

This and my other recent post give a fair amount of qualification, so
let me know if you need anything else to be convinced. :)

But to be fair, I'll start using "untyped" if you can come up with a
satisfactory answer to the two questions I asked above, just before I
used the word "silliness".

Anton
 
A

Anton van Straaten

Rob said:
I agree with most of what you say except regarding "untyped".

In machine language or most assembly the type of a variable is
something held only in the mind of the programmer writing it, and
nowhere else. In latently typed languages though the programmer can
ask what they type of a particular value is. There is a vast
difference to writing code in the latter kind of language to writing
code in assembly.

The distinction you describe is pretty much exactly what I was getting
at. I may have put it poorly.
I would suggest that at least assembly should be referred to as
"untyped".

Yes, I agree.

While we're talking about "untyped", I want to expand on something I
wrote in a recent reply to Vesa Karvonen: I accept that "untyped" has a
technical definition which means that a language doesn't statically
assign types to terms. But this doesn't capture the distinction between
"truly" untyped languages, and languages which tag their values to
support the programmer's ability to think in terms of latent types.

The point of avoiding the "untyped" label is not because it's not
technically accurate (in a limited sense) -- it's because in the absence
of other information, it misses out on important aspects of the nature
of latently typed languages. My reply to Vesa goes into more detail
about this.

Anton
 
A

Anton van Straaten

Andreas said:
Maybe this is the original source of the myth that static typing is all
about assigning types to variables...

With all my respect to the Scheme people, I'm afraid this paragraph is
pretty off, no matter where you stand. Besides the issue just mentioned
it equates "manifest" with static types. I understand "manifest" to mean
"explicit in code", which of course is nonsense - static typing does not
require explicit types. Also, I never heard "weakly typed" used in the
way they suggest - in my book, C is a weakly typed language (= typed,
but grossly unsound).

That text goes back at least 20 years, to R3RS in 1986, and possibly
earlier, so part of what it represents is simply changing use of
terminology, combined with an attempt to put Scheme in context relative
to multiple languages and terminology usages. The fact that we're still
discussing this now, and haven't settled on terminology acceptable to
all sides, illustrates the problem.

The Scheme report doesn't actually say anything about how latent types
relate to static types, in the way that I've recently characterized the
relationship here. However, I thought this was a good place to explain
how I got to where I am on this subject.

There's been a lot of work done on soft type inference in Scheme, and
other kinds of type analysis. The Scheme report talks about latent
types as meaning "types are associated with values". While this might
cause some teeth-grinding amongst type theorists, it's meaningful enough
when you look at simple cases: cases where the type of a term is an
exact match for the type tags of the values that flow through that term.

When you get to more complex cases, though, most type inferencers for
Scheme assign traditional static-style types to terms. If you think
about this in conjunction with the term "latent types", it's an obvious
connection to make that what the inferencer is doing is recovering types
that are latent in the source.

Once that connection is made, it's obvious that the tags associated with
values are not the whole story: that the conformance of one or more
values to a "latent type" may be checked by a series of tag checks, in
different parts of a program (i.e. before, during and after the
expression in question is evaluated). I gave a more detailed
description of how latent types relate to tags in an earlier reply to
Marshall (Spight, not Joe).

Anton
 
M

Marshall

Joe said:
That's the important point: I want to run broken code.

I want to make sure I understand. I can think of several things
you might mean by this. It could be:
1) I want to run my program, even though I know parts of it
are broken, because I think there are parts that are not broken
and I want to try them out.
2) I want to run my program, even though it is broken, and I
want to run right up to a broken part and trap there, so I can
use the runtime facilities of the language to inspect what's
going on.

I want to run
as much of the working fragments as I can, and I want a `safety net' to
prevent me from performing undefined operations, but I want the safety
net to catch me at the *last* possible moment.

This statement is interesting, because the conventional wisdom (at
least as I'm used to hearing it) is that it is best to catch bugs
at the *first* possible moment. But I think maybe we're talking
about different continua here. The last last last possible moment
is after the software has shipped to the customer, and I'm pretty
sure that's not what you mean. I think maybe you mean something
more like 2) above.


Marshall
 
M

Marshall

Joachim said:
Not in mathematics.

So stipulated.

However, it *is* a contradiction in English. Now, we often redefine
or narrow terms for technical fields. However, when we find
ourselves in a situation where we've inverted the English
meaning with the technical meaning, or ended up with a
contradiction, it ought to give us pause.

The understanding there is that a "variable" varies - not over time, but
according to the whim of the usage. (E.g. if a function is displayed in
a graph, the parameter varies along the X axis. If it's used within a
term, the parameter varies depending on how it's used. Etc.)

Similarly for computer programs.
Of course, if values are immutable, the value associated with a
parameter name cannot vary within the same invocation - but it can still
vary from one invocation to the next.

Again, stipulated. In fact I conceded this was a good point when it
was first mentioned. However, we generally refer to parameters
to functions as "parameters"-- you did it yourself above.

What we generally (in programming) call variables are locals
and globals. If the languages supports an update operation
on those variables, then calling them variables makes sense.
But "variable" has become such a catch-all term that we call

public static final int FOO = 7;

a variable, even though it can never, ever vary.

That doesn't make any sense.

If we care about precision in our terminology, we ought to
distinguish among parameters, named values that support
destructive update, named values that don't support
destructive update, and possibly other things.


Marshall
 
M

Marshall

Timo said:
This is actually one of the most interesting threads I have read in a
long time. If you ignore the evangelism, there is a lot if high-quality
information and first-hand experience you couldn't find in a dozen books.

Hear hear! This is an *excellent* thread. The evangelism is at
rock-bottom
and the open exploration of other people's way of thinking is at what
looks to me like an all-time high.


Marshall
 
C

Chris Smith

Anton said:
I reject this comparison. There's much more to it than that.

I agree that there's more to it than that. I also agree, however, that
Vesa's observation is true, and is a big part of the reason why it's
difficult to discuss this topic. I don't recall who said what at this
point, but earlier today someone else posted -- in this same thread --
the idea that static type "advocates" want to classify some languages as
untyped in order to put them in the same category as assembly language
programming. That's something I've never seen, and I think it's far
from the goal of pretty much anyone; but clearly, *someone* was
concerned about it. I don't know if much can be done to clarify this
rhetorical problem, but it does exist.

The *other* bit that's been brought up in this thread is that the word
"type" is just familiar and comfortable for programmers working in
dynamically typed languages, and that they don't want to change their
vocabulary.

The *third* thing that's brought up is that there is a (to me, somewhat
vague) conception going around that the two really ARE varieties of the
same thing. I'd like to pin this down more, and I hope we get there,
but for the time being I believe that this impression is incorrect. At
the very least, I haven't seen a good way to state any kind of common
definition that withstands scrutiny. There is always an intuitive word
involved somewhere which serves as an escape hatch for the author to
retain some ability to make a judgement call, and that of course
sabotages the definition. So far, that word has varied through all of
"type", "type error", "verify", and perhaps others... but I've never
seen anything that allows someone to identify some universal concept of
typing (or even the phrase "dynamic typing" in the first place) in a way
that doesn't appeal to intuition.
The point
is that the reasoning which programmers perform when working with an
program in a latently-typed language bears many close similiarities to
the purpose and behavior of type systems.

Undoubtedly, some programmers sometimes perform reasoning about their
programs which could also be performed by a static type system. This is
fairly natural, since static type systems specifically perform tractable
analyses of programs (Pierce even uses the word "tractable" in the
definition of a type system), and human beings are often (though not
always) best-served by trying to solve tractable problems as well.
There are reasons to connect
it to type theory, and if you can't see those reasons, you need to be
more explicit about why.

Let me pipe up, then, as saying that I can't see those reasons; or at
least, if I am indeed seeing the same reasons that everyone else is,
then I am unconvinced by them that there's any kind of rigorous
connection at all.
I'm suggesting that if a language classifies and tags values in a way
that supports the programmer in static reasoning about the behavior of
terms, that calling it "untyped" does not capture the entire picture,
even if it's technically accurate in a restricted sense (i.e. in the
sense that terms don't have static types that are known within the
language).

It is, nevertheless, quite appropriate to call the language "untyped" if
I am considering static type systems. I seriously doubt that this usage
in any way misleads anyone into assuming the absence of any mental
processes on the part of the programmer. I hope you agree. If not,
then I think you significantly underestimate a large category of people.
The first point I was making is that *automated* checking has very
little to do with anything, and conflating static types with automated
checking tends to lead to a lot of confusion on both sides of the
static/dynamic fence.

I couldn't disagree more. Rather, when you're talking about static
types (or just "types" in most research literature that I've seen), then
the realm of discussion is specifically defined to be the very set of
errors that are automatically caught and flagged by the language
translator. I suppose that it is possible to have an unimplemented type
system, but it would be unimplemented only because someone hasn't felt
the need nor gotten around to it. Being implementABLE is a crucial part
of the definition of a static type system.

I am beginning to suspect that you're make the converse of the error I
made earlier in the thread. That is, you may be saying things regarding
the psychological processes of programmers and such that make sense when
discussing dynamic types, and in any case I haven't seen any kind of
definition of dynamic types that is more acceptable yet; but it's
completely irrelevant to static types. Static types are not fuzzy -- if
they were fuzzy, they would cease to be static types -- and they are not
a phenomenon of psychology. To try to redefine static types in this way
not only ignores the very widely accepted basis of entire field of
existing literature, but also leads to false ideas such as that there is
some specific definable set of problems that type systems are meant to
solve.

Skipping ahead a bit...
I agree, to make the comparison perfect, you'd need to define a type
system. But that's been done in various cases.

I don't think that has been done, in the case of dynamic types. It has
been done for static types, but much of what you're saying here is in
contradiction to the definition of a type system in that sense of the
word.
The problem we're dealing with in this case is that anything that's not
formally defined is essentially claimed to not exist.

I see it as quite reasonable when there's an effort by several
participants in this thread to either imply or say outright that static
type systems and dynamic type systems are variations of something
generally called a "type system", and given that static type systems are
quite formally defined, that we'd want to see a formal definition for a
dynamic type system before accepting the proposition that they are of a
kind with each other. So far, all the attempts I've seen to define a
dynamic type system seem to reduce to just saying that there is a well-
defined semantics for the language.

I believe that's unacceptable for several reasons, but the most
significant of them is this. It's not reasonable to ask anyone to
accept that static type systems gain their essential "type system-ness"
from the idea of having well-defined semantics. From the perspective of
a statically typed language, this looks like a group of people getting
together and deciding that the real "essence" of what it means to be a
type system is... and then naming something that's so completely non-
essential that we don't generally even mention it in lists of the
benefits of static types, because we have already assumed that it's true
of all languages except C, C++, and assembly language.
 
M

Marshall

Anton said:
Sorry, that was a huge omission. (What I get for posting at 3:30am.)

Thanks for the excellent followup.

The short answer is that I'm most directly referring to "the types in
the programmer's head".

In the database theory world, we speak of three levels: conceptual,
logical, physical. In a dbms, these might roughly be compared to
business entities described in a requirements doc, (or in the
business owners' heads), the (logical) schema encoded in the
dbms, and the b-tree indicies the dbms uses for performance.

So when you say "latent types", I think "conceptual types."

The thing about this is, both the Lisp and the Haskell programmers
are using conceptual types (as best I can tell.) And also, the
conceptual/latent types are not actually a property of the
program; they are a property of the programmer's mental
model of the program.

It seems we have languages:
with or without static analysis
with or without runtime type information (RTTI or "tags")
with or without (runtime) safety
with or without explicit type annotations
with or without type inference

Wow. And I don't think that's a complete list, either.

I would be happy to abandon "strong/weak" as terminology
because I can't pin those terms down. (It's not clear what
they would add anyway.)

A more complete informal summary is as follows:

Languages with latent type systems typically don't include type
declarations in the source code of programs. The "static" type scheme
of a given program in such a language is thus latent, in the English
dictionary sense of the word, of something that is present but
undeveloped. Terms in the program may be considered as having static
types, and it is possible to infer those types, but it isn't necessarily
easy to do so automatically, and there are usually many possible static
type schemes that can be assigned to a given program.

Programmers infer and reason about these latent types while they're
writing or reading programs. Latent types become manifest when a
programmer reasons about them, or documents them e.g. in comments.

Uh, oh, a new term, "manifest." Should I worry about that?

(As has already been noted, this definition may seem at odds with the
definition given in the Scheme report, R5RS, but I'll address that in a
separate post.)

There's a close connection between latent types in the sense I've
described, and the "tagged values" present at runtime. However, as type
theorists will tell you, the tags used to tag values at runtime, as e.g.
a number or a string or a FooBar object, are not the same thing as the
sort of types which statically-typed languages have.

A simple example of the distinction can be seen in the type of a
function. Using Javascript as a lingua franca:

function timestwo(x) { return x * 2 }

In a statically-typed language, the type of a function like this might
be something like "number -> number", which tells us three things: that
timestwo is a function; that it accepts a number argument; and that it
returns a number result.

But if we ask Javascript what it thinks the type of timestwo is, by
evaluating "typeof timestwo", it returns "function". That's because the
value bound to timestwo has a tag associated with it which says, in
effect, "this value is a function".

Well, darn. It strikes me that that's just a decision the language
designers
made, *not* to record complete RTTI. (Is it going to be claimed that
there is an *advantage* to having only incomplete RTTI? It is a
serious question.)

But "function" is not a useful type. Why not? Because if all you know
is that timestwo is a function, then you have no idea what an expression
like "timestwo(foo)" means. You couldn't write working programs, or
read them, if all you knew about functions was that they were functions.
As a type, "function" is incomplete.

Yes, function is a parameterized type, and they've left out the
parameter values.

By my definition, though, the latent type of timestwo is "number ->
number". Any programmer looking at the function can figure out that
this is its type, and programmers do exactly that when reasoning about a
program.
Gotcha.


(Aside: technically, you can pass timestwo something other than a
number, but then you get NaN back, which is usually not much use except
to generate errors. I'll ignore that here; latent typing requires being
less rigourous about some of these issues.)

So, where do tagged values fit into this? Tags help to check types at
runtime, but that doesn't mean that there's a 1:1 correspondence between
tags and types. For example, when an expression such as "timestwo(5) *
3" is evaluated, three checks occur that are relevant to the type of
timestwo:

1. Before the function call takes place, a check ensures that timestwo
is a function.

2. Before the multiplication in "x * 2", a check ensures that x is a number.

3. When timestwo returns, before the subsequent multiplication by 3, a
check ensures that the return type of timestwo is a number.

These three checks correspond to the three pieces of information
contained in the function type signature "number -> number".

However, these dynamic checks still don't actually tell us the type of a
function. All they do is check that in a particular case, the values
involved are compatible with the type of the function. In many cases,
the checks may infer a signature that's either more or less specific
than the function's type, or they may infer an incomplete signature --
e.g., the return type doesn't need to be checked when evaluating "arr
= timestwo(5)".

I used a function just as an example. There are many other cases where
a value's tag doesn't match the static (or latent) type of the terms
through which it flows. A simple example is an expression such as:

(flag ? 5 : "foo")

Here, the latent type of this expression could be described as "number |
string". There won't be a runtime tag anywhere which represents that
type, though, since the language implementation never deals with the
actual type of expressions, except in those degenerate cases where the
type is so simple that it happens to be a 1:1 match to the corresponding
tag. It's only the programmer that "knows" that this expression has
that type.


Hmmm. Another place where the static type isn't the same thing as
the runtime type occurs in languages with subtyping.

Question: if a language *does* record complete RTTI, and the
languages does *not* have subtyping, could we then say that
the runtime type information *is* the same as the static type?


Marshall
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,825
Latest member
VernonQuy6

Latest Threads

Top