What is Expressiveness in a Computer Language

Andreas Rossberg · Jun 22, 2006

Marshall said:
I don't follow. Are you saying that a set cannot be described
intentionally? How is "the set of all objects that implement the
Foo interface" not sufficiently abstract? Is it possible you are
mixing in implementation concerns?

"Values" refers to the concrete values existent in the semantics of a
programming language. This set is usually infinite, but basically fixed.
To describe the set of "values" of an abstract type you would need
"fresh" values that did not exist before (otherwise the abstract type
would be equivalent to some already existent type). So you'd need at
least a theory for name generation or something similar to describe
abstract types in a types-as-sets metaphor.

- Andreas

John W. Kennedy · Jun 22, 2006

Rob said:
Another language which has *neither* latent ("dynamic") nor
manifest ("static") types is (was?) BLISS[1], in which, like
assembler, variables are "just" addresses[2], and values are
"just" a machine word of bits.

360-family assembler, yes. 8086-family assembler, not so much.

rossberg · Jun 22, 2006

Darren said:
No. AFAIU, an ADT defines the type based on the operations. The stack
holding the integers 1 and 2 is the value (push(2, push(1, empty()))).
There's no "internal" representation. The values and operations are
defined by preconditions and postconditions.

Are you sure that you aren't confusing *abstract* with *algebraic* data
types? In my book, abstract types usually have an internal
representation, and that can even be stateful. I don't remember having
encountered definitions of ADT as restrictive as you describe it.

Both a stack and a queue could be written in most languages as "values
that can only be accessed by a fixed set of operations" having the same
possible internal representations and the same function signatures.
They're far from the same type, because they're not abstract.

Different abstract types can have the same signature. That does not
make them the same type. The types are distinguished by their identity.

Likewise, two classes can have the same set of methods, without being
the same class (at least in languages that have nominal typing, which
includes almost all typed OOPLs).

I'm pretty sure in Pascal you could say

Type Apple = Integer; Orange = Integer;
and then vars of type apple and orange were not interchangable.

No, the following compiles perfectly fine (using GNU Pascal):

program bla;
type
apple = integer;
orange = integer;
var
a : apple;
o : orange;
begin
a := o
end.

- Andreas

Rob Thorpe · Jun 22, 2006

David said:
Rob said:

Vesa said:

Let me add another complex subtlety, then: the above description misses
an important point, which is that *automated* type checking is not the
whole story. I.e. that compile time/runtime distinction is a kind of
red herring.

I agree. I think that instead of "statically typed" we should say
"typed" and instead of "(dynamically|latently) typed" we should say
"untyped". [...]
It's certainly close enough to say that the *language* is untyped.

Indeed. Either a language has a type system and is typed or has no
type system and is untyped. I see very little room for confusion
here. In my experience, the people who confuse these things are
people from the dynamic/latent camp who wish to see types everywhere
because they confuse typing with safety or having well-defined
semantics.

Click to expand...

No. It's because the things that we call latent types we use for the
same purpose that programmers of static typed languages use static
types for.

Statically typed programmers ensure that the value of some expression
is of some type by having the compiler check it. Programmers of
latently typed languages check, if they think it's important, by asking
what the type of the result is.

The objection here is that advocates of statically typed language seem
to be claiming the "type" as their own word, and asking that others use
their definitions of typing, which are really specific to their
subjects of interest.

Click to expand...

As far as I can tell, the people who advocate using "typed" and "untyped"
in this way are people who just want to be able to discuss all languages in
a unified terminological framework, and many of them are specifically not
advocates of statically typed languages.

Its easy to create a reasonable framework. My earlier posts show simple
ways of looking at it that could be further refined, I'm sure there are
others who have already done this.

The real objection to this was that latently/dynamically typed
languages have a place in it. But some of the advocates of statically
typed languages wish to lump these languages together with assembly
language a "untyped" in an attempt to label them as unsafe.

Darren New · Jun 22, 2006

Are you sure that you aren't confusing *abstract* with *algebraic* data
types?

I've never heard of algebraic data types. It's always been "abstract
data types". Perhaps I stopped studying it, and the terminology changed
when many people started to consider encapsulation as abstraction. I
have any number of old textbooks and journals that refer to these as
"abstract data types", including texts that show an abstract data type
and then explain how to program it as an encapsulated object class.

No, the following compiles perfectly fine (using GNU Pascal):

That'll teach me to rely on 15-year-old memories.

Maybe I'm
remembering the wishful thinking from when I used Pascal.

Andreas Rossberg · Jun 22, 2006

Rob said:
Its easy to create a reasonable framework.

Luca Cardelli has given the most convincing one in his seminal tutorial
"Type Systems", where he identifies "typed" and "safe" as two orthogonal
dimensions and gives the following matrix:

| typed | untyped
-------+-------+----------
safe | ML | Lisp
unsafe | C | Assembler

Now, jargon "dynamically typed" is simply untyped safe, while "weakly
typed" is typed unsafe.

The real objection to this was that latently/dynamically typed
languages have a place in it. But some of the advocates of statically
typed languages wish to lump these languages together with assembly
language a "untyped" in an attempt to label them as unsafe.

No, see above. And I would assume that that is how most proponents of
the typed/untyped dichotomy understand it.

- Andreas

Darren New · Jun 22, 2006

John said:
360-family assembler, yes. 8086-family assembler, not so much.

And Burroughs B-series, not at all. There was one "ADD" instruction, and
it looked at the data in the addresses to determine whether to add ints
or floats.

Dimitri Maziuk · Jun 22, 2006

George Neuner sez:

You can't totally prevent it ... if the index computation involves
types having a wider range ....
The point is really that the checks that prevent these things must be
performed at runtime and can't be prevented by any practical type
analysis performed at compile time. I'm not a type theorist but my
opinion is that a static type system that could, a priori, prevent the
problem is impossible.

Right, but if you look carefully at the paragraph I originally fup'ed
to, you'll notice that it doesn't say "runtime" or "compile-time".
I was commenting on "array bounds not a typing problem" bit -- it is
if you define the index as distinct type, and there is at least one
language that supports it. In this context I don't think it matters
when the check is performed.

It does matter in the larger context of this thread, though. IMO the
important part here is that the crash report clearly points to the
caller code that generated illegal index, not to my library code.

That is the basic argument in favour of compile time error checking,
extended to runtime errors. I don't really care if it's the compiler
or runtime that tells the luser "your code is broken", as long as it
makes it clear it's *his* code that's broken, not mine.

Dima

Pascal Bourguignon · Jun 22, 2006

Matthias Blume said:
Of course, some statically typed languages handle this sort of thing
routinely.

Your sort function from above also has a specific type -- a type which
represents the fact that the objects to be sorted must be acceptable
input to the comparison function.

Well, not exactly. sort is a higher level function. The type of its
arguments is an implicit parameter of the sort function.

(sort "Hello World" (function char<=))
--> " HWdellloor"

(sort '(52 12 42 37) (function <=))
--> (12 37 42 52)

(sort (list (make-instance 'person :name "Pascal")
(make-instance 'unit :name "Pascal")
(make-instance 'programming-language :name "Pascal"))
(lambda (a b) (string<= (class-name (class-of a))
(class-name (class-of b)))))
--> (#<PERSON #x205763FE>
#<PROGRAMMING-LANGUAGE #x205765BE>
#<UNIT #x205764DE>)

In Common Lisp, sort is specified to take a parameter of type SEQUENCE
= (or vector list), and if a list it should be a proper list,
and a function taking two arguments (of any type)
and returning a generalized boolean (that is anything can be returned,
NIL is false, something else is true)

So you could say that:

(sort (sequence element-type)
(function (element-type element-type) boolean))
--> (sequence element-type)

but element-type is not a direct parameter of sort, and can change for
all calls event at the same call point:

(mapcar (lambda (s) (sort s (lambda (a b) (<= (sxhash a) (sxhash b)))))
(list (vector 52 12 42 37)
(list 52 12 42 37)
(list "abc" 'def (make-instance 'person :name "Zorro") 76)))
--> (#(12 37 42 52)
(12 37 42 52)

...or you type-check your "black box" and make sure that no matter how
you will ever change the type of the inputs (in accordance with the
interface type of the box) you get a valid program.

When? At run-time? All the modifications I spoke of can be done at
run-time in Lisp.

Matthias Blume · Jun 22, 2006

Pascal Bourguignon said:
Well, not exactly.

What do you mean by "not exactly".

sort is a higher level function. The type of its
arguments is an implicit parameter of the sort function.

What do you mean by "higher-level"? Maybe you meant "higher-order" or
"polymorphic"?

[ rest snipped ]

You might want to look up "System F".

When?

When what?

Marshall · Jun 22, 2006

Marshall said:
In this simple example,
the static case is better, but this is not free, and the cost
of the static case is evident elsewhere, but maybe not
illuminated by this example.

Ugh, please forgive my ham-fisted use of the word "better."
Let me try again:

In this simple example, the static case is provides you with
a guarantee of type safety, but this is not free, and the
cost of the static case may be evident elsewhere, even
if not illuminated by this example.

Marshall

Pascal Bourguignon · Jun 22, 2006

Marshall said:
I still don't really see the difference.

I would not expect that the dynamic programmer will be
thinking that this code will have two numbers most of the
time but sometimes not, and fail. I would expect that in both
static and dynamic, the thought is that that code is adding
two numbers, with the difference being the static context
gives one a proof that this is so. In this simple example,
the static case is better, but this is not free, and the cost
of the static case is evident elsewhere, but maybe not
illuminated by this example.

This thread's exploration of the mindset of the two kinds
of programmers is difficult. It is actually quite difficult,
(possibly impossible) to reconstruct mental states
though introspection. Nonetheless I don't see any
other way to proceed. Pair programming?

Well this is a question of data flow. As I explained, there's a whole
body of functions that don't process concretely the data they get.
But of course, eventually you must write a function that do some
concrete processing on the data it gets. That's when you consider the
type of the values. Such functions may be generic functions with
methods dispatching on the actual type of the parameters, or you may
encounter some TYPECASE or COND inside the function before calling
non-abstract "primitives" that work only on some specific type.

Pascal Bourguignon · Jun 22, 2006

Matthias Blume said:
What do you mean by "not exactly".

What do you mean by "higher-level"? Maybe you meant "higher-order" or
"polymorphic"?

Yes, that's what I wanted to say.

[ rest snipped ]

You might want to look up "System F".
[...]

When?

Click to expand...

When what?

When will you type-check the "black box"?

A function such as:

(defun f (x y)
(if (g x)
(h x y)
(i y x)))

in the context of a given program could be type-infered statically as
taking an integer and a string as argument. If the compiler did this
inference, it could perhaps generate code specific to these types.

But it's always possible at run-time that new functions and new
function calls be generated such as:

(let ((x "two"))
(eval `(defmethod g ((self ,(type-of x))) t))
(eval `(defmethod h ((x ,(type-of x)) (y string))
(,(intern (format nil "DO-SOMETHING-WITH-A-~A" (type-of x))) x)
(do-something-with-a-string y)))
(funcall (compile nil `(let ((x ,x)) (lambda () (f x "Hi!"))))))

Will you execute the whole type-inference on the whole program "black
box" everytime you define a new function? Will you recompile all the
"black box" functions to take into account the new type the arguments
can be now?

This wouldn't be too efficient. Let's just say that by default, all
arguments and variable are of type T, so the type checking is trivial,
and the generated code is, by default, totally generic.

Only the few concrete, low-level functions need to know the types of
their arguments and variables. In these functions, either the lisp
compiler will do the type inference (starting from the predefined
primitives), or the programmer will declare the types to inform the
compiler what to expect.

(defun do-something-with-a-string (x)
(declare (string x))
...)

(defun do-something-with-a-integer (x)
(declare (integer x))
...)

....

--
__Pascal Bourguignon__ http://www.informatimago.com/

PUBLIC NOTICE AS REQUIRED BY LAW: Any use of this product, in any
manner whatsoever, will increase the amount of disorder in the
universe. Although no liability is implied herein, the consumer is
warned that this process will ultimately lead to the heat death of
the universe.

Joe Marshall · Jun 22, 2006

Chris said:
So the compiler now attempts to prove theorems about the program, but
once it has done so it uses the results merely to optimize its runtime
behavior and then throws the results away. I'd call that not a
statically typed language, then.

You're assuming that type-correctness is an all-or-nothing property
(well, technically it *is*, but bear with me). What if the compiler is
unable to prove a theorem about the entire program, but *can* prove a
theorem about a subset of the program. The theorems would naturally be
conditional, e.g. `Provided the input is an integer, the program is
type-safe', or time-bounded, e.g. `Until the program attempts to invoke
function FOO, the program is type-safe.'

Of course, we could encode that by restricting the type of the input
and everything would be copacetic, but suppose there is a requirement
that floating point numbers are valid input. For some reason, our
program is not type-safe for floats, but as a developer who is working
on the integer math routines, I have no intention of running that code.
The compiler cannot prove that I won't perversely enter a float, but
it can prove that if I enter an integer everything is type-safe. I can
therefore run, debug, and use a subset of the program.

That's the important point: I want to run broken code. I want to run
as much of the working fragments as I can, and I want a `safety net' to
prevent me from performing undefined operations, but I want the safety
net to catch me at the *last* possible moment. I'm not playing it safe
and staying where the compiler can prove I'll be ok. I'm living
dangerously and wandering near the edge where the compiler can't quite
prove that I'll fail.

Once I've done the major amount of exploratory programming, I may very
well want to tighten things up. I'd like to have the compiler prove
that some mature library is type-safe and that all callers to the
library use it in a type-correct manner. I'd like the compiler to say
`Hey, did you know that if the user enters a floating point number
instead of his name that your program will crash and burn?', but I
don't want that sort of information until I'm pretty sure about what I
want the program to do in the normal case.

Chris Smith · Jun 22, 2006

Chris Uppal said:
I think we're agreed (you and I anyway, if not everyone in this thread) that we
don't want to talk of "the" type system for a given language. We want to allow
a variety of verification logics. So a static type system is a logic which can
be implemented based purely on the program text without making assumptions
about runtime events (or making maximally pessimistic assumptions -- which comes
to the same thing really). I suggest that a "dynamic type system" is a
verification logic which (in principle) has available as input not only the
program text, but also the entire history of the program execution up to the
moment when the to-be-checked operation is invoked.

I am trying to understand how the above statement about dynamic types
actually says anything at all. So a dynamic type system is a system of
logic by which, given a program and a path of program execution up to
this point, verifies something. We still haven't defined "something",
though. We also haven't defined what happens if that verification
fails. One or the other or (more likely) some combination of the two
must be critical to the definition in order to exclude silly
applications of it. Presumably you want to exclude from your definition
of a dynamic "type system" which verifies that a value is non-negative,
and if so executes the block of code following "then"; and otherwise,
executes the block of code following "else". Yet I imagine you don't
want to exclude ALL systems that allow the programmer to execute
different code when the verification fails (think exception handlers)
versus succeeds, nor exclude ALL systems where the condition is that a
value is non-negative.

In other words, I think that everything so far is essentially just
defining a dynamic type system as equivalent to a formal semantics for a
programming language, in different words that connote some bias toward
certain ways of looking at possibilities that are likely to lead to
incorrect program behavior. I doubt that will be an attractive
definition to very many people.

Note that not all errors that I would want to call type errors are necessarily
caught by the runtime -- it might go happily ahead never realising that it had
just allowed one of the constraints of one of the logics I use to reason about
the program. What's known as an undetected bug -- but just because the runtime
doesn't see it, doesn't mean that I wouldn't say I'd made a type error. (The
same applies to any specific static type system too, of course.)

In static type system terminology, this quite emphatically does NOT
apply. There may, of course, be undetected bugs, but they are not type
errors. If they were type errors, then they would have been detected,
unless the compiler is broken.

If you are trying to identify a set of dynamic type errors, in a way
that also applies to statically typed languages, then I will read on.

But the checks the runtime does perform (whatever they are, and whenever they
happen), do between them constitute /a/ logic of correctness. In many highly
dynamic languages that logic is very close to being maximally optimistic, but
it doesn't have to be (e.g. the runtime type checking in the JMV is pretty
pessimistic in many cases).

Anyway, that's more or less what I mean when I talk of dynamically typed
language and their dynamic type systems.

So my objections, then, are in the first paragraph.

[**] Although there are operations which are not possible, reading another
object's instvars directly for instance, which I suppose could be taken to
induce a non-trivial (and static) type logic.

In general, I wouldn't consider a syntactically incorrect program to
have a static type error. Type systems are, in fact, essentially a tool
so separate concerns; specifically, to remove type-correctness concerns
from the grammars of programming languages. By doing so, we are able at
least to considerably simplify the grammar of the language, and perhaps
also to increase the "tightness" of the verification without risking
making the language grammar context-sensitive. (I'm unsure about the
second part of that statement, but I can think of no obvious theoretical
reason to assume that combining a type system with a regular context-
free grammar would yield another context-free grammar. Then again,
formal languages are not my strong point.)

Chris F Clark · Jun 22, 2006

Pascal said:
Consider a simple expression like 'a + b': In a dynamically typed
language, all I need to have in mind is that the program will attempt to
add two numbers. In a statically typed language, I additionally need to
know that there must a guarantee that a and b will always hold numbers.

Marshall said:
I would not expect that the dynamic programmer will be
thinking that this code will have two numbers most of the
time but sometimes not, and fail. I would expect that in both
static and dynamic, the thought is that that code is adding
two numbers, with the difference being the static context
gives one a proof that this is so.

I don't think the the problem is down at the atom level. It's really
at the structure (tuple) level. If we view 'a + b' as applying the
function + to some non-atomic objects a and b. The question is how
well do we specify what a and b are. From my limited exposure to
programs that use dynamic typing, I would surmise that developers who
use dynamic typing have very broad and open definitions of the "types"
of a and b, and they don't want the type system complaining that they
have nailed the definitions down quite yet. And, if you read some of
the arguments in this topic, one assumes that they want to even be
able to correct the "types" of a and b at run-time, when they find out
that they have made a mistake without taking the system down.

So, for a concrete example, let b be a "file" type (either text file
or directory) and we get a whole system up and running with those two
types. But, we discover a need for "symbolic links". So, we want to
change the file type to have 3 sub-types. In a dynamic type system,
because the file "type" is loose (it's operationally defined by what
function one applies to b), this isn't an issue. As each function is
recoded to deal with the 3rd sub-type, the program becomes more
functional. If we need to demo the program before we have worked out
how symbolic links work for some operation, it is not a problem. Run
the application, but don't exercise that combination of type and
operation.

In many ways, this can also be done in languages with type inference.
I don't understand the process by which one does it, so I can't
explain it. Perhaps someone else will--please....

Back to the example, the file type is not an atomic type, but
generally some structure/tuple/list/record/class with members or
fields. A function which renames files by looking at only the name
field may work perfectly adequately with the new symbolic link subtype
without change because the new subtype has the same name field and
uses it the same way. A spell-check function which works on text
files using the "contents" field might need to be recoded for symbolic
links if the contents field for that subtype is "different" (e.g. the
name of the target). The naive spell check function might appear to
work, but really do the wrong thing, (i.e. checking if the target file
name is made of legal words). Thus, the type problems are not at the
atomic level (where contents is a string), but at the structure level,
where one needs to know which fields have which meanings for which
subtypes.

-Chris

Joachim Durchholz · Jun 22, 2006

Marshall said:
immutable = can't change
vary-able = can change

Clearly a contradiction in terms.

Not in mathematics.
The understanding there is that a "variable" varies - not over time, but
according to the whim of the usage. (E.g. if a function is displayed in
a graph, the parameter varies along the X axis. If it's used within a
term, the parameter varies depending on how it's used. Etc.)

Similarly for computer programs.
Of course, if values are immutable, the value associated with a
parameter name cannot vary within the same invocation - but it can still
vary from one invocation to the next.

Regards,
Jo

Joachim Durchholz · Jun 22, 2006

Andreas said:
Joachim said:

It's worth noting, too, that (in some sense) the type of an object
can change over time[*].

No. Since a type expresses invariants, this is precisely what may
*not* happen.

Click to expand...

No. A type is a set of allowable values, allowable operations, and
constraints on the operations (which are often called "invariants" but
they are invariant only as long as the type is invariant).

Click to expand...

The purpose of a type system is to derive properties that are known to
hold in advance.

That's just one of many possible purposes (a noble one, and the most
preeminent one in FPLs I'll agree any day, but it's still *not the
definition of a type*).

A type is the encoding of these properties. A type
varying over time is an inherent contradiction (or another abuse of the
term "type").

No. It's just a matter of definition, essentially.
E.g. in Smalltalk and Lisp, it does make sense to talk of the "type" of
a name or a value, even if that type may change over time.
I regard it as a highly dubious practice to have things change their
types over their lifetime, but if there are enough other constraints,
type constancy may indeed have to take a back seat.

Regards,
Jo

Joachim Durchholz · Jun 22, 2006

Andreas said:
(Btw, Pascal did not have it either, AFAIK)

Indeed.
Some Pascal dialects have it.

Regards,
Jo

rossberg · Jun 22, 2006

Joachim said:
No. It's just a matter of definition, essentially.
E.g. in Smalltalk and Lisp, it does make sense to talk of the "type" of
a name or a value, even if that type may change over time.

OK, now we are simply back full circle to Chris Smith's original
complaint that started this whole subthread, namely (roughly) that
long-established terms like "type" or "typing" should *not* be
stretched in ways like this, because that is technically inaccurate and
prone to misinterpretation.

- Andreas

What is Expresiveness in a Computer Language?	24	Jul 10, 2005
proliferation of computer languages	8	Jul 18, 2008
is laziness a programer's virtue?	13	Apr 16, 2007
Hacker News, Xahlee.Org, and What is Politics?	1	Mar 15, 2010
is laziness a programer's virtue?	12	Apr 15, 2007
Math Notations, Computer Languages, and the “Form” in Formalism	4	Aug 31, 2009
general function for sorting a matrix	4	Aug 29, 2007
A Lambda Logo Tour	5	Apr 5, 2006

What is Expressiveness in a Computer Language

Andreas Rossberg

John W. Kennedy

rossberg

Rob Thorpe

Darren New

Andreas Rossberg

Darren New

Dimitri Maziuk

Pascal Bourguignon

Matthias Blume

Marshall

Pascal Bourguignon

Pascal Bourguignon

Joe Marshall

Chris Smith

Chris F Clark

Joachim Durchholz

Joachim Durchholz

Joachim Durchholz

rossberg

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads