What is Expressiveness in a Computer Language

Joachim Durchholz · Jun 25, 2006

Chris said:
Joachim Durchholz replied:

I agree the tag is always there in the abstract.

Well - in the context of a discussion of dynamic and static typing, I'd
think that the semantic ("abstract") level is usually the best level of
discourse.
Of course, this being the Usenet, shifting the level is common (and can
even helpful).

In the end, I'm trying to fit things which are "too big" and "too
slow" into much less space and time, using types to help me reliably
make my program smaller and faster is just one trick. [...]

However, I also know that my way of thinking about it is fringe.

Oh, I really agree that's an important application of static typing.

Essentially, which aspects of static typing is more important depends on
where your problems lie: if it's ressource constraints, static typing
tends to be seen as a tool to keep ressource usage down; if it's bug
counts, static typing tends to be seen as a tool to express static
properties.
These aspects are obviously not equally important to everybody.

Regards,
Jo

Joachim Durchholz · Jun 25, 2006

Darren said:
And Burroughs B-series, not at all. There was one "ADD" instruction, and
it looked at the data in the addresses to determine whether to add ints
or floats.

I heard that the Telefunken TR series had a tagged architecture.
This seems roughly equivalent to what the B-series did (does?).

Regards,
Jo

Joachim Durchholz · Jun 25, 2006

Anton said:
Sorry, that was a huge omission. (What I get for posting at 3:30am.)

The short answer is that I'm most directly referring to "the types in
the programmer's head".

Ah, finally that terminology starts to make sense to me. I have been
wondering whether there's any useful difference between "latent" and
"run-time" typing. (I tend to avoid the term "dynamic typing" because
it's overloaded with too many vague ideas.)

there are usually many possible static
type schemes that can be assigned to a given program.

This seems to apply to latent types as well.

Actually the set of latent types seems to be the set of possible static
type schemes.
Um, well, a superset of these - static type schemes tend to be slightly
less expressive than what the programmer in his head. (Most type schemes
cannot really express things like "the range of this index variable is
such-and-so", and the boundary to general assertions about the code is
quite blurry anyway.)

There's a close connection between latent types in the sense I've
described, and the "tagged values" present at runtime. However, as type
theorists will tell you, the tags used to tag values at runtime, as e.g.
a number or a string or a FooBar object, are not the same thing as the
sort of types which statically-typed languages have.

Would that be a profound difference, or is it just that annotating a
value with a full type expression would cause just too much runtime
overhead?

In your terminology:

So, where do tagged values fit into this? Tags help to check types at
runtime, but that doesn't mean that there's a 1:1 correspondence between
tags and types.

Would it be possible to establish such a correspondence, would it be
common consensus that such a system should be called "tags" anymore, or
are there other, even more profound differences?

Regards,
Jo

Joachim Durchholz · Jun 25, 2006

Marshall said:
It seems we have languages:
with or without static analysis
with or without runtime type information (RTTI or "tags")
with or without (runtime) safety
with or without explicit type annotations
with or without type inference

Wow. And I don't think that's a complete list, either.

I would be happy to abandon "strong/weak" as terminology
because I can't pin those terms down. (It's not clear what
they would add anyway.)
Indeed.

Uh, oh, a new term, "manifest." Should I worry about that?

I think that was OED usage of the term.

Well, darn. It strikes me that that's just a decision the language
designers
made, *not* to record complete RTTI. (Is it going to be claimed that
there is an *advantage* to having only incomplete RTTI? It is a
serious question.)

In most cases, it's probably "we don't have to invent or look up
efficient algorithms that we can't think of right now".
One could consider this a sorry excuse or a wise decision to stick with
available resources, both views have their merits IMHO ;-)

Yes, function is a parameterized type, and they've left out the
parameter values.

Well, in JavaScript, the explicit type system as laid down in the
run-time type information is unparameterized.
You can criticize this as unsound, or inadequate, or whatever you wish,
and I'd like agree with that ;-), but the type of timestwo is indeed
just "function".

*Your mental model* is far more detailed, of course.

Question: if a language *does* record complete RTTI, and the
languages does *not* have subtyping, could we then say that
the runtime type information *is* the same as the static type?

Only if the two actually use the same type system.

In practice, I'd say that this is what most designers intend but what
implementors may not have gotten right ;-p

Regards,
Jo

Joachim Durchholz · Jun 25, 2006

Anton said:
Yup.

What else?
(Genuinely curious.)

Regards,
Jo

Joachim Durchholz · Jun 25, 2006

Andreas said:
Luca Cardelli has given the most convincing one in his seminal tutorial
"Type Systems", where he identifies "typed" and "safe" as two orthogonal
dimensions and gives the following matrix:

| typed | untyped
-------+-------+----------
safe | ML | Lisp
unsafe | C | Assembler

Now, jargon "dynamically typed" is simply untyped safe, while "weakly
typed" is typed unsafe.

Here's a matrix how most people that I know would fill in with terminology:

| Statically | Not |
| typed | statically |
| | typed |
---------+--------------+-------------+
typesafe | "strongly | Dynamically |
| typed" | typed |
| (ML, Pascal) | (Lisp) |
---------+--------------+-------------+
not | (no common | "untyped" |
typesafe | terminology) | |
| (C) | (Assembly) |
---------+--------------+-------------+

(Terms in quotes are challenged on a regular basis, or rarely if ever
applied.)

With the above terminology, it becomes clear that the opposite if
"(statically) typed" isn't "statically untyped", but "not statically
typed". "Statically typed" and "dynamically typed" aren't even
opposites, they just don't overlap.

Another observation: type safeness is more of a spectrum than a clearcut
distinction. Even ML and Pascal have ways to circumvent the type system,
and even C is typesafe unless you use unsafe constructs.
IOW from a type-theoretic point of view, there is no real difference
between their typesafe and not typesafe languages in the "statically
typed" column; the difference is in the amount of unsafe construct usage
in practial programs.

Regards,
Jo

Vesa Karvonen · Jun 25, 2006

In comp.lang.functional Joachim Durchholz said:
[...] Even ML and Pascal have ways to circumvent the type system, [...]

Show me a Standard ML program that circumvents the type system.

-Vesa Karvonen

rossberg · Jun 25, 2006

Joachim Durchholz write:

Another observation: type safeness is more of a spectrum than a clearcut
distinction. Even ML and Pascal have ways to circumvent the type system,

No. I'm not sure about Pascal, but (Standard) ML surely has none. Same
with Haskell as defined by its spec. OCaml has a couple of clearly
marked unsafe library functions, but no unsafe feature in the language
semantics itself.

and even C is typesafe unless you use unsafe constructs.

Tautology. Every language is "safe unless you use unsafe constructs".
(Unfortunately, you can hardly write interesting programs in any safe
subset of C.)

IOW from a type-theoretic point of view, there is no real difference
between their typesafe and not typesafe languages in the "statically
typed" column; the difference is in the amount of unsafe construct usage
in practial programs.

Huh? There is a huge, fundamental difference: namely whether a type
system is sound or not. A soundness proof is obligatory for any serious
type theory, and failure to establish it simply is a bug in the theory.

- Andreas

Pascal Costanza · Jun 25, 2006

Joachim said:
Here's a matrix how most people that I know would fill in with terminology:

| Statically | Not |
| typed | statically |
| | typed |
---------+--------------+-------------+
typesafe | "strongly | Dynamically |
| typed" | typed |
| (ML, Pascal) | (Lisp) |
---------+--------------+-------------+
not | (no common | "untyped" |
typesafe | terminology) | |
| (C) | (Assembly) |
---------+--------------+-------------+

(Terms in quotes are challenged on a regular basis, or rarely if ever
applied.)

With the above terminology, it becomes clear that the opposite if
"(statically) typed" isn't "statically untyped", but "not statically
typed". "Statically typed" and "dynamically typed" aren't even
opposites, they just don't overlap.

Another observation: type safeness is more of a spectrum than a clearcut
distinction. Even ML and Pascal have ways to circumvent the type system,
and even C is typesafe unless you use unsafe constructs.
IOW from a type-theoretic point of view, there is no real difference
between their typesafe and not typesafe languages in the "statically
typed" column; the difference is in the amount of unsafe construct usage
in practial programs.

It's also relevant how straightforward it is to distinguish between safe
and unsafe code, how explicit you have to be when you use unsafe code,
how likely it is that you accidentally use unsafe code without being
aware of it, what the generally accepted conventions in a language
community are, etc. pp.

Pascal

David Hopwood · Jun 25, 2006

Joachim said:
Here's a matrix how most people that I know would fill in with terminology:

| Statically | Not |
| typed | statically |
| | typed |
---------+--------------+-------------+
typesafe | "strongly | Dynamically |
| typed" | typed |
| (ML, Pascal) | (Lisp) |

Pascal, in most of its implementations, isn't [memory] safe; it's in the same
box as C.

---------+--------------+-------------+
not | (no common | "untyped" |
typesafe | terminology) | |
| (C) | (Assembly) |
---------+--------------+-------------+

I have to say that I prefer the labels used by Cardelli.

(Terms in quotes are challenged on a regular basis, or rarely if ever
applied.)

With the above terminology, it becomes clear that the opposite if
"(statically) typed" isn't "statically untyped", but "not statically
typed". "Statically typed" and "dynamically typed" aren't even
opposites, they just don't overlap.

Another observation: type safeness is more of a spectrum than a clearcut
distinction. Even ML and Pascal have ways to circumvent the type system,

Standard ML doesn't (with just the standard basis; no external libraries,
and assuming that file I/O cannot be used to corrupt the language
implementation).

and even C is typesafe unless you use unsafe constructs.
IOW from a type-theoretic point of view, there is no real difference
between their typesafe and not typesafe languages in the "statically
typed" column; the difference is in the amount of unsafe construct usage
in practial programs.

It is quite possible to design languages that have absolutely no unsafe
constructs, and are still useful.

David Hopwood · Jun 25, 2006

Chris said:
Do you reject that there could be something more general than what a
type theorist discusses? Or do you reject calling such things a type?

Let you write:

I'm particularly interested if something unsound (and perhaps
ambiguous) could be called a type system.

Yes, but not a useful one. The situation is the same as with unsound
formal systems; they still satisfy the definition of a formal system.

David Hopwood · Jun 25, 2006

David said:
Yes, but not a useful one. The situation is the same as with unsound
formal systems; they still satisfy the definition of a formal system.

I meant "inconsistent formal systems".

David Hopwood · Jun 25, 2006

Matthias said:
David Hopwood said:

Patricia said:

Vesa Karvonen wrote:
...

An example of a form of informal reasoning that (practically) every
programmer does daily is termination analysis. [...]
Given a program, it may be possible to formally
prove that it terminates.

To make the halting problem decidable one would have to do one of two
things: Depend on memory size limits, or have a language that really is
less expressive, at a very deep level, than any of the languages
mentioned in the newsgroups header for this message.

Click to expand...

I don't think Vesa was talking about trying to solve the halting problem.

A type system that required termination would indeed significantly restrict
language expressiveness -- mainly because many interactive processes are
*intended* not to terminate.

Click to expand...

Most interactive processes are written in such a way that they
(effectively) consist of an infinitely repeated application of some
function f that maps the current state and the input to the new state
and the output.

f : state * input -> state * output

This function f itself has to terminate, i.e., if t has to be
guaranteed that after any given input, there will eventually be an
output. In most interactive systems the requirements are in fact much
stricter: the output should come "soon" after the input has been
received.

I am pretty confident that the f for most (if not all) existing
interactive systems could be coded in a language that enforces
termination. Only the loop that repeatedly applies f would have to be
coded in a less restrictive language.

This is absolutely consistent with what I said. While f could be coded in
a system that *required* termination, the outer loop could not.

As I mentioned in a follow-up, event loop languages such as E enforce this
program structure, which would almost or entirely eliminate the need for
annotations in a type system that proves termination of some subprograms.

rossberg · Jun 25, 2006

Gabriel said:
|
| (Unfortunately, you can hardly write interesting programs in any safe
| subset of C.)

Fortunately, some people do, as living job.

I don't think so. Maybe the question is what a "safe subset" consists
of. In my book, it excludes all features that are potentially unsafe.
So in particular, C-style pointers are out, because they can easily
dangle, be uninitialisied, whatever. Can you write a realistic C
program without using pointers?

- Andreas

Chris F Clark · Jun 25, 2006

Chris said:
I'm particularly interested if something unsound (and perhaps
ambiguous) could be called a type system. I definitely consider such
things type systems.

Marshall said:
I don't understand. You are saying you prefer to investigate the
unsound over the sound? ....
Again, I cannot understand this. In a technical realm, vagueness
is the opposite of understanding.

At the risk of injecting too much irrelevant philosophy into the
discussion, I will with great trepdiation reply.

First in the abstrtact: No, understanding is approximating. The world
is inherently vague. We make false symbolic models of the world which
are consistent, but at some level they do not reflect reality, because
reality isn't consistent. Only by abtracting away the inherent
infinite amout of subtlety present in the real universe can we come to
comprehensible models. Those models can be consistent, but they are
not the universe. The models in their consistency, prove things which
are not true about the real universe.

Now in the concrete: In my work productivity is ultimately important.
Therefore, we live by the 80-20 rule in numerous variations. One of
ths things we do to achieve productivity is simplify things. In fact,
we are more interested in an unsound simplification that is right 80%
of the time, but takes only 20% of the effort to compute, than a
completely sound (100% right) model which takes 100% of the time to
compute (which is 5 times as long). We are playing the probabilities.

It's not that we don't respect the sound underpining, the model which
is consistent and establishes certain truths. However, what we want
is the rule of thumb which is far simpler and approximates the sound
model with reasonable accuracy. In particular, we accept two types of
unsoundness in the model. One, we accept a model which gives wrong
answers which are detectable. We code tests to catch those cases, and
use a different model to get the right answer. Two, we accept a model
which gets the right answer only when over-provisioned. for example,
if we know a loop will occassionaly not converge in the n-steps that
it theoretically should, we will run the loop for n+m steps until the
approximation also converges--even if that requires allocating m extra
copies of some small element than are theoretically necessary. A
small waste of a small resource, is ok if it saves a waste of a more
critical resource--the most critical resource of all being our project
timeliness.

We do the same things in our abstract reasoning (including our type
model). The problems we are solving are too large for us to
understand. Therefore, we make simplifications, simplifications we
know are inaccurate and potentially unsound. Our algorithm then
solves the simplified model, which we think covers the main portion of
the problem. It also attempts to detect when it has encountered a
problem that is outside of the simplified region, on the edge so to
speak. Now, in general, we can figure out how to handle certain edge
cases, and we do it. However, some times the edge of one part of the
algorithm intereacts with an edge of another part of the algorithm and
we get a "corner case". Because our model is too simple and the
reasoning it allows is thus unsound, the algorithm will occassionally
compute the wrong results for some of those corners. We use the same
techniques for dealing with them. Dynamic tagging is one way of
catching when we have gotten the wrong answer in our type
simplification.

Marshall's last point:

I flipped a coin to see who would win the election; it came
up "Bush". Therefore I *knew* who was going to win the
election before it happened. See the probem?

Flipping one coin to determine an election is not playing the
probabilities correctly. You need a plausible explanation for why the
coin should predict the right answer and a track record of it being
correct. If you had a coin that had correctly predicted the previous
42 presidencies and you had an explanation why the coin was right,
then it would be credible and I would be willing to wager that it
could also predict that the coin could tell us who the 44th president
would be. One flip and no explanation is not sufficient. (And to the
abstract point, to me that is all knowledge is, some convincing amount
of examples and a plausible explanation--anyone who thinks they have
more is appealing to a "knowledge" of the universe that I don't
accept.)

I do not want incredible unsound reasoning, just reasoning that is
marginal, within a margin of safety that I can control. Everyone that
I have know of has as far as I can tell made simplifying assumptions
to make the world tractable. Very rarely do we deal with the world
completely formally. Look at where that got Russell and Whitehead.
I'm just trying to be "honest" about that fact and find ways to
compensate for my own failures.

-Chris

Chris Smith · Jun 25, 2006

Joachim Durchholz said:
Sorry, I have to insist that it's not me who's stretching terms here.

All textbook definitions that I have seen define a type as the
set/operations/axioms triple I mentioned above.
No mention of immutability, at least not in the definitions.

The immutability comes from the fact (perhaps implicit in these
textbooks, or perhaps they are not really texts on formal type theory)
that types are assigned to expressions, and the program is not likely to
change as it executes. Even such oddities as "self-modifying code" are
generally formally modeled as a single program, which reduces to other
programs as it evaluates just like everything else. The original
program doesn't change in the formal model.

Hence the common (and, I think, meaningless) distinction that has been
made here: that static type systems assign types to expressions
(sometimes I hear variables, which is limiting and not really correct),
while dynamic type systems do so to values. However, that falls apart
when you understand what formal type systems mean by types. What they
mean, roughly, is "that label which we attach to an expressions
according to fixed rules to help facilitate our proofs". Since dynamic
type systems don't do such proofs, I'm having trouble understanding what
could possibly be meant by assigning types to values, unless we redefine
type. That's what I've been trying to do. So far I've had only limited
success, although there is definite potential in one post by Chris Uppal
and another by Chris Clark (or maybe I'm just partial to other people
named Chris).

Scott David Daniels · Jun 25, 2006

Huh? There is a huge, fundamental difference: namely whether a type
system is sound or not. A soundness proof is obligatory for any serious
type theory, and failure to establish it simply is a bug in the theory.

So you claim Java and Objective C are "simply bugs in the theory."

--Scott David Daniels
(e-mail address removed)

Gabriel Dos Reis · Jun 25, 2006

(e-mail address removed)-sb.de writes:

[...]

| > and even C is typesafe unless you use unsafe constructs.
|
| Tautology. Every language is "safe unless you use unsafe constructs".
| (Unfortunately, you can hardly write interesting programs in any safe
| subset of C.)

Fortunately, some people do, as living job.

I must confess I'm sympathetic to Bob Harper's view expressed in

http://www.seas.upenn.edu/~sweirich/types/archive/1999-2003/msg00298.html

-- Gaby

Marshall · Jun 25, 2006

Joachim said:
What else?
(Genuinely curious.)

I left off a big one: nominal vs. structural

Also: has subtyping polymorphism or not, has parametric polymorphism or
not.

Marshall

Chris Smith · Jun 25, 2006

Anton van Straaten said:
The problem is that there are no useful sound definitions for the type
systems (in the static sense) of dynamically-typed languages. Yet, we
work with type-like static properties in those languages all the time,
as I've been describing.

I honestly don't find it very compelling that there are similarities
between the kinds of reasoning that goes on in static type systems
versus those used by programmers in dynamically typed languages.
Rather, of course there are similarities. It would be shocking if there
were not similarities. These similarities, though, have nothing to with
the core nature of types.

Basically, there are ways to reason about programs being correct.
Static type systems describe their reasoning (which is always axiomatic
in nature) by assigning "types" to expressions. Programmers in
dynamically typed languages still care about the correctness of their
programs, though, and they find ways to reason about their programs as
well. That reasoning is not completely axiomatic, but we spend an
entire lifetime applying this basic logical system, and it makes sense
that programmers would try to reason from premises to conclusions in
ways similar to a type system. Sometimes they may even ape other type
systems with which they are familiar; but even a programmer who has
never worked in a typed language would apply the same kind of logic as
is used by type systems, because it's a form of logic with which
basically all human beings are familiar and have practiced since the age
of three (disclaimer: I am not a child psychologist).

On the other hand, programmers don't restrict themselves to that kind of
pure axiomatic logic (regardless of whether the language they are
working in is typed). The also reason inductively -- in the informal
sense -- and are satisfied with the resulting high probabilities. They
generally apply intuitions about a problem statement that is almost
surely not written clearly enough to be understood by a machine. And so
forth.

What makes static type systems interesting is not the fact that these
logical processes of reasoning exist; it is the fact that they are
formalized with definite axioms and rules of inference, performed
entirely on the program before execution, and designed to be entirely
evaluatable in finite time bounds according to some procedure or set of
rules, so that a type system may be checked and the same conclusion
reached regardless of who (or what) is doing the checking. All that,
and they still reach interesting conclusions about programs. If
informal reasoning about types doesn't meet these criteria (and it
doesn't), then it simply doesn't make sense to call it a type system
(I'm using the static terminology here). It may make sense to discuss
some of the ways that programmer reasoning resembles types, if indeed
there are resemblances beyond just that they use the same basic rules of
logic. It may even make sense to try to identify a subset of programmer
reasoning that even more resembles... or perhaps even is... a type
system. It still doesn't make sense to call programmer reasoning in
general a type system.

In the same post here, you simultaneously suppose that there's something
inherently informal about the type reasoning in dynamic languages; and
then talk about the type system "in the static sense" of a dynamic
language. How is this possibly consistent?

We know that we can easily take dynamically-typed program fragments and
assign them type schemes, and we can make sure that the type schemes
that we use in all our program fragments use the same type system.

Again, it would be surprising if this were not true. If programmers do,
in fact, tend to reason correctly about their programs, then one would
expect that there are formal proofs that could be found that they are
right. That doesn't make their programming in any way like a formal
proof. I tend to think that a large number of true statements are
provable, and that programmers are good enough to make a large number of
statements true. Hence, I'd expect that it would be possible to find a
large number of true, provable statements in any code, regardless of
language.

So we actually have quite a bit of evidence about the presence of static
types in dynamically-typed programs.

No. What we have is quite a bit of evidence about properties remaining
true in dynamically typed programs, which could have been verified by
static typing.

We're currently discussing something that so far has only been captured
fairly informally. If we restrict ourselves to only what we can say
about it formally, then the conversation was over before it began.

I agree with that statement. However, I think the conversation
regarding static typing is also over when you decide that the answer is
to weaken static typing until the word applies to informal reasoning.
If the goal isn't to reach a formal understanding, then the word "static
type" shouldn't be used; and when that is the goal, it still shouldn't
be applied to process that aren't formalized until they manage to become
formalized.

Hopefully, this isn't perceived as too picky. I've already conceded
that we can use "type" in a way that's incompatible with all existing
research literature. I would, however, like to retain some word with
actually has that meaning. Otherwise, we are just going to be
linguistically prevented from discussing it.

What is Expresiveness in a Computer Language?	24	Jul 10, 2005
proliferation of computer languages	8	Jul 18, 2008
is laziness a programer's virtue?	13	Apr 16, 2007
Hacker News, Xahlee.Org, and What is Politics?	1	Mar 15, 2010
is laziness a programer's virtue?	12	Apr 15, 2007
Math Notations, Computer Languages, and the “Form” in Formalism	4	Aug 31, 2009
general function for sorting a matrix	4	Aug 29, 2007
A Lambda Logo Tour	5	Apr 5, 2006

What is Expressiveness in a Computer Language

Joachim Durchholz

Joachim Durchholz

Joachim Durchholz

Joachim Durchholz

Joachim Durchholz

Joachim Durchholz

Vesa Karvonen

rossberg

Pascal Costanza

David Hopwood

David Hopwood

David Hopwood

David Hopwood

rossberg

Chris F Clark

Chris Smith

Scott David Daniels

Gabriel Dos Reis

Marshall

Chris Smith

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads