What is Expressiveness in a Computer Language

M

Marshall

Chris said:
/Given/ a set of operations which I want to forbid,
I would like to say that a "dynamic type system" which prevents them executing
is doing very much the same job as a static type system which stops the code
compiling.

There are similarities, (classifying values into categories, and
checking
whether functions are sensibly invoked according these categories)
and there are differences (in the difference between existential
quantification of success vs. universal quantification of success.)
They are certainly not identical. (I'm not sure if by "very much the
same" you meant identical or not.)

Of course, we can talk about what kinds of operations we want to forbid, but
that seems (to me) to be orthogonal to this discussion. Indeed, the question
of dynamic/static is largely irrelevant to a discussion of what operations we
want to police, except insofar as certain checks might require information
which isn't available to a (feasible) static theorem prover.

Sometimes the things you want to check are emergent properties
of a progam rather than local properties. I don't see how runtime
checks can do anything with these properties, at least not without
a formal framework. An example is deadlock freedom. There
exist type systems that can prove code free of deadlock or
race conditions. How woud you write a runtime check for that?


Marshall
 
M

Marshall

Chris said:
I find it too weak, as if to say: "well, ok, it can't actually corrupt memory
as such, but the program logic is still apt go all over the shop"...

It is impossible to evaluate your claim without a specific definition
of "go all over the shop." If it means subject to Method Not
Understood,
then we would have to say that Smalltalk can in fact go all over
the shop. If it means subject to memory corruption, then it can't.

In any event, the denotation is quite clear: Smalltalk does not
formally assign types to expressions during a static analysis
phase, and Smalltalk is not subject to memory corruption.

I have a problem with the whole line of reasoning that goes
"someone might think term x means something bad, so we
can't use it." It's unfalsifiable. It also optimizes for malicious
use of the terms. Both are bad properties to have as design
principles, at least in this context.


Marshall
 
P

Patricia Shanahan

Chris said:
Patricia, perhaps I'm misunderstanding you here. To make the halting
problem decidable is quite a different thing than to say "given a
program, I may be able to prove that it terminates" (so long as you also
acknowledge the implicit other possibility: I also may not be able to
prove it in any finite bounded amount of time, even if it does
terminate!)

The fact that the halting problem is undecidable is not a sufficient
reason to reject the kind of analysis that is performed by programmers
to convince themselves that their programs are guaranteed to terminate.
Indeed, proofs that algorithms are guaranteed to terminate even in
pathological cases are quite common. Indeed, every assignment of a
worst-case time bound to an algorithm is also a proof of termination.

This is, of course, a very good reason to reject the idea that the
static type system for any Turing-complete language could ever perform
this same kind analysis.

Of course, we can, and should, prefer programs we can prove terminate.
Languages can make it easier or harder to write provably terminating
programs.

Patricia
 
M

Marshall

Andreas said:
... And the reason is that "type" has a
well-established use in theory. It is not just my "assumption", it is
established practice since 80 or so years.

Wouldn't it be fair to say it goes back a least to
Principia Mathematica, 1910?

So far, this discussion has
not revealed the existence of any formal work that would provide a
theory of "dynamic types" in the sense it is used to characterise
"dynamically typed" languages.

Indeed, the idea I am starting to get is that much of what is going
on in the dynamic/latent/untyped/whatever world is simply informal,
conceptual-level static analysis. The use of the runtime system
is a formal assistant to this informal analysis. In other words,
formal existential quantification is used as hint to help do
informal universal quantification.

Now, given that explanation, the formal analysis looks better.
But that's not the whole story. We also have the fact that
the informal analysis gets to run on the human brain, while
the static analysis has to run on a lousy microprocessor
with many orders of magnitude less processing power.
We have the fact that the formal static analysis is of
limited power. We have the fact that historically,
static languages have forbidden the use of the formal
existential method preferred by the DT crowd until
*after* the static analysis has passed, thus defeating
their purpose. And probably still more things that I
don't appreciate yet. So I'll now throw around some
terms (in an attempt to look smart :) and say,
late binding, impredicativity.


Marshal
 
C

Chris Uppal

Chris said:
Perhaps, if you wanted to be quite careful about your distinctions, you
may want to choose different words for the two. However, then you run
into that same pesky "you can't choose other people's terminology" block
again.

There may be more flexibility in this area than you'd think. I get the
impression that (after only a few years hard work) the terms "weak typing" and
"strong typing", used as synonyms for what we are here calling "dynamic typing"
and "static typing", are starting to die out.

So be of good hope !

I may yet be persuaded to talk of "static type checks" vs "dynamc checks" (with
a slight, veilled, sneer on the word "type" -- oh, so very limiting ;-)

I'd be more careful in the analogy, there. Since type theory (as
applied to programming languages, anyway) is really just a set of
methods of proving arbitrary statements about program behaviors,

Not "arbitrary" -- there is only a specific class of statements that any given
type system can address. And, even if a given statement can be proved, then it
might not be something that most people would want to call a statement of type
(e.g. x > y).


It seems to me that most (all ? by definition ??) kinds of reasoning where we
want to invoke the word "type" tend to have a form where we reduce values (and
other things we want to reason about) to equivalence classes[*] w.r.t the
judgements we wish to make, and (usually) enrich that structure with
more-or-less stereotypical rules about which classes are substitutable for
which other ones. So that once we know what "type" something is, we can
short-circuit a load of reasoning based on the language semantics, by
translating into type-land where we already have a much simpler calculus to
guide us to the same answer.

(Or representative symbols, or... The point is just that we throw away the
details which don't affect the judgements.)

a
corresponding "informal type theory" would just be the entire set of
informal reasoning by programmers about their programs.

I think that, just as for static theorem proving, there is informal reasoning
that fits the "type" label, and informal reasoning that doesn't.

I would say that if informal reasoning (or explanations) takes a form
more-or-less like "X can [not] do Y because it is [not] a Z", then that's
taking advantage of exactly the short-cuts mentioned above, and I would want to
call that informal type-based reasoning. But I certainly wouldn't call most of
the reasoning I do type-based.

-- chris
 
M

Marshall

Joe said:
I think I mean a little of each of these.

Nearly *every* program I have ever used is buggy, so this is actually a
normal state of affairs even when I'm not debugging or developing.

That makes some sense to me.

It seems quite reasonable, as a language feature, to be
able to say to the computer: I know there are a few type
errors; just run this damn program as best you can.

I do this quite often. Sometimes I'll develop `in the debugger'. I'll
change some piece of code and run the program until it traps. Then,
without exiting the debugger, I'll fix the immediate problem and
restart the program at the point it trapped. This technique takes a
bit of practice, but if you are working on something that's complex and
has a lot of state, it saves a lot of time because you don't have to
reconstruct the state every time you make a change.

Wow, interesting.

(I say the following only to point out differing strategies of
development, not to say one or the other is right or bad or
whatever.)

I occasionally find myself doing the above, and when I do,
I take it as a sign that I've screwed up. I find that process
excruciating, and I do everything I can to avoid it. Over the
years, I've gotten more and more adept at trying to turn
as many bugs as possible into type errors, so I don't have
to run the debugger.

Now, there might be an argument to be made that if I had
been using a dynamic language, the process wouldn't be
so bad, and I woudn't dislike it so much. But mabe:

As a strawman: suppose there are two broad categories
of mental strategies for thinking about bugs, and people
fall naturally into one way or the other, the way there
are morning people and night people. One group is
drawn to the static analysis, one group hates it.
One group hates working in the debugger, one group
is able to use that tool very effectively and elegantly.

Anyway, it's a thought.

It is often the case that the proximate cause of a bug is nowhere near
where the fix needs to be applied. This is especially true with the
really hard bugs. A static type system will tell me that there (may)
exist a path through the code that results in an error, but the runtime
check that fails will provide a witness.

Ah, the witness! I see the witness idea as supporting my thesis
that some people are using formal existential reasoning as a
tool for informal universal reasoning.

Very interesting. I want to again emphasize that I am not
interested in labelling one or the other of these approaches
as better or worse; rather, I want to understand both of
them, particularly to see if it is possible to have a single
language that incorporate both approaches.

This is not to say that I don't find one approach more appealing
than the other; rather it is to assert that I can tell the difference
between my personal preference and Truth. (Sometimes.)


Marshall
 
M

Marshall

Joe said:
What are you, some sort of neo-static pseudo-relational
object-disoriented fascist?

I prefer to think in terms of "object orientatedness."

But you're close. I'd say rather that I'm a quasi-theoretical,
neo-relational, crypto-logical fascist. But definitely fascist.

Once after a lengthy debate with a schemer, he
dubbed me (good naturedly) "Il Duce." Heh.


Marshall
 
C

Chris Smith

My immediate thought is that the same question is equally applicable (and
equally interesting -- i.e. no simple answer ;-) for static typing.

I don't see that. Perhaps I'm missing something. If I were to define a
type system for such a language that is vaguely Java-based but has
NotZero as a type, it would basically implement your option [D], and
contain the following rules:

1. NotZero is a subtype of Number
2. Zero is a subtype of Number

3. If x : Number and k : Zero, then "if (x != k) a else b" is well-typed
iff "a" is well-typed when the type-environment is augmented with the
proposition "x : NotZero" and "b" is well-typed when the type-
environment is augmented with the new proposition "x : Zero".

Yes, I know that Java doesn't do this right now in similar places (e.g.,
instanceof), and I don't know why not. This kind of thing isn't even
theoretically interesting at all, except that it makes the proof of
type-soundness look different. As far as I can see, it ought to be
second-nature in developing a type system for a language.

In that case, the first example is well-typed, and also runs as
expected. The second bit of code is not well-typed, and thus
constitutes a type error. The way we recognize it as a type error,
though, is in the fact that the compiler aborts while checking the
typing relation, rather than because of the kind of problem prevented.
Of course, it would be possible to build a static type system in which
different things are true. For example, Java itself is a system in
which both terms are well-typed.

From a dynamic type perspective, the litany of possibilities doesn't
really do much for defining a dynamic type system, which I thought was
our goal in this bit of the thread. I may not have been clear enough
that I was look for an a priori grounds to classify these two cases by
the definition. I was assuming the existing behavior of Java, which
says that both accomplish the same thing and react in the same way, yet
intuitively we seem to think of one as a type error being "solved",
whereas the other is not.

Anyway, you just wrote another message that is far closer to what I feel
we ought to be thinking about than my question here... so I will get to
reading that one, and perhaps we can take up this point later when and
if it is appropriate.
 
M

Marshall

Chris said:
How expressive does that end up being for real problems ? I mean obviously in
some sense it's crippling, but how much of a restiction would that be for
non-accademic programming. Could I write an accountancy package in it, or an
adventure games, or a webserver, with not too much more difficulty than in a
similar language without that one aspect to its type system ?

Hmm, come to think of it those all hae endless loops at the outer level, with
the body of the loop being an event handler, so maybe only the handler should
be required to guaranteed to terminate.

An example of a non-Turing-complete language (at least the core
language; the standard is getting huge) with guaranteed termination,
that is used everywhere and quite useful, is SQL.

I wouldn't want to write an accounting package in it; its abstraction
facilities are too weak. But the language is much more powerful
than it is usually given credit for.


Marshall
 
M

Marshall

Darren said:
Now substitute "<" for "+" and see if you can make the same argument. :)

If your point is about overloading, then I don't see how it affects my
point. My point was, I don't see why you'd be writing code using
operators that you thought might not be applicable to the operands.


Marshall
 
M

Marshall

Chris said:
Marshall said:
Ouch; I have a really hard time understanding this.

I can't see how you'd call + on a and b if you think they might
not be numbers. If they could be something other than numbers,
and you're treating them as if they are, is that sort of like
doing a case analysis and only filling in one of the cases?
If so, wouldn't you want to record that fact somehow?

The obvious answer -- I don't know if it's what Pascal meant -- is that
they might be 4x4 matrices, or anything else that behaves predictably
under some operation that could be called addition. As a mathematical
analogy, the entire mathematical field of group theory comes from the
idea of performing operations on values without really knowing what the
values (or the operations) really are, but only knowing a few axioms by
which the relationship between the objects and the operation is
constrained.

[As an interesting coincidence, group theory papers frequently use the
addition symbol to represent the (arbitrary) binary relation of an
Abelian group. Not that this has anything to do with choosing "+" for
the example here.]

Sure. This is what abstract algebra is all about; group theory is
just an example. In particular, much of Stepanov's life work
has been developing abstract algebra within the framework
of a static type system.

http://www.stepanovpapers.com/
In particular, see the paper "Greatest Common Measure, the
Last 2500 Years."

Programming languages do this all the time, as well. The most popular
example is the OO sense of the word polymorphism. That's all about
being able to write code that works with a range of values regardless of
(or, at least, a range that less constraining than equlity in) types.

Exactly.


Marshall
 
M

Marshall

Chris said:
How is neither one less than complete?

In the same way that a mule is less than a complete horse.

It is a small point, and probably not worth much attention.
In fact, I was probably just being fussy in the first place.


Marshall
 
M

Marshall

Rob said:
Can I make a type in C that can only have values between 1 and 10?
How about a variable that can only hold odd numbers, or, to make it
more difficult, say fibonacci numbers?

Well, of course you can't in *C*; you can barely zip you pants with C.
But I believe you can do the above in C++, can't you?


Marshall
 
M

Marshall

Chris said:
Marshall wrote:

[me:]
I have to object to the term "hybrid".

Java has a static type system.
Java has runtime tags and tag checks.

It has both, agreed, but that isn't the full story. I think I explained what I
meant, and why I feel the term is justified as well as I can in the second half
of the paragraph you quoted. I doubt if I can do better.

Maybe you are thinking that I mean that /because/ the JVM does verification,
etc, at "runtime" the system is hybrid ?

Anyway that is /not/ what I mean. I'm (for these purposes) completely
uninterested in the static checking done by the Java to bytecode translator,
javac. I'm interested in what happens to the high-level, statically typed, OO,
language called "java bytecode" when the JVM sees it. That language has a
strict static type system which the JVM is required to check. That's a
/static/ check in my book -- it happens before the purportedly correct code is
accepted, rather than while that code is running.

I am also completely uninterested (for these immediate purposes) in the run
time checking that the JVM does (the stuff that results in
NoSuchMethodException, and the like). In the wider context of the thread, I do
want to call that kind of thing (dynamic) type checking -- but those checks are
not why I call the JVMs type system hybrid either.

Oh well, having got that far, I may as well take another stab at "hybrid".
Since the JVM is running a static type system without access to the whole text
of the program, there are some things that it is expected to check which it
can't. So it records preconditions on classes which might subsequently be
loaded. Those are added to the static checks on future classes, but -- as far
as the original class is concerned -- those checks happen dynamically. So part
of the static type checking which is supposed to happen, has been postponed to
a dynamic check. It's that, and /only/ that which I meant by "hybrid".

That's fair, I guess. I wouldn't use the terms you do, but maybe that
doesn't matter very much assuming we understand each other
at this point.



Marshal
 
D

Darren New

Marshall said:
point. My point was, I don't see why you'd be writing code using
operators that you thought might not be applicable to the operands.

I see. I thought your point was that you might write code for which you
didn't know the type of arguments. Unless you equate "type" with what
smalltalk calls "protocol" (i.e., the type is the collection of
operators applicable to values in the type), these are two different
statements.
 
D

Duane Rettig

Marshall said:
Ouch; I have a really hard time understanding this.

I can't see how you'd call + on a and b if you think they might
not be numbers. If they could be something other than numbers,
and you're treating them as if they are, is that sort of like
doing a case analysis and only filling in one of the cases?
If so, wouldn't you want to record that fact somehow?

But suppose + were polymorphic and someone (dynamically) added a
method on + to handle strings? If the type error were that a and b
happened to be strings, there would be at least two courses of action
in the development of the program: limit a and b so that they are
never strings at this point, or define a method on + to add correct
functionality for the + operator. Note that I use the term "correct",
because the correctness of the subprogram (including any added methods
on +) depends both on the correctness of usage within the language,
and also correctness of extensibilities allowed by the language for
the purpose of satisfying goals of the program. In the many cases
where the eventual goals of the program are not known at the beginning
of the design, this extensibility is a good thing to have.

Note that the above paragraph explicitly leaves out Common Lisp,
because + is not extensible in CL. CL does, however, allow for methods
to be defined on functions not already defined by the standard, so
the argument could still apply to some generic function named my-+,
or even on my-package::+ where the CL defined + operator is shadowed.
 
D

Duane Rettig

Darren New said:
Now substitute "<" for "+" and see if you can make the same argument. :)

Sure. Properly extended to handle strings, "abc" < "def" might return
true.
 
C

Chris Smith

Chris said:
Not "arbitrary" -- there is only a specific class of statements that any given
type system can address. And, even if a given statement can be proved, then it
might not be something that most people would want to call a statement of type
(e.g. x > y).

[My understanding is that we're discussing static types and type theory
here. If you mean dynamic types, then what I'm about to say won't
apply.]

I acknowledge that there are things that type systems are not going to
be capable of proving. These are not excluded a priori from the set of
problems that would count as type errors if the language's type system
caught them. They are merely excluded for pragmatic reasons from the
set of errors that someone would be likely to write a type system to
check. That is the case simply because there is a decided disadvantage
in having a compiler that isn't guaranteed to ever finish compiling your
program, or that doesn't finish in polynomial time on some
representative quantity in nearly all cases, etc.

On the other hand, statements about program behavior that can be proved
in reasonable time frames are universally valid subjects for type
systems. If they don't look like type errors now, they nevertheless
will when the type system is completed to demonstrate them. If someone
were to write a language in which is could be statically proved that a
pair of expressions has the property that when each is evaluated to
their respective values x and y, (x > y) will be true, then proving this
would be a valid subject of type theory. We would then define that
property as a type, whose values are all pairs that have the property
(or, I suppose, a nominal type whose values are constrained to be pairs
that have this property, though that would get painful very quickly),
and then checking this property when that type is required -- such as
when I am passing (x - y) as a parameter to a function that requires a
positive integer -- would be called type checking.

In fact, we've come full circle. The intial point on which I disagreed
with Torben, while it was couched in a terminological dispute, was
whether there existed a (strict) subset of programming errors that are
defined to be type errors. Torben defined the relationship between
static and dynamic types in terms of when they solved "type errors",
implying that this set is the same between the two. A lot of the
questions I'm asking elsewhere in this thread are chosen to help me
understand whether such a set truly exists for the dynamic type world;
but I believe I know enough that I can absolutely deny its existence
within the realm of type theory. I believe that even Pierce's phrase
"for proving the absence of certain program behaviors" in his definition
of a type system only excludes possible proofs that are not interesting
for determining correctness, if in fact it limits the possible scope of
type systems at all.

However, your definition of type errors may be relevant for
understanding concepts of dynamic types. I had understood you to say
earlier that you thought something could validly qualify as a dynamic
type system without regard for the type of problem that it verifies
(that was denoted by "something" in a previous conversation). I suppose
I was misinterpreting you.

So your definition was:
It seems to me that most (all ? by definition ??) kinds of reasoning where we
want to invoke the word "type" tend to have a form where we reduce values (and
other things we want to reason about) to equivalence classes[*] w.r.t the
judgements we wish to make, and (usually) enrich that structure with
more-or-less stereotypical rules about which classes are substitutable for
which other ones. So that once we know what "type" something is, we can
short-circuit a load of reasoning based on the language semantics, by
translating into type-land where we already have a much simpler calculus to
guide us to the same answer.

(Or representative symbols, or... The point is just that we throw away the
details which don't affect the judgements.)

I don't even know where I'd start in considering the forms of reasoning
that are used in proving things. Hmm. I'll have to ponder this for a
while.
I think that, just as for static theorem proving, there is informal reasoning
that fits the "type" label, and informal reasoning that doesn't.

So while I disagree in the static case, it seems likely that this is
true of what is meant by dynamic types, at least by most people in this
discussion. I'd previously classified you as not agreeing.
 
D

David Hopwood

Chris said:
Maybe you are thinking that I mean that /because/ the JVM does verification,
etc, at "runtime" the system is hybrid ?

Anyway that is /not/ what I mean. I'm (for these purposes) completely
uninterested in the static checking done by the Java to bytecode translator,
javac. I'm interested in what happens to the high-level, statically typed, OO,
language called "java bytecode" when the JVM sees it. That language has a
strict static type system which the JVM is required to check. That's a
/static/ check in my book -- it happens before the purportedly correct code is
accepted, rather than while that code is running.

I am also completely uninterested (for these immediate purposes) in the run
time checking that the JVM does (the stuff that results in
NoSuchMethodException, and the like). In the wider context of the thread, I do
want to call that kind of thing (dynamic) type checking -- but those checks are
not why I call the JVMs type system hybrid either.

Oh well, having got that far, I may as well take another stab at "hybrid".
Since the JVM is running a static type system without access to the whole text
of the program, there are some things that it is expected to check which it
can't. So it records preconditions on classes which might subsequently be
loaded.

It does, but interestingly, it wasn't originally intended to. It was intended
to be possible to check each classfile based only on that class' explicit
dependencies, without having to record constraints on the loading of future
classes.

However, the designers made various mistakes with the design of ClassLoaders
which resulted in the original JVM type system being unsound, and hence insecure.
The problem is described in
<http://www.cis.upenn.edu/~bcpierce/courses/629/papers/Saraswat-javabug.html>
(which spells my name wrong, I just noticed).

The constraint-based approach, proposed by Liang and Bracha in
<http://www.cs.purdue.edu/homes/jv/smc/pubs/liang-oopsla98.pdf>, was adopted
in order to fix this. There were other, simpler, solutions (for example,
restricting ClassLoader delegation to a strict tree structure), but the
responsible people at Sun felt that they were insufficiently expressive.
I thought, and still think, that they were mistaken. It is important for the
type system to be no more complicated than necessary if there is to be any
confidence in the security of implementations.

In any case, as shown by
<http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4670071>, current JVM
implementations do *not* reliably support the patterns of ClassLoader delegation
that were intended to be enabled by Liang and Bracha's approach.

[This "bug" (it should be classed as an RFE) is, inexplicably given how few
programs are affected by it, the most voted-for bug in Sun's tracking system.
I get the impression that most of the commentators don't understand how
horribly complicated it would be to fix. Anyway, it hasn't been fixed for
4 years.]
 
D

David Hopwood

Pascal said:
What about this: You get a type error when the program attempts to
invoke an operation on values that are not appropriate for this operation.

Examples: adding numbers to strings; determining the string-length of a
number; applying a function on the wrong number of parameters; applying
a non-function; accessing an array with out-of-bound indexes; etc.

This makes essentially all run-time errors (including assertion failures,
etc.) "type errors". It is neither consistent with, nor any improvement
on, the existing vaguely defined usage.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top