What is Expressiveness in a Computer Language

D

David Hopwood

Chris said:
I don't think that's true. Maybe /some/ people do confuse the two, but I am
certainly a counter-example ;-)

The tag (if any) is part of the runtime machinery (or, if not, then I don't
understand what you mean by the word), and while that is certainly a reasonably
approximation to the type of the object/value, it is only an approximation,
and -- what's more -- is only an approximation to the type as yielded by one
specific (albeit abstract, maybe even hypothetical) type system.

Yes. I should perhaps have mentioned that people sometimes mean "protocol"
rather than "tag" or "type" (a protocol being the set of messages that an object
can respond to, roughly speaking).
If I send #someMessage to a proxy object which has not had its referent set
(and assuming the default value, presumably some variant of nil, does not
understand #someMessage), then that's just as much a type error as sending
#someMessage to a variable holding a nil value.

It's an error, certainly. People usually call it a type error. But does that
terminology actually make sense?

Typical programming languages have many kinds of semantic error that can occur
at run-time: null references, array index out of bounds, assertion failures,
failed casts, "message not understood", ArrayStoreExceptions in Java,
arithmetic overflow, divide by zero, etc.

Conventionally, some of these errors are called "type errors" and some are
not. But there seems to be little rhyme or reason to this categorization, as
far as I can see. If in a particular language, both array index bounds errors
and "message not understood" can occur at run-time, then there's no objective
reason to call one a type error and the other not. Both *could* potentially
be caught by a type-based analysis in some cases, and both *are not* caught
by such an analysis in that language.

A more consistent terminology would reserve "type error" for errors that
occur when a typechecking/inference algorithm fails, or when an explicit
type coercion or typecheck fails.

According to this view, the only instances where a run-time error should be
called a "type error" are:

- a failed cast, or no match for any branch of a 'typecase' construct.
Here the construct that fails is a coercion of a value to a specific type,
or a check that it conforms to that type, and so the term "type error"
makes sense.

- cases where a typechecking/inference algorithm fails at run-time (e.g.
in a language with staged compilation, or dynamic loading with link-time
typechecking).

In other cases, just say "run-time error".
If I then assign the referent
of the proxy to some object which does understand #someMessage, then it is not
a type error to send #someMessage to the proxy. So the type has changed, but
nothing in the tag system of the language implementation has changed.

In the terminology I'm suggesting, the object has no type in this language
(assuming we're talking about a Smalltalk-like language without any type system
extensions). So there is no type error, and no inconsistency.

Objects in this language do have protocols, so this situation can be described
as a change to the object's protocol, which changes whether a given message
causes a protocol error.
 
M

Marshall

Matthias said:
No, it is not a contradiction. See the mathematical usage of the word
"variable".

I am not saying that this kind of terminology isn't common; what
I'm saying is that it isn't good. And I think I gave a pretty clear
and solid definition of the two words, and those definitions are
decidedly contradictory.

Immutable variables can stand for different values at
different times, even without mutation involved, usually because they
are bound by some construct such as lambda.

Well, that's a good point actually. Parameters are variable
no matter how you look at it. I was speaking more in terms
of locals.

Mutation is not the only way for an expression to evaluate to
different values over time.

I suppose the difficulty arises from the difference
between the connotation of "mutate" in a PLT context, which is
support for destructive assignment, and the meaning of the word
in the larger context.

Anyway, I don't see how your sentence above contradicts
my sentence. We do not use the term "update" to describe
the process of binding values to parameters upon function
invocation. (Am I wrong?)


Marshall
 
M

Marshall

Chris said:
Really? I can see that in a strong enough static type system, many
dynamic typing features would become unobservable and therefore would be
pragmatically excluded from any probable implementations... but I don't
see any other kind of mutual exclusion between the two.

Well, it strikes me that some of what the dynamic camp likes
is the actual *absence* of declared types, or the necessity
of having them. At the very least, requiring types vs. not requiring
types is mutually exclusive.

But again, my dynamic kung fu is very weak, so I may not know
what I'm talking about when I represent the dynamic side.


Marshall
 
G

George Neuner

George Neuner sez:

My vague recollection is that no, it won't unless _you_ explicitly code an
unchecked type conversion. But it's been a while.


You can't totally prevent it ... if the index computation involves
types having a wider range, frequently the solution is to compute a
wide index value and then narrow it. But if the wider value is out of
range for the narrow type you have a problem.

Using the illegal wide value in a checked narrowing conversion will
throw a CONSTRAINT_ERROR exception - it doesn't matter whether you
access the array directly using the wide value or try to assign the
value to a variable of the narrow index type. Using the wide value
unchecked will access memory beyond the array which is not what you
wanted and may cause a crash.

The point is really that the checks that prevent these things must be
performed at runtime and can't be prevented by any practical type
analysis performed at compile time. I'm not a type theorist but my
opinion is that a static type system that could, a priori, prevent the
problem is impossible.

George
 
D

David Hopwood

Rob said:
Vesa said:
In comp.lang.functional Anton van Straaten said:
Let me add another complex subtlety, then: the above description misses
an important point, which is that *automated* type checking is not the
whole story. I.e. that compile time/runtime distinction is a kind of
red herring.

I agree. I think that instead of "statically typed" we should say
"typed" and instead of "(dynamically|latently) typed" we should say
"untyped". [...]
It's certainly close enough to say that the *language* is untyped.

Indeed. Either a language has a type system and is typed or has no
type system and is untyped. I see very little room for confusion
here. In my experience, the people who confuse these things are
people from the dynamic/latent camp who wish to see types everywhere
because they confuse typing with safety or having well-defined
semantics.

No. It's because the things that we call latent types we use for the
same purpose that programmers of static typed languages use static
types for.

Statically typed programmers ensure that the value of some expression
is of some type by having the compiler check it. Programmers of
latently typed languages check, if they think it's important, by asking
what the type of the result is.

The objection here is that advocates of statically typed language seem
to be claiming the "type" as their own word, and asking that others use
their definitions of typing, which are really specific to their
subjects of interest.

As far as I can tell, the people who advocate using "typed" and "untyped"
in this way are people who just want to be able to discuss all languages in
a unified terminological framework, and many of them are specifically not
advocates of statically typed languages.
 
G

Greg Buchholz

George said:
You can't totally prevent it ... if the index computation involves
types having a wider range, frequently the solution is to compute a
wide index value and then narrow it. But if the wider value is out of
range for the narrow type you have a problem.
....snip...

The point is really that the checks that prevent these things must be
performed at runtime and can't be prevented by any practical type
analysis performed at compile time. I'm not a type theorist but my
opinion is that a static type system that could, a priori, prevent the
problem is impossible.

I haven't been following this thread too closely, but I thought the
following article might be of interest...

Eliminating Array Bound Checking through Non-dependent types.
http://okmij.org/ftp/Haskell/types.html#branding

"There is a view that in order to gain static assurances such as an
array index being always in range or tail being applied to a non-empty
list, we must give up on something significant: on data structures such
as arrays (to be replaced with nested tuples), on general recursion, on
annotation-free programming, on clarity of code, on well-supported
programming languages. That does not have to be the case. The present
messages show non-trivial examples involving native Haskell arrays,
index computations, and general recursion. All arrays indexing
operations are statically guaranteed to be safe -- and so we can safely
use an efficient unsafeAt provided by GHC seemingly for that purpose.
The code is efficient; the static assurances cost us no run-time
overhead. The example uses only Haskell98 + higher-ranked types. No new
type classes are introduced. The safety is based on: Haskell type
system, quantified type variables, and a compact general-purpose
trusted kernel."
 
C

Chris F Clark

Chris said:
In that sense, a static type system is eliminating tags, because the
information is pre-computed and not explicitly stored as a part of the
computation. Now, you may not view the tag as being there, but in my
mind if there exists a way of perfoming the computation that requires
tags, the tag was there and that tag has been eliminated.

Joachim Durchholz replied:
On a semantic level, the tag is always there - it's the type (and
definitely part of an axiomatic definition of the language).
Tag elimination is "just" an optimization.

I agree the tag is always there in the abstract.

However, for the work I do the optimization of the tag at runtime is
important, and we specifically change things into types when we know
the system can do that optimization, because then we are getting the
system to do the work we would have to do and validating that the job
is done correctly. So, I care that the tag is eliminated in practice
(and remains in theory--I have to have both).

In the end, I'm trying to fit things which are "too big" and "too
slow" into much less space and time, using types to help me reliably
make my program smaller and faster is just one trick. It's a really
great and non-obvious one though, and one I'm glad I learned. Any
algebra I can learn that helps me solve my problems better is
appreciated.

However, I also know that my way of thinking about it is fringe. Most
people don't think that the purpose of types is to help one write
reliably tighter code.

Still, knowing about dynmic typing (tagging) and static typing, helped
me understand this trick. Thus, conflating the two meanings may at
some level be confusing. However, for me, they aided understanding
something that I needed to learn.

-Chris
 
D

David Hopwood

Marshall said:
Well, it strikes me that some of what the dynamic camp likes
is the actual *absence* of declared types, or the necessity
of having them.

So why aren't they happy with something like, say, Alice ML, which is
statically typed, but has a "dynamic" type and type inference? I mean
this as a serious question.
At the very least, requiring types vs. not requiring
types is mutually exclusive.

Right, but it's pretty well established that languages that don't
require type *declarations* can still be statically typed.
 
G

George Neuner

I haven't been following this thread too closely, but I thought the
following article might be of interest...

Eliminating Array Bound Checking through Non-dependent types.
http://okmij.org/ftp/Haskell/types.html#branding


That was interesting, but the authors' method still involves runtime
checking of the array bounds. IMO, all they really succeeded in doing
was turning the original recursion into CPS and making the code a
little bit clearer.

George
 
B

Benjamin Franksen

Pascal said:
There is, of course, room for research on performing static type checks
in a running system, for example immediately after or before a software
update is applied, or maybe even on separate type checking on software
increments such that guarantees for their composition can be derived.
However, I am not aware of a lot of work in that area, maybe because the
static typing community is too focused on compile-time issues.

Not everyone is. For instance, Don Stewart has been enormously successful in
deploying such a system for Haskell (very much a statically typed language)
in a practically usable way. It is called hs-plugins (see
http://www.cse.unsw.edu.au/~dons/hs-plugins/), a framework for run-time
loading and re-loading of Haskell modules (in source form or already
compiled, giving different levels of security). Far from being a purely
academic exercise, there are interesting applications, including yi, an
extensible editor, and lambdabot, an IRC bot, both available from the above
same home page.

Cheers,
Ben
 
C

Chris Smith

Marshall said:
Well, it strikes me that some of what the dynamic camp likes
is the actual *absence* of declared types, or the necessity
of having them. At the very least, requiring types vs. not requiring
types is mutually exclusive.

So you're saying, then, that while static typing and dynamic typing are
not themselves mutually exclusive, there are people whose concerns run
as much in the "not statically typed" direction as in the "dynamically
typed" direction? I agree that this is undoubtedly true. That (not
statically typed) seems to be what gets all the attention, as a matter
of fact. Most programmers in modern languages assume, though, that
there will be some kind of safeguard against writing bad code with
unpredictable consequences, so in practice "not statically typed"
correlates strongly with "dynamically typed".

Nevertheless, the existence of languages that are clearly "both"
suggests that they should be considered separately to at least some
extent.
 
C

Chris Smith

David Hopwood said:
Typical programming languages have many kinds of semantic error that can occur
at run-time: null references, array index out of bounds, assertion failures,
failed casts, "message not understood", ArrayStoreExceptions in Java,
arithmetic overflow, divide by zero, etc.

Conventionally, some of these errors are called "type errors" and some are
not. But there seems to be little rhyme or reason to this categorization, as
far as I can see. If in a particular language, both array index bounds errors
and "message not understood" can occur at run-time, then there's no objective
reason to call one a type error and the other not. Both *could* potentially
be caught by a type-based analysis in some cases, and both *are not* caught
by such an analysis in that language.

Incidentally, yes! Filtering out the terminology stuff [as hard as this
may be to believe on USENET where half the world seems to be trolls, I
really was not so aware when I originally posted of how some communities
use terminology and where the motivations come from], this was my
original point. In the static sense, there is no such thing as a type
error; only an error that's caught by a type system. I don't know if
the same can be said of dynamic types. Some people here seem to be
saying that there is a universal concept of "type error" in dynamic
typing, but I've still yet to see a good precise definition (nor a good
precise definition of dynamic typing at all).

In either case it doesn't make sense, then, to compare how static type
systems handle type errors versus how dynamic systems handle type
errors. That's akin to asking how comparing how much many courses are
offered at a five star restaurant versus how many courses are offered by
the local university. (Yes, that's an exaggeration, of course. The
word "type" in the static/dynamic typing world at least has a closer to
common root.)
A more consistent terminology would reserve "type error" for errors that
occur when a typechecking/inference algorithm fails, or when an explicit
type coercion or typecheck fails.

I am concerned as to whether that actually would turn out to have any
meaning.

If we are considering array length bounds checking by type systems (and
just to confuse ourselves, by both static and dynamic type systems),
then is the error condition that gets raised by the dynamic system a
type error? Certainly, if the system keeps track of the fact that this
is an array of length 5, and uses that information to complain when
someone tries to treat the array as a different type (such as an array
of length >= 7, for example), certainly that's a type error, right?
Does the reference to the seventh index constitute an "explicit" type
coercion? I don't know. It seems pretty explicit to me, but I suspect
some may not agree.

The same thing would certainly be a type error in a static system, if
indeed the static system solved the array bounds problem at all.

While this effort to salvage the term "type error" in dynamic languages
is interesting, I fear it will fail. Either we'll all have to admit
that "type" in the dynamic sense is a psychological concept with no
precise technical definition (as was at least hinted by Anton's post
earlier, whether intentionally or not) or someone is going to have to
propose a technical meaning that makes sense, independently of what is
meant by "type" in a static system.
In the terminology I'm suggesting, the object has no type in this language
(assuming we're talking about a Smalltalk-like language without any type system
extensions).

I suspect you'll see the Smalltalk version of the objections raised in
response to my post earlier. In other words, whatever terminology you
think is consistent, you'll probably have a tough time convincing
Smalltalkers to stop saying "type" if they did before. If you exclude
"message not understood" as a type error, then I think you're excluding
type errors from Smalltalk entirely, which contradicts the psychological
understanding again.
 
R

Rob Warnock

+---------------
| > So, will y'all just switch from using "dynamically typed" to "latently
| > typed", and stop talking about any real programs in real programming
| > languages as being "untyped" or "type-free", unless you really are
| > talking about situations in which human reasoning doesn't come into play?
|
| I agree with most of what you say except regarding "untyped".
|
| In machine language or most assembly the type of a variable is
| something held only in the mind of the programmer writing it, and
| nowhere else. In latently typed languages though the programmer can
| ask what they type of a particular value is. There is a vast
| difference to writing code in the latter kind of language to writing
| code in assembly.
|
| I would suggest that at least assembly should be referred to as
| "untyped".
+---------------

Another language which has *neither* latent ("dynamic") nor
manifest ("static") types is (was?) BLISS[1], in which, like
assembler, variables are "just" addresses[2], and values are
"just" a machine word of bits.

However, while in BLISS neither variable nor values are typed,
operators *are* "typed"; that is, each operator specifies how
it will treat its input machine word(s) and how the machine word(s)
of bits it produces should be interpreted. So "+" is (mod 2^wordsize)
[unsigned?] integer addition, and "FADR" is floating-point addition
with rounding (as opposed to "FADD", which truncates), and so on.
So this (legal but non-sensical!) BLISS:

x := .y FMPR (.x - 13);

would, in C, have to be written roughly like this:

((void*)x) = (void*)((float)(*(void*)y) * (float)((int)(*(void*)x) - 13));

On the PDP-10, at least, both of them would generate this assembler code:

move t1, x
subi t1, 13
fmpr t1, y
movem t1, x

So is BLISS "typed" or not? And if so, what is that kind of typing called?


-Rob

[1] "Basic Language for the Implementation of Systems Software",
see <http://en.wikipedia.org/wiki/BLISS>. Created at CMU,
added-to by DEC, used by CMU, DEC, and a few others for in
the 70's-80's.

[2] Well, approximately. A BLISS variable is, conceptually at least,
really a "byte-pointer" -- a triple of a word address, a byte-size,
and a byte-position-within-word -- even on target architectures
other than the DEC PDP-10 [which had hardware byte-pointer types].
The compiler (even on the PDP-10) optimizes away LDB/DPB accesses
into natively-addressible load/store sizes, when possible.
 
R

Rob Warnock

+---------------
| Anton van Straaten wrote:
| > 3. A really natural term to refer to types which programmers reason
| > about, even if they are not statically checked, is "latent types". It
| > captures the situation very well intuitively, and it has plenty of
| > precedent -- e.g. it's mentioned in the Scheme reports, R5RS and its
| > predecessors, going back at least a decade or so (haven't dug to check
| > when it first appeared).
|
| Can you be more explicit about what "latent types" means?
| I'm sorry to say it's not at all natural or intuitive to me.
| Are you referring to the types in the programmers head,
| or the ones at runtime, or what?
+---------------

Here's what the Scheme Standard has to say:

http://www.schemers.org/Documents/Standards/R5RS/HTML/r5rs-Z-H-4.html
1.1 Semantics
...
Scheme has latent as opposed to manifest types. Types are assoc-
iated with values (also called objects) rather than with variables.
(Some authors refer to languages with latent types as weakly typed
or dynamically typed languages.) Other languages with latent types
are APL, Snobol, and other dialects of Lisp. Languages with manifest
types (sometimes referred to as strongly typed or statically typed
languages) include Algol 60, Pascal, and C.

To me, the word "latent" means that when handed a value of unknown type
at runtime, I can look at it or perform TYPE-OF on it or TYPECASE or
something and thereby discover its actual type at the moment[1], whereas
"manifest" means that types[2] are lexically apparent in the code.


-Rob

[1] I added "at the moment", since I remembered that in Common Lisp
one may change the type of a value at runtime, specifically, a
CLOS instance may change type "out from under you" if someone
performs a CHANGE-CLASS on it or redefines its CLASS definition.
[Though maybe the latter is more a change of the *type* itself
rather than a change of the *object's* type per se.]

[2] Usually of a variables or locations, but sometimes of expressions.
 
C

Chris Smith

Rob Warnock said:
Another language which has *neither* latent ("dynamic") nor
manifest ("static") types is (was?) BLISS[1], in which, like
assembler, variables are "just" addresses[2], and values are
"just" a machine word of bits.

I'm unsure that it's correct to describe any language as having no
latent typing, in the sense that's being used in this thread. It might
be more correct to say "so specified latent typing" and/or "no latent
typing beyond what is provided by the execution environment, including
the CPU, virtual memory system, etc." as appropriate. I am aware of no
hardware environment that really accepts all possible values for all
possible operations without the potential of somehow signaling a type
violation.
 
P

Pascal Costanza

David said:
So why aren't they happy with something like, say, Alice ML, which is
statically typed, but has a "dynamic" type and type inference? I mean
this as a serious question.

Note: I haven't yet worked with such a language, but here is my take anyway.

A statically type language requires you to think about two models of
your program at the same time: the static type model and the dynamic
behavioral model. A static type system ensures that these two
_different_ (that's important!) perspectives are always in sync. This is
especially valuable in settings where you know your domain well and want
to rely on feedback by your compiler that you haven't made any mistakes
in encoding your knowledge. (A static type system based on type
inferencing doesn't essentially change the requirement to think in two
models at the same time.)

A dynamically typed language is especially well suited when you don't
(yet) have a good idea about your domain and you want to use programming
especially to explore that domain. Some static typing advocates claim
that static typing is still suitable for exploring domains because of
the compiler's feedback about the preliminary encoding of your
incomplete knowledge, but the disadvantages are a) that you still have
to think about two models at the same time when you don't even have
_one_ model ready and b) that you cannot just run your incomplete
program to see what it does as part of your exploration.

A statically typed language with a dynamic type treats dynamic typing as
the exception, not as the general approach, so this doesn't help a lot
in the second setting (or so it seems to me).

A language like Common Lisp treats static typing as the exception, so
you can write a program without static types / type checks, but later on
add type declarations as soon as you get a better understanding of your
domain. Common Lisp implementations like CMUCL or SBCL even include
static type inference to aid you here, which gives you warnings but
still allows you to run a program even in the presence of static type
errors. I guess the feedback you get from such a system is probably not
"strong" enough to be appreciated by static typing advocates in the
first setting (where you have a good understanding of your domain).


Pascal
 
P

Pascal Costanza

Chris said:
While this effort to salvage the term "type error" in dynamic languages
is interesting, I fear it will fail. Either we'll all have to admit
that "type" in the dynamic sense is a psychological concept with no
precise technical definition (as was at least hinted by Anton's post
earlier, whether intentionally or not) or someone is going to have to
propose a technical meaning that makes sense, independently of what is
meant by "type" in a static system.

What about this: You get a type error when the program attempts to
invoke an operation on values that are not appropriate for this operation.

Examples: adding numbers to strings; determining the string-length of a
number; applying a function on the wrong number of parameters; applying
a non-function; accessing an array with out-of-bound indexes; etc.
I suspect you'll see the Smalltalk version of the objections raised in
response to my post earlier. In other words, whatever terminology you
think is consistent, you'll probably have a tough time convincing
Smalltalkers to stop saying "type" if they did before. If you exclude
"message not understood" as a type error, then I think you're excluding
type errors from Smalltalk entirely, which contradicts the psychological
understanding again.

Sending a message to an object that does not understand that message is
a type error. The "message not understood" machinery can be seen either
as a way to escape from this type error in case it occurs and allow the
program to still do something useful, or to actually remove (some)
potential type errors. Which view you take probably depends on what your
concrete implementation of "message not understood" does. For example,
if it simply forwards the message to another object that is known to be
able to respond to it, then you remove a potential type error; however,
if it pops up a dialog box to ask the user how to continue from here, it
is still a type error, but just gives you a way to deal with it at runtime.


Pascal
 
I

ilitzroth

A statically type language requires you to think about two models of
your program at the same time: the static type model and the dynamic
behavioral model. A static type system ensures that these two
_different_ (that's important!) perspectives are always in sync.

I have trouble understanding your use of the wording "Model of a
program".
If it is a system that behaves according to the rules of your program
then
statements about your program should consider *all* possible models.
If it is a formal system that makes statements about properties of your
program
than the static type system is a simplified model that is suitable for
automatic
analysis and your runtime model is in most cases nonexistent.
Can you give a definition of a "model of a program"? Can you explain
why
Lisp doesn't have two (SBCL does do a lot of typechecking and gives
type errors)?
This is
especially valuable in settings where you know your domain well and want
to rely on feedback by your compiler that you haven't made any mistakes
in encoding your knowledge. (A static type system based on type
inferencing doesn't essentially change the requirement to think in two
models at the same time.)

It is also valuable when you don't know your domain very well and you
want to rely on feedback by your compiler that you haven't made any
mistakes in encoding your limited knowledge
A dynamically typed language is especially well suited when you don't
(yet) have a good idea about your domain and you want to use programming
especially to explore that domain. our domain).

In the sense that you can start writing code without the compiler
pointing out
all but the most glaring holes in your program, I agree. Most of your
arguments
aren't very convincing and the thruth is that I have seem lisp
programmers using
the debugger to find out that you can't add a number and a hastable.
The static view
was not there and the dynamic view must have been too complicated so
they had
nothing to think about.
Immanuel
 
A

Andreas Rossberg

Marshall said:
What prohibits us from describing an abstract type as a set of values?

If you identify an abstract type with the set of underlying values then
it is equivalent to the underlying representation type, i.e. there is no
abstraction.
References?

This is 1973, actually, but most relevant:

James Morris
Types Are Not Sets.
Proc. 1st ACM Symposium on Principles of Programming Languages, 1973
There is no reason why we must limit ourselves to "standard set theory"
any more than we have to limit ourselves to standard type theory.
Both are progressing, and set theory seems to me to be a good
choice for a foundation. What else would you use?

I'm no expert here, but Category Theory is a preferred choice in many
areas of PLT.

- Andreas
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,825
Latest member
VernonQuy6

Latest Threads

Top