strong/weak typing and pointers

S

Steven Bethard

Diez B. Roggisch said:
Strong/weak typing is about how much you care of types at all - in php, its
perfectly legal to add strings to numbers - the string simply gets
converted to a number beforehand, that conversion yielding 0 when there is
nothing useful and numberlike can be extracted from the string. So typing
is weak, as it doesn't constrain the possible operations on variables with
certain values.

By this definition, Python is also weakly typed:
3.0

Your defintion above calls PHP weakly typed because it performs implicit
string->number conversions. The Python code above performs an implicit
int->float conversion.

This is not the definition of strongly- and weakly-typed that I'm used to. When
I use strongly-typed, I mean that a block of memory associated with an object
cannot be reinterpreted as a different type of object. For example, in a
weakly-typed language like C, we can do:

struct A {
char c;
int i;
};
struct B {
float f;
char c;
};
int main(int argc, char* argv) {
struct A a = {'c', 1024};
struct B* b = (struct B*)&a;
printf("'%c' %d %f '%c'\n", a.c, a.i, b->f, b->c);
}

And get the following ouptut:
'c' 1024 149611321247666274304.000000 ' '

C allows me to reinterpret a char as a float and an int as a char. In the first
case, we get the floating-point number that is represented by the bits that
represent the character 'c'. In the second case, we get the space character
because that's how C prints a character with an ASCII value larger than it's
allowed to be (1024 when the max is 255).

The point here is that I consider C weakly typed because, with no error of any
sort, it allows me to reinterpret a block of memory in as many ways as I like.
A strongly typed language like Python does not allow this. Even in my Python
example above, we're not *reinterpreting* the block of memory representing 1 as
a floating point value; we're *coerceing* the integer 1 into the floating point
value 1.0 (which probably means allocating a new float variable at the C level)
before performing the addition.

Steve
 
D

Diez B. Roggisch

Strong/weak typing is about how much you care of types at all - in php,
By this definition, Python is also weakly typed:

3.0

I'm not sure how things are implemented internally, but I'd still say my
definition is correct even for this example: The + operator can be viewed
as overloaded with the signature

(float, int) -> float

But its not for (string, int) - albeit * e.g. is.
The point here is that I consider C weakly typed because, with no error of
any sort, it allows me to reinterpret a block of memory in as many ways as
I like.
A strongly typed language like Python does not allow this. Even in my
Python example above, we're not *reinterpreting* the block of memory
representing 1 as a floating point value; we're *coerceing* the integer 1
into the floating point value 1.0 (which probably means allocating a new
float variable at the C level) before performing the addition.

This is definitely weak typing.

The question remains if permanent coercions as php (and afaik perl) do can
also be considered weak typing, as you won't end up with an error for more
or less anything you do.

I say yes, but maybe thats a matter of taste.
 
S

Steven Bethard

Diez B. Roggisch said:
I'm not sure how things are implemented internally, but I'd still say my
definition is correct even for this example: The + operator can be viewed
as overloaded with the signature

(float, int) -> float

But its not for (string, int) - albeit * e.g. is.

So you would say that in PHP the + operator cannot be viewed as overloaded with
the signature (string, int) -> string? I don't know PHP, so could you maybe you
could give an example of why you think this is so?
The question remains if permanent coercions as php (and afaik perl) do can
also be considered weak typing, as you won't end up with an error for more
or less anything you do.

Sorry, I don't know what "permanent coercions" means. Could you explain?
"Permanent coercions" makes me expect something like:
1.0

where b's value has been coerced into a float (and reassigned to b) because of
the addition of a. I'm guessing this isn't what you meant...

Steve
 
D

Diez B. Roggisch

So you would say that in PHP the + operator cannot be viewed as overloaded
with
the signature (string, int) -> string? I don't know PHP, so could you
maybe you could give an example of why you think this is so?

If you do this:

"a" + 10

you end with 10 - if the string doesn't contain something as number
interpretable, the coercion results in null.

Sure, that behaviour can be seen as overloaded, too. But overloaded
functions usually make some sort of sense, where this technique masks
errors by _always_ trying to interpret values as useful to every operation.
Sorry, I don't know what "permanent coercions" means. Could you explain?
"Permanent coercions" makes me expect something like:

I'm no native speaker, so I maybe confused the meaning of permanent. In
german, permanent not only means that things are enduring, but also that
things are done on regular bases: "He permanently changed his cloth."

What I wanted to say is that php uses coercions or overloaded operators for
nearly everything. Sometimes this is totally silent, sometimes nothing
happens and a warning is issued - which might be configured to be treated
as an actual error, I'm not sure about that.

So while there migth be in fact type information present, it's rarely used -
which I consider as beeing weak.

Does that make more sense?
 
G

Gabriel Zachmann

wall of abstraction". A Smalltalk programmer would say that
Python is more weakly typed than Smalltalk for user-defined types.

which brings me to another related question.

I understand strong/weak typing is more like a continuum --
is there kind of a programming langauges hierarchy sorting various
langauges accoring to "type-strongness" that is generally agreed upon?

I'd be interested in a hierarchy containing some of the following
languages: ANSI-C++, C, Perl, Python, Pascal, ML.
(because these happen to be some of the languages i know a bit ;-) )

Would the following be justifiable?

Perl < C < C++ < Pascal < Python < ML ?

Or does anyone have a pointer?

Best regards,
gab.


--
/-------------------------------------------------------------------------\
| There are works which wait, |
| and which one does not understand for a long time; [...] |
| for the question often arrives a terribly long time after the answer. |
| (Oscar Wilde) |
+-------------------------------------------------------------------------+
| (e-mail address removed)-bonn.de __@/' www.gabrielzachmann.org |
\-------------------------------------------------------------------------/
 
A

Alex Martelli

Gabriel Zachmann said:
I've read up quite a bit about strong/weak typing, and static and dynamic
typing, and it seems to me that, while static/dynamic typing is a pretty
well-defined concept, the definition of strong/weak typing is not so
clear-cut.

"four legs good two legs bad". In the end, it's what works best...


Alex
 
S

Steven Bethard

Diez B. Roggisch said:
If you do this:

"a" + 10

you end with 10 - if the string doesn't contain something as number
interpretable, the coercion results in null.

Sure, that behaviour can be seen as overloaded, too. But overloaded
functions usually make some sort of sense, where this technique masks
errors by _always_ trying to interpret values as useful to every operation.

Ahh, I understand now. I would still call this coercion with an overloaded
operator. A horrible language decision, certainly, but not a mark of weak
typing -- note that Python can give you exactly the same behavior if you want it:
.... def __add__(self, other):
.... try:
.... other = int(other)
.... except ValueError:
.... other = 0
.... return super(phpint, self).__add__(other)
.... __radd__ = __add__
.... 10

This doesn't mean that Python has suddenly become a weakly-typed language. It
just means that I've implemented some poor coercion choices in the language. =)
I'm no native speaker, so I maybe confused the meaning of permanent. In
german, permanent not only means that things are enduring, but also that
things are done on regular bases: "He permanently changed his cloth."

Ahh. Gotcha. I would probably say "He (regularly changed/repeatedly
changed/used to change) his clothes." Thanks for the clarification.
What I wanted to say is that php uses coercions or overloaded operators for
nearly everything. Sometimes this is totally silent, sometimes nothing
happens and a warning is issued - which might be configured to be treated
as an actual error, I'm not sure about that.

Totally clear now, thanks. Basically you would say that the more implicit
coercions a language performs, the more weakly typed it is. This diverges from
the common use of the terms strong and weak typing in the PL literature, which
is why I was confused.
So while there migth be in fact type information present, it's rarely used -
which I consider as beeing weak.

Well, the type information is probably used all the time (I would't be surprised
if somewhere in the PHP internals something like my __add__ method above was
defined), but it's used implicitly, so the programmer might never see it.

Steve
 
S

Steven Bethard

Gabriel said:
I understand strong/weak typing is more like a continuum --
is there kind of a programming langauges hierarchy sorting various
langauges accoring to "type-strongness" that is generally agreed upon?

Well, as you can probably see from the discussion, the definition of strong/weak
typing isn't even agreed upon, so I'd be wary of giving a hierarchy. ;) The two
main interpretations that I've seen in this thread:

(1) Weakly-typed languages allow you to take a block of memory that was
originally defined as one type and reinterpret the bits of this block as another
type[1]. (This is the definition usually used in Programming Languages
literature.)

(2) Weakly-typed languages have more implicit coercions than strongly-typed
languages. (This seems to be the favored definition on this newsgroup.)

The answer to your question depends on which one of these definitions you're
interested in. Definition (1) will have a much flatter hierarchy than
definition (2). Which definition are you interested in?

Steve

[1] see my example at
http://mail.python.org/pipermail/python-list/2004-November/248983.html
 
G

Gabriel Zachmann

(1) Weakly-typed languages allow you to take a block of memory that was
originally defined as one type and reinterpret the bits of this block as another
type[1]. (This is the definition usually used in Programming Languages
literature.)

(2) Weakly-typed languages have more implicit coercions than strongly-typed
languages. (This seems to be the favored definition on this newsgroup.)

Is either of them a subset of the other, generally speaking?

The answer to your question depends on which one of these definitions you're
interested in. Definition (1) will have a much flatter hierarchy than
definition (2). Which definition are you interested in?

both, if you don't mind ;-)

cheers,
gab.


--
/-------------------------------------------------------------------------\
| There are works which wait, |
| and which one does not understand for a long time; [...] |
| for the question often arrives a terribly long time after the answer. |
| (Oscar Wilde) |
+-------------------------------------------------------------------------+
| (e-mail address removed)-bonn.de __@/' www.gabrielzachmann.org |
\-------------------------------------------------------------------------/
 
G

Gabriel Zachmann

(1) Weakly-typed languages allow you to take a block of memory that was
originally defined as one type and reinterpret the bits of this block as another
type[1]. (This is the definition usually used in Programming Languages
literature.)

So, according to that, Perl is strongly typed?

Thanks a lot in advance,
Gabriel.

--
/-------------------------------------------------------------------------\
| There are works which wait, |
| and which one does not understand for a long time; [...] |
| for the question often arrives a terribly long time after the answer. |
| (Oscar Wilde) |
+-------------------------------------------------------------------------+
| (e-mail address removed)-bonn.de __@/' www.gabrielzachmann.org |
\-------------------------------------------------------------------------/
 
S

Steven Bethard

Gabriel Zachmann said:
(1) Weakly-typed languages allow you to take a block of memory that was
originally defined as one type and reinterpret the bits of this block as
another type[1]. (This is the definition usually used in Programming
Languages literature.)

(2) Weakly-typed languages have more implicit coercions than strongly-typed
languages. (This seems to be the favored definition on this newsgroup.)

Is either of them a subset of the other, generally speaking?

Not really -- they're pretty much orthogonal.

For example, C is weakly-typed (PL theory definition) but has only a few
implicit coercions (e.g. int->float with the + operator). Of course, C only has
a very few basic types so it doesn't have as many chances to support implicit
coercions. For example, there is no string type in C (only arrays of
characters), so it wouldn't really make sense to talk about some sort of
string->int coercion.

Python is strongly-typed (PL theory definition) and also has only a few implicit
coercions (e.g. int->float with the + operator).

ML is strongly-typed (PL theory definition) and has *very* few (perhaps none?)
implicit coercions, e.g.:

# 1 + 1.0;;

Characters 4-7:
1 + 1.0;;
^^^
This expression has type float but is here used with type int

not even the int->float conversion common to C, Java, Python, etc. is supported.

I don't know enough about Pascal, Perl or PHP to tell whether they are
weakly-typed or strongly-typed (PL theory definition). Taking advantage of weak
typing isn't something I do much, so even in languages that I have passing
familiarity with, I've generally used only the strongly typed features. To
someone who knows about Pascal, Perl or PHP: Can you reinterpret an object's
memory block as a different object like you can in C?

I've never had to code in Pascal, but my understanding was that there weren't
too many implicit coercions... Please correct me on this one if anyone knows
better! Examples in this thread suggest that both Perl and PHP have large
numbers of implicit coercions.

both, if you don't mind

I'm not willing to commit to anything for languages that I'm not really familiar
with, but here's what I'd say:
From weakly-typed to strongly-typed (PL theory definition):

C < Java, Python, ML

Basically, there's no hierarchy, just weakly-typed and strongly-typed. You
might get a hierarchy if you had some languages that allowed only some (but not
all) objects to be reinterpreted.
From many implicit coercions to few implicit coercions:

C, Java, Python < ML

Of course, this isn't really helpful because C, Java and Python all contain only
a few implicit coercions (e.g. int->float with the + operator). You'll have to
get a PHP/Perl/Pascal expert in to make any claims about them.

Steve
 
C

Christophe Cavalaria

Steven said:
Totally clear now, thanks. Basically you would say that the more implicit
coercions a language performs, the more weakly typed it is. This diverges
from the common use of the terms strong and weak typing in the PL
literature, which is why I was confused.

One could say that the common definition of weakly typed languages cannont
apply for a dynamicaly type language. Let's face it, once you've
implemented a dynamicaly typed language, it seems very hard to use one
piece of data as the wrong type. What we can do is extend the definition of
weakly typed language to something like : "A weakly typed language is a
language that often uses incorectly some piece of data by applying to it
the wrong type" Such definition would include any language that is too
liberal with the type coertion like php.
 
C

Christophe Cavalaria

Gabriel said:
which brings me to another related question.

I understand strong/weak typing is more like a continuum --
is there kind of a programming langauges hierarchy sorting various
langauges accoring to "type-strongness" that is generally agreed upon?

I'd be interested in a hierarchy containing some of the following
languages: ANSI-C++, C, Perl, Python, Pascal, ML.
(because these happen to be some of the languages i know a bit ;-) )

Would the following be justifiable?

Perl < C < C++ < Pascal < Python < ML ?

Or does anyone have a pointer?

If by ML you thing of OCaml you should try again. OCaml isn't type safe
because of a few modules of the standard library. The marshaling module
comes to mind. Using it you can "typecast" a pointer to an integer as a
pointer to a string and segfault in the process because the unmarshal
function trusts the caller to cast the result in the good type. In fact, it
seems impossible to write a correct typesafe marshaling module in OCaml
since there is no rtti info in the language for an anonymous piece of data.
 
S

Steven Bethard

Christophe Cavalaria said:
One could say that the common definition of weakly typed languages cannont
apply for a dynamicaly type language. Let's face it, once you've
implemented a dynamicaly typed language, it seems very hard to use one
piece of data as the wrong type.

While I agree that it would be kind of foolish to implement a dynamically typed
language that wasn't strongly-typed (PL theory definition), there's no reason
you *couldn't* -- your language would just need to provide some sort of 'cast'
function that did the conversion at runtime.

For example, say the memory block for an object instance was allocated like:

[0] pointer to class object
[1] pointer to dictionary object (holding variables)

Then maybe you could do something like:
int1, int2 = cast(object(), list(int))
Given, I can't see any use for such behavior, but you *could* do it if you had a
weakly-typed dynamic language and you knew how it allocated objects. At some
level, you always have bits -- whether it makes any sense to reinterpret them
depends on exactly what the bits originally meant.
What we can do is extend the definition of weakly typed language to something
like : "A weakly typed language is a language that often uses incorectly some
piece of data by applying to it the wrong type" Such definition would include
any language that is too liberal with the type coertion like php.

Don't get me wrong -- I do understand your point. In every case I can think of,
there is no reason to want weak-typing (PL theory definition) in a
dynamically-typed language. On the other hand, I haven't really seen any good
cases for wanting weak-typing in a statically-typed language either.

Note that PHP doesn't fit your definition above anyway. When PHP allows:

"a" + 10 == 10

it's not incorrectly using "some piece of data by applying to it the wrong
type". It's doing exactly what it tells you it'll do. This is *correct*
response given the PHP language definition.

In addition, you *can* create a statically-typed language that is strongly typed
(PL theory definition) but also very liberal with type coercion. What would you
call such a language? Since being liberal with type coercion and allowing bit
reinterpretation are orthogonal, why not keep the two separate terms?

My issue here is that I don't think we should confuse an already easily confused
term by giving it a second meaning. If there aren't any dynamically typed
languages that are also weakly-typed, that's ok -- it doesn't mean we should
change the meaning of "weakly-typed" for these languages.

Steve
 
S

Steven Bethard

Christophe Cavalaria said:
If by ML you thing of OCaml you should try again. OCaml isn't type safe
because of a few modules of the standard library. The marshaling module
comes to mind. Using it you can "typecast" a pointer to an integer as a
pointer to a string and segfault in the process because the unmarshal
function trusts the caller to cast the result in the good type.

Thanks, I knew I'd read something like that somewhere. Totally surprised me too
'cause I figured that, of all people, ML folks would be the most afraid of a
module like this. =)

Steve
 
C

Christophe Cavalaria

Steven said:
Christophe Cavalaria said:
One could say that the common definition of weakly typed languages
cannont apply for a dynamicaly type language. Let's face it, once you've
implemented a dynamicaly typed language, it seems very hard to use one
piece of data as the wrong type.

While I agree that it would be kind of foolish to implement a dynamically
typed language that wasn't strongly-typed (PL theory definition), there's
no reason you *couldn't* -- your language would just need to provide some
sort of 'cast' function that did the conversion at runtime.

For example, say the memory block for an object instance was allocated
like:

[0] pointer to class object
[1] pointer to dictionary object (holding variables)

Then maybe you could do something like:
int1, int2 = cast(object(), list(int))
Given, I can't see any use for such behavior, but you *could* do it if you
had a
weakly-typed dynamic language and you knew how it allocated objects. At
some level, you always have bits -- whether it makes any sense to
reinterpret them depends on exactly what the bits originally meant.

In any good dynamicaly typed language, the object must know what it is thus
there is no way to do a reinterpret_cast like in C or C++. It is
meaningless. Doing it anyway is insane as you have pointed. It's only goal
beeing to add in the flaws of the weakly static typed languages to a
contrived example.
Don't get me wrong -- I do understand your point. In every case I can
think of, there is no reason to want weak-typing (PL theory definition) in
a
dynamically-typed language. On the other hand, I haven't really seen any
good cases for wanting weak-typing in a statically-typed language either.

Note that PHP doesn't fit your definition above anyway. When PHP allows:

"a" + 10 == 10

it's not incorrectly using "some piece of data by applying to it the wrong
type". It's doing exactly what it tells you it'll do. This is *correct*
response given the PHP language definition.

Incorrectly is a view of the mind in that case. From my point of view it is
incorrect. And you could argue that taking a float * and casting it into a
int * gives us a predictable behaviour. It'll still be wrong to do it
unless it was exactly what you wanted to do. And in that case you can
create explicit language constructs to do the trick.
In addition, you *can* create a statically-typed language that is strongly
typed
(PL theory definition) but also very liberal with type coercion. What
would you
call such a language? Since being liberal with type coercion and allowing
bit reinterpretation are orthogonal, why not keep the two separate terms?

My issue here is that I don't think we should confuse an already easily
confused
term by giving it a second meaning.

I see your point but I must add that I wasn't giving that term a second
meaning. Just like in mathematics when you take a theory and you create a
new theory that encompases the old one, I was giving a new definition for
the term.
 
C

Christophe Cavalaria

Steven said:
Thanks, I knew I'd read something like that somewhere. Totally surprised
me too 'cause I figured that, of all people, ML folks would be the most
afraid of a module like this. =)

Steve

Marshaling ( Python calls it pickling ;) ) is somthing needed in the
standard library of any good language. Too bad for them that the OCaml
language makes it impossible to implement.
 
S

Steven Bethard

Christophe Cavalaria said:
Incorrectly is a view of the mind in that case. From my point of view it is
incorrect. And you could argue that taking a float * and casting it into a
int * gives us a predictable behaviour. It'll still be wrong to do it
unless it was exactly what you wanted to do. And in that case you can
create explicit language constructs to do the trick.

The problem is, "exactly what you wanted to do" varies from programmer to
programmer. Some programmers may actually want "a" + 10 == 10. I can imagine
code that takes a string as input and adds its integer value to an int counter.
Invalid input should not increment the counter and may be silently ignored. I
would not write code this way, but some people would, and would want the code to
work the way PHP does.

So yes, "incorrectly is a view of the mind", but since we can't know what every
programmer is thinking, and it's extremely unlikely that every programmer will
agree what's correct or incorrect for every example, so we have to take the
language definition as the measuring stick for correct or incorrect.

Given that, your definition of weakly-typed:
"A weakly typed language is a language that often uses incorectly some piece
of data by applying to it the wrong type"
either would not call PHP a weakly-typed language, because the data is not used
incorrectly according to the language definition, or would not know what to call
PHP, because incorrectly cannot be defined in a way that applies correctly to
all programmers.

I prefer not extending "weakly-typed" in this way because it makes the term less
well-defined.

Steve
 
G

Greg Ewing

Diez said:
I'm no native speaker, so I maybe confused the meaning of permanent. In
german, permanent not only means that things are enduring, but also that
things are done on regular bases: "He permanently changed his cloth."

We would say "regularly", "frequently", "habitually", or
something like that. In English, "permanently" means
"once and for all".
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,737
Latest member
Georgeengab

Latest Threads

Top