strong/weak typing and pointers

Diez B. Roggisch · Nov 2, 2004

Where do you get that idea from? Modern compilers are aware of printf,

The format string has to be static? That's news to me.

No, but if it is, the compiler can perform type checking:

http://www.ugrad.physics.mcgill.ca/reference/Gcc/gcc_4.html#SEC84

Steven Bethard · Nov 2, 2004

Diez B. Roggisch said:
Strong/weak typing is about how much you care of types at all - in php, its
perfectly legal to add strings to numbers - the string simply gets
converted to a number beforehand, that conversion yielding 0 when there is
nothing useful and numberlike can be extracted from the string. So typing
is weak, as it doesn't constrain the possible operations on variables with
certain values.

By this definition, Python is also weakly typed:
3.0

Your defintion above calls PHP weakly typed because it performs implicit
string->number conversions. The Python code above performs an implicit
int->float conversion.

This is not the definition of strongly- and weakly-typed that I'm used to. When
I use strongly-typed, I mean that a block of memory associated with an object
cannot be reinterpreted as a different type of object. For example, in a
weakly-typed language like C, we can do:

struct A {
char c;
int i;
};
struct B {
float f;
char c;
};
int main(int argc, char* argv) {
struct A a = {'c', 1024};
struct B* b = (struct B*)&a;
printf("'%c' %d %f '%c'\n", a.c, a.i, b->f, b->c);
}

And get the following ouptut:
'c' 1024 149611321247666274304.000000 ' '

C allows me to reinterpret a char as a float and an int as a char. In the first
case, we get the floating-point number that is represented by the bits that
represent the character 'c'. In the second case, we get the space character
because that's how C prints a character with an ASCII value larger than it's
allowed to be (1024 when the max is 255).

The point here is that I consider C weakly typed because, with no error of any
sort, it allows me to reinterpret a block of memory in as many ways as I like.
A strongly typed language like Python does not allow this. Even in my Python
example above, we're not *reinterpreting* the block of memory representing 1 as
a floating point value; we're *coerceing* the integer 1 into the floating point
value 1.0 (which probably means allocating a new float variable at the C level)
before performing the addition.

Steve

Diez B. Roggisch · Nov 2, 2004

Strong/weak typing is about how much you care of types at all - in php,

By this definition, Python is also weakly typed:

3.0

I'm not sure how things are implemented internally, but I'd still say my
definition is correct even for this example: The + operator can be viewed
as overloaded with the signature

(float, int) -> float

But its not for (string, int) - albeit * e.g. is.

The point here is that I consider C weakly typed because, with no error of
any sort, it allows me to reinterpret a block of memory in as many ways as
I like.
A strongly typed language like Python does not allow this. Even in my
Python example above, we're not *reinterpreting* the block of memory
representing 1 as a floating point value; we're *coerceing* the integer 1
into the floating point value 1.0 (which probably means allocating a new
float variable at the C level) before performing the addition.

This is definitely weak typing.

The question remains if permanent coercions as php (and afaik perl) do can
also be considered weak typing, as you won't end up with an error for more
or less anything you do.

I say yes, but maybe thats a matter of taste.

Steven Bethard · Nov 2, 2004

Diez B. Roggisch said:
I'm not sure how things are implemented internally, but I'd still say my
definition is correct even for this example: The + operator can be viewed
as overloaded with the signature

(float, int) -> float

But its not for (string, int) - albeit * e.g. is.

So you would say that in PHP the + operator cannot be viewed as overloaded with
the signature (string, int) -> string? I don't know PHP, so could you maybe you
could give an example of why you think this is so?

The question remains if permanent coercions as php (and afaik perl) do can
also be considered weak typing, as you won't end up with an error for more
or less anything you do.

Sorry, I don't know what "permanent coercions" means. Could you explain?
"Permanent coercions" makes me expect something like:
1.0

where b's value has been coerced into a float (and reassigned to b) because of
the addition of a. I'm guessing this isn't what you meant...

Steve

Diez B. Roggisch · Nov 2, 2004

So you would say that in PHP the + operator cannot be viewed as overloaded

with
the signature (string, int) -> string? I don't know PHP, so could you
maybe you could give an example of why you think this is so?

If you do this:

"a" + 10

you end with 10 - if the string doesn't contain something as number
interpretable, the coercion results in null.

Sure, that behaviour can be seen as overloaded, too. But overloaded
functions usually make some sort of sense, where this technique masks
errors by _always_ trying to interpret values as useful to every operation.

Sorry, I don't know what "permanent coercions" means. Could you explain?
"Permanent coercions" makes me expect something like:

I'm no native speaker, so I maybe confused the meaning of permanent. In
german, permanent not only means that things are enduring, but also that
things are done on regular bases: "He permanently changed his cloth."

What I wanted to say is that php uses coercions or overloaded operators for
nearly everything. Sometimes this is totally silent, sometimes nothing
happens and a warning is issued - which might be configured to be treated
as an actual error, I'm not sure about that.

So while there migth be in fact type information present, it's rarely used -
which I consider as beeing weak.

Does that make more sense?

Gabriel Zachmann · Nov 2, 2004

wall of abstraction". A Smalltalk programmer would say that

Python is more weakly typed than Smalltalk for user-defined types.

which brings me to another related question.

I understand strong/weak typing is more like a continuum --
is there kind of a programming langauges hierarchy sorting various
langauges accoring to "type-strongness" that is generally agreed upon?

I'd be interested in a hierarchy containing some of the following
languages: ANSI-C++, C, Perl, Python, Pascal, ML.
(because these happen to be some of the languages i know a bit ;-) )

Would the following be justifiable?

Perl < C < C++ < Pascal < Python < ML ?

Or does anyone have a pointer?

Best regards,
gab.

--
/-------------------------------------------------------------------------\
| There are works which wait, |
| and which one does not understand for a long time; [...] |
| for the question often arrives a terribly long time after the answer. |
| (Oscar Wilde) |
+-------------------------------------------------------------------------+
| (e-mail address removed)-bonn.de __@/' www.gabrielzachmann.org |
\-------------------------------------------------------------------------/

Alex Martelli · Nov 2, 2004

Gabriel Zachmann said:
I've read up quite a bit about strong/weak typing, and static and dynamic
typing, and it seems to me that, while static/dynamic typing is a pretty
well-defined concept, the definition of strong/weak typing is not so
clear-cut.

"four legs good two legs bad". In the end, it's what works best...

Alex

Steven Bethard · Nov 2, 2004

Diez B. Roggisch said:
If you do this:

"a" + 10

you end with 10 - if the string doesn't contain something as number
interpretable, the coercion results in null.

Sure, that behaviour can be seen as overloaded, too. But overloaded
functions usually make some sort of sense, where this technique masks
errors by _always_ trying to interpret values as useful to every operation.

Ahh, I understand now. I would still call this coercion with an overloaded
operator. A horrible language decision, certainly, but not a mark of weak
typing -- note that Python can give you exactly the same behavior if you want it:
.... def __add__(self, other):
.... try:
.... other = int(other)
.... except ValueError:
.... other = 0
.... return super(phpint, self).__add__(other)
.... __radd__ = __add__
.... 10

This doesn't mean that Python has suddenly become a weakly-typed language. It
just means that I've implemented some poor coercion choices in the language. =)

I'm no native speaker, so I maybe confused the meaning of permanent. In
german, permanent not only means that things are enduring, but also that
things are done on regular bases: "He permanently changed his cloth."

Ahh. Gotcha. I would probably say "He (regularly changed/repeatedly
changed/used to change) his clothes." Thanks for the clarification.

What I wanted to say is that php uses coercions or overloaded operators for
nearly everything. Sometimes this is totally silent, sometimes nothing
happens and a warning is issued - which might be configured to be treated
as an actual error, I'm not sure about that.

Totally clear now, thanks. Basically you would say that the more implicit
coercions a language performs, the more weakly typed it is. This diverges from
the common use of the terms strong and weak typing in the PL literature, which
is why I was confused.

So while there migth be in fact type information present, it's rarely used -
which I consider as beeing weak.

Well, the type information is probably used all the time (I would't be surprised
if somewhere in the PHP internals something like my __add__ method above was
defined), but it's used implicitly, so the programmer might never see it.

Steve

Steven Bethard · Nov 2, 2004

Gabriel said:
I understand strong/weak typing is more like a continuum --
is there kind of a programming langauges hierarchy sorting various
langauges accoring to "type-strongness" that is generally agreed upon?

Well, as you can probably see from the discussion, the definition of strong/weak
typing isn't even agreed upon, so I'd be wary of giving a hierarchy.

The two
main interpretations that I've seen in this thread:

(1) Weakly-typed languages allow you to take a block of memory that was
originally defined as one type and reinterpret the bits of this block as another
type[1]. (This is the definition usually used in Programming Languages
literature.)

(2) Weakly-typed languages have more implicit coercions than strongly-typed
languages. (This seems to be the favored definition on this newsgroup.)

The answer to your question depends on which one of these definitions you're
interested in. Definition (1) will have a much flatter hierarchy than
definition (2). Which definition are you interested in?

Steve

[1] see my example at
http://mail.python.org/pipermail/python-list/2004-November/248983.html

Gabriel Zachmann · Nov 2, 2004

(1) Weakly-typed languages allow you to take a block of memory that was

originally defined as one type and reinterpret the bits of this block as another
type[1]. (This is the definition usually used in Programming Languages
literature.)

(2) Weakly-typed languages have more implicit coercions than strongly-typed
languages. (This seems to be the favored definition on this newsgroup.)

Is either of them a subset of the other, generally speaking?

The answer to your question depends on which one of these definitions you're
interested in. Definition (1) will have a much flatter hierarchy than
definition (2). Which definition are you interested in?

both, if you don't mind ;-)

cheers,
gab.

--
/-------------------------------------------------------------------------\
| There are works which wait, |
| and which one does not understand for a long time; [...] |
| for the question often arrives a terribly long time after the answer. |
| (Oscar Wilde) |
+-------------------------------------------------------------------------+
| (e-mail address removed)-bonn.de __@/' www.gabrielzachmann.org |
\-------------------------------------------------------------------------/

Gabriel Zachmann · Nov 2, 2004

(1) Weakly-typed languages allow you to take a block of memory that was

originally defined as one type and reinterpret the bits of this block as another
type[1]. (This is the definition usually used in Programming Languages
literature.)

So, according to that, Perl is strongly typed?

Thanks a lot in advance,
Gabriel.

--
/-------------------------------------------------------------------------\
| There are works which wait, |
| and which one does not understand for a long time; [...] |
| for the question often arrives a terribly long time after the answer. |
| (Oscar Wilde) |
+-------------------------------------------------------------------------+
| (e-mail address removed)-bonn.de __@/' www.gabrielzachmann.org |
\-------------------------------------------------------------------------/

Steven Bethard · Nov 2, 2004

Gabriel Zachmann said:
(1) Weakly-typed languages allow you to take a block of memory that was
originally defined as one type and reinterpret the bits of this block as
another type[1]. (This is the definition usually used in Programming
Languages literature.)

(2) Weakly-typed languages have more implicit coercions than strongly-typed
languages. (This seems to be the favored definition on this newsgroup.)

Click to expand...

Is either of them a subset of the other, generally speaking?

Not really -- they're pretty much orthogonal.

For example, C is weakly-typed (PL theory definition) but has only a few
implicit coercions (e.g. int->float with the + operator). Of course, C only has
a very few basic types so it doesn't have as many chances to support implicit
coercions. For example, there is no string type in C (only arrays of
characters), so it wouldn't really make sense to talk about some sort of
string->int coercion.

Python is strongly-typed (PL theory definition) and also has only a few implicit
coercions (e.g. int->float with the + operator).

ML is strongly-typed (PL theory definition) and has *very* few (perhaps none?)
implicit coercions, e.g.:

# 1 + 1.0;;

Characters 4-7:
1 + 1.0;;
^^^
This expression has type float but is here used with type int

not even the int->float conversion common to C, Java, Python, etc. is supported.

I don't know enough about Pascal, Perl or PHP to tell whether they are
weakly-typed or strongly-typed (PL theory definition). Taking advantage of weak
typing isn't something I do much, so even in languages that I have passing
familiarity with, I've generally used only the strongly typed features. To
someone who knows about Pascal, Perl or PHP: Can you reinterpret an object's
memory block as a different object like you can in C?

I've never had to code in Pascal, but my understanding was that there weren't
too many implicit coercions... Please correct me on this one if anyone knows
better! Examples in this thread suggest that both Perl and PHP have large
numbers of implicit coercions.

both, if you don't mind

I'm not willing to commit to anything for languages that I'm not really familiar
with, but here's what I'd say:

From weakly-typed to strongly-typed (PL theory definition):

C < Java, Python, ML

Basically, there's no hierarchy, just weakly-typed and strongly-typed. You
might get a hierarchy if you had some languages that allowed only some (but not
all) objects to be reinterpreted.

From many implicit coercions to few implicit coercions:

C, Java, Python < ML

Of course, this isn't really helpful because C, Java and Python all contain only
a few implicit coercions (e.g. int->float with the + operator). You'll have to
get a PHP/Perl/Pascal expert in to make any claims about them.

Steve

Christophe Cavalaria · Nov 2, 2004

Steven said:
Totally clear now, thanks. Basically you would say that the more implicit
coercions a language performs, the more weakly typed it is. This diverges
from the common use of the terms strong and weak typing in the PL
literature, which is why I was confused.

One could say that the common definition of weakly typed languages cannont
apply for a dynamicaly type language. Let's face it, once you've
implemented a dynamicaly typed language, it seems very hard to use one
piece of data as the wrong type. What we can do is extend the definition of
weakly typed language to something like : "A weakly typed language is a
language that often uses incorectly some piece of data by applying to it
the wrong type" Such definition would include any language that is too
liberal with the type coertion like php.

Christophe Cavalaria · Nov 2, 2004

Gabriel said:
which brings me to another related question.

I understand strong/weak typing is more like a continuum --
is there kind of a programming langauges hierarchy sorting various
langauges accoring to "type-strongness" that is generally agreed upon?

I'd be interested in a hierarchy containing some of the following
languages: ANSI-C++, C, Perl, Python, Pascal, ML.
(because these happen to be some of the languages i know a bit ;-) )

Would the following be justifiable?

Perl < C < C++ < Pascal < Python < ML ?

Or does anyone have a pointer?

If by ML you thing of OCaml you should try again. OCaml isn't type safe
because of a few modules of the standard library. The marshaling module
comes to mind. Using it you can "typecast" a pointer to an integer as a
pointer to a string and segfault in the process because the unmarshal
function trusts the caller to cast the result in the good type. In fact, it
seems impossible to write a correct typesafe marshaling module in OCaml
since there is no rtti info in the language for an anonymous piece of data.

Steven Bethard · Nov 2, 2004

Christophe Cavalaria said:
One could say that the common definition of weakly typed languages cannont
apply for a dynamicaly type language. Let's face it, once you've
implemented a dynamicaly typed language, it seems very hard to use one
piece of data as the wrong type.

While I agree that it would be kind of foolish to implement a dynamically typed
language that wasn't strongly-typed (PL theory definition), there's no reason
you *couldn't* -- your language would just need to provide some sort of 'cast'
function that did the conversion at runtime.

For example, say the memory block for an object instance was allocated like:

[0] pointer to class object
[1] pointer to dictionary object (holding variables)

Then maybe you could do something like:
int1, int2 = cast(object(), list(int))
Given, I can't see any use for such behavior, but you *could* do it if you had a
weakly-typed dynamic language and you knew how it allocated objects. At some
level, you always have bits -- whether it makes any sense to reinterpret them
depends on exactly what the bits originally meant.

What we can do is extend the definition of weakly typed language to something
like : "A weakly typed language is a language that often uses incorectly some
piece of data by applying to it the wrong type" Such definition would include
any language that is too liberal with the type coertion like php.

Don't get me wrong -- I do understand your point. In every case I can think of,
there is no reason to want weak-typing (PL theory definition) in a
dynamically-typed language. On the other hand, I haven't really seen any good
cases for wanting weak-typing in a statically-typed language either.

Note that PHP doesn't fit your definition above anyway. When PHP allows:

"a" + 10 == 10

it's not incorrectly using "some piece of data by applying to it the wrong
type". It's doing exactly what it tells you it'll do. This is *correct*
response given the PHP language definition.

In addition, you *can* create a statically-typed language that is strongly typed
(PL theory definition) but also very liberal with type coercion. What would you
call such a language? Since being liberal with type coercion and allowing bit
reinterpretation are orthogonal, why not keep the two separate terms?

My issue here is that I don't think we should confuse an already easily confused
term by giving it a second meaning. If there aren't any dynamically typed
languages that are also weakly-typed, that's ok -- it doesn't mean we should
change the meaning of "weakly-typed" for these languages.

Steve

Steven Bethard · Nov 2, 2004

Christophe Cavalaria said:
If by ML you thing of OCaml you should try again. OCaml isn't type safe
because of a few modules of the standard library. The marshaling module
comes to mind. Using it you can "typecast" a pointer to an integer as a
pointer to a string and segfault in the process because the unmarshal
function trusts the caller to cast the result in the good type.

Thanks, I knew I'd read something like that somewhere. Totally surprised me too
'cause I figured that, of all people, ML folks would be the most afraid of a
module like this. =)

Steve

Christophe Cavalaria · Nov 3, 2004

Steven said:
Christophe Cavalaria said:

One could say that the common definition of weakly typed languages
cannont apply for a dynamicaly type language. Let's face it, once you've
implemented a dynamicaly typed language, it seems very hard to use one
piece of data as the wrong type.

Click to expand...

While I agree that it would be kind of foolish to implement a dynamically
typed language that wasn't strongly-typed (PL theory definition), there's
no reason you *couldn't* -- your language would just need to provide some
sort of 'cast' function that did the conversion at runtime.

For example, say the memory block for an object instance was allocated
like:

[0] pointer to class object
[1] pointer to dictionary object (holding variables)

Then maybe you could do something like:
int1, int2 = cast(object(), list(int))
Given, I can't see any use for such behavior, but you *could* do it if you
had a
weakly-typed dynamic language and you knew how it allocated objects. At
some level, you always have bits -- whether it makes any sense to
reinterpret them depends on exactly what the bits originally meant.

In any good dynamicaly typed language, the object must know what it is thus
there is no way to do a reinterpret_cast like in C or C++. It is
meaningless. Doing it anyway is insane as you have pointed. It's only goal
beeing to add in the flaws of the weakly static typed languages to a
contrived example.

Don't get me wrong -- I do understand your point. In every case I can
think of, there is no reason to want weak-typing (PL theory definition) in
a
dynamically-typed language. On the other hand, I haven't really seen any
good cases for wanting weak-typing in a statically-typed language either.

Note that PHP doesn't fit your definition above anyway. When PHP allows:

"a" + 10 == 10

it's not incorrectly using "some piece of data by applying to it the wrong
type". It's doing exactly what it tells you it'll do. This is *correct*
response given the PHP language definition.

Incorrectly is a view of the mind in that case. From my point of view it is
incorrect. And you could argue that taking a float * and casting it into a
int * gives us a predictable behaviour. It'll still be wrong to do it
unless it was exactly what you wanted to do. And in that case you can
create explicit language constructs to do the trick.

In addition, you *can* create a statically-typed language that is strongly
typed
(PL theory definition) but also very liberal with type coercion. What
would you
call such a language? Since being liberal with type coercion and allowing
bit reinterpretation are orthogonal, why not keep the two separate terms?

My issue here is that I don't think we should confuse an already easily
confused
term by giving it a second meaning.

I see your point but I must add that I wasn't giving that term a second
meaning. Just like in mathematics when you take a theory and you create a
new theory that encompases the old one, I was giving a new definition for
the term.

Christophe Cavalaria · Nov 3, 2004

Steven said:
Thanks, I knew I'd read something like that somewhere. Totally surprised
me too 'cause I figured that, of all people, ML folks would be the most
afraid of a module like this. =)

Steve

Marshaling ( Python calls it pickling

) is somthing needed in the
standard library of any good language. Too bad for them that the OCaml
language makes it impossible to implement.

Steven Bethard · Nov 3, 2004

Christophe Cavalaria said:
Incorrectly is a view of the mind in that case. From my point of view it is
incorrect. And you could argue that taking a float * and casting it into a
int * gives us a predictable behaviour. It'll still be wrong to do it
unless it was exactly what you wanted to do. And in that case you can
create explicit language constructs to do the trick.

The problem is, "exactly what you wanted to do" varies from programmer to
programmer. Some programmers may actually want "a" + 10 == 10. I can imagine
code that takes a string as input and adds its integer value to an int counter.
Invalid input should not increment the counter and may be silently ignored. I
would not write code this way, but some people would, and would want the code to
work the way PHP does.

So yes, "incorrectly is a view of the mind", but since we can't know what every
programmer is thinking, and it's extremely unlikely that every programmer will
agree what's correct or incorrect for every example, so we have to take the
language definition as the measuring stick for correct or incorrect.

Given that, your definition of weakly-typed:
"A weakly typed language is a language that often uses incorectly some piece
of data by applying to it the wrong type"
either would not call PHP a weakly-typed language, because the data is not used
incorrectly according to the language definition, or would not know what to call
PHP, because incorrectly cannot be defined in a way that applies correctly to
all programmers.

I prefer not extending "weakly-typed" in this way because it makes the term less
well-defined.

Steve

Greg Ewing · Nov 3, 2004

Diez said:
I'm no native speaker, so I maybe confused the meaning of permanent. In
german, permanent not only means that things are enduring, but also that
things are done on regular bases: "He permanently changed his cloth."

We would say "regularly", "frequently", "habitually", or
something like that. In English, "permanently" means
"once and for all".

python philosophical question - strong vs duck typing	2	Jan 3, 2012
strongly typed	4	Oct 20, 2004
2 questions about scope	4	Oct 25, 2004
Usage statistics?	1	Nov 5, 2004
Elise Mooney reports on Channel 9 about Maths Worldwide and the fraudthat it is	1	Apr 17, 2010
Dr. Dobb's Python-URL! - weekly Python news and links (Nov 10)	1	Nov 10, 2004
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006
Ruby Weekly News 5th - 11th June 2006	0	Jun 14, 2006

strong/weak typing and pointers

Diez B. Roggisch

Steven Bethard

Diez B. Roggisch

Steven Bethard

Diez B. Roggisch

Gabriel Zachmann

Alex Martelli

Steven Bethard

Steven Bethard

Gabriel Zachmann

Gabriel Zachmann

Steven Bethard

Christophe Cavalaria

Christophe Cavalaria

Steven Bethard

Steven Bethard

Christophe Cavalaria

Christophe Cavalaria

Steven Bethard

Greg Ewing

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads