Origin and History of Dot Syntax

K

Keith H Duggar

Can anyone point me to the origin and history of the dot
syntax for accessing structures?

Were there languages prior to C that used it?
Who invented it?
etc.
 
T

Tim Rentsch

Can anyone point me to the origin and history of the dot
syntax for accessing structures?

Were there languages prior to C that used it?
Who invented it?
etc.

Possibly it came from Cobol originally. Certainly the dot syntax for
accessing structures was used in PL/I. And C itself was derived
(indirectly) from BCPL, which also used dot for structure access if I
am not mistaken. But I'm fairly sure BCPL came into being only after
the dot syntax for structure access was well established in other
major languages.
 
M

Martin Ambuhl

Keith said:
Can anyone point me to the origin and history of the dot
syntax for accessing structures?

Were there languages prior to C that used it?

PL/1 used it. The only language I used before PL/1 that had structures
with named elements was COBOL, but that used a different way of
accessing them.
 
L

Lew Pitcher

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin said:
PL/1 used it. The only language I used before PL/1 that had structures
with named elements was COBOL, but that used a different way of
accessing them.

To COBOL, the '.' is a statement terminator, sort of like the semi-colon (';')
is to C.

Reference to members of a 'structure' (a "record" in COBOL) is done directly,
as COBOL doesn't have an implicit 'structure model'.

You /can/ have more than one COBOL record using the same data names (and not
necessarily to reference the same type of data, or placed at the same
displacement in the record), and access to these data names is done through
the OF clause.

For instance

01 NAME-RECORD.
03 FIRST-NAME PIC X(20).
03 MIDDLE-INITIAL PIC X.
03 LAST-NAME PIC X(25).

01 DATA-RECORD.
03 SIN-NUMBER PIC 9(16).
03 FIRST-NAME PIC X(15).
03 LAST-NAME PIC X(15).


MOVE LAST-NAME OF NAME-RECORD TO LAST-NAME OF DATA-RECORD.



- --
Lew Pitcher

Master Codewright & JOAT-in-training | GPG public key available on request
Registered Linux User #112576 (http://counter.li.org/)
Slackware - Because I know what I'm doing.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBsk4RagVFX4UWr64RAiv2AJ9IKUznPVrBYrEByE2W1wAA5T+0/wCfZMzj
j1FT2h2+nXGeV8rX70sCiFo=
=T3JG
-----END PGP SIGNATURE-----
 
C

Chris Dollin

Tim said:
Possibly it came from Cobol originally. Certainly the dot syntax for
accessing structures was used in PL/I. And C itself was derived
(indirectly) from BCPL, which also used dot for structure access if I
am not mistaken.

You are mistaken. BCPL doesn't *have* structures but, since all the
data-values are the same size, vectors can (and do) serve much the
same purpose. The vector-indexing operation is ! and may be used infix
(with an implied sum) or prefix.

My memory is shouting at me that "." is just a legal character in
BCPL names, but my reference book is at home, so I can't check.

A later (or possibly Cambridge-specific) extension to BCPL was the
"field selector", allowing bit-slices out of a value; I seem to
recall that this uses the infixed keyword OF.

Pop11 uses . for postfix function application, which makes it look
like structure access, but I don't know off-hand whether this existed
in POP2 or whether it was a later invention based on .-for-structures.
 
T

Tim Rentsch

Chris Dollin said:
You are mistaken. BCPL doesn't *have* structures but, since all the
data-values are the same size, vectors can (and do) serve much the
same purpose. The vector-indexing operation is ! and may be used infix
(with an implied sum) or prefix.

My memory is shouting at me that "." is just a legal character in
BCPL names, but my reference book is at home, so I can't check.

A later (or possibly Cambridge-specific) extension to BCPL was the
"field selector", allowing bit-slices out of a value; I seem to
recall that this uses the infixed keyword OF.

Interesting. Thank you for the correction.

It's been a *long* time since I looked at any BCPL code. :)
 
W

William McNicol

Chris said:
Tim Rentsch wrote:




You are mistaken. BCPL doesn't *have* structures but, since all the
data-values are the same size, vectors can (and do) serve much the
same purpose. The vector-indexing operation is ! and may be used infix
(with an implied sum) or prefix.

My memory is shouting at me that "." is just a legal character in
BCPL names, but my reference book is at home, so I can't check.

I still have my August 1974 manual on the bookshelf right by my
computer, but it has been a v. long time since I ever wrote (or read)
any BCPL code.

"." is indeed just a legal character in identifiers (which must start
with a letter).

Of course, if one declares identifiers

person.name
person.age
etc,

Then it already looks to our modern eyes like structure access even
though the language didn't actually have structures...
A later (or possibly Cambridge-specific) extension to BCPL was the
"field selector", allowing bit-slices out of a value; I seem to
recall that this uses the infixed keyword OF.

Indeed, using the value of BCPL expression

(SLCT 3:4:5) OF X

would be more or less equivalent to value of the C expression:

*(X+5) >> 4) & 7

Assigning to such a BCPL expression writes the correct number of low
order bits of the value into the specified bits.
Pop11 uses . for postfix function application, which makes it look
like structure access, but I don't know off-hand whether this existed
in POP2 or whether it was a later invention based on .-for-structures.

Cheers,

William.
 
C

CBFalconer

William said:
Chris Dollin wrote:
.... snip ...

I still have my August 1974 manual on the bookshelf right by my
computer, but it has been a v. long time since I ever wrote (or
read) any BCPL code.

"." is indeed just a legal character in identifiers (which must
start with a letter).

Of course, if one declares identifiers

person.name
person.age
etc,

Then it already looks to our modern eyes like structure access
even though the language didn't actually have structures...

Exactly. I used this feature in 8080 assembly code to interface
with Pascal (which uses the . for record fields) with several
assemblers which accepted it as just another char. This was in the
mid to late '70s.

Pascal has had the feature since about 1968. I don't know about
Algol.
 
C

Chris Dollin

CBFalconer said:
William McNicol wrote: [snip]
I still have my August 1974 manual on the bookshelf right by my
computer, but it has been a v. long time since I ever wrote (or
read) any BCPL code.

"." is indeed just a legal character in identifiers (which must
start with a letter).

Of course, if one declares identifiers

person.name
person.age
etc,

Then it already looks to our modern eyes like structure access
even though the language didn't actually have structures...

Exactly. I used this feature in 8080 assembly code to interface
with Pascal (which uses the . for record fields) with several
assemblers which accepted it as just another char. This was in the
mid to late '70s.

Pascal has had the feature since about 1968. I don't know about
Algol.

Algol 60 didn't have structures. Algol 68 (guess) had, if I recall
correctly, OF as the field selector. (Stares abstractedly into space.)
Dunno if I have a '68 book at home. Maybe not.
 
W

William McNicol

Chris said:
CBFalconer wrote:

William McNicol wrote:
[snip]
I still have my August 1974 manual on the bookshelf right by my
computer, but it has been a v. long time since I ever wrote (or
read) any BCPL code.

"." is indeed just a legal character in identifiers (which must
start with a letter).

Of course, if one declares identifiers

person.name
person.age
etc,

Then it already looks to our modern eyes like structure access
even though the language didn't actually have structures...

Exactly. I used this feature in 8080 assembly code to interface
with Pascal (which uses the . for record fields) with several
assemblers which accepted it as just another char. This was in the
mid to late '70s.

Pascal has had the feature since about 1968. I don't know about
Algol.


Algol 60 didn't have structures. Algol 68 (guess) had, if I recall
correctly, OF as the field selector. (Stares abstractedly into space.)
Dunno if I have a '68 book at home. Maybe not.

Well, on the same shelf, I still have my Algol68-R User guide....

'selector OF primary' is exactly what they had, so

name OF person etc...

Cheers,

William.
 
A

Albert van der Horst

CBFalconer said:
William McNicol wrote: [snip]
I still have my August 1974 manual on the bookshelf right by my
computer, but it has been a v. long time since I ever wrote (or
read) any BCPL code.

"." is indeed just a legal character in identifiers (which must
start with a letter).

Of course, if one declares identifiers

person.name
person.age
etc,

Then it already looks to our modern eyes like structure access
even though the language didn't actually have structures...

Exactly. I used this feature in 8080 assembly code to interface
with Pascal (which uses the . for record fields) with several
assemblers which accepted it as just another char. This was in the
mid to late '70s.

Pascal has had the feature since about 1968. I don't know about
Algol.

Algol 60 didn't have structures. Algol 68 (guess) had, if I recall
correctly, OF as the field selector. (Stares abstractedly into space.)
Dunno if I have a '68 book at home. Maybe not.

Indeed Algol 60 was a very early language, and lacked structures
as FORTRAN did.
As for Algol 68, they did most everything right, a very usable
language till this day. I did a seminar about automata theory
that was attended by van Wijngaarden too. University of Utrecht
the Netherlands, 1969.
For sure. `of' is a keyword. `` complex.x '' (c-code) becomes
`` x OF complex'' (Algol68) with uppercase stropping, i.e. all
keywords in uppercase. (Published code mostly has keywords in
bold, very neat.)

While I wasn't looking a neighbour implemented an Algol68 compiler,
available for GNU Linux, http://www.xs4all.nl/~jmvdveer
or google for algol68g. Missing still are the parallel processing
constructions, but the implementation is recent and in full
development swing. (Parallel constructions are mostly there to
allow the compiler to optimise, could be interesting for
SMP.)

Groetjes Albert

--
 
C

Chris Dollin

Albert said:
CBFalconer said:
William McNicol wrote: [snip]
I still have my August 1974 manual on the bookshelf right by my
computer, but it has been a v. long time since I ever wrote (or
read) any BCPL code.

"." is indeed just a legal character in identifiers (which must
start with a letter).

Of course, if one declares identifiers

person.name
person.age
etc,

Then it already looks to our modern eyes like structure access
even though the language didn't actually have structures...

Exactly. I used this feature in 8080 assembly code to interface
with Pascal (which uses the . for record fields) with several
assemblers which accepted it as just another char. This was in the
mid to late '70s.

Pascal has had the feature since about 1968. I don't know about
Algol.

Algol 60 didn't have structures. Algol 68 (guess) had, if I recall
correctly, OF as the field selector. (Stares abstractedly into space.)
Dunno if I have a '68 book at home. Maybe not.

Indeed Algol 60 was a very early language, and lacked structures
as FORTRAN did.
As for Algol 68, they did most everything right,

Implicit dereferencing was a clever abomination which, thankfully, only
one later language (as far as I know) sucumbed to, and that only in
carefully controlled circumstances.

They botched lexical closures.

I think we're sufficiently far off-topic to stop now.
 
A

Albert van der Horst

Implicit dereferencing was a clever abomination which, thankfully, only
one later language (as far as I know) sucumbed to, and that only in
carefully controlled circumstances.

In my opinion the rvalues and lvalues and sequence points of C are the
abomination, and indeed Algol68 was very careful and theoretical sound
to explain what happens if you say
a := a + b ;
In C you have
a = a + b
and it is theoretical messy and very adhoc that one a is an lvalue
and one a is an rvalue.
In algol68 it says: a is the name of a place where you can store
a number. You can't add two places where you can store numbers.
So we fetch the numbers for you.
This is called coercion.

Actually c does some kind of coercions :
{
float b=1.;
float c;
c = 1 + b;
c = 1;
}
You can't add an int and a float. So another ad hoc rule is introduced:
both are converted to a double first. (I know, it has since been changed
into an other ad hoc rule.). In the second assignment to c, we have
again a similar situation. Suddenly it is called an "implicit cast".

Algol 68 said, as a general rule that an integer can be coerced into
a float and explains when and why this can be done. At least for me
the theoretical frame work was sound, and I found it much easier to
remember what an actual language construct does, than in c.
They botched lexical closures.

This got much publicity, because they were so proud that the
original report didn't have a single omission or unclarity.
Still it was a rather obscure thing, not likely to occur
in a practical program.
I said they did "most" things right. At a very early stage.
I don't retreat from that.

C has never recovered from the way you have to define types
(inside out), which makes expressions like
const * * const char * const p; 1)
very unintuitive.
What is worse, it survives in successors like Java and C++.
I can't find a language mistake of this magnitude in Algol68.
I think we're sufficiently far off-topic to stop now.

I do my best to keep it on topic. Comparing standard-c to
other languages is on topic, as far as I know.

1) This is a charicature, I can't be bothered to insert a
valid expression here, or maybe it is.
Chris "electric hedgehog" Dollin

Groetjes Albert
 
L

Lawrence Kirby

Interesting, I've seen it suggested that C doesn't need the -> operator
because . could be given this behaviour if its left operand has pointer
to struture or union type.
In my opinion the rvalues and lvalues and sequence points of C are the
abomination, and indeed Algol68 was very careful and theoretical sound
to explain what happens if you say
a := a + b ;
In C you have
a = a + b
and it is theoretical messy and very adhoc that one a is an lvalue and
one a is an rvalue.
In algol68 it says: a is the name of a place where you can store a
number. You can't add two places where you can store numbers. So we
fetch the numbers for you.
This is called coercion.

C isn't all that different. Based on the standard, in the expression
a = a + b both of the a's are lvalues. As an operand of + the 2nd a is
converted to the value of the object it designates.
Actually c does some kind of coercions : {
float b=1.;
float c;
c = 1 + b;
c = 1;
}
You can't add an int and a float. So another ad hoc rule is introduced:
both are converted to a double first. (I know, it has since been changed
into an other ad hoc rule.). In the second assignment to c, we have
again a similar situation. Suddenly it is called an "implicit cast".

These are all called conversions. There is no such thing as an "implicit
cast" in C, a cast is an operator of the form (type)expr and there is
nothing implicit about it. I have heard the term "implicit cast" used
informally but it is not terminology used in the language definition, the
correct term is "conversion".
Algol 68 said, as a general rule that an integer can be coerced into a
float and explains when and why this can be done. At least for me the
theoretical frame work was sound, and I found it much easier to remember
what an actual language construct does, than in c.

C has what it calls the "Usual Arithmetic conversions". Admittedly these
have grown a bit unwieldly as the language developed, but at least the
rules that involve floating point types aren't too bad. Rules relating to
just integers are a mess though.

....
C has never recovered from the way you have to define types (inside
out), which makes expressions like const * * const char * const p; 1)
very unintuitive.

Also invalid. Maybe a declaration like

const char **const *const p;

The things closest to p are the things that most directly define it - here
p is a const pointer to something. The something is a const pointer to
pointer to const char. You could even write it as for example

const char **const (*const p);

The vast majority of declarations are simpler than that. Perhaps the most
awkward part is that the declarator syntax includes both prefix and
postfix elements (arrays and function declarators). Perhaps if *
(indirection) was a postfix operator and declarations had a corresponding
postfix syntax for it, things would be simpler. You wouldn't need the ->
operator as ptr->member would be naturally ptr*.member. However declaring
a function that returns a pointer to char would be like

char func(arg) *;

which puts the * well away from things it relates to, especially if
there is a large parameter list.
What is worse, it survives in successors like Java and C++.

In Java the form is quite limited due to the lack of explicit pointers,
perhaps making it more manageable.
I can't find a language mistake of this magnitude in Algol68.

You may be right, but it has its charm, perhaps even some advantages.
I don't remember seeing a proposal of how a more conventional declaration
syntax might be integrated into C. I've yet to be convinced that C's way
is such a huge mistake. It seems to work well enough in practice.

Lawrence
 
L

Lew Pitcher

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Lawrence said:
Interesting, I've seen it suggested that C doesn't need the -> operator
because . could be given this behaviour if its left operand has pointer
to struture or union type.

- From Chapter 6 ("Structures") of "The C Programming Language" by Brian W.
Kernighan & Dennis M. Ritchie (Copyright 1978 by Bell Telephone Laboratories)

<quote>

The declaration

struct date *pd;

says that pd is a pointer to a structure of type date. The notation
exemplified by

pd->year

is new. If p is a pointer to a structure, then

p->member-of-structure

refers to the particular member. (The operator -> is a minus sign followed
by >.)

* Since pd points to the structure, the year member could also be refered to
* as
*
* (*pd).year
*
* but pointers to structures are so frequently used that the -> notation is
* provided as a convenient shorthand.

</quote>

So, it appears that it's more than a suggestion :)

[snip]


- --
Lew Pitcher
IT Consultant, Enterprise Data Systems,
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed are my own, not my employers')
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)

iD8DBQFBvdrqagVFX4UWr64RAtCAAKC+kPWD3cReA3mgF9TXw3mEfnxWdwCeKub7
jicpsucjjtYoOZem4sey1D0=
=+uYN
-----END PGP SIGNATURE-----
 
A

Arthur J. O'Dwyer

Lawrence said:
Interesting, I've seen it suggested that C doesn't need the -> operator
because . could be given this behaviour if its left operand has pointer
to struture or union type.

- From Chapter 6 ("Structures") of "The C Programming Language" by Brian W.
Kernighan & Dennis M. Ritchie (Copyright 1978 by Bell Telephone Laboratories) [...]
* (*pd).year
</quote>

So, it appears that it's more than a suggestion :)

You misunderstood. That's an explicit dereference in real C. Lawrence
was saying that you could, potentially, in a language that would not be C
as we know it today, write

pd.year

and get the same effect as the expression

pd->year or the equivalent (*pd).year

in C. This is because there's no ambiguity in such an expression; since
we know 'pd' is of type "pointer to struct date," we know that we must
dereference it down to a struct type before applying the dot operator.
Algol went that route, but C didn't.
I suppose the rationale was that since pointers and structs are vastly
different beasts in C (what with explicit memory management and pointer
arithmetic and so on), it ought to be clear whether 'pd.year' is accessing
a struct member or dereferencing a (possibly aliased) pointer. I tend to
agree with this line of reasoning.

The all-out Algol approach (AIUI) wouldn't work in C because it would
make constructs like

int *x, *y;
int *z = x+y;

ambiguous: Are we implicitly dereferencing 'x', 'y', or both of them?

-Arthur
 
C

Chris Dollin

Albert said:
In my opinion the rvalues and lvalues and sequence points of C are the
abomination,

The sequence points are a separate issue, of course, and play little
(if any) part in the lvalue/rvalue distinction.

and indeed Algol68 was very careful and theoretical sound
to explain what happens if you say
a := a + b ;
In C you have
a = a + b
and it is theoretical messy and very adhoc that one a is an lvalue
and one a is an rvalue.

It is neither messy nor adhoc. It's just a rule. It's a rule that
caters to the common case and requires making the uncommon case
explicit, as opposed to Algol 68, which recognises the common case
most (but not all - and I *have* been bitten by this) of the time
by careful (complicated and extensive) balancing of the coercion
rules.
In algol68 it says: a is the name of a place where you can store
a number. You can't add two places where you can store numbers.
So we fetch the numbers for you.
This is called coercion.

Yes, I understand the mechanism.
This got much publicity, because they were so proud that the
original report didn't have a single omission or unclarity.

Achieved (he said rather unfairly, but he's still saying it)
by making the report as a whole unclear. Well, opaque anyway.
Still it was a rather obscure thing, not likely to occur
in a practical program.

Pah. Hundreds of Common Lisp, Scheme, Pop11, Smalltalk, and Simula
programmers fall over laughing hysterically.

It's only "not likely to occur" if you don't let people do it.
I said they did "most" things right. At a very early stage.
I don't retreat from that.

C has never recovered from the way you have to define types
(inside out), which makes expressions like
const * * const char * const p; 1)
very unintuitive.

With respect, almost any way of expressing this type is likely
to be unintuitive, typedef or no typedef. (After a minor correction
to

char const * * const * const p

it's a legal type expression.)
What is worse, it survives in successors like Java and C++.

C++ I'll grant you, but not Java - the type syntax is so small
that the placement of [] is hardly an issue.
I can't find a language mistake of this magnitude in Algol68.


I do my best to keep it on topic. Comparing standard-c to
other languages is on topic, as far as I know.

I think not, actually.
1) This is a charicature, I can't be bothered to insert a
valid expression here, or maybe it is.

Nearly, it was.
 
C

Chris Dollin

Lawrence said:
Interesting, I've seen it suggested that C doesn't need the -> operator
because . could be given this behaviour if its left operand has pointer
to struture or union type.

That would be a much more controlled use of implicit dereferencing than
Algol 68's. (Which essentially keeps variables as `ref T` values as long
as it can, until it can only satisfy a type constraint by removing a `ref`.
The language can't distinguish between a variable and a constant reference.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,172
Messages
2,570,935
Members
47,479
Latest member
JaysonK723

Latest Threads

Top