Dennis Ritchie -- An Appreciation

K

Keith Thompson

nroberts said:
Because in computers, memory is one large array like structure.
That's the fundamental abstraction that our hardware attempts to
create.
[...]

Memory needn't be monolithic like that. That's certainly a common
organization, but it's not the only one, and it's not one that's imposed
or even strongly encouraged by C's memory model.

In C, give, two distinct declared objects:

int x;
int y;

there is no defined relationship between their locations in memory.
It's typically the case that (&x < &y || &x > &y) is true, but the
behavior of attempting to evaluate that expression is undefined.

Memory within each object is contiguous (in the sense that it can
consistently be treated as a 1-dimensional array of bytes). No such
relationship necessarily exists between distinct objects.
 
N

nroberts

Because in computers, memory is one large array like structure.
That's the fundamental abstraction that our hardware attempts to
create.

[...]

Memory needn't be monolithic like that.  That's certainly a common
organization, but it's not the only one, and it's not one that's imposed
or even strongly encouraged by C's memory model.

In C, give, two distinct declared objects:

    int x;
    int y;

there is no defined relationship between their locations in memory.
It's typically the case that (&x < &y || &x > &y) is true, but the
behavior of attempting to evaluate that expression is undefined.

Memory within each object is contiguous (in the sense that it can
consistently be treated as a 1-dimensional array of bytes).  No such
relationship necessarily exists between distinct objects.

OK, I'll bite.

How does this save the argument that these are C specific issues?

To answer I'd expect that you can come up with a logical framework by
which an object does not have to be contiguous, or appear to be. Or,
you could come up with a logical framework in which the layout of
objects within objects is not significantly altered in use when you
invert the relationship. Meaning that something like

struct employee { string name; int age; } employees[];

always makes as much sense as:

struct employees { string names[]; int ages[]; };

To me, the grouping of data here is not equivalent and my opponents
are telling me that this is a **C language specific issue**.

I can see why one might prefer the later for performance reasons, and
I've never disputed this. But that is not the argument being made.
I've also never disputed that you can come up with a higher
abstraction, which will cost more on the standard computer, through
which you could operate on either structure with the same interface.

The argument being made is that these two methods of data organization
could be logically equivalent if it wasn't written in C. I for one
cannot think of any way in which it is possible to make that true and
I've been repeatedly asking those who made this claim to tell me how
it could, only to receive abstract interfaces and performance criteria
to prefer the latter version in reply.

If you can show how data grouping and higherarchy become non-issues in
environments unlike the standard, von Neumanesque memory model I'm all
ears.

To reiterate what this argument is even about, it started with this
statement by Richard Harter:

As a side note, there is a subtle bias in C in favor of arrays of
structs rather than structs of arrays. If we are using indices rather
pointers it doesn't affect the code much. Thus

a.x /* array of structs vs */
a.x /* struct of arrays */

are much the same. However if we use pointers the choice matters.
Frex suppose we are walking through x to find a particular record.
With an array of structs we have something like:

for (ap=a;ap<ap_end;ap++) {
if (f(ap->x)) calc(ap);
}

Writing the equivalent code using a struct of arrays is not quite so
simple. :)

END

I've tried repeatedly to get someone to explain why that's even
remotely a C specific issue and again all I've heard so far is that,
"You could have an abstraction that..." and, "Sometimes it's faster to
do..." and every time I try to explain why that's got nothing really
to do with C someone comes along with some completely new, unrelated
thing. Can you get it back on topic and show why a hypothetical non-
preference for the standard model can salvage their argument?

Actually, some have even made the claim that an abstraction on top of
this that hid the difference between the two organization methods
would be impossible to write in C. I don't expect you to be able to
salvage that one though.
 
N

nroberts

I'm going to try to get this argument back on track to see if there's
anything to gleam from it or if my initial assumption, that people
were just making crazy statements to win an argument, is in fact true.

It's been said that there is a bias in C that favors arrays of structs
rather than structs of arrays. The basis of this statement has so far
seemed to be that there are two ways, among others, that one might
organize a list of records:

1. As an array of records:

struct record
{
type0 field0;
type1 field1;
...
} records[];

2. As a structure of arrays:
struct
{
type0 field0[];
type1 field1[];
...
} records;

Alternatives that have not been brought up are things like having
individual variables for each element of each record, random placement
in a blob, some sort of scattered matrix, quantum entanglement...etc.
So the premise at this point would be that you are organizing data in
a hierarchy of containers and contents, otherwise there's nothing to
discuss.

For the claim that there is a bias in C toward #1 over #2 there must
be a way in which this preference does not exist outside of C
somewhere. My assumption to date, which seems clear to me at least,
is that no such environment exists or can exist given our
understanding of logic and computability.

A couple of valid attempts have been made to show that such an
environment is possible:

1. You could invent a "language X" that has some other form of
container that pretty much logically resembles the first case, but
uses a functional syntax such that you could implement it internally
with the later version (though why you would is questionable since
this is not exposed and so can't be taken advantage of).

2. There are cases WITHIN C in which the supposed bias is inverted,
such as when you want performance gains for functions that operate on
the entire record set, but only on one field...and also because
expressing a generic function that can operate on a particular,
supplied field within a generic record (like a generic sum function
that could sum up ages within a list of person records) is so
complicated that you generally don't. In my opinion this is a very
valid argument why this idea that C is causing any bias to #1 is in
error, but I've let that slide hoping for refinement.

I have responded to both by saying that "language X" can be written in
C (this seems to be a debatable proposition but nobody's said why so
far) since it's just a higher abstraction than was originally proposed
and that sure, one might prefer the less obvious organization in some
situations due to the performance gains you might get in the standard
memory model, but that these are not issues specific in any way to C.

The newest argument seems to be that in non-standard memory models,
which C kind of assumes in some ways, language X could take advantage
of some yet unspecified memory addressing method that C would be
incapable of. I concede that this could possibly be true but I don't
believe this does anything to the problem either. Remember that it is
assumed at this point that you're attempting to organize data into a
container/contained relationship, which is an issue that I believe
would still come up with ANY KIND of abstraction one could possibly
come up with. There is always going to be a selection of logical
organizations one can use for their data and which one picks is an
issue of the data and its use, but has nothing to do with syntax or
memory model or even if objects exist in memory in any kind of
recognizable form we currently expect, or anything that could be said
to be a component of C specifically. So long as there exists a
mapping between our logical view and the memory model we don't
actually care what that is unless we're trying to exploit it, which
again won't be language specific.

I don't believe this begs the question. Without a need to organize
data into container and contained there is no sense by which it can be
said that C has a bias toward one method or another. We must be
trying to organize data in some sense for there to be anything to
discuss. If we were simply talking about random values spattered
about in a completely decoupled manner, then C actually prefers
independent variables..and so would any other language we could think
of. We're not though. If we were going even further and talking
about a system in which objects themselves are not coherent and
mappable to logical constructs then I think the entire idea of
computing goes kaput.

It is true that we could possibly come up with some memory model that
is not standard in which the cost of abstracting a "language X" is not
paid because it's actually closer to the machine's memory model than
structures and arrays possibly. I don't believe this solves the issue
either though because we can push language X into reductio ad absurdum
by proposing its obvious extension: field(records_blob, index_0,
index_1, field_name). This expresses a hierarchy. Does it make more
sense to iterate by index_0 or index_1? Under what conditions? At
what point do we decide that instead of having index_0 through index_N
do we actually return some sort of container as our data type? Will
it make as much sense then to invert the relationship and if not does
that comprise of a bias in the language?

Working with simple examples has hidden the bigger, higher level issue
to some people it seems.

Another issue here is that "language X" is incomplete. We've not
devised a declarative syntax for it. If we assume that either
declaration we might use in C can be used in "language X" and then
manipulated through the field syntax, is there any reason at all
anymore to use the second version? The whole argument for even using
the second version has been entirely dependent upon the nature of C
and the standard memory model. Does not the very structure of the
data make the first version more understandable for humans? If so,
how could this be a language issue? If not, please explain.

Even further, the field() syntax proposed in "language X" seems to
mimic the first version more than the second. Given the freedom to
evolve a completely new language that avoids the biases in C we've
created something that very much resembles that which is being said is
C's bias. The whole motivating factor for our attempt to abstract is
not, "How can we get C to make #1 look more like #2," but, "How can we
get C to give us the speed advantages and generic re-usability we want
from #2, but a logical structure that's more resembling #1 because
that's what makes more sense to us." Unfortunately, the answer isn't
so easy when it comes to composing generic functions because this is
something that C is simply not good at. The speed issue on the other
hand can be taken care of by providing macros that turn a #2
implementation into something resembling language X.

Now, if something I've just said is wrong then please explain how. If
your only reply is that "nobody's saying ..." or some other red
herring then I'm not interested; if you don't think anyone's saying it
then there's no reason to argue about it...if the fact that the very
next person contradicts you isn't clue enough. If you've legitimately
got a reason why C specifically causes a bias toward the first
implementation above then I'm all ears. Even if you've got a
legitimate reason why "C like" language have this bias I'm all
ears...what possible language doesn't?
 
B

Ben Bacarisse

nroberts said:
Perhaps you came in late and are unfamiliar with the original
statements and just found yourself on the wrong side of a discussion.
You weren't the one to originally claim that C syntax creates these
biases but you ended up arguing with me so I assumed you were
supporting it.

I think I've understood the origins of this thread. To be sure, I've
just reviewed it and I've not come up with any surprises.
Because in computers, memory is one large array like structure.
That's the fundamental abstraction that our hardware attempts to
create.

I disagree with this, but that could lead to a whole new (and off topic)
discussion. Even if this is universally true, are you saying that it is
why C links logical structure to memory layout? If so I don't see the
connection. My own view is that C does this because it is a low-level
language designed to be "close to the machine".
Which is why quite a while back I argued that what people claiming
that C's syntax causes a bias like this are actually arguing for is
abstractions. The further away you get from the fundamental
abstraction that the hardware provides, a long array, the more
operations have to be conducted to project the "higher" abstraction
onto the "lower" one. This is why languages like "language X" cost
more in terms of performance. C is an attempt to provide just enough
abstraction to be useful across all computers, which means it is as
close to the underlying abstraction as necessary.

There may be a cost, but it is not inevitable. In general, higher-level
languages have higher costs, but what has been called "language X" was
suggested by me because it would, in some cases, *reduce* run-time costs
(and there is no compulsion to use it's non-C features where they don't
improve performance).

It's incidental this argument, but, quite correctly, already distances
itself from the "large array" abstraction. It does so because a large
array is *not* the view of memory that all hardware provides, and
the designers of C wanted to include such hardware as possible targets.
Actually, C doesn't insist on this at all, it's just the simplest
mapping.

I mean contiguous as far as C's view of memory is concerned -- that you
can choose to view any object as a char array whose elements have
consecutive addresses. I don't think it matters to either of our points
of view how these are mapped by the hardware. The programmer sees only
one view of objects -- as contiguous regions of storage. Any access
that violates this model can't be supported by C's model of objects.

This was my original point. You could add the syntax to, say, access a
column from an array of rows, but such a thing could not be an object
like any other -- you could never pass a pointer to it to a function
that operates on an ordinary C array, for example.
This is an incorrect view of higher level languages. Since computers
are what they are, every language inevitably links logical structure
to memory layout. It has to. It can provide an interface that does
not, but that interface comes at a cost and underneath you can be sure
that it projects that interface into the hardware's abstraction in a
very determined manner, just like C does.

Maybe we are using the same words in different ways. Let's use an
example: I want an array of cons cells for my Lisp interpreter. The
logical way to define this is

struct cell { int car, cdr } heap[1000];

If I have a function

int find(int array[], int needle);

that can search an array for a particular value, C does not permit me to
speak of the whole array of car values in the heap. First, it has no
syntax for this (maybe heap[].car?) but, more significantly its object
model would not permit it.

Given what you've written above, I suspect we agree here. Your use of
"link" is might be "there must be *some* correspondence" and mine is
"there is a restricted correspondence".
Because C doesn't provide higher abstractions. The abstractions you
are looking for CAN be created in C though and the lower level
abstractions that it provides are a nice way to describe the mapping.
In fact, most languages that we can think of originally start by being
written in a language like C, if they ever even go past that point and
become self-hosting.

On the other hand, what you seem to want, and I agree that it would be
quite difficult to provide in C (or ANY language) is an abstraction
that can, through the way you use it, interpret what the best layout
would be. This is some sort of super-optimizer and indeed would
require a lot of research to provide. The easiest method would
probably be to invent a new language and provide a compiler that did
this; you'd probably write it in C or C++. On the other hand, for
most of the problems you'd attempt to solve with this abstraction it
would probably be best to let the programmer decide what to turn on
and when. This would, at least, be a lot simpler to implement and
would solve most real-world issues. This is probably why I can't
think of a single language that implements anything like what you seem
to expect from C.

I don't expect it all. I posted to say why it is more than a case of
tweaking C's syntax.

Also, I think I posted that I doubt the pay-off of being able to choose
the layout would be worth the effort due to the rarity of cases where it
matters. If that is true, the effort of doing it automatically is
surely beyond the pale. Just because I make a point on one side of a
debate does not mean that I am dedicated to the cause!

I am in favour, however, of languages that provide highly expressive
constructs, and being able to refer to both columns and rows of a matrix
in equally flexible ways is a big advantage. Algol 68 did this. I am
less sure about doing the same for structs of arrays and arrays or
structs, but it seems to be a nice feature.
Something similar that perhaps actually does provide what you'd
expect, would be a state based template pattern that switches
implementation based on statistics of its use to date. This would be
both more powerful and less powerful than a new language with a really
smart optimizer...depending on your needs. This, and anything more
determinable, can be made in C without much difficulty.

I think we have different metrics for difficulty! For one thing, I
would factor in the cost of lost expressiveness -- having a flexible
layout (in C) would affect how the objects can be referenced, would it
not?
 
N

nroberts

Maybe we are using the same words in different ways.  Let's use an
example: I want an array of cons cells for my Lisp interpreter.  The
logical way to define this is

  struct cell { int car, cdr } heap[1000];

If I have a function

  int find(int array[], int needle);

that can search an array for a particular value, C does not permit me to
speak of the whole array of car values in the heap.  First, it has no
syntax for this (maybe heap[].car?) but, more significantly its object
model would not permit it.

Lets assume I agreed with all that 100% (truth is more like 80-90 but
whatever).

Could you explain to me again why this serves as a reason that C has a
bias toward constructs similar to what you have as "cell" above rather
than something like:

struct
{
int cars[1000];
int cdrs[1000];
} cells;
 
B

BartC

nroberts said:
I'm going to try to get this argument back on track to see if there's
anything to gleam from it or if my initial assumption, that people
were just making crazy statements to win an argument, is in fact true.

It's been said that there is a bias in C that favors arrays of structs
rather than structs of arrays. The basis of this statement has so far
seemed to be that there are two ways, among others, that one might
organize a list of records:

1. As an array of records:

struct record
{
type0 field0;
type1 field1;
...
} records[];

2. As a structure of arrays:
struct
{
type0 field0[];
type1 field1[];
...
} records;

Alternatives that have not been brought up are things like having
individual variables for each element of each record, random placement
in a blob, some sort of scattered matrix, quantum entanglement...etc.
So the premise at this point would be that you are organizing data in
a hierarchy of containers and contents, otherwise there's nothing to
discuss.

If at any point you need a single record as an independent struct object,
then only (1) will give you that. With (2), the fields of a single record
are spread over several arrays.

Perhaps that's the bias that's being talked about.

If there is never any need to be able to pass around an individual, compact
record, then there is no bias. Nor is there really any need to collect all
the parallel arrays into a single struct.
For the claim that there is a bias in C toward #1 over #2 there must
be a way in which this preference does not exist outside of C
somewhere.

OK, so you're disputing not just the existence of a bias, but that fact that
it must be unique to C in order to exist?

Certainly many languages work exactly the same as C, and a few will be able
to extract a single record whether method (1) or (2) is
used, but the internal problems that need to be solved to do that are the
same.

But you seem to be making a big deal out of a simple throw-away remark.
 
K

Keith Thompson

nroberts said:
I'm going to try to get this argument back on track to see if there's
anything to gleam from it or if my initial assumption, that people
were just making crazy statements to win an argument, is in fact true.

It's been said that there is a bias in C that favors arrays of structs
rather than structs of arrays. The basis of this statement has so far
seemed to be that there are two ways, among others, that one might
organize a list of records:

1. As an array of records:

struct record
{
type0 field0;
type1 field1;
...
} records[];

2. As a structure of arrays:
struct
{
type0 field0[];
type1 field1[];
...
} records;

I haven't made any claims about any bias one way or the other.

But I'll mention that (2) is illegal, and (1) is valid only if it's
a declaration, not a definition.

Most arrays, I suspect, have a length that is not determinable
at compile time, and are allocated via malloc() and then accessed
via pointers. (I have no statistics on this.)

If I need N "struct record"s, I can do something like this:

struct record records = malloc(N * sizeof *records);
if (records == NULL) /* error handling */
for (int i = 0; i < N; i ++) {
/* do something with records */
}

With the second approach, you'd need two allocations, one for an array
of type0 objects and another for an array of type1 objects.

Even if you have a fixed number of items (say, 100), there's not a
whole lot of practical difference between

struct record {
type0 field0;
type1 field1;
} records[100];

and

struct {
type0 field0[100];
type1 field1[100];
} records;

-- until you want to deal with individual records. For example, the
array-of-structs method makes it easier to write a function that deals
with a single record:

void func(struct record rec);

whereas the struct-containing-arrays method means that it would have to
be written as:

void func(type0 field0, type1 field1);

If your usage pattern doesn't require that kind of thing, and you
find the struct-containing-arrays method easier to deal with for
some reason, then by all means use it.
 
B

Ben Bacarisse

nroberts said:
Maybe we are using the same words in different ways.  Let's use an
example: I want an array of cons cells for my Lisp interpreter.  The
logical way to define this is

  struct cell { int car, cdr } heap[1000];

If I have a function

  int find(int array[], int needle);

that can search an array for a particular value, C does not permit me to
speak of the whole array of car values in the heap.  First, it has no
syntax for this (maybe heap[].car?) but, more significantly its object
model would not permit it.

Lets assume I agreed with all that 100% (truth is more like 80-90 but
whatever).

Could you explain to me again why this serves as a reason that C has a
bias toward constructs similar to what you have as "cell" above rather
than something like:

struct
{
int cars[1000];
int cdrs[1000];
} cells;

It doesn't. The bias (and I don't like that term but it seems to have
stuck) is that if you are forced to choose the second for reasons of
efficiency, you have no simple way to recover (and process) the cells.
At least this is how I understood Richard Harter's original remark.

I prefer to think of it simply as a limitation -- and for C, quite a
reasonable one. Relaxing it would, I think, require a wholesale
re-thinking of what C considers to be an object.

Replace the struct with an array of two items and I think it becomes
clear that it's reasonable to access either version along either rows or
columns. This is probably the most significant version of "the bias" --
declare a 2D (or higher) array and you are stuck with the order you've
picked. For many algorithms, it is natural to treat rows and columns
interchangeably.
 
B

Ben Bacarisse

nroberts said:
I'm going to try to get this argument back on track to see if there's
anything to gleam from it or if my initial assumption, that people
were just making crazy statements to win an argument, is in fact true.

It's been said that there is a bias in C that favors arrays of structs
rather than structs of arrays.

I don't think that claim has been made. I've posted another reply where
I try to clarify what I think has been said, but I re-word it below.
The basis of this statement has so far
seemed to be that there are two ways, among others, that one might
organize a list of records:

1. As an array of records:

struct record
{
type0 field0;
type1 field1;
...
} records[];

2. As a structure of arrays:
struct
{
type0 field0[];
type1 field1[];
...
} records;

Alternatives that have not been brought up are things like having
individual variables for each element of each record, random placement
in a blob, some sort of scattered matrix, quantum entanglement...etc.
So the premise at this point would be that you are organizing data in
a hierarchy of containers and contents, otherwise there's nothing to
discuss.

For the claim that there is a bias in C toward #1 over #2 there must
be a way in which this preference does not exist outside of C
somewhere. My assumption to date, which seems clear to me at least,
is that no such environment exists or can exist given our
understanding of logic and computability.

A couple of valid attempts have been made to show that such an
environment is possible:

1. You could invent a "language X" that has some other form of
container that pretty much logically resembles the first case, but
uses a functional syntax such that you could implement it internally
with the later version (though why you would is questionable since
this is not exposed and so can't be taken advantage of).

The advantage was intended to be efficiency (presumably of cache use).
That's why the two layouts were not explicitly exposed.
2. There are cases WITHIN C in which the supposed bias is inverted,
such as when you want performance gains for functions that operate on
the entire record set, but only on one field...and also because
expressing a generic function that can operate on a particular,
supplied field within a generic record (like a generic sum function
that could sum up ages within a list of person records) is so
complicated that you generally don't. In my opinion this is a very
valid argument why this idea that C is causing any bias to #1 is in
error, but I've let that slide hoping for refinement.

Yes, that's because the "bias" is not for one over the other. It's that
having picked one (for whatever reason) you are barred from using
natural accesses to the other organisation. (This is clearest with 2D
arrays.)
I have responded to both by saying that "language X" can be written in
C (this seems to be a debatable proposition but nobody's said why so
far) since it's just a higher abstraction than was originally proposed
and that sure, one might prefer the less obvious organization in some
situations due to the performance gains you might get in the standard
memory model, but that these are not issues specific in any way to C.

This is very far from the original debate. The original point was that
C did not have either. I went a little further and said that they go
against C view of objects, so they can't be added to C without
a major overhaul of the language.
The newest argument seems to be that in non-standard memory models,
which C kind of assumes in some ways, language X could take advantage
of some yet unspecified memory addressing method that C would be
incapable of.

I don't recall this argument at all. It's possible I've missed it, but
it sounds very odd indeed. What is a non-standard memory model? (Or,
conversely, what is a standard one?)
I concede that this could possibly be true but I don't
believe this does anything to the problem either. Remember that it is
assumed at this point that you're attempting to organize data into a
container/contained relationship, which is an issue that I believe
would still come up with ANY KIND of abstraction one could possibly
come up with. There is always going to be a selection of logical
organizations one can use for their data and which one picks is an
issue of the data and its use, but has nothing to do with syntax or
memory model or even if objects exist in memory in any kind of
recognizable form we currently expect, or anything that could be said
to be a component of C specifically. So long as there exists a
mapping between our logical view and the memory model we don't
actually care what that is unless we're trying to exploit it, which
again won't be language specific.

I don't follow this at all.
I don't believe this begs the question. Without a need to organize
data into container and contained there is no sense by which it can be
said that C has a bias toward one method or another. We must be
trying to organize data in some sense for there to be anything to
discuss. If we were simply talking about random values spattered
about in a completely decoupled manner, then C actually prefers
independent variables..and so would any other language we could think
of. We're not though. If we were going even further and talking
about a system in which objects themselves are not coherent and
mappable to logical constructs then I think the entire idea of
computing goes kaput.

It is true that we could possibly come up with some memory model that
is not standard in which the cost of abstracting a "language X" is not
paid because it's actually closer to the machine's memory model than
structures and arrays possibly. I don't believe this solves the issue
either though because we can push language X into reductio ad absurdum
by proposing its obvious extension: field(records_blob, index_0,
index_1, field_name). This expresses a hierarchy. Does it make more
sense to iterate by index_0 or index_1? Under what conditions? At
what point do we decide that instead of having index_0 through index_N
do we actually return some sort of container as our data type? Will
it make as much sense then to invert the relationship and if not does
that comprise of a bias in the language?

Working with simple examples has hidden the bigger, higher level issue
to some people it seems.

I suspect (mainly because I don't know what you are saying here) that
the problem is the reverse -- that the lessons of a simple example are
being lost in ill-defined terminology.
Another issue here is that "language X" is incomplete. We've not
devised a declarative syntax for it. If we assume that either
declaration we might use in C can be used in "language X" and then
manipulated through the field syntax, is there any reason at all
anymore to use the second version? The whole argument for even using
the second version has been entirely dependent upon the nature of C
and the standard memory model. Does not the very structure of the
data make the first version more understandable for humans? If so,
how could this be a language issue? If not, please explain.

Again, I just don't follow this. In case anyone still cares, what you
called "language X" was the proposition of adding a keyword
("transpose", I think) to flip the layout of 2D data structures. This
would permit the programmer to use the natural declaration in situations
where the transposed layout was more efficient (always assuming that
such situations exist).

Another language idea has been discussed (and it's independent of the
first one) which it the ability to "slice" a struct with array members
to get a struct, or to slice an array of structs to get an array of the
members. (One would presumably extend this to array of arrays as well.)

The transpose idea lets to use the natural declaration even when you
want the layout implied by the other. You still have only one
convenient was to access the data -- the one determined by the
declaration. The second idea lets you access the data structure in
either way.
Even further, the field() syntax proposed in "language X" seems to
mimic the first version more than the second. Given the freedom to
evolve a completely new language that avoids the biases in C we've
created something that very much resembles that which is being said is
C's bias. The whole motivating factor for our attempt to abstract is
not, "How can we get C to make #1 look more like #2,"

Actually, I do like that idea (the access either one as if it were the
other) but I agree that it's not the original motivation.
but, "How can we
get C to give us the speed advantages and generic re-usability we want
from #2, but a logical structure that's more resembling #1 because
that's what makes more sense to us."
Exactly!

Unfortunately, the answer isn't
so easy when it comes to composing generic functions because this is
something that C is simply not good at.

and that was my original point, though I put it more strongly. Altering
the relationship between declaration and location contravenes C's view
of objects.
The speed issue on the other
hand can be taken care of by providing macros that turn a #2
implementation into something resembling language X.

But the complaint was not that you can't use layout #2 when you need to,
but that some otherwise natural accesses become convoluted. I don't see
how macros can do more than paper over the cracks.
Now, if something I've just said is wrong then please explain how. If
your only reply is that "nobody's saying ..." or some other red
herring then I'm not interested;

Eh? Sorry to have wasted your time. I thought it would be useful to
explain what I think people have been saying, but if you had put this
upfront I would not have replied (it's been quite a time consuming reply
up to this point).
 
N

Nick Keighley

not how I'd use "logical"

I don't think so. I know we've been talking array-of-structs and
structs-of-arrays. But consider arrays-of-arrays. C implements 2d
arrays (or as near as C gets to 2-d arrays) as a column array
containing row arrays. It doesn't *have* to do this Fortran, I
believe, implements it as a row array containing coumn arrays. The
most efficeint way to access a 2-d array is affected by this. The
PHYSICAL layout is different but the LOGICAL loyout isn't. Similar
things could be done with structs and arrays.
Perhaps you came in late and are unfamiliar with the original
statements and just found yourself on the wrong side of a discussion.
You weren't the one to originally claim that C syntax creates these
biases but you ended up arguing with me so I assumed you were
supporting it.

he's probably agreeing with you on one thing and disagreeing with
another. I for instance don't think it's just syntax but I'm yet to be
convinced its a law of nature or something.
Because in computers, memory is one large array like structure.
That's the fundamental abstraction that our hardware attempts to
create.

consider the 8xx86 for more bizzare memory layout
Which is why quite a while back I argued that what people claiming
that C's syntax causes a bias like this are actually arguing for is
abstractions.  The further away you get from the fundamental
abstraction that the hardware provides, a long array, the more
operations have to be conducted to project the "higher" abstraction
onto the "lower" one.  This is why languages like "language X" cost
more in terms of performance.

not necessarily. I thought the point was X laid things out in memory
in a different way to C and that this may give a performance
improvement to certain algorithms
 C is an attempt to provide just enough
abstraction to be useful across all computers, which means it is as
close to the underlying abstraction as necessary.

the underlying "hardware abstraction"? C provides an abstraction that
matches well (but not perfectly) with much modern hardware. Its pretty
close to the hardware. I'm not sure what "as close as necessary"
means. Who decides what is necessary? C is a great success at what it
does no doubt.
Actually, C doesn't insist on this at all, it's just the simplest
mapping.

C does insist on this.
struct Point {int x; iny y} pt;

&pt.a < &pt.b is well defined and true.

well it *could*, but it doesn't
This is an incorrect view of higher level languages.  Since computers
are what they are, every language inevitably links logical structure
to memory layout.  It has to.

it doesn't have to be fixed and static though. logically contiguous
items could be sprayed all over physical memory.
 It can provide an interface that does
not, but that interface comes at a cost and underneath you can be sure
that it projects that interface into the hardware's abstraction in a
very determined manner, just like C does.

well C does this but I don't see why an HLL must do this.
Because C doesn't provide higher abstractions.  The abstractions you
are looking for CAN be created in C though and the lower level
abstractions that it provides are a nice way to describe the mapping.
In fact, most languages that we can think of originally start by being
written in a language like C, if they ever even go past that point and
become self-hosting.

On the other hand, what you seem to want, and I agree that it would be
quite difficult to provide in C

I don't think anyone is askign for this. Hence "language X"
(or ANY language) is an abstraction
that can, through the way you use it, interpret what the best layout
would be.

or a special type of inside-out data structure.
 This is some sort of super-optimizer and indeed would
require a lot of research to provide.  The easiest method would
probably be to invent a new language and provide a compiler that did
this; you'd probably write it in C or C++.  On the other hand, for
most of the problems you'd attempt to solve with this abstraction it
would probably be best to let the programmer decide what to turn on
and when.  This would, at least, be a lot simpler to implement and
would solve most real-world issues.  This is probably why I can't
think of a single language that implements anything like what you seem
to expect from C.

no one "expects" this from C
 
N

nroberts

OK, so you're disputing not just the existence of a bias, but that fact that
it must be unique to C in order to exist?

No, I'm claiming that the bias must be unique to C in order to be
unique to C. The claim that it WAS unique to C is what I'm trying to
fathom.
 
J

John Gordon

In said:
AFAIK, nobody made such a claim, certainly not I. AFAICT, the whole
notion is an invention of yours.

"As a side note, there is a subtle bias in C in favor of arrays of
structs rather than structs of arrays." -- Richard Harter
 
J

James Kuyper

"As a side note, there is a subtle bias in C in favor of arrays of
structs rather than structs of arrays." -- Richard Harter

I see nothing in there about the bias being unique to C.
 
J

John Gordon

I see nothing in there about the bias being unique to C.

I may have responded to the wrong sub-thread.

If I'm reading my attributions correctly, nroberts said this:

It's been said that there is a bias in C that favors arrays of structs
rather than structs of arrays.

And then someone (Ben Bacarisse?) responded with this:

I don't think that claim has been made.

So I provided a quote of just such a claim.

The quote does not state the bias is unique to C, but it does state that
C has the bias, which is the point I was responding to.
 
J

James Kuyper

I may have responded to the wrong sub-thread.

Apparently. I've tracked down the details below.
If I'm reading my attributions correctly, nroberts said this:

It's been said that there is a bias in C that favors arrays of structs
rather than structs of arrays.

And then someone (Ben Bacarisse?) responded with this:

I don't think that claim has been made.

So I provided a quote of just such a claim.

Ben's message containing that text was
<0.1a58db304bd70f0e7d15.20111111045341GMT.87zkg3xnsa.fsf@bsb.me.uk>.

The message you posted that I was responding to,
<[email protected]>, had a different context. It cites none
of the above text. The last item in it's References: header was

In that message, the text you quoted was immediately preceded by this text:
 
N

nroberts

I may have responded to the wrong sub-thread.

If I'm reading my attributions correctly, nroberts said this:

  It's been said that there is a bias in C that favors arrays of structs
  rather than structs of arrays.

And then someone (Ben Bacarisse?) responded with this:

  I don't think that claim has been made.

So I provided a quote of just such a claim.

The quote does not state the bias is unique to C, but it does state that
C has the bias, which is the point I was responding to.

Yeah, I misspoke above. I sometimes get caught by straw men people
throw up and loose track of the conversation for a moment. The whole
unique to C thing isn't mine at all. That was never the claim but I
did get confused and misspeak.

I don't expect that it is unique to C, only that by saying it is IN C
(which WAS stated, as you and I have now both quoted) that there's
something outside of C that doesn't have it...or could be. Otherwise
saying that it's in C is a pointless, inflammatory statement...sort of
like saying there's a bias in American culture toward breathing air.
Even though Americans indeed breath air, so does everyone and
everything else living so the bias isn't exactly an American one.
Just like I'd expect you to back the bias claim by showing me any
possible human culture that does not breath air, I expect someone to
be able to show any possible programming language or form of
computational logic that doesn't contain THIS "bias".

It's been said I'm making a big deal about a "throw away statement".
If that's the case fine...but the only reason I'm continuing here is
that people keep arguing against me. If it really was a throw away
statement, why are so many people asserting that it's either a) true,
or b) never said even though it can be quoted repeatedly. Maybe it
wasn't meant, then I'd hope the person who said it would explain
that. As far as I know to date it WAS meant and there are many
supporters. I'm just asking them to explain themselves.

At this point I'm giving up. My initial assumption was that people
had gotten so locked into their point of view that they were talking
nonsense to back it up. I was asking for clarification in case I was
wrong and all I've gotten is a bunch of hand wrangling. Obviously I
was correct and this conversation can never go anywhere.
 
N

nroberts

It was Ben.





But you quoted me and when you did you snipped what Roberts had
written.  This is what you wrote:

BEGIN
In <[email protected]> (e-mail address removed) (Richard



"As a side note, there is a subtle bias in C in favor of arrays of
structs rather than structs of arrays." -- Richard Harter
END

and here is where you got my remark from.  If you had read my entire
post and hadn't snipped out what Roberts wrote you would have seen:

BEGIN



AFAIK, nobody made such a claim, certainly not I.  AFAICT, the whole
notion is an invention of yours.
END

Lesson: Be careful about what you snip and always make sure you have
the right context.

Actually, you're mixing and matching here. The bits you're quoting
come from different lines of conversation. The one that Mr. Gordan
was replying to was correctly stated, the one your pasting into
contained my misspeaking.

You're also completely ignoring the fact that I already corrected the
mistake regarding "unique to C". Now you're just harping on something
that was never meant in order to ignore the real issue at hand, which
is getting YOU to explain what YOU said, which very much was that _C_
has this bias. If there's nothing special about C in any way
regarding this bias, why is it C's?
 
J

John Gordon

This is a fallacious argument based on a faulty analogy. Simply
observing that X has property A does not establish that there is
anything that does not have property A. It does not oblige me to show
that there could be something that does not have property A.

Strictly speaking you're right of course, but I do see his point.

If I hear someone say that French cooking is delicious, that leads me
to believe that the speaker holds French cooking in higher regard than
other cuisines.

Yes, it's possible that the speaker believes all cooking is delicious,
but in that case, why did he or she bother to single out French cooking?
It would have been more accurate and less effort to simply say that all
cooking is delicious.

But the speaker did choose to single out French cooking, and in my opinion
that can carry meaning.

(Of course this is all hugely subject to context. If the speaker in my
example was specifically responding to a question about French cooking,
then that changes everything.)
 
N

nroberts

This is a fallacious argument based on a faulty analogy.  Simply
observing that X has property A does not establish that there is
anything that does not have property A.  It does not oblige me to show
that there could be something that does not have property A.

Thank you for finally confirming my assessment that your statement was
never meant to mean anything. Now that I know you're prone to posting
gibberish just to stroke your ego I don't have to worry that I'm
missing anything fundamental when I don't see the point.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,085
Messages
2,570,597
Members
47,218
Latest member
GracieDebo

Latest Threads

Top