Dennis Ritchie -- An Appreciation

N

nroberts

Not red herrings, a real observation about idioms in different languages.


Well, that we agree on.  It seems you were uninterested in reading the
explanation.


Okay, let's start from the top.

Imagine that there IS only one programming language, ever: C.  And ask
the question:
        I want to represent a number of items, each of which has several
        traits.  Should I represent these as an array of structures, or
        as several arrays (possibly stored in a higher level structure)?

It seems pretty clear to me that, in C, the answer is nearly always the
former.  In cases where there's an overwhelmingly strong reason to do it
the other way, such as working on a vector processor where it is *VITAL*
that all x be in one block of contiguous memory, and all y in another
block of contiguous memory, it's a lot of work and very easy to get it
wrong.

That constitutes a clear bias towards one way of approaching things.

Now what if there were other languages?  Imagine, if you will, a language
in which the native data structure is a 2D array, in which both rows and
columns are primary objects which can be manipulated easily.

What type are the columns? If "any" then how does this language store
arbitrary length strings in this matrix? "It just does," is not an
answer since the very fallacy of what you're claiming is right here in
this detail.
 In a language
like this, you might find that the degree of bias towards viewing things
one way rather than another was weaker.  It might even be so weak that
it wasn't really obvious that doing things one way rather than the other
created a distinction you could perceive in code.

What happens when we want to work in 3d? Should I keep an array of
these things or one of these things full of arrays? Do you really
mean to assert that the answer to this question is not, "It depends on
what you want to represent and how you want to manipulate it?" If
not, then you've not proven your case.

If you'd read even the most recent posts here you'd be aware of this
issue already.
Think of, say, C's way of looking at arrays, where clearly each row is
an object, but each column is not an object.  Now contrast this with, say,
the scripting language for a spreadsheet program, in which "column A"
and "row 1" are both primary objects.

Basically you are proposing a matrix in which you can access any value
in it based on any ordering of the keys. OK. This is a new
abstraction that has nothing to do with arrays or structures and it is
going to be useful for its own distinct cases that are not entirely
congruent, though there is some overlap, with those of structures and
arrays. For one thing, all the contents must be of the exact same
type or we're talking about some sort of fantasy language that works
via. fairy dust or something.

As I said before, it is absurd to talk about a "bias in C toward
arrays of structures" by stipulating some foreign abstraction that
works differently, is useful for different problems, etc...

Would it make sense to you if I asserted that C has a bias toward
using arrays by pointing out the existence of linked lists and binary
trees? What if I show you iterators from C++ that let you treat them
as the same, something that's much more difficult in C?

Would it make sense to you if I asserted that C has a bias toward
arrays by pointing out that one could think up a language in which
everything was a variable and no complex data types existed?

Would it make sense to you if I asserted that C has a bias toward
array based formula calculations by asserting the existence of
"language X" in which I could assign an arbitrary formula to variable
f, and then simply use it's inversion by calling f' when in C the
easiest method is probably to calculate slices and then do a search?
Essentially that's what your argument boils down to.

How about if I assert that C has a bias toward named variables if I
assert a language in which calling "rand()" and dereferencing it will
get me the value I'm trying to get, no matter what it is, every time?

Assuming you say 'no' to those, and I don't see how you can't, why
does it make sense to you then to assert that C has a bias toward
arrays of structures by stipulating the hypothetical existence of a
matrix structure that works via some unspecified, mysterious logic??

This matrix also does nothing to respond when there really is a
logical hierarchy to the data being described, as has been stipulated
in every example and argument in the conversation to date. Even when
a special "language" was created in which one could gain access to
individual data fields in our blob we ended up with an interface in
which our entities' individual features were grouped together in a
single chunk representing an entire entity rather than visa-versa.
Even though we could access items through a syntax that didn't care if
the underlying data type was completely backward or even spread
randomly through memory, the logical grouping was static, concrete,
and obvious.
So I put it to you that:

1.  It is entirely meaningful to talk about C having a bias towards one
way of doing things rather than another, without any need to refer to
other languages or tools.

Great assertion, now show it.
2.  C in fact has a bias towards arrays of structures, rather than
structures of arrays, just as it has a bias towards arrays of rows
rather than arrays of columns.

Great assertion. You've done even LESS to show this. In fact, you're
introducing entirely new, unsupported assertions here.
3.  Other languages do not always have as strong a bias.

Name one.
 
S

Seebs

What type are the columns? If "any" then how does this language store
arbitrary length strings in this matrix? "It just does," is not an
answer since the very fallacy of what you're claiming is right here in
this detail.

Probably as references-to-other-stuff. But that's the thing; say that
the only things in the array are { typeinfo, reference } pairs. In C,
you can still process rows easily and efficiently and columns are still
hard.
What happens when we want to work in 3d? Should I keep an array of
these things or one of these things full of arrays? Do you really
mean to assert that the answer to this question is not, "It depends on
what you want to represent and how you want to manipulate it?" If
not, then you've not proven your case.

Interesting question, and I don't actually know. But! I don't care.
I just need to show the biasing once.
Basically you are proposing a matrix in which you can access any value
in it based on any ordering of the keys. OK.

That's logically similar, but it's not the same thing as "you can access
either rows or columns".
Assuming you say 'no' to those, and I don't see how you can't, why
does it make sense to you then to assert that C has a bias toward
arrays of structures by stipulating the hypothetical existence of a
matrix structure that works via some unspecified, mysterious logic??

Well, uhm. I use spreadsheets. These matrix structures exist. There
are *multiple* languages that have structures which work this way.
This matrix also does nothing to respond when there really is a
logical hierarchy to the data being described, as has been stipulated
in every example and argument in the conversation to date.

Except I'm not sure that's really been stipulated. If you're thinking
naturally in C, you *have* to think of the hierarchy in terms of the
array of structures. Working with one "column" of that logical object
doesn't make *sense*.
Great assertion, now show it.

Doesn't need to be shown. A bias towards doing things one way can be
shown entirely within one language; you just show that it's easy one way
and hard the other.
Name one.

Except I already did, repeatedly. SQL. Whatever scripting language is
used by basically every spreadsheet ever to offer one. I think someone
else claimed that Algol 68 let you talk about both columns and rows of
a matrix.

-s
 
N

nroberts

The same is true with regard to C and array of structure being
preferable to organize the kinds of data that have been discussed to
date.  This preference is intrinsic to the nature of the data, not to
the language used to express its formulation.

I'm not sure this is the case.  At one point I was doing some vectorizing
and it turned out that the nature of the data was such that I was best served
by using several arrays of 1024 objects and manually keeping track of the
parallels.  Which, in C, was a giant pain to do.  But it was *necessary*
that the x coordinates of four consecutive objects be stored contiguously..
(And on that hardware, moving individual scalars around was insanely
expensive, involving shifts and rotates and masks, because there were no
instructions for manipulating single 32-bit values.)

Let's consider a hunk of data.  The data I propose we consider is
topographical data; we are considering a region of land, and we have
records of the average height above sea level of each square meter section
of a 1km square area.

So:

        float heights[1000][1000];

Now consider a natural C operation:

        float avg(float a[], int n) {
                double total;
                for (int i = 0; i < n; ++i)
                        total += a;
                return (float) (total / n);
        }

Consider the ease with which we determine the average height of each
row of data:

        averages = avg(heights, 1000);

Now consider how we determine the average height of each column of
data.

C has a very strong bias which is in no way intrinsic to the data being
looked at.


You're just using the wrong data structure for your data. Your data
very much DOES have an intrinsic organization, that of a matrix.
Since a matrix is not what you've chosen to use as your data
structure, the structure and data don't entirely play well together
and you've created, by your own design, an apparent "bias".

The problem here is that you're using a hierarchical data structure
for data that has no hierarchical structure. You are either saying
that each X meter contains N(Y) meters, or visa-versa. You're no
longer talking about a block of square meters, but rows of 1xN meters
split into N square meters. This is not true of your data though;
this can be proven by simply inverting the array relationship and
showing that it really does make just as much sense, as you've
mentioned. Equating that with a bias in C though is highly
questionable to say the least.

What you really want is the matrix structure you proposed before. It
contains x,y coordinate data without being an array of y's or array of
x's. C does not provide this as a raw data type but you could
certainly create the structure and provide an interface that worked as
you'd desire. Unfortunately, it won't be as close to the underlying
data types of the machine you're targeting most likely, and thus
you'll end up paying a bit for this abstraction.

Of course, you'd be quite insane to assert that all cases when you'd
be tempted to use an array of arrays construct that the relationship
can be invented quite so easily. As such, I can't understand why
you'd make the statement that this supposed bias has nothing to do
with data.
 
S

Seebs

You're just using the wrong data structure for your data. Your data
very much DOES have an intrinsic organization, that of a matrix.
Since a matrix is not what you've chosen to use as your data
structure, the structure and data don't entirely play well together
and you've created, by your own design, an apparent "bias".

But C doesn't have a matrix. It has arrays-of-arrays. And that means
that it introduces a necessary bias, in its native representations,
towards a hierarchy.

There are languages in which a matrix really is a first-class thing, in
which you can pick columns or rows easily. Those languages have less of
a bias here than C does.

-s
 
N

nroberts

Probably as references-to-other-stuff.  But that's the thing; say that
the only things in the array are { typeinfo, reference } pairs.

Then why not have structure of array here? Looks like your problem
came back.
 In C,
you can still process rows easily and efficiently and columns are still
hard.

No they're not!
Interesting question, and I don't actually know.  But!  I don't care.
I just need to show the biasing once.

Not true. What you're trying to do here is solve for a special case
and call it universal. And you've not "shown the bias", you've
declared a solution for a small subset of problems in which the "bias"
appears. Since my contention is that the bias is always there, you've
done nothing to argue your case.
That's logically similar, but it's not the same thing as "you can access
either rows or columns".

Yes, it is. You're trying to treat arrays of arrays as matrices and
they're not. A matrix can be converted into vectors through an
indexing operation on either row or column. Arrays of arrays do not
behave this way quite specifically because they're not matrixes.
Well, uhm.  I use spreadsheets.  These matrix structures exist.  There
are *multiple* languages that have structures which work this way.


Except I'm not sure that's really been stipulated.

It has.
 If you're thinking
naturally in C, you *have* to think of the hierarchy in terms of the
array of structures.

No you don't. There are many alternative data types you can use and/
or create.
 Working with one "column" of that logical object
doesn't make *sense*.

It makes sense if it does, it doesn't if it doesn't.
Doesn't need to be shown.  A bias towards doing things one way can be
shown entirely within one language; you just show that it's easy one way
and hard the other.

You've gone off the deep end now.
Except I already did, repeatedly.  SQL.

I responded to that one.
 Whatever scripting language is
used by basically every spreadsheet ever to offer one.  I think someone
else claimed that Algol 68 let you talk about both columns and rows of
a matrix.

Don't know what this is supposed to respond to. You never bothered to
answer any of the questions that try to get you to respond to why you
think bringing up more data types and abstractions has anything to do
with the discussion and just do it again and again.

Let me know when you've got a new song to sing. I've already heard
this one, responded to it repeatedly, it's not really interesting.
 
N

nroberts

But C doesn't have a matrix.  It has arrays-of-arrays.
Correct.

 And that means
that it introduces a necessary bias, in its native representations,
towards a hierarchy.
Nonsense.

There are languages in which a matrix really is a first-class thing, in
which you can pick columns or rows easily.  Those languages have less of
a bias here than C does.

No. They have a larger selection of *built-in* data types. You can
make a matrix in C.

Further, a matrix simply does not solve the "bias". It works as a
better data type for *some* classes of problems in which this "bias"
exists, but not all. In all honesty, a matrix is even less
conceptually similar to an array of structures than anything anyone
else has brought up to date.

Use a matrix where the data is best represented by one. Use a
structure of arrays where THAT makes sense. Use an array of
structures where THAT makes sense. Use the data structure for the
job!!! This has nothing whatsoever to do with language and even less
to do with syntax!
 
R

Richard Damon

I would agree that you appear not to be buying logic. :)


Maybe you were clear about this, but honestly, I think his original statement
made it pretty clear. Furthermore, it's easy to come up with an example
without a comparable bias: SQL. In SQL, it's just as practical to grab a
column from a table as a row. You can think of a table just as easily as
a structure of arrays as you can as an array of structures.

But in C, if you do an array of struct foo, there is no reasonable way to
get the "array" of just one member of each structure in the array. That's
not a natural thing in C. You can't treat it like you'd treat any other
array.

So I think it's pretty clear that C has such a bias. I think *most* languages
have this bias. But the thing is, I'm not really interested in whether it is
a stronger bias in C than it is in other languages, or a weaker bias. What I
care about is whether, in C, it is going to be practical to use a structure
of arrays rather than an array of structures, and the answer is "no".

-s

From my experience with SQL, it is very much based on an array of
"structs" (aka rows) in operation.

You example later of "select salary from employees" just points out that
SQL has some built in operators to COPY a give field from every record
into a new array to do operations on it.

Note also that in SQL, everything is pretty much looked at as this "2d"
array of structs. Even things like taking the MAX, SUM or AVG of a field
is really just a shorthand of representing something like:

for all i WHERE record meets the WHERE clause:
operation(record.field)

The existence of the WHERE clause basics says that even if possible to
address just that column of the records, the loop still needs to in
general actually iterate through the records first to select which items
to operate on.

Indexes are also sort of arrays of structs, where the struct is a
value/record pointer pair, but may actually not be organized as an
"array" but some other organization to do better at keeping a sorted
list that is easy to search.
 
S

Seebs

Then why not have structure of array here? Looks like your problem
came back.

Tell me how to do a structure containing three arrays of size not known until
runtime? Note: Three separate allocations doesn't count. You have to be
able to allocate a single structure.

That's one of the big hints as to what's natural in C: You can allocate an
array of structures without having to know the size until runtime. You cannot
allocate a structure of arrays without having to know the size until runtime.
I responded to that one.

But your response didn't change the fact that, in SQL, a single column of
a table is just as much a manipulatable object as a single row of a table.
You can delete a row, or a column, without affecting the rest of the table.
Let me know when you've got a new song to sing. I've already heard
this one, responded to it repeatedly, it's not really interesting.

I'm not gonna bother. I can't tell whether you're lying or stupid; I can't
rule out the possibility that it's both. However, we're discussing something
here which is, so far as I can tell, immediately obvious to anyone I've talked
to who has ever programmed seriously in *any* language, and you're either
incapable of getting it or pretending not to get it to save face. I don't
care which. The chances that you will ever say anything that I, or anyone
else, will care about are too low for me to spend time reading your posts
when I could be doing important stuff, like playing MMOs or chasing a cat
around the house. Both of these are more likely to yield desireable outcomes.

-s
 
S

Seebs

From my experience with SQL, it is very much based on an array of
"structs" (aka rows) in operation.

It certainly is in some ways, but...
You example later of "select salary from employees" just points out that
SQL has some built in operators to COPY a give field from every record
into a new array to do operations on it.

Is it really copying everything? I somehow doubt that large databases are
actually copying everything for all the views and such that they maintain.
Note also that in SQL, everything is pretty much looked at as this "2d"
array of structs. Even things like taking the MAX, SUM or AVG of a field
is really just a shorthand of representing something like:
for all i WHERE record meets the WHERE clause:
operation(record.field)

The existence of the WHERE clause basics says that even if possible to
address just that column of the records, the loop still needs to in
general actually iterate through the records first to select which items
to operate on.

I'm not sure it does -- seems to me that operations can work on a few
things and then stop, for one thing.

My main point, though, is that in SQL, one or two or three columns from a
table can be treated as themselves a table-like-thing, without any need
for the user to explicitly create and populate this new data structure; that's
very different from the way that C handles that kind of thing.

I'm not talking about underlying internal representations, but about what
kinds of things the language makes it easy or hard to express.

-s
 
N

nroberts

Tell me how to do a structure containing three arrays of size not known until
runtime?  Note:  Three separate allocations doesn't count.  You have to be
able to allocate a single structure.

That's YOUR problem. I'm not the one trying to say it's completely
natural to do what you're saying is completely natural to do. You've
asserted the existence of a language in which it is as natural to put
the struct on the outside as the inside and now you want me to make
that work for you??? I'm not trying to solve this problem, I'm just
pointing out that it still exists. This is your problem to solve if
you want to make your case.
 
D

David Thompson

On Wed, 09 Nov 2011 00:06:23 +0000, Ben Bacarisse
I am saying that the organisation of data in memory, and the
organisation of data into logical structures are orthogonal, but many
languages (like C) link the two together.

In high-level languages, arrays are used to represent data sequences --
there need be no implication that one array element is next to its
successor in memory. Similarly, structures are used to group data into
units that can be manipulated as one. C insists that such a unit be
contiguous in memory, but this does not follow inevitably from the
desire to group two coordinates into a point.

A weak example is Algol 68 arrays. Given a 2D array, any row or column
of it can be passed to a procedure that wants a 1D array. Algol 68 does
not afford the same flexibility to arrays of structures or structures of
arrays (and I don't know any language that does), but a language *could*
do that, and it would not be without value to do so. C can't do it
without completely overhauling its notion of an object.
Fortran >= 90 allows you to pass (usually by reference, though not
mandated), or set a (fat) pointer to, any array section of the correct
rank -- i.e. if you have a 4dim array and callee wants 2dim, you can
slice any 2 subscripts, full or partial, but not 0 1 3 or 4 *.
Similarly you can pass or point to a field across an array of
structures (except it calls structures 'derived types').

(* You can also effectively overload on rank. Thus you could call both
MYFUNC( A(1:10, 3, 2:7, 4) ) and MYFUNC( A(2, 3, 4, 5:8) ) but those
are actually two different MYFUNC's.)

F9X provides no particular features for structure of arrays. It does
allow the compiler (normally) to choose storage order of fields in any
structure; IMLE this is more often useful for nonarray fields.

SQL (already noted elsethread) effectively hides the storage. Most if
not all DBMSes allow you to dynamically modify row storage (e.g.
sotring, partitioning) and column types (or at least sizes) without
altering embedded or precompiled DML that references that table --
sometimes even during execution of the DML.

<snip>
 
M

Malcolm McLean

I started with BASIC so, yeah, but if the C answer is 'call 'strcat''
i think that is the end of the story.
strcat() isn't a very good function, because it's often hard to make
sure that the buffer is big enough to hold two runtime-entered
strings.

It's easy to write char *cat(const char *str1, const char *str2) to
return an allocated string. The function is not included in the
standard library because it was decided not to allow string functions
to depend on malloc().
 
N

Nick Keighley

Nick Keighley  <[email protected]> wrote:

could you leave a sensible amount of context in? Yoe are replying to
somethign I wrote nearly a month ago!!

The discussion was about C++ operating overload.

[extra indentation added to show context of my use of "string
concatenation"]
If that's the only example [of C++ operating overload] that
you've seen though then you're obviously not looking at a lot of C++.

give some examples then. Arthmatic types need them but what else?
string concatentaion. And?

you'lll note here that I was asking for examples of overloaded
operators.
I started with BASIC so, yeah, but if the C answer is 'call 'strcat''
i think that is the end of the story.

you could safely concatentate strings in C++ without using an
overloaded oeprator. With a more sophisticated string type you could
do teh same in C. What exactly is your point?
 
J

Joe keane

you could safely concatentate strings in C++ without using an
overloaded oeprator.

kind of my point
With a more sophisticated string type you could do [the] same in C.

Of course, everyone has his own library to do this.
What exactly is your point?

I think operator overloading is a gain of about zero.

I've never written in C and been like 'darn it, if i only had operator
overloading!'.

Even in C++ i use it sparingly, unless someone tells me 'add some
operators cause the customer will think they're nifty'.

C++ has:

classes

derivation

ctor/dtor concept & exception handling

templates & STL

These things, if they're done right, can make an F-15 (or if they're not
done right, can get you blow up by an F-15).

The opereator overloading is a candy bar.
 
J

jacob navia

Le 30/11/11 00:42, Ian Collins a écrit :
Until you you have to implement a specialised numeric type or a smart
pointer.

In general operator overloading is useful for numbers and arrays.

1) For numbers, it allows the user to create any kind of new numerical
types and keep most of his software.

lcc-win has used this feature extensively in the context of an extension
to the C language. It allwos to have several extended precision numbers
like qfloat (350 or 450 bits) bignums, and complex numbers.

2) For arrays it allwos to implement length delimited strings using the
natural array syntax:

String s = "abc"; // overloaded assignment operator
s[2] = 'C'; // overloaded index operator


This simplifies porting from normal C strings to the new type
of strings.

Flames >/dev/null
 
J

Jorgen Grahn

.
I think operator overloading is a gain of about zero.

I've never written in C and been like 'darn it, if i only had operator
overloading!'.

Operator overloading ties in with many of the other C++ features -- I
don't think it would be very useful as an isolated feature.

/Jorgen
 
J

jacob navia

Le 30/11/11 22:42, Jorgen Grahn a écrit :
Operator overloading ties in with many of the other C++ features -- I
don't think it would be very useful as an isolated feature.

/Jorgen

That is because you did not know that Fortran has operator
oerloading, C#, even Java. I have added operatr overloading
to the lcc-win compiler and it is very useful. See the other message
about this in this thread.
 
Q

Quentin Pope

Le 30/11/11 22:42, Jorgen Grahn a écrit :
That is because you did not know that Fortran has operator oerloading,
C#, even Java. I have added operatr overloading to the lcc-win compiler
and it is very useful. See the other message about this in this thread.

Sorry, but it is NOT useful. In those other languages, it is a curse. It
leads to many bugs, and plenty of confusing code where it is difficult to
tell what a simple-looking line of code is actually doing without digging
through piles of other code to track down the relevant overloading.

Overloading is not the C way. It is not standard C and not portable.

C has a much better solution - explicit function calls. Look at GMP for a
good example of a C interface for numerical types without any need for
overloading.

/* QP */
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,083
Messages
2,570,591
Members
47,212
Latest member
RobynWiley

Latest Threads

Top