Dennis Ritchie -- An Appreciation

N

Nick Keighley

On Oct 30, 4:56 pm, Nick Keighley <[email protected]>
wrote:> On Oct 30, 8:07 am, Malcolm McLean <[email protected]>

The data in the program, not the code.

Most things naturally fall into arrays of arrays. For instance a
protein consists of an array of atoms,

doesn't sound a natural data type. Surely a protein is an array of
amino acids? The organisation of the atoms is not a linear structure.
each of which has an element
type and an x y z position. The atoms are grouped into amino acid
residues. The residues are grouped into chains. Then you might be
working on more than one protein.

That's what most data is like.

looks to me more like you've chosen it to be like that.
Arrays are by far the most commonly
used data structure in C.

because there isn't much choice
They're the only one which has explicit
syntactical support.

structs, strings, unions
That's not to say you never use other structures, there's a case for
representing the bonds between atoms as a graph, for instance, you
might want to do that for some applications.

again wouldn't arrays of structs be more natural than strcuts of
arrays?
 
J

James Kuyper

You have to use std::vectors everywhere with the C++ version, if you
want to call f. You have to use arrays of ints everywhere with the C
version if you want to do the same.

That's why most C++ algorithms take iterator arguments rather than
container arguments. An algorithm, once it's correctly expressed in
terms of iterators, can work equally well on the elements of an array of
double as on a std::vector<int> or a std::list<unsigned long>. You only
have to write the template once, no matter how the data's actually
stored (as long as the algorithm's iterator requirements are met).
 
M

Malcolm McLean

That's why most C++ algorithms take iterator arguments rather than
container arguments. An algorithm, once it's correctly expressed in
terms of iterators, can work equally well on the elements of an array of
double as on a std::vector<int> or a std::list<unsigned long>. You only
have to write the template once, no matter how the data's actually
stored (as long as the algorithm's iterator requirements are met).
But not everyone uses the iterator system consistently.

You need two requirements. You need the iterators to be set up
correctly. But you also need functions like add and divide to be
defined correctly. The mean of a list of integers is not usually an
integer.
 
J

James Kuyper

But not everyone uses the iterator system consistently.

"Not everyone uses X consistently" is a fairly accurate statement for
just about any X. It's a problem with everything, and is not specific
to C++ iterators.
You need two requirements. You need the iterators to be set up
correctly. But you also need functions like add and divide to be
defined correctly. The mean of a list of integers is not usually an
integer.

That's true; which is why a C++ algorithm that performed a generic mean
calculation would have to templated by the type to be used for
calculating the mean. It would have to be the user's responsibility to
choose that type in a way that allows the expressions "sum += *iterator"
and "sum /= count" to do what he wants done.
 
M

Malcolm McLean

On 10/31/2011 09:08 AM, Malcolm McLean wrote:

"Not everyone uses X consistently" is a fairly accurate statement for
just about any X.  It's a problem with everything, and is not specific
to C++ iterators.
No, but it's particularly bad in this case, because the whole point of
the template system is to avoid having to rewrite code or "adapter"
routines (e.g. to convert an array of size_ts to an array of DWORDs).
 
B

Ben Bacarisse

Malcolm McLean said:
I'm not sure I've even got the syntax right, it's so long since I did
this, but the root of the problem goes something like this

template <class Iterator, class numeric>
<numeric> standard_deviation( Iterator begin, Iterator end)
{
while(begin !- end)
{
/* code here */
++begin;
}
/* There's something wrong with out standard deviation, so I want
to call a tried and tested function to get the mean, which is foo ()
*/
foo();
/* The problem is that, even though under the bonnet I've just been
passed a double *, the only way of calling foo() is to construct a
temporary vector */
}

temple <class numeric>
<numeric> foo( std:vector<numeric> &list )
{
}

In C, you can have the problem that we might have floats or doubles or
long doubles. The whole point of the template idea is that you only
have to write the routines once. But you need iron tight discipline to
achieve this.

No, I meant some C code which gets more complex in C++. That was your
claim -- that some kinds of code get more complex in C++. This snippet
demonstrates that you can't hack a call to a function with one type from
a template that uses another. That's true, but not what you were
talking about. For one thing, it's very hard to write this sort of the
generic code in C, so it's unfair to complain that it's "unhackable" and
"complex" in C++. The C for a type-generic loop would be horrendous.
 
N

nroberts

You'll see kludges an any language!  Null objects are an abomination and
anyone using them should be tarred and feathered by their peers.

Nonsense. They are quite preferable to littering 'if (xx != 0)' all
over the place in a great number of situations. Any time it is quite
common for an object to be non-existent and the behavior is always the
same, a null object encapsulates the situation quite nicely. Client
code doesn't need to care, nor should it.
 
N

nroberts

noooooo! This way lies the Hungarian Lunacy. If your functions are
small and localised then the variable will be defined close by and we
can see what type it is.

Nobody said anything about Hungarian notation. That's a straw man.
void refrog (MyType x, MyType y)
{
    x = y++;

}

OK, well I'm looking at that function and I have no idea what you are
intending with it. I really doubt that knowing the definitions of
MyType would help me either.
a well chosen type tells you the semantics of the type

int q;

Tell me, what is the intended use of q?

void (*f)(int,int,double);

Tell me, what does f do?

You should be able to tell me since the types tell you enough, right?
 
N

nroberts

No, I meant some C code which gets more complex in C++.  That was your
claim -- that some kinds of code get more complex in C++.  This snippet
demonstrates that you can't hack a call to a function with one type from
a template that uses another.  That's true, but not what you were
talking about.  For one thing, it's very hard to write this sort of the
generic code in C, so it's unfair to complain that it's "unhackable" and
"complex" in C++.  The C for a type-generic loop would be horrendous.

Let's compare them. The basic for_each implementation in C++:

template < typename ForwardIterator, typename Fun >
void for_each(ForwardIterator start, ForwardIterator end, Fun fun)
{
for (; start != end; ++start) fun(*start);
}

Use:

void print_int(int i) { std::cout << i; }

std::vector<int> vect;
....
std::for_each(vect.begin(), vect.end(), print_int);

Or now:

std::for_each(vect.begin(), vect.end(), [](int i) { std::cout <<
i; });

I, personally, don't find that very complicated.

Now the C version. I'm sure a C expert can do better than me, but
here's my untested attempt:

void for_each(char const* start, size_t size, size_t count, void(*fun)
(char*))
{
while (count--)
{
fun(start);
start += size;
}
}

Use:

void print_int(char const* ptr) { printf("%d", *((int const*)ptr)); }

int array[COUNT];
....
for_each((char const*)array, sizeof(int), COUNT, print_int);

I don't find that particularly difficult to get, but as a C++
developer I am annoyed by all the casting (especially upon review
before posting when I noticed I forgot the const on the int cast).
There's also room for user error if the wrong size is passed in, which
the compiler does for you with the C++ version. Neither of these is
insurmountable though.

Of course, generally you don't write your functions to work on char*
when they don't, so you'd probably be creating a wrapper for
print_int. In that way it would be reminiscent of having to create
functor objects for binding more than 2 parameters. The new C++
solves this issue, but most here probably are not familiar with bind
and lambdas.
 
N

nroberts

I'm not sure I've even got the syntax right, it's so long since I did
this, but the root of the problem goes something like this

template <class Iterator, class numeric>
<numeric> standard_deviation( Iterator begin, Iterator end)
{
  while(begin !- end)
  {
    /* code here */
    ++begin;
  }
  /* There's something  wrong with out standard deviation, so I want
to call a tried and tested function to get the mean, which is foo ()
*/
  foo();
  /* The problem is that, even though under the bonnet I've just been
passed a double *, the only way of calling foo() is to construct a
temporary vector */

}

temple <class numeric>
<numeric> foo( std:vector<numeric> &list )
{

}

In C, you can have the problem that we might have floats or doubles or
long doubles. The whole point of the template idea is that you only
have to write the routines once. But you need iron tight discipline to
achieve this.

So, let me see if I understand you correctly...you find C++ more
complicated because you're not allowed to call functions that expect
one type with a different one and that if you need to you have to do
some conversions?

I think I can live with that. I wonder, what would happen if I
created a function expecting to be passed a handle to a managed buffer
structure and tried to pass it a double*...
 
M

Malcolm McLean

No, I meant some C code which gets more complex in C++.  That was your
claim -- that some kinds of code get more complex in C++.  This snippet
demonstrates that you can't hack a call to a function with one type from
a template that uses another.  That's true, but not what you were
talking about.  For one thing, it's very hard to write this sort of the
generic code in C, so it's unfair to complain that it's "unhackable" and
"complex" in C++.  The C for a type-generic loop would be horrendous.
The point is, you're unlikely to make this particular mistake in C.

You'll write

double stdev(double *x, int N)

double mean(double *x, int N)

So if one person writes mean and the other stdev, the code will likely
plug together perfectly nicely.

Even in C, it is possible to mess things up. You can say

typedef float real;

real stdev(real *x, int N)

Thinking you're doing everyone a big favour by allowing your code to
toggle between double and float. You're not. All you're doing is
polluting the namespace and adding complexity.

Anther problem is that, frequently, you want to get the mean and
deviation of fields of structures. So

double stdev(void *array, int offset, size_t size, int N)

Would be a more generic function. However this is probably a bad idea
because it's making the call hard to understand, and forcing the
implementer of "mean" to do messy casts and pointer arithmetic. As
always, the toy example is a bad one. mean() is so frequently
required, and so trivial to calculate, that there's an argument for a
generic stepping function.

Generally, you don't have this sort of problem. Functions written in C
can be made reusable with a minimum of effort, and can be embedded in
other programs with a minimum of fuss. That's much less true of
functions that use elaborate container systems. They tend to be
difficult to integrate.
 
M

Malcolm McLean

So, let me see if I understand you correctly...you find C++ more
complicated because you're not allowed to call functions that expect
one type with a different one and that if you need to you have to do
some conversions?

I think I can live with that.  I wonder, what would happen if I
created a function expecting to be passed a handle to a managed buffer
structure and tried to pass it a double*...
The point is, if you write C, you won't write a function that demands
a managed buffer, unless the function really strictly needs it. In C++
you often will, which forces everyone to use your managed buffer
system. The C++ method is tolerable if everyone uses the stl
conventions, which is that a list is passed in as an iterator to the
start of the list and an iterator to one past the end. But the fact is
that everyone doesn't use the stl conventions. Partly it's a failure
of education, partly it's that they are hard to explain and it's
difficult to make programmers see the benefits, partly it's
historical, stl was bolted on to C++ at a late date, so people grew up
using C-style arrays instead of vectors. Whatever the reason, the
system isn't used as it should be.
 
N

nroberts

The point is, if you write C, you won't write a function that demands
a managed buffer, unless the function really strictly needs it.

Well, if that's your point then you have a very odd way of stating it.

Of course, the point is quite debatable since it's actually impossible
to write a function that uses a static buffer in C. You are always
going to have to pass in a size and pointer.

In C++
you often will, which forces everyone to use your managed buffer
system.

Gotta debate the first. The second is a reasonable point though, if
you write a function that expects a vector you are indeed forced to
use one in order to call it. I don't think this is different from any
other statically typed language though.

The C++ method is tolerable if everyone uses the stl
conventions, which is that a list is passed in as an iterator to the
start of the list and an iterator to one past the end. But the fact is
that everyone doesn't use the stl conventions. Partly it's a failure
of education, partly it's that they are hard to explain and it's
difficult to make programmers see the benefits, partly it's
historical, stl was bolted on to C++ at a late date, so people grew up
using C-style arrays instead of vectors. Whatever the reason, the
system isn't used as it should be.

It seems to me that you're too long out of the C++ loop to make good
judgments about what is common.
 
N

nroberts

Generally, you don't have this sort of problem. Functions written in C
can be made reusable with a minimum of effort, and can be embedded in
other programs with a minimum of fuss. That's much less true of
functions that use elaborate container systems. They tend to be
difficult to integrate.

Why, because you said so?
 
N

nroberts

The C++ method is tolerable if everyone uses the stl
conventions, which is that a list is passed in as an iterator to the
start of the list and an iterator to one past the end. But the fact is
that everyone doesn't use the stl conventions. Partly it's a failure
of education, partly it's that they are hard to explain and it's
difficult to make programmers see the benefits, partly it's
historical, stl was bolted on to C++ at a late date, so people grew up
using C-style arrays instead of vectors. Whatever the reason, the
system isn't used as it should be.

Maybe putting this argument of yours in an alternate perspective would
show you how vaporous it is.

My general experience with C is that C developers don't know the first
thing about cohesion, modularizing, etc... They're all proud hackers
that peck at things until they work, without any regard to maintenance
and reuse. They base statically sized arrays on assumptions that are
far from guaranteed and do nothing to assert those preconditions.
They scatter global variables throughout their systems and write
functions that read and write to many unrelated, global concepts at
once so that it is impossible to separate the concepts into distinct
modules or put them under unit test without instantiating the entire
system. They write and use functions like 'gets' that assume things
that are quite literally impossible to check, allowing all manner of
security exploits and crash conditions; this method of thinking among
C developers is so bad they even do this in their standard library.
They use convoluted methods of error handling like singular, global
variables that can be written to anywhere along the call stack so that
it's impossible to tell which function caused the error or if the code
you're getting is even the right one...and then they use such systems
in multi-threaded architectures such that the error you get might not
even BE in the call stack anywhere.

All that and much, much, much more. In my experience, C developers
lack all discipline and design skills...and don't even see the problem
with that. Basic principles of engineering are lost on them.

I must therefore conclude that C is a bad, complex language. How
stupid am I?

Now it must be said, in non-sarcasm mode, that though the above is
indeed a true reflection of C coders I have met...I realize that it's
not the majority since the vast majority are people I've not met.
Further it's not C specific as this is the general trend I see in all
software development cultures. In my experience, good developers are
rare no matter what language they speak. Most the good ones know how
to write good code in more than one.
 
M

Malcolm McLean

Why, because you said so?
No, because the containers used by caller and the containers used by
callee have to be compatible. The more elaborate your container
system, the less likely that is to be achieved, and the more that can
go wrong. The worst situation is where you have semi-compatiblity.
Consider this, for example

template <class numeric, class iterator>
double mean(iterator start, iterator end)
{
numeric tot(0);
int N = 0;
iterator it;

for(it=start, it != end, ++it)
{
tot += *it;
N++;
}
return (double)(tot/ (double) N);
}

There's a problem with this, which will become apparent when you call
the function with a small data type. However when you test it with
short sequences, which you can calculate by hand, it will probably
appear OK. It's a semi-compatible generic function.
 
J

James Kuyper

No, but it's particularly bad in this case, because the whole point of
the template system is to avoid having to rewrite code or "adapter"
routines (e.g. to convert an array of size_ts to an array of DWORDs).

Yes, and if used properly, it achieves that goal. Like anything else, it
can be used improperly, a fact that's only worth bringing up in
connection with an argument that explains why you think that improper
use is more likely than it would be in corresponding C code.

Given how much more complicated the C code would have to be to implement
anything similar to the genericity of the C++ code, and the
corresponding lack of type safety in any reasonable C approximation to
genericity, I find that rather unlikely. However, I'm willing to listen
to an argument to the contrary. I haven't seen that argument, just
assertions that it's true.
 
M

Malcolm McLean

Well, if that's your point then you have a very odd way of stating it.
Most functions operate on lists. C provides one easy way to set up a
list, which is as an array. The consequence is that, in C programs,
lists are usually arrays, unless there's a pressing performance need
for another structure. The result is that, _in practice_ functions
tend to be inter-operable. The interfaces are kept clean and simple.
You can even, very frequently, call a C function from another
language.
 
N

nroberts

No, because the containers used by caller and the containers used by
callee have to be compatible. The more elaborate your container
system, the less likely that is to be achieved, and the more that can
go wrong. The worst situation is where you have semi-compatiblity.
Consider this, for example

template <class numeric, class iterator>
double mean(iterator start, iterator end)
{
  numeric tot(0);
  int N = 0;
  iterator it;

  for(it=start, it != end, ++it)
  {
    tot += *it;
    N++;
  }
  return (double)(tot/ (double) N);

}

There's a problem with this, which will become apparent when you call
the function with a small data type. However when you test it with
short sequences, which you can calculate by hand, it will probably
appear OK. It's a semi-compatible generic function.

There are a LOT of problems with the above code. It was written by
someone that doesn't know C++.

However, why don't you point out the problem you are talking about so
that we can all be in on the secret. What is the problem you see
there that represents a flaw in the use or design of C++?
 
B

Ben Bacarisse

nroberts said:
No, I meant some C code which gets more complex in C++.  That was your
claim -- that some kinds of code get more complex in C++.  This snippet
demonstrates that you can't hack a call to a function with one type from
a template that uses another.  That's true, but not what you were
talking about.  For one thing, it's very hard to write this sort of the
generic code in C, so it's unfair to complain that it's "unhackable" and
"complex" in C++.  The C for a type-generic loop would be horrendous.

Let's compare them. The basic for_each implementation in C++:

template < typename ForwardIterator, typename Fun >
void for_each(ForwardIterator start, ForwardIterator end, Fun fun)
{
for (; start != end; ++start) fun(*start);
}

Use:

void print_int(int i) { std::cout << i; }

std::vector<int> vect;
...
std::for_each(vect.begin(), vect.end(), print_int);

Or now:

std::for_each(vect.begin(), vect.end(), [](int i) { std::cout <<
i; });

I, personally, don't find that very complicated.

No, nor do I, but I am not sure that this helps. Presumably you, like
me do not agree that it is common for thing to get more complex in C++.
The *language* is more complex, but the code if often simpler.
Now the C version. I'm sure a C expert can do better than me, but
here's my untested attempt:

The main difference is that a C expert would avoid most of the casts by
using void * and initialisation.
void for_each(char const* start, size_t size, size_t count, void(*fun)
(char*))

In the general case you can't make start point to const data (and your
C++ version doesn't use const).
{
while (count--)
{
fun(start);
start += size;
}
}

Use:

void print_int(char const* ptr) { printf("%d", *((int const*)ptr)); }

int array[COUNT];
...
for_each((char const*)array, sizeof(int), COUNT, print_int);

sizeof array[0] is better.
I don't find that particularly difficult to get, but as a C++
developer I am annoyed by all the casting (especially upon review
before posting when I noticed I forgot the const on the int cast).
There's also room for user error if the wrong size is passed in, which
the compiler does for you with the C++ version. Neither of these is
insurmountable though.

It's worse. There's no type checking at all! Also what the function
can do quickly becomes so limiting that it's not worth bothering.
Either that or you end up with file-scope variables to pass data to the
function. In almost every case, if you write a C call-back, provide for
a general data pointer too:

typedef void callback(void *item, void *general_data);
void for_each(void *start, size_t size, size_t count, callback *cb);

We'll have to wait for someone on the "other" side of the debate to
illustrate the kind of code they have in mind.
Of course, generally you don't write your functions to work on char*
when they don't, so you'd probably be creating a wrapper for
print_int. In that way it would be reminiscent of having to create
functor objects for binding more than 2 parameters. The new C++
solves this issue, but most here probably are not familiar with bind
and lambdas.

If you try to match the type safety of the C++, you'll probably end up
with a macro system.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,091
Messages
2,570,605
Members
47,224
Latest member
Gwen068088

Latest Threads

Top