reference type for C

M

Malcolm McLean

Then you have to know the boundaries of that subset (or of an even
smaller subset of that subset) to avoid going beyond them. I don't see
how you can know the boundaries of the subset without understanding the
difference between your mental model and the one described in the
standard.

Nor do I know why any serious C programmer would be satisfied with not
understanding the language.
As long as the subset is both expressive enough to allow you to do
what you want easily, and Turing complete, there's no particular reason to go beyond it.
C would be nicer with decent 2D arrays, but you don't have them, and it's
no great hardship to go array2d[y*width+x]. So we don't need all the
complications of the 2D syntax. Even in the rare cases when we know the
dimensions at compile time, and can use a 2D array, we might as well
use the index calculation method, if we use it everywhere else in the
program.
I don't understand that at all. Are you saying that a fixed-size array
isn't likely to be as long as 100 elements, and is more likely to be
allocated via malloc with a size based on the program's input?

A lookup table indexed by 8-bit characters will have 256 elements.
Input buffers are commonly 1024 or more characters. Fixed-size data for
some common algorithms can be much bigger than that.

In any case, you claim that

int array[100];

"isn't declaring an array" is quite simply false.
For some reason we need a list of the number of days worked, per year
of an employee's service with the company.
It's hard to say what the maximum is, recently, for example, the UK
retirement age was raised from 65 to 67. But there's absolutely no
way someone could have as many as 100 year's service.

So

int getsizeofgoldwatch(int recruitmentdate, DATABASE *db)
{
int daysworked[100];
int currentdate = getcurrentdate();
int N = currentdate - recruitmentdate + 1;
int i;

for(i=0;i<N;i++)
daysworked = getdaysworked(db, recruitmentdate + i);

return fancyalgorithm(daysworked, N);
}

So the buffer is 100 ints, the array is the first N.

Now if we decide we don't want to take up all that stack space, or
if we decide that we need to support any length of service (in case
we employ Methuselah so something), we can go to a buffer on the
heap without changing much of the code.
 
R

Rosario1903

As long as you don't change the local copy, yes. But you can:

int sum(int a*, int b*, int n) {
int i;
int c[]={3,1,4,1,5,9,2,6};
if(n<=8) b=c;
for(i=0;i<n;i++) a += b;
}


this not change the pointer variable b in the function call sum,
it change only its copy in the stack...
 
R

Rosario1903

And independently of that, incrementing a number *is* an arithmetic
operation. Addition is repeated incrementing, multiplication is
repeated addition, and exponentation is repeated multiplication.
And they're all arithmetic.

i quote this
 
R

Rosario1903

Gareth Owen writes:

In the mental model I use, expressions of array type and parameters
of array type are special cases (and I can point to the paragraphs
in the standard that rigorously describe those special cases).

Arrays as function arguments and sizeof applied to function
parameters are not special cases.
In my mental model pointer arguments to functions, which are of
basic types, are "arrays", except in the relatively unusual case
of a "return pointer", when a function needs to return two values.

Structure arguments can be arrays or singletons. You've just
got to know.

int array[100];

isn't "declaring an array", it's reserving a buffer on the stack.

it seems i have seen...
it reserve 100 char in the stack and reserve the word "array"
and the address that point to the stack that mem, for call that, *in
the compiler mem* that use in compilation
int *array = malloc(100 * sizeof(int));

is doing exactly the same thing, except the buffer is reserved on
the heap.

this is not true
it reserve one variable pointer in the stack name "array"
than call malloc
fill that variable with the address of array
 
G

glen herrmannsfeldt

(snip, I wrote)
As long as you don't change the local copy, yes. But you can:
int sum(int a*, int b*, int n) {
int i;
int c[]={3,1,4,1,5,9,2,6};
if(n<=8) b=c;
for(i=0;i<n;i++) a += b;
}

this not change the pointer variable b in the function call sum,
it change only its copy in the stack...

Yes. But in the case of call by reference, it doesn't make any sense
to even ask about changing the local copy (which may or may not be
on a stack) of the pointer.

-- glen
 
E

Eric Sosman

Answer this question:

void func(T t1)
{
T t2;
// In this function. Are t1 and t2 always the same type?
// When are they the not the same type?
}

Keith Thompson has already answered it, more than once in
this thread alone. I agree with his answer; If you gave it
serious thought you probably would, too.
 
J

James Kuyper

Uh. Look! A three-headed monkey!

*throws smoke bomb*

*runs away*


Huh. That is disturbing, because without a definition of the term, either
as an adjective or a noun, we run into some strange ambiguities.

Not really. The C standard cross-references ISO/IEC 2382−1 for the
definition of some terms not defined in the C standard itself (3p1).
However, last time I checked, ISO/IEC 2382-1 was too expensive for me to
justify buying it, so I can't be sure whether it defines "arithmetic".
However, for terms defined in neither standard, we can still fall back
on common English usage (with a bias toward usage in the world of
computer programming, for terms with specialized meanings in that context).

....
Which is interesting, because that means that in C99, so far as I
can tell, the phrase "pointer arithmetic" never occurs in normative
text (!). The index has "arithmetic, pointer" pointed to 6.5.6.
Although!

Yes - i pointed that out rather early in this very thread.

....
least, in the index under the heading "arithmetic operators". I note
also that the example of pointer arithmetic given in 6.5.6 uses array
subscripting.

As I pointed out in that earlier message, that example contains two
examples of what I would call pointer arithmetic, and two examples of
things that matched Malcolm's definition, "If you take the value of the
pointer expression and store it, then I'd define that as pointer
arithmetic.", with only one example that satisfied both definitions.
Therefore, the only thing that the standard says about "pointer
arithmetic" is not useful for distinguishing which of the two
definitions matches the concept behind that comment.

I'm not even sure about that one common example: "p += 1", since it
seems to be a case of the "counting on" that he has since made clear is
NOT something he considers to be arithmetic; but it does indeed involve
storage of the value of a pointer expression.
 
K

Keith Thompson

Gareth Owen said:
Because there's a difference between understanding the language in a way
that enables one to be productive programmer, and understanding every
crevice of the language

If one remembers "Don't typedef arrays", you don't have to care about
the details of the "weird and inconsistent" behaviour of:

int func(T t)
{
// when is T not a t.
return (sizeof(T) == sizeof(t));
}

That might work if you never have to deal with code written by
someone else. But even then, it would be hard to be sure where
your restricted mental model conflicts with the C standard without
actually understanding the C standard. How many rules like "Don't
typedef arrays" would you have to memorize to avoid problems,
and how would you derive those rules?

Programmers do occasionally typedef arrays, and you might have to
understand and even modify their code.

The relationship between arrays and pointers is not some obscure
"crevice" of the language, it's part of its fundamental design.
(And it's very well explained in section 6 of the comp.lang.c FAQ,
http://www.c-faq.com/).
 
K

Keith Thompson

Gareth Owen said:
Keith Thompson said:
A parameter declared with an array type, even via a typedef (that's a
good point, BTW),

Thank you.
is "adjusted" to a pointer type. This is not a conversion, it's a
compile-time adjustment; as a parameter declaration, "int arr[]"
really *means* "int *arr*.

Right. And this is unique to array-types-as-function parameters
(despite both you and Eric tell me there's nothing unique about array
types naming function parameters). Happy to have you in agreement.

I think you've misunderstood what we were saying. There is something
special about parameter declarations that are of array type; they're
adjusted to pointer type. There is nothing special about array
expressions passed as function arguments. The "decay" that occurs in
that context is not specific to function calls;, it occurs in all
contexts other than the three exceptions that we've already described.
This is a distinct rule from the decay rule above; the language could
have had either rule without the other.

My mental model works. Does yours?

Yes. TMTOWTDI.
Taken together, these rules mean that a program like this:

#include <stdio.h>
#include <ctype.h>

void capitalize(char s[]) {
s[0] = toupper(s[0]);
}

int main(void) {
char message[] = "hello, world";
capitalize(message);
puts(message);
}

works correctly, and *can* be "understood" via a mental model that
doesn't recognize the decay of array expressions, the adjustment of
array parameter declarations, or the fact that the [] operator takes a
pointer operand. And that's my biggest problem with the way C handles
arrays and pointers: it can lead learners down a false path.

Do you not agree, that, say a talented programmer (in a language other
than C) would look at that and say:

"capitalize(message) has mutated the value the object named 'message'
so its visible in the caller. That looks a lot more like
pass-by-reference semantics than pass-by-value."

Use a typedef, and it looks doubly so.

Yes, of course it *looks* like that. And this talented programmer,
unfamiliar with C, would assume that "sizeof s", used in side the
function, would yield some meaningful value related to the size of
an array.

You have to understand the rules as they're stated in the standard
to know that sizeof doesn't work in that context. And if you
understand the actual rules, I don't understand why you'd build this
confusing and complex mental model. It's like using a thorough
understanding of Newtonian physics to construct a really precise
system of geocentric epicyles to explain the apparent motion of
the planets. Why bother?
 
B

BartC

Keith Thompson said:
I think you've misunderstood what we were saying. There is something
special about parameter declarations that are of array type; they're
adjusted to pointer type.

You're right there's something 'special' about them! An entire schism in the
type system has been created to deal with this. When I was playing around
last year converting one language (which didn't have this schism) to C
(which did), it caused endless problems.

So, outside of a function formal parameter list, you might have the set of
types array, array*, array** and so on which declare what you might expect
(a value array, followed by a pointer and pointer to pointer).

Use exactly the same types in a parameter list, then they become array*,
array**, array*** and so on; they all move up one place, just so it becomes
impossible to specify an 'array' type (ie. an array value type) as a
parameter.

Such a type is illegal to use as a parameter in C, but that could just have
raised a constraint violation instead of introducing this extra quirk into
the type system. And would have left open a possibility to be able to pass
actual arrays by value at some point.
 
M

Malcolm McLean

That might work if you never have to deal with code written by
someone else. But even then, it would be hard to be sure where
your restricted mental model conflicts with the C standard without
actually understanding the C standard. How many rules like "Don't
typedef arrays" would you have to memorize to avoid problems,
and how would you derive those rules?
Basically the rule is that you have buffers, and you pass round
pointers to them with counts. So virtually all your functions
are of the form

int foo(type *x, int N)

because mostly data comes in lists. If you need a function that operates
on scalars, normally it's already in a library somewhere. If you need
a pointer to a pointer, what exactly are you doing? If the list is
naturally of a length fixed at compile time, isn't it usually easier
and clearer to pass in the length anyway, and make it general?

Obviously there are exceptions. You normally wouldn't bother passing
in 3 as the number of dimensions in the universe we happen to live
in, for example. But you don't typedef a "point" array ether. You
just pass in a float *, with the understanding that that is a vector
of 3 x, y, z values.
(There's also the object paradigm, where the first parameter to
a collection of functions is an opaque structure. I'm not talking
about that here).

You have a high level function that "owns" the buffer. If it's automatic
this is done for you, if it's on the heap you have a matching malloc()
and free() at the same level.

So everything is simple. You concentrate on writing the algorithms
that manipulate or process your lists.

If you need a linked list, tree, or other structure, then of course
that's got to be coded. But, except perhaps for linked lists, it's
very much a last resort. You don't use a tree or hash table until
N becomes so large that arrays become impractical. Or if the data is fundamentally a tree or graph, which usually means understood and
presented to the user that way. Virtually everything is arrays, often
of structures which themselves contain nested arrays, because that's
what most things in the real world are. Lists of lists.
Programmers do occasionally typedef arrays, and you might have to
understand and even modify their code.
I not infrequently have to understand, and sometimes modify and
debug, code written in languages with which I am unfamiliar. You
can usually work out what is going on.
Code that relies on sizeof(x) != sizeof(typeofx) belongs in the
obfuscated C contest.
The relationship between arrays and pointers is not some obscure
"crevice" of the language, it's part of its fundamental design.
(And it's very well explained in section 6 of the comp.lang.c FAQ,
There are quirks, like int *ptr1, notaptr2; which make sense in terms
of a machine specified formal grammar, but are just gotchas to
a human programmer. sizeof(array) is one of those quirks.
 
K

Keith Thompson

BartC said:
You're right there's something 'special' about them! An entire schism in the
type system has been created to deal with this. When I was playing around
last year converting one language (which didn't have this schism) to C
(which did), it caused endless problems.

I wouldn't call it "an entire schism in the type system", but I'm
glad we're in agreement that array parameter declarations are a
special case.

I'm curious: do you agree that arrays passed as arguments in function
calls are *not* a special case, but just one of many contexts in
which arrays decay to pointers?

Arrays in C are second-class types. There's no rigorous definition
of "second-class", but C doesn't support assignment, comparison,
parameter passing, or function return values of array types.
There are array values, but there's nothing that operates directly
on those values. Sure, it would be nice if it were otherwise,
but it would be difficult to turn arrays into first-class types
without breaking existing code.

I've already acknowledged that I dislike the way C handles arrays
and pointers. My point is that it's important to understand how
they're actually defined by the language. If you can't stand using
a language without first-class arrays, there are plenty out there
that have them.
So, outside of a function formal parameter list, you might have the set of
types array, array*, array** and so on which declare what you might expect
(a value array, followed by a pointer and pointer to pointer).

Use exactly the same types in a parameter list, then they become array*,
array**, array*** and so on; they all move up one place, just so it becomes
impossible to specify an 'array' type (ie. an array value type) as a
parameter.

Such a type is illegal to use as a parameter in C, but that could just have
raised a constraint violation instead of introducing this extra quirk into
the type system. And would have left open a possibility to be able to pass
actual arrays by value at some point.

C allows tremendous flexibility in dealing with arbitrary arrays and
similar data structures, building on a fairly small number of primitive
types and operations with a couple of bizarre special-case rules. It
does so at the expense of placing more burden on the programmer than on
the compiler to keep track of array sizes.

That's just the way it is.
 
G

glen herrmannsfeldt

(snip)
I've already acknowledged that I dislike the way C handles arrays
and pointers. My point is that it's important to understand how
they're actually defined by the language. If you can't stand using
a language without first-class arrays, there are plenty out there
that have them.
(snip)

C allows tremendous flexibility in dealing with arbitrary arrays and
similar data structures, building on a fairly small number of primitive
types and operations with a couple of bizarre special-case rules. It
does so at the expense of placing more burden on the programmer than on
the compiler to keep track of array sizes.
That's just the way it is.

Do you like the way Java treats arrays better?

Among other things, they do have a length that goes along with them.

-- glen
 
M

Malcolm McLean

Do you like the way Java treats arrays better?

Among other things, they do have a length that goes along with them.
It means that you don't have the same distinction between an "array"
and a "buffer". It's common in C for a buffer to be a bit bigger than
the data in it. In Java, whilst you can do this, it's confusing and
works against the language.
 
I

Ian Collins

Malcolm said:
It means that you don't have the same distinction between an "array"
and a "buffer". It's common in C for a buffer to be a bit bigger than
the data in it. In Java, whilst you can do this, it's confusing and
works against the language.

C++'s vector fills the role nicely, having both capacity() and size()
members. Most implementations also implement it efficiently enough to
pass (and return) by value.
 
M

Malcolm McLean

C++'s vector fills the role nicely, having both capacity() and size()
members. Most implementations also implement it efficiently enough to
pass (and return) by value.
Yes, the C++ vector is a standard library over the common C way
of doing things, which is to have a buffer which may only be partly
filled with data. If the array needs to grow indefinitely, the C
way is to call realloc(), but often C programmers impose an arbitrary
limit because of the complexity of coding this.
I don't see how you can implement a vector efficiently if it is passed
by value, however. Inherently you need to make a local copy, which
is O(N).
The STL way is to pass iterators to a controlled sequence. However a
lot of C++ programmers don't understand this and pass about the
collection objects instead. Sometimes it's necessary, because the
iterators aren't flexible enough to do everything you want efficiently.
But most often it's a case of not really getting it.
 
M

Martin Shobe

Martin Shobe said:
// Function on
void mutate(int bar[])
{
bar[0] = 1; //
}

int foo[3] = {0}; // Foo is an array. It is not a pointer.
mod(foo); // mutate does modify a copy of foo, it modifies foo
printf("%d\n",foo[0]);

However you want to phrase it, that's the semantics of "foo was passed
by reference".
Sure you can say "Actually, the address of foo was passed by value",
but on a fundamental level *that's what pass by reference* means.

class myarr {
public:
int arr_[3];
};

void mutate_by_reference(myarr& x)
{
x.arr_[0] = 1; //
}

void mutate_by_value(myarr x)
{
x.arr_[0] = 1; //
}


Which of those has the same semantics as the C example?

Neither. And rather obviously so.

Only with respect to sizeof(). Which is one of the other major
misfeatures of how the C type system handles arrays...
Not that I agree, but how about this then?

typedef int bar_t[3];

static void mutate(bar_t foo)
{
foo[0] = 1;
}

static void mutate_by_reference(bar_t & foo)
{
foo[1] = 1;
}

int main()
{
bar_t bar{0, 0, 0};
int baz{0};

mutate(bar); // okay.
mutate(&baz); // okay, mutate(bar_t) is actually mutate(int *);
mutate_by_reference(bar); // okay.
mutate_by_reference(&baz); // error. an int * isn't a bar_t.
mutate_by_reference(baz); // error. an int isn't a bar_t.
}

Martin Shobe
 
E

Eric Sosman

*massive eye roll*

What happened to "There's nothing special about arrays as function parameters"?

Whom are you quoting? What was the context? Was it in this thread?

Function parameters spelled as arrays are rewritten as pointers
to the array element's type. Nobody disputes that; it's the crux of
the explanation Keith gave. What's in dispute is your statement in
<[email protected]>:

It's an array, with a hidden pass-by-reference ("pointer
decay") as part of the calling convention.

If you're unable to see why and how these differ, I give up.
 
B

BartC

Keith Thompson said:
I wouldn't call it "an entire schism in the type system", but I'm
glad we're in agreement that array parameter declarations are a
special case.

I'm curious: do you agree that arrays passed as arguments in function
calls are *not* a special case, but just one of many contexts in
which arrays decay to pointers?

I guess so.
Arrays in C are second-class types. There's no rigorous definition
of "second-class", but C doesn't support assignment, comparison,
parameter passing, or function return values of array types.
There are array values, but there's nothing that operates directly
on those values. Sure, it would be nice if it were otherwise,
but it would be difficult to turn arrays into first-class types
without breaking existing code.

Arrays were also largely second-class in the language I was working from.
However the language syntax and semantics were not altered so that it was
impossible to express first-class operations as they have been in C.

That means a few first-class ops on arrays can be implemented (assignment
for example), but it also means explicit address-of and deref operators and
so on often need to be used to make them work like C.

But I agree C should be left alone now; this would be too big a fundamental
change at this point.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,076
Messages
2,570,565
Members
47,201
Latest member
IvyTeeter

Latest Threads

Top