strange warning

K

Kenny McCormack

Malcolm McLean said:
"Here's an array of ten integers".
"OK, can you tell me their values?"
"42, 3, 62, 100, 1234 and, er, the last five are undefined".
"How do you know that the last five are undefined?"
"Well we allow up to ten accounts per user, but this particular
user has only five. So Naccounts = 5, and he's made that many
posts from each account".
"Ok, so you've got an array of five integers, you've got a buffer
which holds up to ten integers".

Indeed. We get it. Anyone with 2 brain cells to rub together and who
hasn't been totally Kool Aided by CLC gets it.
That's normal usage.

Not for Kiki, it isn't...

--
Here's a simple test for Fox viewers:

1) Sit back, close your eyes, and think (Yes, I know that's hard for you).
2) Think about and imagine all of your ridiculous fantasies about Barack Obama.
3) Now, imagine that he is white. Cogitate on how absurd your fantasies
seem now.

See? That wasn't hard, was it?
 
K

Keith Thompson

Malcolm McLean said:
"Here's an array of ten integers".
"OK, can you tell me their values?"
"42, 3, 62, 100, 1234 and, er, the last five are undefined".
"How do you know that the last five are undefined?"
"Well we allow up to ten accounts per user, but this particular
user has only five. So Naccounts = 5, and he's made that many
posts from each account".
"Ok, so you've got an array of five integers, you've got a buffer
which holds up to ten integers".

That's normal usage.

Not in my experience. What you've described is what I'd call an array
of ten integers, of which five happen to have unspecified values. I
have no problem calling it a "buffer" if that's what it is, but that's a
statement about how it's used, not just about what it is.

This:

int arr[10];

defines an array (more precisely an object of array type, even
more precisely an object of type int[10]) of 10 elements; that's
based on how the C standard defines the word "array", and on the
way the word is consistently used in my experience. The array
object consists of 10 int objects, each of which may or may not
have a defined value at the moment.

The concept of "the portion of an array object which currently
contains elements with defined values" certainly can be a useful
one (and it's not necessarily the first N elements), but again,
using the word "array" to refer to that concept will cause confusion.

Similarly:

char str[20];
strcpy(str, "hello");

str is an array with 20 elements. The first 6 elements of that array
contain a string; the remaining elements of the array have unspecified
values.
 
I

Ian Collins

Keith said:
This has very little to do with Ian's situation, where the size of
the fixed array is specifically imposed by the real-world entity
the code is dealing with.

Indeed. It was Malcolm continually wandering off down random (dynamic?)
tangents rather than addressing the key point that I found frustrating.
 
M

Malcolm McLean

This:

int arr[10];

defines an array (more precisely an object of array type, even
more precisely an object of type int[10]) of 10 elements; that's
based on how the C standard defines the word "array", and on the
way the word is consistently used in my experience. The array
object consists of 10 int objects, each of which may or may not
have a defined value at the moment.
That's because C doesn't make a rigorous distinction between buffers
and arrays.

int x;
x = arr[0];

is undefined behaviour. Most programmers would say that if you can't
legally read an element of an array, the array doesn't contain that
element, so it's not an array of ten integers.

But loosely, most C programmers will say "here's an array of ten
integers, currently uninitialised", I agree.
 
M

Malcolm McLean

As a result, I could not make out what it was you were trying to say.
It turns out, you are simply advocating parameterising functions by the
size of the data the operate on. I'd surprised if there is any
experienced programmer here who disagrees that this is generally the
right thing to do, or who does not have very good reasons for not doing
when they don't.
Well, yes, it's surprising that such a simple claim generates such
a huge amount of discussion.
 
G

glen herrmannsfeldt

Keith Thompson said:
Not in my experience. What you've described is what I'd call an array
of ten integers, of which five happen to have unspecified values. I
have no problem calling it a "buffer" if that's what it is, but that's a
statement about how it's used, not just about what it is.

int arr[10];

Well, there are a few possibilities, which generate different
code, and there are reasons to use each:

int arr[10];

static int arr[10];

int *arr;
arr=malloc(10*sizeof(*arr));
defines an array (more precisely an object of array type, even
more precisely an object of type int[10]) of 10 elements; that's
based on how the C standard defines the word "array", and on the
way the word is consistently used in my experience. The array
object consists of 10 int objects, each of which may or may not
have a defined value at the moment.

The instructions needed to access an element of each one some
early machines with C compilers might have led to preference of
one over another, even if the optimal choice has changed over
the years.

-- glen
 
G

glen herrmannsfeldt

(snip)
Well, thats wrong.
Its quite normal when writing game grids to have an N width border
where N is the largest number of tiles any piece might move outside
of the legal play area - it cuts the need for potentially heavy "bounds
checking" since that "move" immediately flags up as an illegal location
because of the "out of bounds" flag set in that destination location's
cell.

It is also a favorite trick in sorting algorithms to speed up the
inner loop. Only testing one thing, instead of two, can make the
loop twice as fast. (Sometimes it can be done without an extra
cell, other times the extra is needed.)

-- glen
 
K

Keith Thompson

Malcolm McLean said:
This:

int arr[10];

defines an array (more precisely an object of array type, even
more precisely an object of type int[10]) of 10 elements; that's
based on how the C standard defines the word "array", and on the
way the word is consistently used in my experience. The array
object consists of 10 int objects, each of which may or may not
have a defined value at the moment.
That's because C doesn't make a rigorous distinction between buffers
and arrays.

But it clearly and unambiguously defines the word "array" (see N1570
6.2.5p20), and it does so in a manner that's inconsistent with the way
you've been using the word in this thread.
int x;
x = arr[0];

is undefined behaviour. Most programmers would say that if you can't
legally read an element of an array, the array doesn't contain that
element, so it's not an array of ten integers.

I don't know of *any* programmer (other than you) who would say that.

The elements of the array object "arr" are int objects. There are ten
of them.
But loosely, most C programmers will say "here's an array of ten
integers, currently uninitialised", I agree.

There's nothing loose about it; that's exactly what it is.

I urge you to find a better term for what you're incorrectly calling an
"array"; then perhaps there will be something to discuss.
 
K

Keith Thompson

Malcolm McLean said:
Well, yes, it's surprising that such a simple claim generates such
a huge amount of discussion.

That's not what generated the discussion.

But to address that point, suppose you have a chess program (N.B.:
*not* snakes and ladders). Its design does not allow for a board
size other than 8 by 8. Should chess-specific functions within
that program that deal with the board always have two additional
parameters that specify the size of the board (whose values will
always be 8 and 8)?
 
B

BartC

Keith Thompson said:
That's not what generated the discussion.

But to address that point, suppose you have a chess program (N.B.:
*not* snakes and ladders). Its design does not allow for a board
size other than 8 by 8. Should chess-specific functions within
that program that deal with the board always have two additional
parameters that specify the size of the board (whose values will
always be 8 and 8)?

For that purpose, a 8x8 or 1x64 data structure isn't really used as an
array, it's more of a type. It might just about be feasible to use structs
for that purpose instead, then clearly you wouldn't need to pass the sizes,
as that would be an inherent part of the struct data type.

So you wouldn't need to pass 8, 8 as actual parameters, but it's a good idea
not to hardcode the 8's. Just in case.

(I remember coding a sudoku puzzle solver once. These are generally 9x9, so
I hardcoded these values.

Then one day I had to upgrade it to 16x16 (because a newspaper was offering
a prize for a solution of this super-sudoku). Then I had to take all the 9s,
9x9s, 81s, and 3x3s hardcoded in the program, and change them to 16s,
16x16s, 256s and 4x4s! BTW I didn't win...)
 
B

Ben Bacarisse

Malcolm McLean said:
Well, yes, it's surprising that such a simple claim generates such
a huge amount of discussion.

You cut, without comment, the reasons I gave why you should *not* be, in
my option, surprised.
 
G

glen herrmannsfeldt

(snip)
But to address that point, suppose you have a chess program (N.B.:
*not* snakes and ladders). Its design does not allow for a board
size other than 8 by 8. Should chess-specific functions within
that program that deal with the board always have two additional
parameters that specify the size of the board (whose values will
always be 8 and 8)?

I suppose nobody here watches Star Trek or even Big Bang Theory.

Star Trek has a 3D chess, which I think isn't 8 by 8 by ???.

I think Sheldon has one like it, but also came up with a three
player chess, though as I remember, he never found two other
people to play it with.

-- glen
 
J

James Kuyper

(snip)


I suppose nobody here watches Star Trek or even Big Bang Theory.

Star Trek has a 3D chess, which I think isn't 8 by 8 by ???.

Keith has already covered that in a later response:
I've seen variants of chess that don't use an 8-by-8 board --
but most of them don't use an N-by-N board for any value of N.
I've seen various forms of 3-D chess, and one version that has
three 4-by-8 sub-boards adjacent to a central triangle. A simple
change to the board size is, as far as I know, one of the *least*
likely variations you might encounter.


The Star Trek 3D chessboard has the same number of squares as a standard
chessboard, but distributed in several different levels. Rules for the
game were never provided in the series, but some were developed
afterwards by fans. No program could possibly implement those rules as a
simple side-effect of making the dimensions of the game board
adjustable. That's the point that Keith was making.
 
M

Malcolm McLean

For that purpose, a 8x8 or 1x64 data structure isn't really used as an
array, it's more of a type. It might just about be feasible to use structs
for that purpose instead, then clearly you wouldn't need to pass the sizes,
as that would be an inherent part of the struct data type.

So you wouldn't need to pass 8, 8 as actual parameters, but it's a good idea
not to hardcode the 8's. Just in case.
The snag is that if we do
#define CHESSBOARDWIDTH 8

we have to be very careful that all the code reacts properly when we replace
8 with another value. For a chess program, that's likely difficult.

Someone did mention the possibility that you might want guard squares to
speed up the move calculations. So you need 12 (to account for the knight),
and maybe a dummy piece type to occupy the squares.
 
M

Malcolm McLean

But it clearly and unambiguously defines the word "array" (see N1570
6.2.5p20), and it does so in a manner that's inconsistent with the way
you've been using the word in this thread.
Which is fine, as long as we're talking about implementing C compilers.

But if we're discussing whether C makes the distinction between a
buffer with data in it and a buffer without in the best way, then
the fact that the C standard uses the term "array" for both types
of objects is hardly decisive. If we argue that they're not drawing
a distinction which they ought to draw, or which it would be better
if they did draw, then it's hardly surprising that they're also using
a word in a way that we might consider to be wrongly.

In fact it depends on context. "array" can mean specifically data arranged
at contiguous locations in memory, it can mean data arranged at strides
in memory (so padding or secondary data allowed between elements), it
can mean a high-level structure which can have any underlying
representation, maybe cached to disk, but accessed through an index.
 
K

Kaz Kylheku

In fact it depends on context. "array" can mean specifically data arranged
at contiguous locations in memory, it can mean data arranged at strides
in memory (so padding or secondary data allowed between elements), it
can mean a high-level structure which can have any underlying
representation, maybe cached to disk, but accessed through an index.

Array can also refer to the disks themselves: the A in RAID.
 
D

David Brown

On 12/05/14 14:30, Malcolm McLean wrote:

Therefore, /if/ you can afford the resources (typically memory), then it
is almost always better to use fixed sizes that are determined at
compile time. A statically allocated array of fixed size is better in
almost every way than a dynamically malloc'ed array, except that you
cannot (easily) reuse the same memory for other purposes, and you need
to know the sizes at compile time.

This is the reason why high reliability and safety-critical systems
usually ban any sort of dynamic memory. And that is precisely the sort
of system that requires /real/ rigorous development methods, and /real/
good design. When you start using "dynamic this" and "general that",
you end up with a system that cannot be analysed and characterised
properly, and consequently has much greater risks for reliability.
The Turing machine running out of tape is a fundamental theoretical problem,
I agree.
You're mixing up "dynamic memory" with "non-fixed-size arrays". That's because C
doesn't enforce a rigorous distinction between an array and a buffer. C
programmers tend to say

char line[1024];
fgets(line, 1024, fp) ;

"Ok, I've got an array of 1024 bytes".
They don't. They've got a buffer of 1024 bytes, an array of however many characters
fgets happened to read, plus the nul.

You are inventing a distinction between what /you/ call a buffer, and
what /you/ call an array. I think it is likely that our opinions are
not as far apart as first seems regarding dynamic and fixed sizes in
different types of program - it is just that you have invented your own
ideas about what "fixed-size array" and "non-fixed-size array" mean.

So lets get this anchored in the /real/ world - the one in which we
program in C, as defined by the C standards, using terminology common in
the standards as well as literature about the C language.

char line[1024];

This defines a fixed-size array of 1024 characters. It can be used to
store anything the user likes, such as strings, characters, or any other
data. It /always/ has a fixed length of 1024 characters, and it is
/always/ an array. It might not always contain 1024 characters worth of
useful or valid data, but that does not affect the size of the array.

fgets(line, 1024, fp);

This reads up to 1024 characters from "fp" into "line". We commonly say
"line" is a "buffer" here - that is one use of an array.

After the "fgets", line is /still/ a fixed-length array of 1024
characters. Only a certain number of characters in it are actually
valid - one can say "there are 'X' characters in buffer 'line'". But
the data type and size of line has never changed - it is always a
fixed-length array of fixed size 1024.

Dynamic memory and algorithms which scale in N are obviously linked ideas, but
they're not quite the same thing. If we call strlen() on line, we'd expect strlen to
scale gracefully to any line length. That's obvious, few people would make the mistake
of hardcoding strlen() to a limited buffer size of 1024. But when it's less obvious,
people do often write dependencies into code that they shouldn't.

No, people who are serious programmers except dependencies and
limitations on all aspects of the system and the code. We are not
programming for Turing machines - we understand that things are limited.
I certainly do /not/ expect strlen() to scale gracefully to any line
length - I expect it to be limited by the target, and /usually/ I expect
it to handle lines of any /practical/ length for the target and
application in question. But I don't expect it to "scale gracefully" to
strings longer than 2^31 on a 32-bit target. I don't expect it to be
happy with strings longer than 2^15 on a 16-bit target - even if the
target has more than 64K memory. On some targets, I don't expect it to
work at all on strings in flash, because I know that one some targets,
flash and ram have different memory spaces. For some types of
programming, I don't expect it to work at all because I cannot be sure
the target string is properly terminated. And on some targets, I don't
expect it to work for long strings because I have limitations on how
much time a function is allowed to take. And on some targets, I might
well make my own strlen() function hardcoded to a limit of 1024 in order
to minimise the damage if it were called on an unterminated string,
because I know that no valid string would be longer than 1024 bytes.

You are making all sorts of unwarranted assumptions because you don't
understand the type of programming that is done in C, and you think your
own little niche of experience covers everything.


Of course, I don't disagree that people often write dependencies when
they should not, or have dependencies or limitations that are not
obvious and not documented, and people often fail to use static
assertions and static compile-time checks when they could catch
conflicts with these limitations.
 
B

BartC

Keith Thompson said:
An array type *is* a type. What distinction are you trying to make?

I meant an application type. Its implementation in whatever primitive type
construct the language provides would be irrelevant.

If you wanted for some reason to represent the three r, g, b values of some
image pixels as an array: char[3], and had a bunch of pixel-processing
routines to work on such a type, would you bother to pass the 3 as an extra
argument to each?

It might be that you wanted to print such a value, and happened to have a
generic char-array printing routine, then in that case you would need to
aware that this is a char[3] and would need to pass its length.

Or it might be a vector of 3 coordinates: double[3]. Such an array is not in
the same class as u or v in:

#define N 1500
double u[N],v[N];

So they are more like opaque types. In fact my examples could be implemented
as structs too, but arrays offer the opportunity to index the members by
number rather than by name. (This would be an advantage of having value
arrays: to provide that choice, without making compromises.)
 
M

Malcolm McLean

And on some targets, I might
well make my own strlen() function hardcoded to a limit of 1024 in order
to minimise the damage if it were called on an unterminated string,
because I know that no valid string would be longer than 1024 bytes.
So in fact you are making the "mistake". You've given a sensible justification,
unlike some other people I don't make the assumption that anyone who
disagrees with me is stupid or inexperienced, or knows nothing about C
outside of their own little world.

strlen() should in my view scale gracefully to any string length, until you hit
fundamental limits like the sizes of integers. We've often got to assume
that an "int" can represent any integer that we want, because trying to
code simple variables with arbitrary precision is just too difficult and time
consuming, outside of specialised mathematical software. That's just a
limitation of current processors that we have to accept, it's not something
we're happy about.

Why not put in a sanity test, if you know that no string will go over 1K?
It's not a completely stupid idea, but things like that have a way of persisting
well after the reason has gone away.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,075
Messages
2,570,562
Members
47,197
Latest member
NDTShavonn

Latest Threads

Top