Please help optimize (and standarize) this code...

M

Mark F. Haigh

gtippery wrote:
The platform limit is 64KB for any one item (i8086). The array would
be the limiting factor in this case (at something over 5,000
elements),

[OT]

The limits change depending on the compiler / memory model combination.
I know there's one out there (a huge model variant) using 16 bit int
and 32 bit size_t. Can't remember which compiler, and don't care to--
hopefully my DOS programming days are far behind me (pun intended),
never to return.

If size_t isn't actually an int (of some standard type), isn't that
going to be a problem in the for()? I thought the index variable had
to be an integer or enumeration.

The standard states (7.17) that "size_t [...] is the unsigned integer
type of the result of the sizeof operator". Use it like any other
integral type, including in for statements and subscripting operators
(ie []). Generally, size_t is a better choice than int for array
subscripting.

So, making a few simple modifications to my previously posted code:

---snip---
#include <stdio.h>

#define MAXCOL 2
#define NAMLEN 8
#define EXTLEN 3

struct NameExt
{
char name[NAMLEN];
char ext[EXTLEN];
};


int main(void)
{
/* Some sample filenames */
struct NameExt list[] = {
{ "One ", "1 " },
{ "TwoTwo ", "22 " },
{ "ThreeThr", "333" },
{ "Four ", "4 " },
{ "FiveFive", "55 " },
{ "SixSixSi", "666" },
};

/* Print the array out */
size_t i;
for(i = 0; i < sizeof(list) / sizeof(list[0]); i++)
printf("%-*.*s.%-*.*s%s",
NAMLEN, NAMLEN, list.name,
EXTLEN, EXTLEN, list.ext,
(i + 1) % MAXCOL ? " " : "\n");

return 0;
}
--- snip ---

[mark@icepick ~]$ gcc -Wall -O2 -ansi -pedantic foo.c -o foo
[mark@icepick ~]$ ./foo
One .1 TwoTwo .22
ThreeThr.333 Four .4
FiveFive.55 SixSixSi.666

I believe that's what you're looking for. That's the simplest code I
can think of off the top of my head.


Mark F. Haigh
(e-mail address removed)
 
G

gtippery

Mark said:
gtippery wrote:
If size_t isn't actually an int (of some standard type), isn't that
going to be a problem in the for()? I thought the index variable had
to be an integer or enumeration.

The standard states (7.17) that "size_t [...] is the unsigned integer
type of the result of the sizeof operator". Use it like any other
integral type, including in for statements and subscripting operators
(ie []). Generally, size_t is a better choice than int for array
subscripting.

I still don't understand why it's "better", by which I suppose you mean
"bigger" (meaning wider range - for positive values, of course).
Wouldn't an implementation with e.g. 16-bit size_t and 32-bit type int
meet the spec you quote? Do you mean it's _likely_ to be bigger? Or
do you just mean that for a given size, unsigned int is (probably)
"bigger" than signed int?

Or maybe I've missed the point. Other than range, is there some reason
specific to array subscripting to prefer size_t ?
So, making a few simple modifications to my previously posted code:

---snip---
#include <stdio.h>

#define MAXCOL 2
#define NAMLEN 8
#define EXTLEN 3

struct NameExt
{
char name[NAMLEN];
char ext[EXTLEN];
};


int main(void)
{
/* Some sample filenames */
struct NameExt list[] = {
{ "One ", "1 " },
{ "TwoTwo ", "22 " },
{ "ThreeThr", "333" },
{ "Four ", "4 " },
{ "FiveFive", "55 " },
{ "SixSixSi", "666" },
};

/* Print the array out */
size_t i;
for(i = 0; i < sizeof(list) / sizeof(list[0]); i++)
printf("%-*.*s.%-*.*s%s",
NAMLEN, NAMLEN, list.name,
EXTLEN, EXTLEN, list.ext,
(i + 1) % MAXCOL ? " " : "\n");

return 0;
}
--- snip ---

[mark@icepick ~]$ gcc -Wall -O2 -ansi -pedantic foo.c -o foo
[mark@icepick ~]$ ./foo
One .1 TwoTwo .22
ThreeThr.333 Four .4
FiveFive.55 SixSixSi.666

I believe that's what you're looking for. That's the simplest code I
can think of off the top of my head.


Mark F. Haigh
(e-mail address removed)


That is indeed the desired output format, and indeed simple. With
luck, the compiler will factor out the constant expression in the for's
test expression.

I note you've changed the internal data representation somewhat, but I
assume it was just to simplify the example by using preinitialization.
I wanted to do that for the initial posting, but couldn't figure out
how. Is that an array of structures of strings? (As I mentioned, I
have trouble with C's declaration syntax, but I'm learning -- I hope.)
 
M

Michael Mair

gtippery said:
Mark said:
gtippery wrote:
If size_t isn't actually an int (of some standard type), isn't that
going to be a problem in the for()? I thought the index variable
had
to be an integer or enumeration.

The standard states (7.17) that "size_t [...] is the unsigned integer
type of the result of the sizeof operator". Use it like any other
integral type, including in for statements and subscripting operators
(ie []). Generally, size_t is a better choice than int for array
subscripting.

I still don't understand why it's "better", by which I suppose you mean
"bigger" (meaning wider range - for positive values, of course).
Wouldn't an implementation with e.g. 16-bit size_t and 32-bit type int
meet the spec you quote? Do you mean it's _likely_ to be bigger? Or
do you just mean that for a given size, unsigned int is (probably)
"bigger" than signed int?

Or maybe I've missed the point. Other than range, is there some reason
specific to array subscripting to prefer size_t ?

size_t is the type of the result of the sizeof operator and the
type of argument taken by the dynamic memory allocation routines
malloc/calloc/realloc -- so, basically, every object you work with
has a size in bytes which can be expressed in size_t. The same
guarantee does not hold for short, int, long of either signed or
unsigned variety. So, using size_t for your index variables, you
can _never_ (*) go wrong. The downside is that you have to be
more careful with your loop tests as unsigned integer types never
can provide values <0.

(*) never: There are no absolutes. Using automatic or static
or dynamically allocated storage, within standard C you will be
on the safe side. Note that size_t has not to be large enough
to count, for example, the number of bytes in a file.

[snip: solved problem]


Cheers
Michael
 
M

Mark F. Haigh

gtippery wrote:
I still don't understand why it's "better", by which I suppose you mean
"bigger" (meaning wider range - for positive values, of course).
Wouldn't an implementation with e.g. 16-bit size_t and 32-bit type int
meet the spec you quote? Do you mean it's _likely_ to be bigger? Or
do you just mean that for a given size, unsigned int is (probably)
"bigger" than signed int?

Or maybe I've missed the point. Other than range, is there some reason
specific to array subscripting to prefer size_t ?

When you're talking about the *size* of things, use size_t. Take the
canonical example (incidentially, on a DOS-like platform):

#define SIZE 20000
char buf[SIZE];

Let's say it was changed to:

#define SIZE 40000

The thing is, any loop using a signed int (16 bit) index will fail
after element 32767, causing undefined behavior (signed integer
overflow). A size_t would make it to the maximum *size* buf can be.
The size of int is really unrelated to the maximum size of objects.

That is indeed the desired output format, and indeed simple. With
luck, the compiler will factor out the constant expression in the for's
test expression.

I note you've changed the internal data representation somewhat, but I
assume it was just to simplify the example by using preinitialization.
I wanted to do that for the initial posting, but couldn't figure out
how. Is that an array of structures of strings? (As I mentioned, I
have trouble with C's declaration syntax, but I'm learning -- I
hope.)

The internal data representation is the same as you need. C does not
include the terminating null ('\0') if there is no room for it:

6.7.8 Initialization

[...]

[#14] An array of character type may be initialized by a
character string literal, optionally enclosed in braces.
Successive characters of the character string literal
(including the terminating null character if there is room
or if the array is of unknown size) initialize the elements
of the array.

Since the size of each array is specified in each case (8 and 3,
respectively), if you provide exactly 8 and exactly 3 characters for
each initializer, the \0 will not be tacked on to the end. In other
words, the C implementation will not overflow the buffer for you, it'll
leave you to do that on your own. ;-)

Keep it up. We like the hard questions around here.


Mark F. Haigh
(e-mail address removed)
 
P

pete

Michael Mair wrote:
size_t is the type of the result of the sizeof operator and the
type of argument taken by the dynamic memory allocation routines
malloc/calloc/realloc -- so, basically, every object you work with
has a size in bytes which can be expressed in size_t. The same
guarantee does not hold for short, int, long of either signed or
unsigned variety.

The same guarantee has to hold for the highest ranking
unsigned type, which is unsigned long in C89.
size_t has the option of being smaller than the highest ranking
unsigned type, for implementations where that may be desirable.
 
G

gtippery

Michael Mair wrote:
....
Note that size_t has not to be large enough
to count, for example, the number of bytes in a file.

"has not to be"? Ah, was that idiom for "does not have to be", or typo
for "has got to be"?
 
G

gtippery

pete said:
The same guarantee has to hold for the highest ranking
unsigned type, which is unsigned long in C89.
size_t has the option of being smaller than the highest ranking
unsigned type, for implementations where that may be desirable.

I'm thinking perhaps a segmented or paged architecture. Wider
operands, but paged addressing using fewer address bits. Seems to me a
number of older computers were like this, as well as some present-day
microcontrollers (i8051?)

And some nominally 32-bit machines have a wider integer type supported
by their FPU. You can calculate with it, but you can't address with
it.
 
G

gtippery

Mark F. Haigh wrote:

....
The internal data representation is the same as you need. C does not
include the terminating null ('\0') if there is no room for it:

6.7.8 Initialization

[...]

[#14] An array of character type may be initialized by a
character string literal, optionally enclosed in braces.
Successive characters of the character string literal
(including the terminating null character if there is room
or if the array is of unknown size) initialize the elements
of the array.

Since the size of each array is specified in each case (8 and 3,
respectively), if you provide exactly 8 and exactly 3 characters for
each initializer, the \0 will not be tacked on to the end. In other
words, the C implementation will not overflow the buffer for you, it'll
leave you to do that on your own. ;-)

That's handy. What happens if you initialize a char[8] with
"123456789"? I mean, I can check what "happens" for me, but what's
_supposed_ to happen?
Keep it up. We like the hard questions around here.

That's _definitely_ a Good Thing <grin>.
 
M

Michael Mair

gtippery said:
Michael Mair wrote:
...


"has not to be"? Ah, was that idiom for "does not have to be", or typo
for "has got to be"?

The former ;-)

Cheers
Michael
 
M

Michael Mair

pete said:
Michael Mair wrote:



The same guarantee has to hold for the highest ranking
unsigned type, which is unsigned long in C89.
size_t has the option of being smaller than the highest ranking
unsigned type, for implementations where that may be desirable.

Thanks for the expansion -- I thought I'd better leave that
out as we then get to C89 vs. C99.
size_t does always fit.

-Michael
 
P

pete

gtippery wrote:
That's handy. What happens if you initialize a char[8] with
"123456789"? I mean, I can check what "happens" for me, but what's
_supposed_ to happen?

char array[8] = "123456789";
is undefined.
It violates a "shall constraint".

N869
6.7.8 Initialization
[#2] No initializer shall attempt to provide a value for an
object not contained within the entity being initialized.
 
G

gtippery

pete said:
gtippery said:
That's handy. What happens if you initialize a char[8] with
"123456789"? I mean, I can check what "happens" for me, but what's
_supposed_ to happen?

char array[8] = "123456789";
is undefined.
It violates a "shall constraint".

N869
6.7.8 Initialization
[#2] No initializer shall attempt to provide a value for an
object not contained within the entity being initialized.

That interpretation isn't very obvious to me from what you quote;
perhaps it is in the larger context. (Can't check, the copy I
downloaded only goes to 4.13, including the library. Either it got
renumbered or we're on different versions.)
Sounds to me like it's saying you can't initialize object z in object
y's initializer unless z is part of y.
 
W

Walter Roberson

:pete wrote:
:> char array[8] = "123456789";
:> is undefined.
:> It violates a "shall constraint".

:That interpretation isn't very obvious to me from what you quote;

The initializer is attempting to provide a value for the
"object" which is the character at array+8, but that object is
not part of the object array[] which is being initialized, since
array[] goes from array+0 to array+7.
 
S

Stephen Sprunk

gtippery said:
gtippery said:
That's handy. What happens if you initialize a char[8] with
"123456789"? I mean, I can check what "happens" for me, but what's
_supposed_ to happen?

char array[8] = "123456789";
is undefined.
It violates a "shall constraint".

N869
6.7.8 Initialization
[#2] No initializer shall attempt to provide a value for an
object not contained within the entity being initialized.

That interpretation isn't very obvious to me from what you quote;
perhaps it is in the larger context. (Can't check, the copy I
downloaded only goes to 4.13, including the library. Either it got
renumbered or we're on different versions.)
Sounds to me like it's saying you can't initialize object z in object
y's initializer unless z is part of y.

Your initializer is trying to provide values for array[8] and array[9],
which are not containted within the char[8] object called "array". The
behavior is undefined.

S
 
P

pete

gtippery said:
pete wrote:
char array[8] = "123456789";
is undefined.
It violates a "shall constraint".

N869
6.7.8 Initialization
[#2] No initializer shall attempt to provide a value for an
object not contained within the entity being initialized.
That interpretation isn't very obvious to me from what you quote;
perhaps it is in the larger context. (Can't check, the copy I
downloaded only goes to 4.13, including the library. Either it got
renumbered or we're on different versions.)
Sounds to me like it's saying you can't initialize object z in object
y's initializer unless z is part of y.

That's what I'm saying.
I think the language is a little plainer in the C89 standard.

http://dev.unicals.com/papers/c89-draft.html#3.5.7

3.5.7 Initialization
Constraints

There shall be no more initializers in an initializer list
than there are objects to be initialized.
 
L

lawrence.jones

pete said:
I think the language is a little plainer in the C89 standard.

http://dev.unicals.com/papers/c89-draft.html#3.5.7

3.5.7 Initialization
Constraints

There shall be no more initializers in an initializer list
than there are objects to be initialized.

That language had to change in C99 due to designated initializers. For
example:

int a[10] = {[11] = 0};

is invalid even though there are 10 objects and only one initializer.

-Larry Jones

Years from now when I'm successful and happy, ...and he's in
prison... I hope I'm not too mature to gloat. -- Calvin
 
J

Joe Wright

pete said:
I think the language is a little plainer in the C89 standard.

http://dev.unicals.com/papers/c89-draft.html#3.5.7

3.5.7 Initialization
Constraints

There shall be no more initializers in an initializer list
than there are objects to be initialized.


That language had to change in C99 due to designated initializers. For
example:

int a[10] = {[11] = 0};

is invalid even though there are 10 objects and only one initializer.

-Larry Jones

Years from now when I'm successful and happy, ...and he's in
prison... I hope I'm not too mature to gloat. -- Calvin

Given..
int a[10] = {[11] = 0};
^^^^^^^^^^
...I have never seen an initializer like that. What do you think it
means? What is the type of this expression?
 
D

Dave Vandervies

Joe Wright said:
Given..
int a[10] = {[11] = 0};
^^^^^^^^^^
..I have never seen an initializer like that.

It's a C99ism.
What do you think it
means? What is the type of this expression?

It means precisely what 6.7.8#6 of n869 (and, unless the numbering has
changed, the same paragraph of C99) says it means, and has the type that
the same paragraph says it has.

(In this case, it's an attempt to initialize an array[10] of int with
an initializer of type array[more-than-10], which is what makes it an
appropriate example of an invalid initializer.)


dave
 
L

lawrence.jones

Joe Wright said:
Given..
int a[10] = {[11] = 0};
^^^^^^^^^^
..I have never seen an initializer like that. What do you think it
means? What is the type of this expression?

It's a designated initializer (a new feature in C99) -- it's an attempt
to initialize a[11] to 0 (which is invalid since a only has 10
elements). Since it's not an expression, it has no type. A more
complete (and valid!) example:

int a[10] = {0, 1, [7] = 7, 8, [2] = 2, 3, 4};

explicitly initalizes a[0] to 0, a[1] to 1, a[2] to 2, a[3] to 3, a[4]
to 4, a[7] to 7, a[8] to 8, and implicitly initializes all the other
elements to 0. There's a similar construct for struct and union
members and they can be combined:

struct foo s = {.x = 10, .y = 20, .u.z = 4, .a[3] = 3};

-Larry Jones

What's Santa's definition? How good do you have to be to qualify as good?
-- Calvin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,161
Messages
2,570,892
Members
47,428
Latest member
RosalieQui

Latest Threads

Top