initializing data at compile time

B

Bart Goeman

Hi,

I have a question about how to put redundant information in data
structures, initialized at compile time. This is often necessary
for performance reasons and can't be done at run time (data
structures are read only)
Ideally one should be able to put the redundant
information there automatically so no mistakes are possible, but in a lot
of case I see no way how to do it.

stupid but simple example:
typedef unsigned char u8;
const char lbl0[3] = "cow";
const char lbl1[5] = "horse";
struct _header
{
const char * label;
u8 length;
};
#define fill_header(lbl) {(lbl),sizeof(lbl)}
const struct _header headers[]=
{
fill_header(lbl0),
fill_header(lbl1)
};

struct _vector
{
const int * data;
u8 header; //index into headers structure
u8 headerlength; //copy of the length field of the header
};

extern const int p0[3];
extern const int p1[4];

#define USEHEADER(h) (h), headers[(h)].length
const struct _vector vectorA[]=
{
{p0,USEHEADER(0)}, //label cow is used
{p1,USEHEADER(1)}, //label horse is used
};
const struct _vector vectorB[]=
{
{p0,0,3}, //label cow is used
{p1,1,5}, //label horse is used
};

First I define an array with strings + the length (headers). Next I
define an array with data that references these strings. Each element of
the array holds an index into the headers, and also some redundant
information, the length of the header string. To initialize the redundant
information, I can't often use an automated method. vectorA does not
compile. So i have to put it there manually (vectorB), which is of course
error-prone and clumsy for big data structures. Does anyone now an elegant
method to automate things like this in C? Or do I have to generate the
data structures using other means? (e.g. perl) I have the same type of
problems when I try to do some sanity checks on related structures at
compile time.

regards,
Bart Goeman
 
J

Jack Klein

Hi,

I have a question about how to put redundant information in data

What do you mean by "redundant information"? Why do you think you
need it?
structures, initialized at compile time. This is often necessary
for performance reasons and can't be done at run time (data
structures are read only)
Ideally one should be able to put the redundant
information there automatically so no mistakes are possible, but in a lot
of case I see no way how to do it.

stupid but simple example:
typedef unsigned char u8;
const char lbl0[3] = "cow";
const char lbl1[5] = "horse";

You realize that the arrays above are not strings, as they are missing
the terminating '\0'. Any passing of them to C string handling
functions produces undefined behavior.
struct _header

Don't do this, why do you think you need to start a tag name with an
underscore? All identifiers beginning with an underscore are reserved
by the implementation at file scope in both the ordinary and tag name
spaces.
{
const char * label;
u8 length;
};
#define fill_header(lbl) {(lbl),sizeof(lbl)}
const struct _header headers[]=
{
fill_header(lbl0),
fill_header(lbl1)
};

struct _vector
{
const int * data;
u8 header; //index into headers structure
u8 headerlength; //copy of the length field of the header
};

extern const int p0[3];
extern const int p1[4];

#define USEHEADER(h) (h), headers[(h)].length
const struct _vector vectorA[]=
{
{p0,USEHEADER(0)}, //label cow is used
{p1,USEHEADER(1)}, //label horse is used
};

Of course this does not work. Your are trying to initialize a member
of structures with static storage duration with the value of an
object. This is not a compile time constant expression, and so is not
valid.
const struct _vector vectorB[]=
{
{p0,0,3}, //label cow is used
{p1,1,5}, //label horse is used
};

The real problem here is your intermediate structure. Why do you need
it? All it contains is a pointer to char and the length of the array
of chars, and you are putting the length of the array into the higher
level structure anyway. Why not just eliminate the intermediate
structure and put the char pointer and length directly in the final
array?

The other alternative is to omit the "redundant" size parameter from
the final structure. Why do you need it twice? What is the point of
the redundancy? Why do you think it buys you in terms of robustness?
First I define an array with strings + the length (headers). Next I
define an array with data that references these strings. Each element of
the array holds an index into the headers, and also some redundant
information, the length of the header string. To initialize the redundant
information, I can't often use an automated method. vectorA does not
compile. So i have to put it there manually (vectorB), which is of course
error-prone and clumsy for big data structures. Does anyone now an elegant
method to automate things like this in C? Or do I have to generate the
data structures using other means? (e.g. perl) I have the same type of
problems when I try to do some sanity checks on related structures at
compile time.

regards,
Bart Goeman

You still haven't explained the need for the "redundant" information.
I can't think of any particular need for this in a program.

If you have a specific problem you are trying to solve, or a specific
result you are trying to receive, it would be better if you posted
again and explained what it is you are actually trying to accomplish.
Then perhaps we can suggest a better way to go about it.
 
B

Bart Goeman

Op Sat, 18 Dec 2004 16:32:06 -0600, schreef Jack Klein:
Hi,

I have a question about how to put redundant information in data

What do you mean by "redundant information"? Why do you think you
need it?
structures, initialized at compile time. This is often necessary
for performance reasons and can't be done at run time (data
structures are read only)
Ideally one should be able to put the redundant
information there automatically so no mistakes are possible, but in a lot
of case I see no way how to do it.

stupid but simple example:
typedef unsigned char u8;
const char lbl0[3] = "cow";
const char lbl1[5] = "horse";

You realize that the arrays above are not strings, as they are missing
the terminating '\0'. Any passing of them to C string handling
functions produces undefined behavior.

yes i know. it's a simple example.
it saves some space however, and pascal type strings hare more handsome in
my application since I have to retrieve them over a serial line, it's
easier if you know before how many bytes you need to request. but is off
the point.
Don't do this, why do you think you need to start a tag name with an
underscore? All identifiers beginning with an underscore are reserved
by the implementation at file scope in both the ordinary and tag name
spaces.
You have a point. Bad practice.
{
const char * label;
u8 length;
};
#define fill_header(lbl) {(lbl),sizeof(lbl)}
const struct _header headers[]=
{
fill_header(lbl0),
fill_header(lbl1)
};

struct _vector
{
const int * data;
u8 header; //index into headers structure
u8 headerlength; //copy of the length field of the header
};

extern const int p0[3];
extern const int p1[4];

#define USEHEADER(h) (h), headers[(h)].length
const struct _vector vectorA[]=
{
{p0,USEHEADER(0)}, //label cow is used
{p1,USEHEADER(1)}, //label horse is used
};

Of course this does not work. Your are trying to initialize a member
of structures with static storage duration with the value of an
object. This is not a compile time constant expression, and so is not
valid.

this is the core of my question. in C, headers[h].length is not a
compile-time constant expression, but every sane person will say it really
is a compile-time constant expression, since headers is a const
array. So I'm asking is there any way to circumvent this C limitation?


const struct _vector vectorB[]=
{
{p0,0,3}, //label cow is used
{p1,1,5}, //label horse is used
};

The real problem here is your intermediate structure. Why do you need
it? All it contains is a pointer to char and the length of the array of
chars, and you are putting the length of the array into the higher level
structure anyway. Why not just eliminate the intermediate structure and
put the char pointer and length directly in the final array?

The other alternative is to omit the "redundant" size parameter from the
final structure. Why do you need it twice? What is the point of the
redundancy? Why do you think it buys you in terms of robustness?
First I define an array with strings + the length (headers). Next I
define an array with data that references these strings. Each element
of the array holds an index into the headers, and also some redundant
information, the length of the header string. To initialize the
redundant information, I can't often use an automated method. vectorA
does not compile. So i have to put it there manually (vectorB), which
is of course error-prone and clumsy for big data structures. Does
anyone now an elegant method to automate things like this in C? Or do I
have to generate the data structures using other means? (e.g. perl) I
have the same type of problems when I try to do some sanity checks on
related structures at compile time.

regards,
Bart Goeman

You still haven't explained the need for the "redundant" information. I
can't think of any particular need for this in a program.

If you have a specific problem you are trying to solve, or a specific
result you are trying to receive, it would be better if you posted again
and explained what it is you are actually trying to accomplish. Then
perhaps we can suggest a better way to go about it.

I wanted to give a simple &
short example, but apparently my example was far too simple.

Redundant information is often needed to speed up calculations. Speed is
far more an issue for embedded applications than for desktop apps.
It buys abolutely nothing in terms of robustness, it's worse of course,
but it should be possible to avoid the robustness issue by constructing
the values by the compiler/preprocessor.

This redundant information can be calculated at the start of the program,
this is an easy solution, but it's a problem for embedded
applications with few RAM available, you want to store it in ROM/flash.
but C makes life difficult.

I will give you 2 real-world examples. In fact I know far more examples,
it's a problem I hit into quite often.

Example 1:
I have a device with a LCD display. It can display messages
A proportional font is stored in ROM; a const array with for
each ascii character the width of the character, some other info and a
pointer to the actual bitmap. Second, you have another big const array
with a lot of fixed messages in it. If the message does not fit on the
screen (77 columns) you have to scroll it when you display it, so you need
to know the width of each message. To speed up the program you want to
store the width of a message, so you know immediately if you have to
scroll. The width of the message is of course redundant, you can calculate
it at run-time. but if a lot of ROM is available and time is not this
information should be stored in ROM

You know this before the program starts,
so you want to store the width in the messages
array-> has to be const->has to be initialized
at compile time.

typedef unsigned char u8;
typedef unsigned short u16;

struct s_font
{
u8 width; //#columns
const u8 * pBitmap; //height is fixed(16), so 2 bytes/column
};

const struct s_font font[256]=
{
//fontdata
};

struct s_msg
{
const char * msg;
u16 width;
};

const struct s_msg msg[]=
{
{"hello world",0}, //width??
{"Too many characters",0}, //width??
{"bold",0}, //width??
};
 
J

jacob navia

You can store pascal strings in C using:

typedef struct pString {
int len; // Number of chars without the zero
char *str;
} PSTRING;

PSTRING foo = {
sizeof("myString")-1, // Avoid counting the trailing zero
"myString"
};

or you can

#define MAKE_PSTRING(a) { sizeof(a)-1,a}

then

PSTRING foo = MAKE_PSTRING("myString");
 
C

Chris Torek

... this is the core of my question. in C, headers[h].length is not a
compile-time constant expression, but every sane person will say it really
is a compile-time constant expression, since headers is a const
array. So I'm asking is there any way to circumvent this C limitation?

No. Or rather, yes, but it is not pleasing: "avoid writing such
code in the first place". :)

As Gildor said to Frodo, "You have not told me all concerning
yourself, and how then shall I choose better than you?" But with
that dangerous gift of advice, I think you should treat the C source
for these initializers as object code, generated from some other
source. Write a small compiler that reads the true source, and
generates a ".c" file containing the initialized data complete with
redundant information. This .c file, despite its name, is object
code, that is "linked" with the remaining code by running the
compilation phase *and* the link phase, rather than simply the link
phase.
 
B

Bart Goeman

Op Sat, 18 Dec 2004 16:32:06 -0600, schreef Jack Klein:
in comp.lang.c:


If you have a specific problem you are trying to solve, or a specific
result you are trying to receive, it would be better if you posted
again and explained what it is you are actually trying to accomplish.
Then perhaps we can suggest a better way to go about it.

example 2:
A controller has a lot of parameters, these are stored in a standard
format so they can be changed:
a PC can request info on all the parameters available in the controller
display them in menus and write them back over a serial line.
I can give you some code but it's rather big and ugly so I'll describe the
basic properties.

Each menu is a two-dimensional matrix, with headers for all columns and
rows.

each menu is defined by a struct,
holding the number of columns in the menu and a pointer to an array with
the columnheaders:



typedef unsigned char u8;
#define SIZE(x) (sizeof(x)/sizeof(x[0]))
struct s_menu
{
const char * menuname;
u8 n_columnheaders;
const char ** columnheaders;
//other info
};

const char * clutches_columnheaders[]=
{
"forward",
"reverse",
"gear1",
"gear2",
"gear3",
"gear4"
};
const struct s_menu menus[]=
{
{"clutches",SIZE(clutches_columnheaders),clutches_columnheaders},
//other menus
};
#define MENU_CLUTCHES 0

there is another array of structs with all the rows in it,
each element of the array holds a struct with the name of the row, the
menu to which the row belongs and a pointer to the actual data:

struct s_parameter
{
u8 menu_id;
char * rowheader;
int * data;
//other info
};

a file s.def holds all the parameters:
DEFPARAMETER("Pressure on",MENU_CLUTCHES,clutchonpressure)
//other parameters

and I use it to define the array:

int clutchonpressure[6]; //the actual data used in the controller
const struct s_parameter parameters[]=
{
#define DEFPARAMETER(name,menuid,data) \
{ menuid, name, (int *) data },
#include "s.def"
#undef DEFPARAMETER
};

if anyone adds a new parameter, he can easily make
mistakes.
e.g. you can put a parameter with 5 elements in a menu with 6 columns.
data corruption and difficult bugs assured if you update the data over
a serial line because the PC writes 6 elements.

so i'd like to check these datastructures with compile time assertions.
(see http://www.jaggersoft.com/pubs/CVu11_3.html
http://www.jaggersoft.com/pubs/CVu11_5.html )

#define COMPILE_TIME_ASSERT(pred) \
switch(0) {case 0: case (pred):;}

i want to test at compile time, i dont want to download it to
the controller (several minutes) to discover something is wrong.
this is also a waste of code space since all checking can be done at
compile time.

i'd like to test it as follows:
static void compile_time_asserts(void)
{
#define DEFPARAMETER(name,menuid,data) \
COMPILE_TIME_ASSERT(SIZE(data)==menus[menuid].n_columnheaders);
#include "s.def"
#undef DEFPARAMETER
};

this does not work because menus[menuid].n_columnheaders is not
a constant expression, according to C's interpretation. so I have to use
normal asserts->wasted code space, wasted run time,
error has to be reported in a meaningful way->more wasted code...

this is another example of a difficulty with initializing data structures
in C at compile time for which I do not know a good solution.
 
K

Keith Thompson

jacob navia said:
You can store pascal strings in C using:

typedef struct pString {
int len; // Number of chars without the zero
char *str;
} PSTRING;

PSTRING foo = {
sizeof("myString")-1, // Avoid counting the trailing zero
"myString"
};

or you can

#define MAKE_PSTRING(a) { sizeof(a)-1,a}

then

PSTRING foo = MAKE_PSTRING("myString");

The argument to MAKE_PSTRING has to be a string literal or an array.
You can't do the following, for example:

void func(char *str)
{
PSTRING foo = MAKE_PSTRING(str);
}

...

func("myString");

For full generality, you need to use strlen() to compute the length.

You might want several forms of MAKE_PSTRING (with different names):
using sizeof vs. calling strlen(), pointing to the argument vs.
malloc()ing a copy.
 
J

jacob navia

Keith said:
The argument to MAKE_PSTRING has to be a string literal or an array.

Obviously. Sorry if I did not write that explicitely down.
You can't do the following, for example:

void func(char *str)
{
PSTRING foo = MAKE_PSTRING(str);
}

...

func("myString");

For full generality, you need to use strlen() to compute the length.

Yes. I intended that macro within the context of
compile time only, not run time.
 
B

Bart Goeman

Op Mon, 20 Dec 2004 00:35:33 +0000, schreef Chris Torek:
... this is the core of my question. in C, headers[h].length is not a
compile-time constant expression, but every sane person will say it really
is a compile-time constant expression, since headers is a const
array. So I'm asking is there any way to circumvent this C limitation?

No. Or rather, yes, but it is not pleasing: "avoid writing such
code in the first place". :)
maybe I can avoid to write C progams? :)
As Gildor said to Frodo, "You have not told me all concerning who are Gildor and Frodo?
yourself, and how then shall I choose better than you?" But with
that dangerous gift of advice, I think you should treat the C source
for these initializers as object code, generated from some other
source. Write a small compiler that reads the true source, and
generates a ".c" file containing the initialized data complete with
redundant information. This .c file, despite its name, is object
code, that is "linked" with the remaining code by running the
compilation phase *and* the link phase, rather than simply the link
phase.

hmm. for a single project I can write a few scripts that just do the
job. (not an ideal solution in a windows development environment without
makefiles (often))

but a generic solution seems a lot of work to me! it's not only a
small (?) compiler, but also a new language as you have to specify
not only the source data structures and the data but also the target data
structures and the conversion method. And to be really reusable in an
embedded world, it needs to support all types of compilers &
architectures: little/big endian, int/ptr sizes, alignment, bitfield
ordening....

I'm also taking a look at the m4 macro processor as a partial solution,
seems more powerful than the c preprocessor to me. arrays/hashes and so
on. But I'm knew to it.
 
C

Chris Torek

Op Mon, 20 Dec 2004 00:35:33 +0000, schreef Chris Torek:
who are Gildor and Frodo?

Characters in "The Lord of the Rings" (though neither Gildor nor this
conversation appear in the movie version -- alas, all trimmed for time).
This is the source of one famous quote:

Frodo: [tells Gildor that they were supposed to meet Gandalf,
but he has not returned, and now the Black Riders seem to be
searching for him.]

Gildor: [pause, then] "I do not like this news. That Gandalf
should be late, does not bode well. But it is said: Do not
meddle in the affairs of wizards, for they are subtle and
quick to anger. The choice is yours: to go, or wait."

Frodo: "And it is also said: Go not to the Elves for counsel,
for they will say both no and yes."

Gildor: [laughs] "Is it indeed? Elves seldom give unguarded
advice, for advice is a dangerous gift, even from the wise to
the wise, and all courses may run ill. But what would you?
You have not told me all concerning yourself ..."

There are many variants of the famous quote, such as: "Do not meddle
in the affairs of dragons, for you are crunchy and taste good with
ketchup." Or, as urban legend at least has it, a sign in a men's
room: "Do not throw cigarette butts in the urinals," to which some
wit appended: "for they become soggy and hard to light."

I myself prefer to say "no and yes" a lot. :)
hmm. for a single project I can write a few scripts that just do the
job. (not an ideal solution in a windows development environment without
makefiles (often))

(There are versions of "make" for Windows.)
but a generic solution seems a lot of work to me! it's not only a
small (?) compiler, but also a new language as you have to specify
not only the source data structures and the data but also the target data
structures and the conversion method. And to be really reusable in an
embedded world, it needs to support all types of compilers &
architectures: little/big endian, int/ptr sizes, alignment, bitfield
ordening....

Indeed. Systems like this tend to start out as small, special-purpose
programs, and over time grow into monsters ("... all courses may run
ill"). The problem itself is inherently complex; only by ruling away
complexities can you achieve simple solutions.
I'm also taking a look at the m4 macro processor as a partial solution,
seems more powerful than the c preprocessor to me.

It is; but it is also cumbersome and error-prone. A small, special-
purpose program that handles only simple cases can detect errors
at "compile" (.src => .c translation) time, and generate code that
is always valid C code and never produces "assemble" or "link" time
errors from the C compiler. Since m4 does not understand C syntax,
this is not possible there.
 
B

Bart Goeman

Op Mon, 20 Dec 2004 00:40:07 +0000, schreef Bart Goeman:
in comp.lang.c:

this does not work because menus[menuid].n_columnheaders is not
a constant expression, according to C's interpretation. so I have to use
normal asserts->wasted code space, wasted run time,
error has to be reported in a meaningful way->more wasted code...

this is another example of a difficulty with initializing data structures
in C at compile time for which I do not know a good solution.

i found a solution myself abusing enums.

bash:pan$ cat menu.h
DEFMENU( MENU_CLUTCH,"Clutches",clutchheaders)
DEFMENU( MENU_GEAR,"Gears",gearheaders)

bash:pan$ cat parameter.h
DEFPARAMETER( P_CLUTCHON,"clutch on pressure",clutchon,MENU_CLUTCH)
DEFPARAMETER( P_CLUTCHOFF,"clutch off pressure",clutchoff,MENU_CLUTCH)
DEFPARAMETER( P_GEARPATT,"gear pattern",gearpattern,MENU_GEAR)

bash:pan$ cat main.c
int clutchon[3];
int clutchoff[3];
int gearpattern[4];

extern const char * const clutchheaders[3];
extern const char * const gearheaders[4];

#define SIZE(x) (sizeof(x)/sizeof(x[0]))
#define COMPILE_TIME_ASSERT(x) do \
{ switch (0) case 0: case x: ;} while (0)

struct s_menu
{
const char * name;
const char * const * columnheaders;
int n_columnheaders;
};

//construct an enum for the menu ids
#define DEFMENU(id,name,headers) id,
enum menu_id
{
#include "menu.h"
};
#undef DEFMENU
//initialize menu array
#define DEFMENU(id,name,headers) \
{ name, headers, SIZE(headers) },
const struct s_menu menus[]=
{
#include "menu.h"
};
#undef DEFMENU

struct s_parameter
{
const char * name;
int * p;
int menu;
};

/*
* now we want to test if each parameter has the same number of
* elements as the corresponding header.
*/


/* this test does not compile */
#if 0
#define DEFPARAMETER(id,name,var,menuid) \
COMPILE_TIME_ASSERT(SIZE(var)==menus[menuid].n_columnheaders)
static void compile_time_asserts_fail(void)
{
# include "parameter.h"
};
#undef DEFPARAMETER
#endif

/*but this test does work! */
#define DEFMENU(id,name,headers) \
enum { SIZE_##id = SIZE(headers)};
#define DEFPARAMETER(id,name,var,menuid) \
COMPILE_TIME_ASSERT(SIZE(var)==SIZE_##menuid);
static void compile_time_asserts(void)
{
# include "menu.h"
# include "parameter.h"
};
#undef DEFMENU
#undef DEFPARAMETER


so by declaring an enum for each menu it works.
#define DEFMENU(id,name,headers) \
const int SIZE_##id==SIZE(headers)
is more natural but doesn't work, a const int is not a constant expression.
It's all rather ugly of course.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top