J
jacob navia
I am writing software to make a general storage
facility of any kind of objects to/from disk.
The intermeidate format used is XML, using the schema
(modified a bit) of Microsoft: xmlns="x-schema:xop-schema.xml"
Operation:
----------
The software generates several C functions that implement the
writing of the XML. To make things more concrete suppose
the following setup:
typedef struct tagG {
int tab[10];
} Tab;
typedef struct tagstruct {
char a;
short b;
int c;
unsigned d;
long e;
long long f;
long double g;
double h;
char * str;
Tab tab;
struct tagstruct *Next;
} structure;
The "wizard" software generates the following functions:
----------------------------------------------
//@ Serialization function for structure structure
int structureSerialize(structure *data,FILE *out)
{
int i;
unsigned char *p;
if (data == NULL)
return 0;
if (!initialized) {
InitXmlWriter(out);
initialized=1;
}
fprintf(out,"<Object id=\"ID%x\"
typename=\"structure\">\n",(int)data);
fprintf(out,"\t<byte name=\"a\">%d</byte>\n",data->a);
fprintf(out,"\t<int name=\"b\">%d</int>\n",data->b);
fprintf(out,"\t<int name=\"c\">%d</int>\n",data->c);
fprintf(out,"\t<unsignedInt
name=\"d\">%u</unsignedInt>\n",data->d);
fprintf(out,"\t<int name=\"e\">%d</int>\n",data->e);
fprintf(out,"\t<long name=\"f\">%ll</long>\n",data->f);
// Type long double not supported natively.
// Using hexadecimal encoding
p = (unsigned char *)&data->g;
fprintf(out,"\t<bin.hex name=\"g\">");
for(i=0; i<12;i++) {
fprintf(out,"%x",*(p++) & 0xff);
}
fprintf(out,"</bin.hex>\n");
fprintf(out,"\t<double name=\"h\">%.15g</double>\n",data->h);
// Assume char * points to strings
fprintf(out,
"\t<string name=\"str\" xml:space=\"preserve\">%s</string>\n",
data->str);
fprintf(out,"\t<IDREF name=\"tab\">ID%x</IDREF>\n",&data->tab);
fprintf(out,"\t<IDREF name=\"Next\">ID%x</IDREF>\n",data->Next);
fprintf(out,"</Object>\n");
structureSerialize(data->Next,out); // follow the Next pointer
TabSerialize(&data->tab,out); // Follow embedded structures
return 1;
}
-----------------------------------------------------------------
This function, when called will generate the following xml:
----------------------------------------------------
<Object id="ID12ff00" typename="structure">
<byte name="a">-56</byte>
<int name="b">3876</int>
<int name="c">-254</int>
<unsignedInt name="d">598877</unsignedInt>
<int name="e">777899</int>
<bin.hex name="g">000000080ff7f00</bin.hex>
<double name="h">687.988877</double>
<string name="str" xml:space="preserve">A string</string>
<IDREF name="tab">ID12ff40</IDREF>
<IDREF name="Next">ID0</IDREF>
</Object>
---------------------------------------------------------
Design principles:
------------------
1) The software will follow pointers and should be able to cope with
complicated and messy graphs, even if they contain loops.
To do this it records the address of each object stored.
(Not shown in the example above)
2) Since the address of each object is unique, the implementation
contains no embedded objects, just references (pointers) to
other objects. All objects are stored under the ObjectStore
tag (not shown).
3) Open issues are what to do with:
A) Unions. In my opinion there is no way to know which of the
members of the union is valid, so unions will not be followed
and just stored in binary form.
B) Function pointers. There is no easy way to know what is
the name of the function stored in a function pointer.
Storing the pointer may be useful if the program is loaded
at the same address.
I have followed a bit the literature about this, and I have never
seen any C implementation. Just C++ ones, where the problems are
much bigger than in C since they have to cope with multiple
heritance hierarchies, templates, whatever. Happily in C everything
is much simpler.
Questions:
Are any of you aware of an implementation of this in C?
What would you propose for unions and function pointers?
Are there any other standards for datatypes in XML besides
the one mentioned above?
Thanks in advance for your time
jacob
facility of any kind of objects to/from disk.
The intermeidate format used is XML, using the schema
(modified a bit) of Microsoft: xmlns="x-schema:xop-schema.xml"
Operation:
----------
The software generates several C functions that implement the
writing of the XML. To make things more concrete suppose
the following setup:
typedef struct tagG {
int tab[10];
} Tab;
typedef struct tagstruct {
char a;
short b;
int c;
unsigned d;
long e;
long long f;
long double g;
double h;
char * str;
Tab tab;
struct tagstruct *Next;
} structure;
The "wizard" software generates the following functions:
----------------------------------------------
//@ Serialization function for structure structure
int structureSerialize(structure *data,FILE *out)
{
int i;
unsigned char *p;
if (data == NULL)
return 0;
if (!initialized) {
InitXmlWriter(out);
initialized=1;
}
fprintf(out,"<Object id=\"ID%x\"
typename=\"structure\">\n",(int)data);
fprintf(out,"\t<byte name=\"a\">%d</byte>\n",data->a);
fprintf(out,"\t<int name=\"b\">%d</int>\n",data->b);
fprintf(out,"\t<int name=\"c\">%d</int>\n",data->c);
fprintf(out,"\t<unsignedInt
name=\"d\">%u</unsignedInt>\n",data->d);
fprintf(out,"\t<int name=\"e\">%d</int>\n",data->e);
fprintf(out,"\t<long name=\"f\">%ll</long>\n",data->f);
// Type long double not supported natively.
// Using hexadecimal encoding
p = (unsigned char *)&data->g;
fprintf(out,"\t<bin.hex name=\"g\">");
for(i=0; i<12;i++) {
fprintf(out,"%x",*(p++) & 0xff);
}
fprintf(out,"</bin.hex>\n");
fprintf(out,"\t<double name=\"h\">%.15g</double>\n",data->h);
// Assume char * points to strings
fprintf(out,
"\t<string name=\"str\" xml:space=\"preserve\">%s</string>\n",
data->str);
fprintf(out,"\t<IDREF name=\"tab\">ID%x</IDREF>\n",&data->tab);
fprintf(out,"\t<IDREF name=\"Next\">ID%x</IDREF>\n",data->Next);
fprintf(out,"</Object>\n");
structureSerialize(data->Next,out); // follow the Next pointer
TabSerialize(&data->tab,out); // Follow embedded structures
return 1;
}
-----------------------------------------------------------------
This function, when called will generate the following xml:
----------------------------------------------------
<Object id="ID12ff00" typename="structure">
<byte name="a">-56</byte>
<int name="b">3876</int>
<int name="c">-254</int>
<unsignedInt name="d">598877</unsignedInt>
<int name="e">777899</int>
<bin.hex name="g">000000080ff7f00</bin.hex>
<double name="h">687.988877</double>
<string name="str" xml:space="preserve">A string</string>
<IDREF name="tab">ID12ff40</IDREF>
<IDREF name="Next">ID0</IDREF>
</Object>
---------------------------------------------------------
Design principles:
------------------
1) The software will follow pointers and should be able to cope with
complicated and messy graphs, even if they contain loops.
To do this it records the address of each object stored.
(Not shown in the example above)
2) Since the address of each object is unique, the implementation
contains no embedded objects, just references (pointers) to
other objects. All objects are stored under the ObjectStore
tag (not shown).
3) Open issues are what to do with:
A) Unions. In my opinion there is no way to know which of the
members of the union is valid, so unions will not be followed
and just stored in binary form.
B) Function pointers. There is no easy way to know what is
the name of the function stored in a function pointer.
Storing the pointer may be useful if the program is loaded
at the same address.
I have followed a bit the literature about this, and I have never
seen any C implementation. Just C++ ones, where the problems are
much bigger than in C since they have to cope with multiple
heritance hierarchies, templates, whatever. Happily in C everything
is much simpler.
Questions:
Are any of you aware of an implementation of this in C?
What would you propose for unions and function pointers?
Are there any other standards for datatypes in XML besides
the one mentioned above?
Thanks in advance for your time
jacob