C Data Serialize ?

C

Corne' Cornelius

Hi,

I need to save a (complex) struct which has pointer members, to a file,
and load it again later. The actual memory which the pointers point to
must be saved, not the memory address.

Will i need to create de/serialization functions ? if so, what's a good
starting point ?

is it possible to check if a variable is a struct, pointer, etc.. ?

Thanks,
Corne'


!Exclude Disclaimer!
 
S

signuts

Hi,

I need to save a (complex) struct which has pointer members, to a file,
and load it again later.

The only way I'm aware of doing this is to write teh code yourself, yes a
de/serialize functions.
if so, what's a good
starting point ?
I think this depends on the application. Is there binary data contained in
your structures? How many members are in the structure. What kind of
storage would be effecient for your application? etc.
The best way for Me (or any of us) to help you is to see some example code.

I havn't had to save/load many complex structures and when I do I lay out
the data if a config file (ini file format most often) and just have
functions to load the values read right back into the structure.
is it possible to check if a variable is a struct, pointer, etc.. ?
There is always some trickery you can do but it's all
architecture/platform specific. Best avoided in my opinion.
 
C

Corne' Cornelius

Hi,

Well, it would be nice to be able to load/save any struct, so
serialization is probably the answer.

eg.:

#include <stdio.h>
#include <string.h>

int main(int argc, char *argv[]) {
FILE *f;
struct _x {
int id;
char *name;
} x = {
187,
NULL
};

x.name = strdup("corne");

f = fopen("test.out", "wb");
fwrite(&x, 1, sizeof(struct _x), f);
fclose(f);

free(x.name);

bzero(&x, sizeof(struct _x));

f = fopen("test.out", "rb");
fread(&x, 1, sizeof(struct _x), f);
fclose(f);

return 0;
}


compile that with debugging, and have a look at 'x' before the write,
and after the read. you'll see, the memory address if x.name was written
to the outfile, instead of the value at that memory address. This is
correct, i know. but i need to be able to restore a complete struct as
it was.

Thanks,
Corne'
 
M

Michael B Allen

Hi,

I need to save a (complex) struct which has pointer members, to a file,
and load it again later. The actual memory which the pointers point to
must be saved, not the memory address.

Will i need to create de/serialization functions ? if so, what's a good
starting point ?

This might be a good starting point:

http://www.ioplex.com/~miallen/encdec/

but you will need to write your own higher level de/serialization
functions to de/serialize the data pointed to by the pointers. See
tests/t3encdec.c.
is it possible to check if a variable is a struct, pointer, etc.. ?

Not enough to automate the serialization process. You must do it
explicitly.

Mike
 
D

Default User

Artie said:
If the pointers in the struct are pointers to PODs and point to unique
data (i.e. pointers in different instances of the struct do not point to
the same memory) it's pretty simple; just dereference the pointer and
output the value.


PODS?



Brian Rodenborn
 
M

Malcolm

Default User said:
That's a C++ ism. Plain old data types, or a C structure.

class A a1;
class A a2;

memcpy(&al, &a2, sizeof(class A));

is an error in C++.
 
R

Rich Grise

Corne' Cornelius said:
Hi,

Well, it would be nice to be able to load/save any struct, so
serialization is probably the answer.

eg.:

#include <stdio.h>
#include <string.h>

int main(int argc, char *argv[]) {
FILE *f;
struct _x {
int id;
char *name;
} x = {
187,
NULL
};

x.name = strdup("corne");

f = fopen("test.out", "wb");
fwrite(&x, 1, sizeof(struct _x), f);
fclose(f);

free(x.name);

bzero(&x, sizeof(struct _x));

f = fopen("test.out", "rb");
fread(&x, 1, sizeof(struct _x), f);
fclose(f);

return 0;
}


compile that with debugging, and have a look at 'x' before the write,
and after the read. you'll see, the memory address if x.name was written
to the outfile, instead of the value at that memory address. This is
correct, i know. but i need to be able to restore a complete struct as
it was.

Thanks,
Corne'
Well, all the structure has is the pointer. You haven't declared any
storage for what the pointer 'name' points at. I've never seen "strdup"
in use; judging from your code it allocates some memory a la malloc(),
which then gets freed later. In other words, when you free(x.name) then
whatever was in it goes into the bit bucket.

try this:

struct _x {
int ID;
char name[64];
} x {
247,
"" /* or "Me" or "WhateverQ!@#$%^&*()" */
};

or

char mybuf[64] = "My Name";

struct _x {
int ID;
char *name;
} x {
247,
mybuf
};

and save them and restore them to their own space.

But since you need the space for the data anyway, why not incorporate
it into the structure?

Cheers!
Rich
 
A

Artie Gold

Default said:
I know what it is, I'm a C++ programmer too. I'm curious as to why the
OP mentioned it in this context.

Well, I wasn't the OP (but the `OR'[1]), but...

If the struct you want to serialize has instances of other structs
inside of it, you'd also have to be able to serialize them as well, etc.
etc.

That's all.

HTH,
--ag

[1] OR - original respondent?
 
G

goose

Corne' Cornelius said:
Hi,

Well, it would be nice to be able to load/save any struct, so
serialization is probably the answer.

eg.:

#include <stdio.h>
#include <string.h>

int main(int argc, char *argv[]) {
FILE *f;
struct _x {
int id;
char *name;
} x = {
187,
NULL
};

x.name = strdup("corne");

f = fopen("test.out", "wb");
fwrite(&x, 1, sizeof(struct _x), f);
fclose(f);

free(x.name);

bzero(&x, sizeof(struct _x));

f = fopen("test.out", "rb");
fread(&x, 1, sizeof(struct _x), f);
fclose(f);

return 0;
}


compile that with debugging, and have a look at 'x' before the write,
and after the read. you'll see, the memory address if x.name was written
to the outfile, instead of the value at that memory address. This is
correct, i know. but i need to be able to restore a complete struct as
it was.

Thanks,
Corne'
Well, all the structure has is the pointer. You haven't declared any
storage for what the pointer 'name' points at. I've never seen "strdup"
in use; judging from your code it allocates some memory a la malloc(),
which then gets freed later. In other words, when you free(x.name) then
whatever was in it goes into the bit bucket.

try this:


and save them and restore them to their own space.

But since you need the space for the data anyway, why not incorporate
it into the structure?

I'm a little curious as to what exactly you think the OP wanted
to know. how does your posting answer his question ?

goose,
 
D

Default User

Artie said:
Default said:
I know what it is, I'm a C++ programmer too. I'm curious as to why the
OP mentioned it in this context.

Well, I wasn't the OP (but the `OR'[1]), but...

Sorry, I didn't mean OP.
If the struct you want to serialize has instances of other structs
inside of it, you'd also have to be able to serialize them as well, etc.
etc.


But what does that have to do with POD objects? I don't think that you
understand what that term means. You seem to think it means "made up of
primitive types and no pointers" or something. It doesn't.

Basically, POD only has meaning in C++ and refers to data types that are
compatible with C. It's not really defined that way in the Standard, but
that's the reason for the whole POD thing. It means things like, no
inheritance, no user-defined copy, no pointers to members, a bunch of
other C++ type stuff.

One of the key features of PODS are that they occupy contiguous storage,
they can be written to files and read back and get the same-valued
object, they can be copied with memcpy() and get the same-valued object,
etc.

To speak of POD types in C is pointless, as ALL data objects are PODs.



Brian Rodenborn
 
A

Artie Gold

Default said:
Artie said:
Default said:
Malcolm wrote:




PODS?


That's a C++ ism. Plain old data types, or a C structure.


I know what it is, I'm a C++ programmer too. I'm curious as to why the
OP mentioned it in this context.

Well, I wasn't the OP (but the `OR'[1]), but...


Sorry, I didn't mean OP.

If the struct you want to serialize has instances of other structs
inside of it, you'd also have to be able to serialize them as well, etc.
etc.



But what does that have to do with POD objects? I don't think that you
understand what that term means. You seem to think it means "made up of
primitive types and no pointers" or something. It doesn't.

Basically, POD only has meaning in C++ and refers to data types that are
compatible with C. It's not really defined that way in the Standard, but
that's the reason for the whole POD thing. It means things like, no
inheritance, no user-defined copy, no pointers to members, a bunch of
other C++ type stuff.

One of the key features of PODS are that they occupy contiguous storage,
they can be written to files and read back and get the same-valued
object, they can be copied with memcpy() and get the same-valued object,
etc.

To speak of POD types in C is pointless, as ALL data objects are PODs.
You have enlightened me.
Thanks.

--ag
 
D

Default User

Artie said:
You have enlightened me.
Thanks.


Then my day is a success! Now if I can just tranfer that success to the
Unscheduled Data Interface infrastructure module I'm looking at . . .




Brian Rodenborn
 
R

Richard Heathfield

Default User wrote:

One of the key features of PODS are that they occupy contiguous storage,
they can be written to files and read back and get the same-valued
object, they can be copied with memcpy() and get the same-valued object,
etc.

To speak of POD types in C is pointless, as ALL data objects are PODs.

If that is true, then...

/* Error-checking omitted for brevity */

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
FILE *fp = fopen("foo.bin", "wb"); /* open a file */
char *p = malloc(1024); /* A: get some memory, point at it with a "POD" */
fwrite(&p, sizeof p, 1, fp); /* B: write the POD to a file */
fclose(fp); /* close the file */
free(p); /* C: release the memory */
fp = fopen("foo.bin", "rb"); /* reopen the file */
fread(&p, sizeof p, 1, fp); /* D: read POD back in */
fclose(fp);
return 0;
}

At the point marked A, p gains a value. At the point B, p - which is a POD
by your argument because this is a C program - has its value written to a
file. At the point C, p loses its value. It used to point somewhere useful,
but after the free(), its value is indeterminate. No matter! According to
your POD theory, we can restore its value by re-reading it from file (D)!

But of course we can't.

There is more to C than plain old data.

(And don't get me started on deep copies.)
 
D

Default User

Richard said:
At the point marked A, p gains a value. At the point B, p - which is a POD
by your argument

Well, it's not really my arguments, every definition I've read of PODs
state that ALL C objects are PODs.
because this is a C program - has its value written to a
file. At the point C, p loses its value. It used to point somewhere useful,
but after the free(), its value is indeterminate.

How do you figure? It's still the same address, if it's valid it's
valid, if not then not. It's the same bit pattern.

No matter! According to
your POD theory, we can restore its value by re-reading it from file (D)!

But of course we can't.

I don't see what you are getting at. The fact that what it points to
went away doesn't enter into it.
There is more to C than plain old data.

Explain.

I think, like the previous poster, you are using a different definition
of POD.




Brian Rodenborn
 
D

Default User

Default said:
How do you figure? It's still the same address, if it's valid it's
valid, if not then not. It's the same bit pattern.

Zipping through the Standard, I do find that it says, "The value of a
pointer that refers to freed space is indeterminate." However, I think
that's a tiny flaw in my definition of what POD means. Really reading a
value from a file isn't much different from copying a value using
memcpy().

Interesting though. If the value is indeterminant, can you say it is the
same?

If I do this:

int *p;
int *q;

p = malloc(sizeof *p); /* skip err chk */
free (p);

memcpy (&q, &p, sizeof p);

Can we say the values of p and q are the same?



Brian Rodenborn
 
C

Chris Torek

How [can a bit pattern passed to free() become invalid]? It's still
the same address, if it's valid it's valid, if not then not. It's the
same bit pattern.

One canonical example -- which I believe to be *the* example that
got this put in the C standard, although I am just guessing --
occurs on the Intel 80x86 ("x86").

On this machine, pointers are 32 bits long, but composed of two
pieces called the "segment" and "offset". Ordinary "int"s, of
course, are only 16 bits long, so pointers must be loaded into
special pointer registers. The x86 is rather odd in that these
"pointer registers" are really two half-registers, a "segment
register" and an ordinary 16-bit integer register. The "offset"
portion of a 32-bit pointer is loaded into the ordinary register,
while the "segment" is loaded into one of the special segment
registers. (In keeping with the register-poor nature of this CPU,
there are only three segment registers, "cs", "ds", and "es" --
code, data, and "extra" segment -- although later CPUs add two more
called "fs" and "gs".)

Although this CPU is rather slow and klunky, it does have a nice
security feature that catches all kinds of programming errors.
This is: each "segment" value can be automatically tested for
validity each time a new segment number is loaded into a segment
register. This means that the underlying system (library and/or
operating system) can catch improper accesses to memory, such as
the use of a pointer after that pointer is freed.

Naturally, any C compiler where the system is intended to get
the right answer, even if that takes more time, *uses* this feature.
This means that, in the sequence:

q = p;
...
free(p);
x = q->member;

the attempt to access "q->member" is caught at runtime. But
it also means that even something like:

free(p);
q = p; /* or: if (p == q) */

may trigger a runtime fault, if the compiler chooses to copy or
compare p to q by loading p's value -- which was valid before
free(), but is no longer -- into a segment/offset register pair.

Luckily for virus-writers, C programmers and compiler-writers seem
to prefer getting the wrong answer as fast as possible. Actually
*using* this segmentation feature causes them to detect this wrong
answer, while getting right answers slightly slower, which most
consider unacceptable. So in practice, users of the 80x86 do not
see this behavior, even though it would stop so many software bugs.
(Note that these same segment registers also allow the system to
prevent execution of data as if it were code, stopping most of the
remaining virus-mechanisms. Again, this would prevent getting the
wrong answer as fast as possible, so it is not much used.)
 
R

Richard Heathfield

It might not be the same address, even if it's the same bit pattern.
Consider, for example, a protected mode environment, where a "pointer" may
really just be an entry in a dynamic lookup table. After being freed, that
"address" might literally no longer exist.
It's the same bit pattern.

And this matters how, precisely? :)
Zipping through the Standard, I do find that it says, "The value of a
pointer that refers to freed space is indeterminate." However, I think
that's a tiny flaw in my definition of what POD means. Really reading a
value from a file isn't much different from copying a value using
memcpy().

Interesting though. If the value is indeterminant, can you say it is the
same?

No, I don't think it's useful to claim that we can say anything at all about
an indeterminate value other than that it /is/ indeterminate.
If I do this:

int *p;
int *q;

p = malloc(sizeof *p); /* skip err chk */
free (p);

memcpy (&q, &p, sizeof p);

Can we say the values of p and q are the same?

Very good question. Since p's value is indeterminate, it's tempting to say
that the behaviour is undefined. But I'd be curious to hear from an expert
on this subject, given that p's bit pattern is (IMHO) rather unlikely to be
a trap representation.
 
D

Default User

Richard said:
Default User wrote:
It might not be the same address, even if it's the same bit pattern.
Consider, for example, a protected mode environment, where a "pointer" may
really just be an entry in a dynamic lookup table. After being freed, that
"address" might literally no longer exist.


And this matters how, precisely? :)

Yes, it all comes down to "what is the value". The bit representation
may the same, but the value is "indeterminant" once it is freed. So as I
said, I think it's more a matter of me giving a flawed presentation on
PODs rather than an exception to the rule about all C objects being
PODs.

Very good question. Since p's value is indeterminate, it's tempting to say
that the behaviour is undefined. But I'd be curious to hear from an expert
on this subject, given that p's bit pattern is (IMHO) rather unlikely to be
a trap representation.

I don't remember the rules, it is ok to compare freed pointers to other
pointers? Only null pointers? No comparisons period?



Brian Rodenborn
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,077
Messages
2,570,566
Members
47,202
Latest member
misc.

Latest Threads

Top