What's wrong with this code ? (struct serialization to raw byte str

A

Alfonso Morra

Hi,

I am at the end of my tether now - after spending several days trying to
figure how to do this. I have finally written a simple "proof of
concept" program to test serializing a structure containing pointers
into a "flattened" bit stream.

Here is my code (it dosen't work).

I would be grateful for any feedback that helps fix this. My intention
is to build on this example, and use the ideas here, to be able to
persist any data structure (I'll write different routines for difdferent
stuctures, just to keep things simple). Anyway, here's the code:


#include "stdlib.h"
#include "stdio.h"
#include "string.h"


typedef struct {
int l ;
double d ;
char* s; /* Null terminated string */
} MyStruct ;


void * pack(size_t *size, MyStruct* m);
MyStruct *unpack(void* block);


int main(int argc, char* argv[]) {

MyStruct *in = (MyStruct*)malloc(sizeof(*in)), *out = NULL;
unsigned char *memblock = NULL ;
size_t size ;

in->l = 1000 ;
in->d = 3.142857;
in->s = strdup("Simple Text" ); /*did I need to strdup? */

memblock = (unsigned char*)pack(&size, in) ;
out = unpack(memblock) ;

printf("Int member has value : %d (expected : %d)", out->l, in->l ) ;
printf("Double member has value : %f (expected : %f)", out->d, in->d ) ;
printf("Int member has value : %s (expected : %s)", out->s, in->s ) ;

free(in->s) ;
free(in) ;
free(out->s) ;
free(out) ;

}




void * pack(size_t *size, MyStruct* m) {
unsigned char *buff = NULL ;
size_t len, length ;

length = strlen(m->s) ;
len = sizeof(int) + sizeof(double) + sizeof(size_t) +
(length+1)*sizeof(char) ;
buff = (unsigned char*)malloc(len) ;

/*copy int*/
memmove(buff, &(m->l), sizeof(int)) ;
/*copy double*/
memmove(buff + sizeof(int), &(m->d), sizeof(double)) ;
/*store length of string*/
memmove(buff + sizeof(int) + sizeof(int), &length, sizeof(size_t));
/*copy string*/
memmove(buff + sizeof(int) + sizeof(double), m->s,
(strlen(m->s)+1)*sizeof(char)) ;

*size = len ;
return buff ;
}


MyStruct *unpack(void* block) {
int l, len ;
double d ;
char * s = NULL ;
MyStruct *p = NULL ;

/* get int*/
memcpy(&l, block, sizeof(int)) ;
/* get double*/
memcpy(&d, (unsigned char*)block + sizeof(int), sizeof(double)) ;
/* get string length*/
memcpy(&len, (unsigned char*)block + sizeof(int) + sizeof(double),
sizeof(size_t)) ;
/* get string*/
s = (char*)malloc(len+1) ;
memcpy(s,(unsigned char*)block + sizeof(int) + sizeof(double)+
sizeof(size_t),len) ;

p = (MyStruct*)malloc(sizeof(*p)) ;

p->l = l ;
p->d = d ;
p->s = s ;

/* free resource */
free(block) ;
block = NULL ;
return p ;
}
 
D

David G. Hong

Alfonso said:
Hi,

I am at the end of my tether now - after spending several days trying to
figure how to do this. I have finally written a simple "proof of
concept" program to test serializing a structure containing pointers
into a "flattened" bit stream.

Here is my code (it dosen't work).

Exactly how doesn't it work? Compile error? Segmentation Fault? Etc..
 
A

Alfonso Morra

David said:
Exactly how doesn't it work? Compile error? Segmentation Fault? Etc..

<snip>
(it dosen't work - compiles fine, pack appears to work, but unpack
retrieves jibberish and causes program to crash).
</snip>
 
T

tanmoy87544

Just one error (or rather on two lines): an error of frustration, I
believe, one you surely have found by now! But since you asked for
comments, here they are:

Alfonso said:
MyStruct *in = (MyStruct*)malloc(sizeof(*in)), *out = NULL;

It does not matter, but any particular reason for the cast? (I also try
to write sizeof *in since sizeof is an operator when applied to
objects, but that is stylistic and varies from person to person).
in->s = strdup("Simple Text" ); /*did I need to strdup? */

Depends on what you mean: you could copy the text by mallocing strlen+1
and then doing a strcpy. Alternatively, you could directly assign
"Simple Text" to in->s as long as you remember that different instances
of string literals need not refer to separate blocks of memory, and
this block of memory may not be modifiable.
memblock = (unsigned char*)pack(&size, in) ;

Any reason that pack returns void* which is being converted to unsigned
char*?

len = sizeof(int) + sizeof(double) + sizeof(size_t) +
(length+1)*sizeof(char) ;

An reason you chose not to use the more transparent sizeof m->l +
sizeof m->d + sizeof length + (length+1) * sizeof (char)?

Not using sizeof types, and only using sizeof objects often avoids
errors like the following
memmove(buff + sizeof(int), &(m->d), sizeof(double)) ;

You may know this, but in the absence of parentheses, the postfix ->d
is already evaluated before prefix &.
/*store length of string*/
memmove(buff + sizeof(int) + sizeof(int), &length, sizeof(size_t));
^^^^^^^^^^^
Huh? (similar mistake also on next line). Any reason you used memmove
in all of this? malloced memory can never overlap another area.

To avoid this kind of error, I try to maintain a running pointer curpos
which is incremented with each memcpy and forms the base for the target
of further memcpys.

Incidentally, why did you store the string length if you got it by
strlen in the first place, and if you are storing the final null byte
anyway? Not wrong, I am just confused.
/* free resource */
free(block) ;
block = NULL ;
return p ;

What does assigning to block achieve just before a return?

Incidentally, the last write to stdout from your code should end in a
newline, and two many characters without a newline is rather hard to
read!
 
A

Alfonso Morra

Just one error (or rather on two lines): an error of frustration, I
believe, one you surely have found by now! But since you asked for
comments, here they are:

Alfonso Morra wrote:




It does not matter, but any particular reason for the cast? (I also try
to write sizeof *in since sizeof is an operator when applied to
objects, but that is stylistic and varies from person to person).

I'm mixing C and C++ (the casts are required by the stricter C++ compiler)
Depends on what you mean: you could copy the text by mallocing strlen+1
and then doing a strcpy. Alternatively, you could directly assign
"Simple Text" to in->s as long as you remember that different instances
of string literals need not refer to separate blocks of memory, and
this block of memory may not be modifiable.

Ok. Duly noted. Not too keen on requesting mem all over the place. But I
was not sure a const char* would have been treated.
Any reason that pack returns void* which is being converted to unsigned
char*?

See first comment



An reason you chose not to use the more transparent sizeof m->l +
sizeof m->d + sizeof length + (length+1) * sizeof (char)?

Not using sizeof types, and only using sizeof objects often avoids
errors like the following




You may know this, but in the absence of parentheses, the postfix ->d
is already evaluated before prefix &.

Sorry, what's the error again? Could you eleaborate ? - is this why the
program crashes? How may I correct it ?
^^^^^^^^^^^
Huh? (similar mistake also on next line). Any reason you used memmove
in all of this? malloced memory can never overlap another area.

Better safe than sorry ;-)
Probably a case of being too cautious. Is there a performance penalty?.
Why should I care (i.e. if there is no discernable perfrmance hit caused
by the overheard of ensuring theres no overlap?)

That cryptic line is basically offseting the base pointer by the data
size (I admit it does look horrible). The reason I didn't directly
increment the ptr (buff) was that I am returning the base at the end of
the routine (I'm sure there's a better way of doing this).

In any case, I DO need to store the length of the string, so I can
unpack it safely (allowing for the '\0').

To avoid this kind of error, I try to maintain a running pointer curpos
which is incremented with each memcpy and forms the base for the target
of further memcpys.

Incidentally, why did you store the string length if you got it by
strlen in the first place, and if you are storing the final null byte
anyway? Not wrong, I am just confused.

Could be overkill, but it is a "cheap?" way of me ensuring that the
string is "kosher" - could probably do without it though..
What does assigning to block achieve just before a return?

That line is residue from some code that I had deleted before posting -
its redundant.
Incidentally, the last write to stdout from your code should end in a
newline, and two many characters without a newline is rather hard to
read!

Details, details, details ;-) I was more worried in getting the program
to work - but I take your point.
 
A

Alfonso Morra

Just one error (or rather on two lines): an error of frustration, I
believe, one you surely have found by now! But since you asked for
comments, here they are:

Alfonso Morra wrote:




It does not matter, but any particular reason for the cast? (I also try
to write sizeof *in since sizeof is an operator when applied to
objects, but that is stylistic and varies from person to person).




Depends on what you mean: you could copy the text by mallocing strlen+1
and then doing a strcpy. Alternatively, you could directly assign
"Simple Text" to in->s as long as you remember that different instances
of string literals need not refer to separate blocks of memory, and
this block of memory may not be modifiable.




Any reason that pack returns void* which is being converted to unsigned
char*?





An reason you chose not to use the more transparent sizeof m->l +
sizeof m->d + sizeof length + (length+1) * sizeof (char)?

Not using sizeof types, and only using sizeof objects often avoids
errors like the following




You may know this, but in the absence of parentheses, the postfix ->d
is already evaluated before prefix &.



^^^^^^^^^^^
Huh? (similar mistake also on next line). Any reason you used memmove
in all of this? malloced memory can never overlap another area.

To avoid this kind of error, I try to maintain a running pointer curpos
which is incremented with each memcpy and forms the base for the target
of further memcpys.

Incidentally, why did you store the string length if you got it by
strlen in the first place, and if you are storing the final null byte
anyway? Not wrong, I am just confused.




What does assigning to block achieve just before a return?

Incidentally, the last write to stdout from your code should end in a
newline, and two many characters without a newline is rather hard to
read!

Hi,

Ignore my last post. I have spotted the errors you highlighted. Many
thanks for your comments.
 
M

manoj1978

Alfonso said:
Hi,

I am at the end of my tether now - after spending several days trying to
figure how to do this. I have finally written a simple "proof of
concept" program to test serializing a structure containing pointers
into a "flattened" bit stream.

Here is my code (it dosen't work).
/*store length of string*/
memmove(buff + sizeof(int) + sizeof(int), &length, sizeof(size_t));
memmove(buff + sizeof(int) + sizeof(double), &length,
sizeof(size_t));
/*copy string*/
memmove(buff + sizeof(int) + sizeof(double), m->s,
(strlen(m->s)+1)*sizeof(char)) ;
memmove(buff + sizeof(int) + sizeof(double)+sizeof(size_t),
m->s,(strlen(m->s)+1)*sizeof(char));
 
M

manoj1978

Alfonso said:
Hi,

I am at the end of my tether now - after spending several days trying to
figure how to do this. I have finally written a simple "proof of
concept" program to test serializing a structure containing pointers
into a "flattened" bit stream.

Here is my code (it dosen't work).

I would be grateful for any feedback that helps fix this. My intention
is to build on this example, and use the ideas here, to be able to
persist any data structure (I'll write different routines for difdferent
stuctures, just to keep things simple). Anyway, here's the code:
memcpy(s,(unsigned char*)block + sizeof(int) + sizeof(double)+
sizeof(size_t),len) ;
memcpy(s,(unsigned char*)block + sizeof(int) + sizeof(double)+
sizeof(size_t),len+1) ;
 
T

tanmoy87544

Alfonso said:
I'm mixing C and C++ (the casts are required by the stricter C++ compiler)

But thou shalt not mix languages! In C++, use new, not malloc, and
avoid the cast. In C, the cast can hide warnings about errors on some
compilers, so avoid the cast. In short, avoid the cast! (In C, you
cannot always get away from casts, but you almost can.)

C is not a subset of C++. The rules of C are different for a reason,
so it will be useful to treat them as such. In C, void* is to be
treated as a generic pointer to and from which all non-function
pointers can be freely converted without casts as long as qualifiers
are okay.
Ok. Duly noted. Not too keen on requesting mem all over the place. But I
was not sure a const char* would have been treated.

In C, string literal does not have the type `const char *' unless you
tell the compiler to not obey the standard. It is `char *', and
attempts to modify it do not need diagnostics: they simply result in
undefined behaviour.

The main problem for you in not doing malloc etc. is how to know, in
general, whether to free the pointer.
See first comment

But why not have memblock as void*? Is that not more descriptive?
Avoid casts (whether in C or C++) when possible: most often they show
that you thought about the problem wrong. (Some const casts, and some
casts of structs etc. to member etc. are exceptions to the general
rules.)
Sorry, what's the error again? Could you eleaborate ? - is this why the
program crashes? How may I correct it ?

Look at the next line: you moved sizeof(double) bytes to buff +
sizeof(int) and then started writing at buff + sizeof(int) +
sizeof(int) ???

Yes, this error and the similar error of the next to next line
(unquoted) cause the program to crash on my machine. I will leave it
to you to see how to correct it.
Better safe than sorry ;-)

Huh? Why not use the language as it is supposed to be used.
Probably a case of being too cautious. Is there a performance penalty?.

Maybe: why do you think someone defined both memmove and memcpy?
Why should I care (i.e. if there is no discernable perfrmance hit caused
by the overheard of ensuring theres no overlap?)

The performance hit may or may not be discernible. But, the puzzlement
of the next programmer reading the code is discernible, and costs quite
as much.
That cryptic line is basically offseting the base pointer by the data
size (I admit it does look horrible). The reason I didn't directly
increment the ptr (buff) was that I am returning the base at the end of
the routine (I'm sure there's a better way of doing this).

That is why I suggested using curpos, a separate local variable.
Looking horrible is not the issue; to err is human as your program
demonstrated so succinctly. Avoid programming styles that are error
prone.
In any case, I DO need to store the length of the string, so I can
unpack it safely (allowing for the '\0').

Why not use strlen in unpack as well? Are you worried about the speed?
Could be overkill, but it is a "cheap?" way of me ensuring that the
string is "kosher" - could probably do without it though..

Don't understand this.

.deleted my embarrassing use of two instead of too.
Details, details, details ;-) I was more worried in getting the program
to work - but I take your point.

I think you were so worried that you missed the obvious error :) But,
as I said, you set yourself up by using a very bad coding style ...
 
T

tanmoy87544

Just one error (or rather on two lines): an error of frustration, I
believe, one you surely have found by now! But since you asked for
comments, here they are:

I apologize. manoj1 pointed out another error as well.
 
F

Flash Gordon

Alfonso said:
Hi,

I am at the end of my tether now - after spending several days trying to
figure how to do this. I have finally written a simple "proof of
concept" program to test serializing a structure containing pointers
into a "flattened" bit stream.

Here is my code (it dosen't work).

You should say *exactly* how it does not work. In this case, any
compiler warnings and what output you get on running it.
I would be grateful for any feedback that helps fix this. My intention
is to build on this example, and use the ideas here, to be able to
persist any data structure (I'll write different routines for difdferent
stuctures, just to keep things simple). Anyway, here's the code:


#include "stdlib.h"
#include "stdio.h"
#include "string.h"

You should use angle brackets for systems headers, e.g.
#include <stdlib.h>
#include <stdio.h>
#include said:
typedef struct {
int l ;
double d ;
char* s; /* Null terminated string */
} MyStruct ;


void * pack(size_t *size, MyStruct* m);
MyStruct *unpack(void* block);


int main(int argc, char* argv[]) {

MyStruct *in = (MyStruct*)malloc(sizeof(*in)), *out = NULL;

There is no need to cast the return value of malloc and no need for the
initialisation of out in this case. Also no need for the brackets on the
operand of sizeof.
MyStruct *out;
MyStruct *in = malloc(sizeof *in);

You should also check that the call to malloc succeeds. Although in this
example you could have used a variable of type MyStruct and not bothered
with the mallocing.
unsigned char *memblock = NULL ;
size_t size ;

in->l = 1000 ;
in->d = 3.142857;
in->s = strdup("Simple Text" ); /*did I need to strdup? */

strdup is not part of the C standard. In the case of this example you
could use
in->s = "Simple Text";
If you were to modify the string then you would have to allocate a block
for it and copy the data in.
memblock = (unsigned char*)pack(&size, in) ;
out = unpack(memblock) ;

printf("Int member has value : %d (expected : %d)", out->l, in->l ) ;
printf("Double member has value : %f (expected : %f)", out->d, in->d ) ;
printf("Int member has value : %s (expected : %s)", out->s, in->s ) ;

Typo above. The last printout is of String, not Int.
free(in->s) ;

This would go if it was not dynamically allocated, obviously.
free(in) ;

As would this.
free(out->s) ;
free(out) ;

}




void * pack(size_t *size, MyStruct* m) {
unsigned char *buff = NULL ;
unsigned char *p /* see usage below for the reason */
size_t len, length ;

length = strlen(m->s) ;

length = strlen(m->s) + 1;
You are always dealing with the entire block including the nul
termination so why not do the +1 once only?
len = sizeof(int) + sizeof(double) + sizeof(size_t) +
(length+1)*sizeof(char) ;

sizeof(char) is guaranteed to be 1 by the standard.
len = sizeof(int) + sizeof(double) + sizeof(size_t) +
length;

You can reduce the chance of error by changing this to
len = (sizeof m->l) + (sizeof m->d) + (sizeof length) +
length;
buff = (unsigned char*)malloc(len) ;

No need for the cast.
p = buff = malloc(len);

Check if malloc succeeded otherwise it could go bang.
/*copy int*/
memmove(buff, &(m->l), sizeof(int)) ;

You can use memcpy since you know there will be no overlap. also use
sizeof variable rather than type and less chance of error.
memcpy(p, &m->l, sizeof m->l);
p += sizeof m->l;
No, by using p you won't need to keep adding this each time without typo.

/*copy double*/
memmove(buff + sizeof(int), &(m->d), sizeof(double)) ;

memcpy(p, &m->d, sizeof m->d);
p += sizeof m->d;

/*store length of string*/
memmove(buff + sizeof(int) + sizeof(int), &length, sizeof(size_t));
^^^^^^^^^^^
See the problem with using sizeof(type) now?

memcpy(p, &length, sizeof length);
p += sizeof length;
/*copy string*/
memmove(buff + sizeof(int) + sizeof(double), m->s,
(strlen(m->s)+1)*sizeof(char)) ;

You already know the length of the string, why recalculate it?
memcpy(p, m->s, length);
*size = len ;
return buff ;
}


MyStruct *unpack(void* block) {
int l, len ;

You stored the length as a size_t not an int, so you need to extract it
as a size_t not an int. The two *are* different sizes on some systems.

Why bother with the other temporary variables?
double d ;
char * s = NULL ;
MyStruct *p = NULL ;

As before, a pointer to the current place in the block would be useful.
size_t length;
unsigned char *pos = block;
MyStruct *p = malloc(*p);

Check the malloc succeeds, of course.
/* get int*/
memcpy(&l, block, sizeof(int)) ;
memcpy(p->l, pos, sizeof p->l);
pos += sizeof p->l;

See previous comments.
/* get double*/
memcpy(&d, (unsigned char*)block + sizeof(int), sizeof(double)) ;
memcpy(p->d, pos, sizeof p->d);
pos += sizeof p->d;

See previous comments.
/* get string length*/
memcpy(&len, (unsigned char*)block + sizeof(int) + sizeof(double),
sizeof(size_t)) ;
memcpy(length, pos, sizeof length);
pos += sizeof length;

See previous comments.
/* get string*/
s = (char*)malloc(len+1) ;

Make sure the malloc succeeds, and don't cast the result Also, with my
previous mod you don't need the +1.
s = malloc(length);
memcpy(s,(unsigned char*)block + sizeof(int) + sizeof(double)+
sizeof(size_t),len) ;

Isn't it horrible by here?

memcpy(s, pos, length) ;

p = (MyStruct*)malloc(sizeof(*p)) ;

p->l = l ;
p->d = d ;
p->s = s ;

You don't need the above now since I malloced the space at the start.
/* free resource */
free(block) ;
block = NULL ;

The above assignment does exactly nothing useful. block was passed in by
value so this does not affect the copy in the caller.
return p ;
}

All of the above changes are untested, and you still need to add in
checks on malloc.
 
J

John Devereux

Alfonso Morra said:
I'm mixing C and C++ (the casts are required by the stricter C++ compiler)

Just in case you are not aware, you can call C functions from either C
or C++ modules provided you write the header file in a particular
way. What I do for all header files is this:

/* xxx.h header file for the xxx.c module */
#ifndef H_XXX
#define H_XXX
#ifdef __cplusplus
extern "C" {
#endif

... put your interface declarations here ...

#ifdef __cplusplus
}
#endif /*__cplusplus */
#endif /*H_XXX */

You can then happily include the same header file in C or C++
programs, and use the C functions from either.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,814
Latest member
SpicetreeDigital

Latest Threads

Top