a struct parser

B

Ben Pfaff

I've written a program that, if given a struct definition (in a
simple, non-C format) will create the corresponding C struct
definition as well as functions to parse and write textual
representations of the values of the struct members as
key/value pairs. Essentially I wanted a 1:1 correspondence
between a configuration/paramater text file and a struct inside
the program.

This sounds like an interface definition language. In what ways
is it superior to existing IDLs such as XDR, which is defined by
an RFC and for which exist multiple compatible implementations?
 
K

Keith Thompson

jacob navia said:
Weiguang Shi a écrit :

I did a mistake and posted "answer" to the group instead of "answer to
the sender only". I retired (cancelled ) the message one second after
it appeared, but Mr Sosman, that is very busy following my messages
intercepted it, and immediately started a polemic, since he is not
very busy (or apparently has nothing better to do).

Cancels very frequently don't work. Your message appeared on the
newsgroup, with no indication that it wasn't intended to be public,
and therefore available for public comment. (And Mr. Sosman's point
was a valid one.)
 
Q

qed

Weiguang said:
I can accept some comments. Actually, it might be useful to hint to
the parser what option it should generate for each entry.

Well, ok, how would we do this? This is off the top of my head:

struct something1 {
int slen;
char last[1 /* @:len=$.slen */ ]; /* struct hack. */
};

struct something2 {
int qty;
typename * ptr /* @:len=$.qty */; /* variable length ptr. */
};

struct something3 {
typename * ptr /* @:len=1 */; /* Just a pointer. */
};

struct something4 {
char * ptr /* @:len=strlen($.ptr)+1 */; /* A C string. */
};

So presumably you would do a text substitution of $ with the current
structure variable name in you autogenerated (un)marshalling code. And
@:len would tell you the count of this field.

struct something5 {
enum kind { k_W1, k_W2 } which;
union {
typename1 field1;
typename2 field2;
} x /* @:union(field1:$$.which==k_W1)
@:union(field2:$$.which==k_W2) */ ;
};

So $$ would pop above the current structure (in this case a union) to
the containing structure, and you determine which of the union fields is
valid by looking at the which field in the structure above it. A little
annoying to parse it all out, but doable.

So for serializing a bstring (http://bstring.sf.net/) we would need
something like:

struct tagbstring {
int mlen /* @:unmarshall=$.slen+1 */;
int slen;
unsigned char * data /* @:len=slen+1 */;
};

So this extra field @:unmarshall gives an expression for overriding the
value that is generated when it is unpacked (because it is defined by
the memory size, which is instance dependent.)

It would be interesting to see if this could be made into something
truly general and thus make things like CORBA and COM obselete. Being
able to automarshall directly from a .h file would be extremely
valuable, IMHO. But as we can see just from these few examples, what
started as a simple idea (the @:len= ... idea) has needed some twisted
extensions (@:unmarshall, @union, and $$ macros) to make it general to
real world scenarios.
 
F

Flash Gordon

qed said:
Weiguang said:
I can accept some comments. Actually, it might be useful to hint to
the parser what option it should generate for each entry.

Well, ok, how would we do this? This is off the top of my head:

struct something1 {
int slen;
char last[1 /* @:len=$.slen */ ]; /* struct hack. */
};

struct something2 {
int qty;
typename * ptr /* @:len=$.qty */; /* variable length ptr. */
};

struct something3 {
typename * ptr /* @:len=1 */; /* Just a pointer. */
};

struct something4 {
char * ptr /* @:len=strlen($.ptr)+1 */; /* A C string. */
};

For a string I would be inclined to just have something telling the
parser it is a string.
So presumably you would do a text substitution of $ with the current
structure variable name in you autogenerated (un)marshalling code. And
@:len would tell you the count of this field.

struct something5 {
enum kind { k_W1, k_W2 } which;
union {
typename1 field1;
typename2 field2;
} x /* @:union(field1:$$.which==k_W1)
@:union(field2:$$.which==k_W2) */ ;
};

So $$ would pop above the current structure (in this case a union) to
the containing structure, and you determine which of the union fields is
valid by looking at the which field in the structure above it. A little
annoying to parse it all out, but doable.

Now you need a mechanism for dealing with void* pointers. For example, a
linked list library that uses a pointer to void to point to the data so
you can put anything you like there. One option of course is to just
have the parser report it as being unserialisable.

Also how are you going to deal with pointers to pointers etc.
So for serializing a bstring (http://bstring.sf.net/) we would need
something like:

struct tagbstring {
int mlen /* @:unmarshall=$.slen+1 */;
int slen;
unsigned char * data /* @:len=slen+1 */;
};

So this extra field @:unmarshall gives an expression for overriding the
value that is generated when it is unpacked (because it is defined by
the memory size, which is instance dependent.)

It would be interesting to see if this could be made into something
truly general and thus make things like CORBA and COM obselete. Being
able to automarshall directly from a .h file would be extremely
valuable, IMHO. But as we can see just from these few examples, what
started as a simple idea (the @:len= ... idea) has needed some twisted
extensions (@:unmarshall, @union, and $$ macros) to make it general to
real world scenarios.

I can certainly see use for it. So much so that I have something similar
(written some months back) for converting some data structures used as
internal representation of database records in to XML and back.
--
Flash Gordon, living in interesting times.
Web site - http://home.flash-gordon.me.uk/
comp.lang.c posting guidelines and intro:
http://clc-wiki.net/wiki/Intro_to_clc

Inviato da X-Privat.Org - Registrazione gratuita http://www.x-privat.org/join.php
 
H

Haude Daniel

This sounds like an interface definition language. In what ways
is it superior to existing IDLs such as XDR, which is defined by
an RFC and for which exist multiple compatible implementations?

It is superior in several ways:

- It was fun to write
- I have never heard of XDR
- $ apt-cache search xdr
librpc-ocaml-dev - Ocaml Sun RPC libraries
xdrawchem - an open-source version of ChemDraw
$

--Daniel
 
H

Haude Daniel

When will the tool go public?

When I have tested it some more and when I have finished the documentation.
Will you be willing to share the code now?

No, sorry. It is not self-explanatory and I don't like to give away unfinished,
undocumented and untested code. You might be better off by looking at the
stuff Ben Pfaff recommended which doesn't suffer from these shortcomings
(otherwise he probably wouldn't have recommended it).

--Daniel
 
B

Ben Pfaff

It is superior in several ways:

- It was fun to write

Well, that's always worthwhile. It's always reasonable to write
code to figure out for yourself how it should be done.
- I have never heard of XDR

I'm not sure that "I didn't bother to look into the related work"
is a good reason to do something.
- $ apt-cache search xdr

It doesn't show up under apt-cache because it's built into glibc.
It's also built into the Linux kernel, for that matter, if you
compile in NFS support.
 
R

Rod Pemberton

Weiguang Shi said:
Hi,

Is there a tool that, given a struct definition, generates a function
that parses binary data of this struct and a command that can be used
to construct binary data according to user-specified values for the
fields of this struct?

Your question reminds me of a record editor for a database. I programmed
one of those a number of years ago in PL/1. I ended up using a number of
tools to do the job. The final output had a single line for each piece of
data in the structure which contained the name of the data, type of data,
offset from the start of the structure, etc. I used a flex grammar to parse
the PL/1 data structure into a clean format. I then used PL/1 program which
read the clean format and structure definition to output the size and
offsets of the data. This created a generic "map" for the record. This
generic map could then be loaded into the record editing program.

You should be able to do something similar in C with the help of a flex
grammar. I would start by looking sizeof() and offsetof().


Rod Pemberton
 
F

Friedrich Dominicus

CBFalconer said:
Most of us would strongly echo Eric Sosmans warning. M. Navia is
well known for using non-standard constructs and creating
non-portable systems. As Richard Heathfield has pointed out, he
has seriously damaged his reputation here.
Oh, I didn't know that you are the speaker of comp.lang.c.

Nice to learn

Friedrich
 
K

Keith Thompson

Friedrich Dominicus said:
Oh, I didn't know that you are the speaker of comp.lang.c.

Nice to learn

No, he's one of many speakers here, who is as entitled to his opinion
(which I happen to share) as anyone else.
 
H

Herbert Rosenau

Oh, I didn't know that you are the speaker of comp.lang.c.

No, he is not THE speaker of clc but one of the most respected
regulars.
Nice to learn

True, you can learn many thing from him. And yes, M. Navia has reduced
his reputation to less than zero already.

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
 
E

Eric Sosman

Herbert said:
No, he is not THE speaker of clc but one of the most respected
regulars.




True, you can learn many thing from him. And yes, M. Navia has reduced
his reputation to less than zero already.

I don't think that's defensible. Mr. Navia is a hard
worker with considerable knowledge and a desire to help. His
failing ("the mote in thy neighbor's eye ...") is an inability
to distinguish C from non-C -- hence, my warning. I'm fairly
confident that Mr. Navia can write C if properly motivated.
 
H

Herbert Rosenau

I don't think that's defensible. Mr. Navia is a hard
worker with considerable knowledge and a desire to help. His
failing ("the mote in thy neighbor's eye ...") is an inability
to distinguish C from non-C -- hence, my warning. I'm fairly
confident that Mr. Navia can write C if properly motivated.
I've seen enough from him here to never trust any bit of C he has
written in practice.

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,183
Messages
2,570,968
Members
47,517
Latest member
TashaLzw39

Latest Threads

Top