GDBM prototype magic.

V

vshenoy

Hi Guys,

I was going through gdbm-1.8.3 source (http://ftp.gnu.org/gnu/gdbm/
gdbm-1.8.3.tar.gz) and found this strange thing : all the exposed
functions of gdbm work with GDBM_FILE pointer (which is returned by
gdbm_open), but in the implementation of gdbm functions (e.g.
gdbm_open in gdbmopen.c)work with a structure called gdbm_file_info.

Now GDBM_FILE is defined like this in gdbm.h an exposed header :

typedef struct { int dummy[10];} *GDBM_FILE;

but gdbm_file_info is a structure which is a lot bigger than this.

For e.g. in gdbm.h gdbm_open is defined as :

extern int gdbm_store __P((GDBM_FILE, datum, datum, int)); /* __P(x)
is x if it is standard C or C++ else _P(x) is () */

but in its definition it is like this :

int
gdbm_store (dbf, key, content, flags)
gdbm_file_info *dbf;
datum key;
datum content;
int flags;

I have these questions : (I searched previous archives of c.l.c, but
only found mention of typedef struct {int dummy[10];} *GDBM_FILE; in a
post called "What makes C _not_ a subset of C++, where it said that
this particular declaration is illegal in C++)

1. How does this sort of code compiles in standard C ? ( What function
prototype matching rule of C applies here ?) When I checked in the
generated Makefile there wasn't any special flags that were passed to
gcc.

2. Why was this done ? My first guess is because the author did not
want to expose the gdbm_file_info structure in gdbm.h so that he can
change it between different releases preserving compatibility (?) Also
to keep the external interface clean.

3. Is this safe ? Will it work in all the compilers and machines i.e.
is it portable ?

Any help is appreciated.
 
W

WANG Cong

vshenoy said:
Hi Guys,

I was going through gdbm-1.8.3 source (http://ftp.gnu.org/gnu/gdbm/
gdbm-1.8.3.tar.gz) and found this strange thing : all the exposed
functions of gdbm work with GDBM_FILE pointer (which is returned by
gdbm_open), but in the implementation of gdbm functions (e.g.
gdbm_open in gdbmopen.c)work with a structure called gdbm_file_info.

Now GDBM_FILE is defined like this in gdbm.h an exposed header :

typedef struct { int dummy[10];} *GDBM_FILE;

but gdbm_file_info is a structure which is a lot bigger than this.

For e.g. in gdbm.h gdbm_open is defined as :

extern int gdbm_store __P((GDBM_FILE, datum, datum, int)); /* __P(x)
is x if it is standard C or C++ else _P(x) is () */

but in its definition it is like this :

int
gdbm_store (dbf, key, content, flags)
gdbm_file_info *dbf;
datum key;
datum content;
int flags;

I have these questions : (I searched previous archives of c.l.c, but
only found mention of typedef struct {int dummy[10];} *GDBM_FILE; in a
post called "What makes C _not_ a subset of C++, where it said that
this particular declaration is illegal in C++)

1. How does this sort of code compiles in standard C ? ( What function
prototype matching rule of C applies here ?) When I checked in the
generated Makefile there wasn't any special flags that were passed to
gcc.

The function definition and its declaration are not match, this behavior
should be undefined.

This can work because the linker will not check the prototype.
2. Why was this done ? My first guess is because the author did not
want to expose the gdbm_file_info structure in gdbm.h so that he can
change it between different releases preserving compatibility (?) Also
to keep the external interface clean.

Check the source code or ask the author directly. ;)
3. Is this safe ? Will it work in all the compilers and machines i.e.
is it portable ?

No, it's not a good hack.
 
B

Barry Schwarz

Hi Guys,

I was going through gdbm-1.8.3 source (http://ftp.gnu.org/gnu/gdbm/
gdbm-1.8.3.tar.gz) and found this strange thing : all the exposed
functions of gdbm work with GDBM_FILE pointer (which is returned by
gdbm_open), but in the implementation of gdbm functions (e.g.
gdbm_open in gdbmopen.c)work with a structure called gdbm_file_info.

Now GDBM_FILE is defined like this in gdbm.h an exposed header :

typedef struct { int dummy[10];} *GDBM_FILE;

but gdbm_file_info is a structure which is a lot bigger than this.

Nowhere in the discussion below is there any indication of what a
gdbm_file_info looks like. Where do you get the idea it is bigger
than 10 int?
For e.g. in gdbm.h gdbm_open is defined as :

extern int gdbm_store __P((GDBM_FILE, datum, datum, int)); /* __P(x)
is x if it is standard C or C++ else _P(x) is () */

Why show us gdbm_store when your question is about gdbm_open?

This is not a definition. It is a declaration (also known as a
prototype). It serves two purposes: 1) to let the compiler check on
(and possibly convert) the arguments you pass, and 2) to let the
compiler generate code to use the return value.
but in its definition it is like this :

int
gdbm_store (dbf, key, content, flags)
gdbm_file_info *dbf;
datum key;
datum content;
int flags;

Technically, this definition is inconsistent with the prototype. If
gdbm.h is not included in this source module, the compiler will not
know about the inconsistency. The linker never cares about arguments.
At the code level, there is no problem if gdbm_file_info is also a
structure because all pointers to structure have the same size and
representation.

This is a very old style function definition (pre-C89). While it is
still legal, it is a strong hint that you are looking at code that
predates the standard.
I have these questions : (I searched previous archives of c.l.c, but
only found mention of typedef struct {int dummy[10];} *GDBM_FILE; in a
post called "What makes C _not_ a subset of C++, where it said that
this particular declaration is illegal in C++)

It's off-topic but I wonder why unless one of the tokens is a reserved
word.
1. How does this sort of code compiles in standard C ? ( What function
prototype matching rule of C applies here ?) When I checked in the
generated Makefile there wasn't any special flags that were passed to
gcc.

Is gdbm.h included in the source module that contains gdbm_store? If
not, there is no prototype to match.
2. Why was this done ? My first guess is because the author did not
want to expose the gdbm_file_info structure in gdbm.h so that he can
change it between different releases preserving compatibility (?) Also
to keep the external interface clean.

Not uncommon but remember how old this is.
3. Is this safe ? Will it work in all the compilers and machines i.e.
is it portable ?

You would not normally be compiling gdbm_store. It would be in a
library that would be used by the linker to resolve your references to
it. As explained above, while their is an inconsistency it is not the
type to manifest as an error in your generated code.

I would expect it to work only on those systems for which gdbm-1.8.3
is designed. Does gnu work on Windows also or is it limited to Unix?
Does gdbm_open eventually call fopen or does it use system specific
tricks to open the file? If the latter, I would be surprised if it
worked at all on my IBM mainframe.


Remove del for email
 
V

vshenoy

Hi Barry,
Nowhere in the discussion below is there any indication of what a
gdbm_file_info looks like. Where do you get the idea it is bigger
than 10 int?

This is the definition of gdbm_file_info (comments removed) :

typedef struct {
char *name;
int read_write;
int fast_write;
int central_free;
int coalesce_blocks;
int file_locking;
void (*fatal_err) ();
int desc;
gdbm_file_header *header;
off_t *dir;
cache_elem *bucket_cache;
int cache_size;
int last_read;
hash_bucket *bucket;
int bucket_dir;
cache_elem *cache_entry;
char header_changed;
char directory_changed;
char bucket_changed;
char second_changed;
} gdbm_file_info;

with some of the compound structures themselves being more than 10
bytes. It is defined in gdbmdefs.h. Also in the same header file
included is header called "proto.h" which is somewhat similar to the
exposed gdbm.h except that it uses gdbm_file_info instead of GDBM_FILE
*
This is not a definition. It is a declaration (also known as a
prototype). It serves two purposes: 1) to let the compiler check on
(and possibly convert) the arguments you pass, and 2) to let the
compiler generate code to use the return value.

I am sorry. It should have been "declared as" instead of "defined as".
Why show us gdbm_store when your question is about gdbm_open?

Sorry again for context switching so fast. Basically what I wanted to
say is all the exposed functions (like gdbm_store) work with GDBM_FILE
*, which is returned by gdbm_open, where as internally (in the source
code) all functions receive gdbm_file_info as arguments returned by
gdbm_open in place of GDBM_FILE *
Technically, this definition is inconsistent with the prototype. If
gdbm.h is not included in this source module, the compiler will not
know about the inconsistency. The linker never cares about arguments.

That is a good point. The source code doesn't include gdbm.h anywhere.
In fact gdbm.h is generated as a part of build process from files
called gdbm.proto and gdbm.proto2.
At the code level, there is no problem if gdbm_file_info is also a
structure because all pointers to structure have the same size and
representation.

Exactly. My question is that is this a portable behavior ? Can we rely
on compiler silently converting back and forth between gdbm_file_info
* and GDBM_FILE * ?
Is gdbm.h included in the source module that contains gdbm_store? If
not, there is no prototype to match.

Like I said above, no.
Not uncommon but remember how old this is.

Actually this is the latest gdbm version.
You would not normally be compiling gdbm_store. It would be in a
library that would be used by the linker to resolve your references to
it. As explained above, while their is an inconsistency it is not the
type to manifest as an error in your generated code.

I was surprised seeing this kind of code in one of the most widely
used database libraries.
I would expect it to work only on those systems for which gdbm-1.8.3
is designed. Does gnu work on Windows also or is it limited to Unix?

It works on windows according to this page :

http://gnuwin32.sourceforge.net/packages/gdbm.htm
Does gdbm_open eventually call fopen or does it use system specific
tricks to open the file? If the latter, I would be surprised if it
worked at all on my IBM mainframe.
It uses open(2) for opening the file.
 
B

Barry Schwarz

Hi Barry,


This is the definition of gdbm_file_info (comments removed) :

typedef struct {
char *name;
int read_write;
int fast_write;
int central_free;
int coalesce_blocks;
int file_locking;
void (*fatal_err) ();
int desc;
gdbm_file_header *header;
off_t *dir;
cache_elem *bucket_cache;
int cache_size;
int last_read;
hash_bucket *bucket;
int bucket_dir;
cache_elem *cache_entry;
char header_changed;
char directory_changed;
char bucket_changed;
char second_changed;
} gdbm_file_info;

with some of the compound structures themselves being more than 10
bytes. It is defined in gdbmdefs.h. Also in the same header file
included is header called "proto.h" which is somewhat similar to the
exposed gdbm.h except that it uses gdbm_file_info instead of GDBM_FILE
*


I am sorry. It should have been "declared as" instead of "defined as".


Sorry again for context switching so fast. Basically what I wanted to
say is all the exposed functions (like gdbm_store) work with GDBM_FILE
*, which is returned by gdbm_open, where as internally (in the source
code) all functions receive gdbm_file_info as arguments returned by
gdbm_open in place of GDBM_FILE *


That is a good point. The source code doesn't include gdbm.h anywhere.
In fact gdbm.h is generated as a part of build process from files
called gdbm.proto and gdbm.proto2.


Exactly. My question is that is this a portable behavior ? Can we rely
on compiler silently converting back and forth between gdbm_file_info
* and GDBM_FILE * ?

No. The compiler is not silently converting back and forth. In your
code, it is generating an argument of type GDBM.FILE (which is already
a pointer to struct so we don't put an asterisk after it). This is
the argument that will be passed during execution. The library
function will take this argument and treat it as a gdbm_file_info*.
The library function is allowed to do this (more correctly, will get
away with doing this) because the two pointer types are guaranteed to
have the same size and representation. Your code cannot set or
reference the members of the struct (by convention). Nor can it
create an object of this type (the struct has no tag and the typedef
defines only a pointer to struct type). Consequently, you don't
really care how big the struct really is or how badly the typedef
mismatches the actual struct contents.
Like I said above, no.


Actually this is the latest gdbm version.

That may be but the presence of a K&R style function definition
indicates this is old code being carried forward.
I was surprised seeing this kind of code in one of the most widely
used database libraries.


It works on windows according to this page :

http://gnuwin32.sourceforge.net/packages/gdbm.htm

It uses open(2) for opening the file.

That is not a C standard function. I believe it's posix so it
probably only works on systems which have some extension method that
adds the posix function headers and libraries.


Remove del for email
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,992
Messages
2,570,220
Members
46,807
Latest member
ryef

Latest Threads

Top