Need help with multi-dimensional arrays and functions

P

Paul David Buchan

Hello,

I'm attempting to write a program to read in database files (.dbf).
When I do it all as a single procedure in main, everything works.
However, what I really want, is to pass the database filename to a function,
and have it pass back an array containing the database contents, and some
parameters telling me the dimensions of the array.
I've succeeded in getting my function to read in the dbf file, and it
returns the dimensions of the array to main, but the array is my stumbling
block. I keep seg-faulting.
I allocate memory for the array inside the function, because I won't know
how big the database is until the function opens it up and analyzes it.

So my question is: can I make this program work such that the function
"readfile" opens the database file, allocates memory for an array, and
then passes that array back to main, which had no prior knowledge of the
required size of the array?

I'm really struggling with the pointer concept, I'm afraid.
Any help is appreciated.

Below are two versions separated by asterisk lines. The difference is in
my treatment of array "input" and "input_array".
Sorry they're so long, but I didn't want to trim too much for fear of missing
something important.

I'm using GCC on windows XP.

Dave Buchan
(e-mail address removed)

*********************************************
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>

/*int readfile (char *, int, int, int, char ***); */

int main ()
{
int i,j,nrecords,nfields,nchars;
char filename[100];
char ***input;

strcpy (filename, "input.dbf");

readfile (filename,&nrecords,&nfields,&nchars,input);
printf ("\n%u %u %u", nrecords,nfields,nchars);
printf (" %s",filename);

for (i=0; i<nrecords; i=i+1) {
for (j=0; j<nfields; j=j+1) {
printf ("%s ",input[j]);
}
printf ("\n");
}

}





int readfile (filename,nrecords,nfields,nchars,input_array)
int *nrecords, *nfields, *nchars;
char filename[];
char ***input_array;
{
int i,j,k,c,b1,b2,b3,b4,headlen,reclen;
int nrows, ncols,max;
int *dbf,*fieldlen;
FILE *fi;

/* Attempt to open .dbf file */
fi = fopen (filename, "rb");
if (fi==NULL) {
printf ("Can't open .dbf file.\n");
exit (EXIT_FAILURE);
}

/* Count number of bytes in .dbf file */
i=0;
while ((b1=fgetc(fi)) !=EOF) {
i=i+1;
}
fclose (fi);

/* Allocate array for file contents */
dbf = (int *)malloc(i*sizeof(int));

/* Read .dbf file into array dbf */
i=0;
fi = fopen (filename, "rb");
while ((dbf=fgetc(fi)) !=EOF) {
i=i+1;
}
fclose (fi);

/* Number of records (4 bytes) */
*nrecords=(dbf[7]*256*256*256)+(dbf[6]*256*256)+(dbf[5]*256)+dbf[4];

/* Length of header (2 bytes) */
headlen=(dbf[9]*256)+dbf[8];

/* Length of each record (2 bytes) */
reclen=(dbf[11]*256)+dbf[10];

/* Count number of fields in each record */
*nfields=0;
j=32;
while (dbf[j]!=13) {
j=j+32;
*nfields=*nfields+1;
}

/* Allocate array for field lengths */
fieldlen = (int *)malloc(*nfields*sizeof(int));

/* Populate array of field lengths (1 byte each) */
*nchars=0;
for (i=0; i<*nfields; i=i+1) {
fieldlen=dbf[48+(i*32)];
if (fieldlen>*nchars) {
*nchars=fieldlen;
}
}

nrows=*nrecords+1; /* Add 1 because of header */
ncols=*nfields;
/* Allocate 3-dimensional array nrows-by-ncols-by-nchars */
input_array = (char ***) malloc (nrows*sizeof(char **));
for (i=0; i<nrows; i=i+1) {
input_array = (char **) malloc(ncols*sizeof(char *));
for (j=0; j<ncols; j=j+1) {
input_array[j] = (char *) malloc(*nchars*sizeof(char));
}
}

/* Initialize array contents to NULL */
for (i=0; i<nrows; i=i+1) {
for (j=0; j<ncols; j=j+1) {
for (k=0; k<*nchars; k=k+1) {
input_array[j][k]='\0';
}
}
}

/* Write field titles to array */
for (i=0; i<*nfields; i=i+1) {
j=0;
while (dbf[(i*32)+32+j] !=NULL) {
input_array[0][j]=dbf[(i*32)+32+j];
j=j+1;
}
}

/* Write all data fields to array */
for (i=0; i<*nrecords; i=i+1) {
c=1; /* Ignore Record Delete Flag */
for (j=0; j<*nfields; j=j+1) {
for (k=0; k<fieldlen[j]; k=k+1) {
input_array[i+1][j][k]=dbf[headlen+(i*reclen)+c];
c=c+1;
}
}
}

/* Trim off any trailing spaces, tabs, or newlines */
for (i=0; i<*nrecords; i=i+1) {
for (j=0; j<*nfields; j=j+1) {
for (k=fieldlen[j]-1; k>=0; k=k-1) {
if (input_array[j][k] !=' ' && input_array[j][k] !='\t'
&& input_array[j][k] !='\n') {
break;
}
}
input_array[j][k+1]='\0';
}
}

/* De-allocate memory
free (dbf);
free (fieldlen);
for (i=0; i<nrows; i=i+1) {
for (j=0; j<ncols; j=j+1) {
free((void *)input_array[j]);
}
}
free ((void *)input_array); */

return (EXIT_SUCCESS);
}

*********************************************
In the following version I attempt to treat array "input"
and "input_array" inthe same manner as I treat nrecords,
nfields, and nchars.

*********************************************

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>

/*int readfile (char *, int, int, int, char ***); */

int main ()
{
int i,j,nrecords,nfields,nchars;
char filename[100];
char ***input;

strcpy (filename, "input.dbf");

readfile (filename,&nrecords,&nfields,&nchars,&input);
printf ("\n%u %u %u", nrecords,nfields,nchars);
printf (" %s",filename);

for (i=0; i<nrecords; i=i+1) {
for (j=0; j<nfields; j=j+1) {
printf ("%s ",input[j]);
}
printf ("\n");
}
exit(1);

}





int readfile (filename,nrecords,nfields,nchars,input_array)
int *nrecords, *nfields, *nchars;
char filename[];
char ****input_array;
{
int i,j,k,c,b1,b2,b3,b4,headlen,reclen;
int nrows, ncols,max;
int *dbf,*fieldlen;
FILE *fi;

/* Attempt to open .dbf file */
fi = fopen (filename, "rb");
if (fi==NULL) {
printf ("Can't open .dbf file.\n");
exit (EXIT_FAILURE);
}

/* Count number of bytes in .dbf file */
i=0;
while ((b1=fgetc(fi)) !=EOF) {
i=i+1;
}
fclose (fi);

/* Allocate array for file contents */
dbf = (int *)malloc(i*sizeof(int));

/* Read .dbf file into array dbf */
i=0;
fi = fopen (filename, "rb");
while ((dbf=fgetc(fi)) !=EOF) {
i=i+1;
}
fclose (fi);

/* Number of records (4 bytes) */
*nrecords=(dbf[7]*256*256*256)+(dbf[6]*256*256)+(dbf[5]*256)+dbf[4];

/* Length of header (2 bytes) */
headlen=(dbf[9]*256)+dbf[8];

/* Length of each record (2 bytes) */
reclen=(dbf[11]*256)+dbf[10];

/* Count number of fields in each record */
*nfields=0;
j=32;
while (dbf[j]!=13) {
j=j+32;
*nfields=*nfields+1;
}

/* Allocate array for field lengths */
fieldlen = (int *)malloc(*nfields*sizeof(int));

/* Populate array of field lengths (1 byte each) */
*nchars=0;
for (i=0; i<*nfields; i=i+1) {
fieldlen=dbf[48+(i*32)];
if (fieldlen>*nchars) {
*nchars=fieldlen;
}
}

nrows=*nrecords+1; /* Add 1 because of header */
ncols=*nfields;
/* Allocate 3-dimensional array nrows-by-ncols-by-nchars */
*input_array = (char ***) malloc (nrows*sizeof(char **));
for (i=0; i<nrows; i=i+1) {
*input_array = (char **) malloc(ncols*sizeof(char *));
for (j=0; j<ncols; j=j+1) {
*input_array[j] = (char *) malloc(*nchars*sizeof(char));
}
}

/* Initialize array contents to NULL */
for (i=0; i<nrows; i=i+1) {
for (j=0; j<ncols; j=j+1) {
for (k=0; k<*nchars; k=k+1) {
*input_array[j][k]='\0';
}
}
}

/* Write field titles to array */
for (i=0; i<*nfields; i=i+1) {
j=0;
while (dbf[(i*32)+32+j] !=NULL) {
*input_array[0][j]=dbf[(i*32)+32+j];
j=j+1;
}
}

/* Write all data fields to array */
for (i=0; i<*nrecords; i=i+1) {
c=1; /* Ignore Record Delete Flag */
for (j=0; j<*nfields; j=j+1) {
for (k=0; k<fieldlen[j]; k=k+1) {
*input_array[i+1][j][k]=dbf[headlen+(i*reclen)+c];
c=c+1;
}
}
}

/* Trim off any trailing spaces, tabs, or newlines */
for (i=0; i<*nrecords; i=i+1) {
for (j=0; j<*nfields; j=j+1) {
for (k=fieldlen[j]-1; k>=0; k=k-1) {
if (*input_array[j][k] !=' ' && *input_array[j][k] !='\t'
&& *input_array[j][k] !='\n') {
break;
}
}
*input_array[j][k+1]='\0';
}
}

/* De-allocate memory
free (dbf);
free (fieldlen);
for (i=0; i<nrows; i=i+1) {
for (j=0; j<ncols; j=j+1) {
free((void *)input_array[j]);
}
}
free ((void *)input_array); */

return (EXIT_SUCCESS);
}
 
S

Szabolcs Borsanyi

Hello,

I'm attempting to write a program to read in database files (.dbf).
When I do it all as a single procedure in main, everything works.
However, what I really want, is to pass the database filename to a function,
and have it pass back an array containing the database contents, and some
[snip]

Let's get to your code
*********************************************
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>

/*int readfile (char *, int, int, int, char ***); */
Why is this a comment? Using a function prototype is nothing evil.
int main ()
{
int i,j,nrecords,nfields,nchars;
char filename[100];
char ***input;

strcpy (filename, "input.dbf");

readfile (filename,&nrecords,&nfields,&nchars,input);
Uups, the 'input' variable inside of readfile() receives an uninitialised
value, and has no chance to return the initialised value to main().
Your second version does not have this problem. Why don't you pass the 'input'
pointer as a return value of the function?
printf ("\n%u %u %u", nrecords,nfields,nchars); Are you sure about the %u ?
printf (" %s",filename);

for (i=0; i<nrecords; i=i+1) {
for (j=0; j<nfields; j=j+1) {
printf ("%s ",input[j]);
}
printf ("\n");
}
}


Let's see the second attempt
*********************************************
In the following version I attempt to treat array "input"
and "input_array" inthe same manner as I treat nrecords,
nfields, and nchars.

*********************************************

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>

/*int readfile (char *, int, int, int, char ***); */ Ok, with deleting /* and */

int main ()
{
int i,j,nrecords,nfields,nchars;
char filename[100];
char ***input;

strcpy (filename, "input.dbf");

readfile (filename,&nrecords,&nfields,&nchars,&input);
printf ("\n%u %u %u", nrecords,nfields,nchars);
%u is for unsigned ints. Yours are plain ints, so better use %d.
printf (" %s",filename);

for (i=0; i<nrecords; i=i+1) {
I wonder why you keep writing i=i+1, most people do i++, but your version
is equally correct.
for (j=0; j<nfields; j=j+1) {
printf ("%s ",input[j]);
}
printf ("\n");
}
exit(1);

Does 1 mean a kind of failure? Normally exit(0) is used for
the successful termination, but from the main(), a 'return 0' will do.

And now comes your lengthy function
int readfile (filename,nrecords,nfields,nchars,input_array)
int *nrecords, *nfields, *nchars;
char filename[];
char ****input_array;
This is the ancient way of defining a function.
{
int i,j,k,c,b1,b2,b3,b4,headlen,reclen;
int nrows, ncols,max;
int *dbf,*fieldlen;
FILE *fi;

/* Attempt to open .dbf file */
fi = fopen (filename, "rb");
if (fi==NULL) {
printf ("Can't open .dbf file.\n");
exit (EXIT_FAILURE);
}

/* Count number of bytes in .dbf file */
i=0;
while ((b1=fgetc(fi)) !=EOF) {
i=i+1;
}
fclose (fi);

/* Allocate array for file contents */
dbf = (int *)malloc(i*sizeof(int));
The (int*) is superfluous and can be misleading.
sizeof(*dbf) is slightly easier to maintain than sizeof(int)
/* Read .dbf file into array dbf */
i=0;
fi = fopen (filename, "rb");
You are reopening a file. Are you doing so just to get again to its
begining? How about a rewind() (or fseek())?. A reopening should
again involve some error checking...

[snip some code]
nrows=*nrecords+1; /* Add 1 because of header */
ncols=*nfields;
/* Allocate 3-dimensional array nrows-by-ncols-by-nchars */
*input_array = (char ***) malloc (nrows*sizeof(char **));
these casts in front of malloc are quite annoying. A C++ compiler
will sure require this, but well, this is C, so let's save those
keystrokes.
for (i=0; i<nrows; i=i+1) {
*input_array = (char **) malloc(ncols*sizeof(char *));

Uups!
*input_array is *(input_array) that is input_array[0]
I what you really wanted is (*input_array)=...
The catch is the operator precedence.
for (j=0; j<ncols; j=j+1) {
*input_array[j] = (char *) malloc(*nchars*sizeof(char));

Again: (*input_array)[j] will be better
}
}

/* Initialize array contents to NULL */
for (i=0; i<nrows; i=i+1) {
for (j=0; j<ncols; j=j+1) {
for (k=0; k<*nchars; k=k+1) {
*input_array[j][k]='\0';

Yet again: (*input_array)[j][k]=0 will be better.
Notice that '\0' is the four character long-hand for 0 (both are ints).

[snip], you'll have to correct the same error in a few
/* De-allocate memory
free (dbf);
free (fieldlen);
for (i=0; i<nrows; i=i+1) {
for (j=0; j<ncols; j=j+1) {
free((void *)input_array[j]);

???????
Are you releasing the memory before returning from the function?
Where is the caller (the main()) supposed to find the data?
And please do not cast the pointers to (void*), this conversion takes
place automatially.
}
}
free ((void *)input_array); */
???????
Are you planning to call free() on the the callers memory? Not very
polite. These lines stayed in your code from the times when everything
was in the main()
return (EXIT_SUCCESS); This too.
}
By the way, you do not need to call free() on the malloced memory. This
is all done for you when your program exits. The friendly way of
the deallocation, if you need one, is to accompany your read routine with
a deallocator function.

Szabolcs
 
P

Paul David Buchan

Thanks Szabolcs!

Looks like a lot of good information there.
It's going to take me some time to understand it all.
I've been FORTRAN programming for years, so C is hard for me.

I really appreciate your comments.

Dave
 
B

Ben Bacarisse

Paul David Buchan said:
Looks like a lot of good information there.

There were a couple of key things not yet pointed out. You read the
whole dbf into an array of ints. But each character is just a char.
You could read the file in one single fread operation.

Secondly, having done that, you then go and allocate a 3D array of
char and copy the chars (stored as ints) from the dbf array to the
new one. This whole operation could be avoided if you had read the
file a giant char array in the first place -- you would already have
all the data in (roughly) the right place and order. It would have
been simpler to provide access functions to that giant array rather
the moving all the data about!

However... It often is worth "parsing" a file like this and mangling
the data into a better form but I think you miss one key trick -- you
read arrays of chars but you don't turn the into strings (i.e. you
do not ensure they are null terminated). Now, it could be that DBF
files always have null-terminated strings, but that would be quote odd
for a DB file format that seems to have fixed size fields. If any of
your strings are not null terminated you will be in trouble when you
use %s to print them.
 
P

Paul David Buchan

Hi guys,

Ben, You're right! I actually had a brief moment of clarity
when I was thinking of leaving everything in the dbf array,
but I somehow got onto the track I did.

Regardless, I totally missed the fact that I didn't terminate the strings.
And now that you mention is, there may be a screw-up in my trimming
routine. I need to check the bounds on that. I haven't looked yet.

Thanks for the input.

Dave
 
P

Paul David Buchan

Szabolcs,

I'm pleased to report that I implemented your recommendations,
and now it works perfectly!

Thanks again,

Dave
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,954
Messages
2,570,116
Members
46,704
Latest member
BernadineF

Latest Threads

Top