Reading a data file

W

W. eWatson

It seems every year, I have to write one simple C-program.
If I stare at my only book on C, Comprehensive C, 1992 (way back), my
eyes start to glaze over. It's been along time.

I'm using gcc.

I have a file something like this:

mydata.dat
0.345 12.613 8.100
13.118
8.224 0.777
22.420 0.546 99.000


7.3f specifier. The numbers are lined up, but sometimes there are only
one or two numbers on a line. Actually, I'd be happy to read just one
line. I think I might be able to carry the ball after that.

I want to read it, and display it on the console.

Can someone supply the details?
 
T

Tim Rentsch

W. eWatson said:
It seems every year, I have to write one simple C-program.
If I stare at my only book on C, Comprehensive C, 1992 (way back), my
eyes start to glaze over. It's been along time.

I'm using gcc.

I have a file something like this:

mydata.dat
0.345 12.613 8.100
13.118
8.224 0.777
22.420 0.546 99.000


7.3f specifier. The numbers are lined up, but sometimes there are only
one or two numbers on a line. Actually, I'd be happy to read just one
line. I think I might be able to carry the ball after that.

I want to read it, and display it on the console.

Can someone supply the details?

You need stdio.h of course:

#include <stdio.h>

The file can be opened with the fopen() function:

FILE *input = fopen( "mydata.dat", "r" );

Lines from the file can be read into a buffer using
the fgets() function:

char line[ 100 ];
while( fgets( line, sizeof line, input ) ){
...

To get the values on a line, use the sscanf() function,
checking the function's return value to see how many
values on the line there were:

double a, b, c;
int n;

n = sscanf( line, " %lf %lf %lf", &a, &b, &c );
/* n == 3 means three values were read - a, b, c */
/* n == 2 means two values were read - a, b */
/* n == 1 means one value was read - a */
/* n anything else means no values were read */

... now use the values ...

When all done reading, the file can be closed with
the fclose() function:

}

fclose( input );

Production-quality code should have checks for error
returns and perhaps accommodate a non-fixed-size
line buffer. Now that you have a better idea where
to look, it shouldn't be too hard to figure out what
those things are and how to do them.

Disclaimer: all I've done is type this stuff in.
It hasn't been compiled or tested in any way. So
don't rely on it except has a hint to point you
in (what I think is) a generally good direction.
 
W

W. eWatson

W. eWatson said:
It seems every year, I have to write one simple C-program.
If I stare at my only book on C, Comprehensive C, 1992 (way back), my
eyes start to glaze over. It's been along time.

I'm using gcc.

I have a file something like this:

mydata.dat
0.345 12.613 8.100
13.118
8.224 0.777
22.420 0.546 99.000


7.3f specifier. The numbers are lined up, but sometimes there are only
one or two numbers on a line. Actually, I'd be happy to read just one
line. I think I might be able to carry the ball after that.

I want to read it, and display it on the console.

Can someone supply the details?

You need stdio.h of course:

#include <stdio.h>

The file can be opened with the fopen() function:

FILE *input = fopen( "mydata.dat", "r" );

Lines from the file can be read into a buffer using
the fgets() function:

char line[ 100 ];
while( fgets( line, sizeof line, input ) ){
...

To get the values on a line, use the sscanf() function,
checking the function's return value to see how many
values on the line there were:

double a, b, c;
int n;

n = sscanf( line, " %lf %lf %lf", &a, &b, &c );
/* n == 3 means three values were read - a, b, c */
/* n == 2 means two values were read - a, b */
/* n == 1 means one value was read - a */
/* n anything else means no values were read */

... now use the values ...

When all done reading, the file can be closed with
the fclose() function:

}

fclose( input );

Production-quality code should have checks for error
returns and perhaps accommodate a non-fixed-size
line buffer. Now that you have a better idea where
to look, it shouldn't be too hard to figure out what
those things are and how to do them.

Disclaimer: all I've done is type this stuff in.
It hasn't been compiled or tested in any way. So
don't rely on it except has a hint to point you
in (what I think is) a generally good direction.
Thanks. is sscanf a typo for fscanf?

I played some more and got this:
#include<stdio.h>
#include<stdlib.h>

int
main(void)
{
char str[70];
FILE *p;
if((p=fopen("outfile.txt","r"))==NULL){
printf("\nUnable to open file string.txt");
exit(1);
}
while(fgets(str,70,p)!=NULL)
puts(str);
fclose(p);
exit(0);
}
Needs some adjustments for my situation.

Getting closer.
 
I

Ike Naar

int
main(void)
{
char str[70];
FILE *p;
if((p=fopen("outfile.txt","r"))==NULL){
printf("\nUnable to open file string.txt");

Wouldn't it make more sense to put the end-of-line character
at the end of the line, rather than at the beginning?
 
J

Jorgen Grahn

OP: really? Because there are standard programs for that, e.g. cat(1)
on Unix. If you do need to do more, do you have to do floating-point
operations on the numbers? In a way it would be better if you could
treat them as strings: "0.345" and so on. No numerical errors to
worry about.
The file can be opened with the fopen() function:

FILE *input = fopen( "mydata.dat", "r" );

I'd prefer to read from stdin, and/or from a file whose name was provided
in argv. Nitpick perhaps, but I really dislike programs which have odd
user interfaces -- they are hard to automate.

/Jorgen
 
J

James Kuyper

Nope. See for example <http://linux.die.net/man/3/sscanf> - sscanf() allows you to scan a string for data rather than reading from stdin or some other file.

Parts of section 12 of the C-FAQ <http://c-faq.com> may help you understand why this is a good thing.

W. eWatson is probably not ready for the following complication, but
I'll mention it for his future consideration, when he's become more
familiar with C:

The behavior of the fscanf() family of functions is undefined when
parsing a string that represents a number too large to be represented in
the corresponding type. For example, a typical value for FLT_MAX is
3.40282346638528859812e+38F, so sscanf("4e38", "%f", &x) has undefined
behavior.

If you want to avoid this problem, you should use the strto*() family of
functions. They have defined behavior regardless of the contents of the
string that is parsed. If something goes wrong, they provide more
information about what went wrong than sscanf() does. Finally, if they
extract a complete number before reaching the end of the string, they
let you know where the number ended, so you can continue parsing the
rest of that string in whatever manner you wish. These advantages have a
cost - using those functions is more complicated than using sscanf(),
which is why I don't think W. eWatson is ready for it yet.

Example of use:

#include <errno.h>
#include <math.h>
#include <stdlib.h>
#include <stdio.h>

int main(int argc, char *argv[])
{
for(int arg=1; arg<argc; arg++)
{
char *endptr;

printf("Argument %d:", arg);
errno = 0;
double d = strtod(argv[arg], &endptr);
if(endptr == argv[arg])
{
printf("The subject sequence is empty "
"or does not have the expected form");
}
else if(errno == ERANGE)
{
printf("Conversion to double %sflowed.",
(d >= HUGE_VAL || d <= -HUGE_VAL) ? "over" : "under");
}
else
{
printf("%g", d);
if(*endptr)
printf(" The rest of the string is: \"%s\"", endptr);
}
putchar('\n');
}
return ferror(stdout) ? EXIT_FAILURE : EXIT_SUCCESS;
}
 
G

glen herrmannsfeldt

W. eWatson said:
It seems every year, I have to write one simple C-program.
If I stare at my only book on C, Comprehensive C, 1992 (way back), my
eyes start to glaze over. It's been along time.
(snip)
I have a file something like this:
mydata.dat
0.345 12.613 8.100
13.118
8.224 0.777
22.420 0.546 99.000
7.3f specifier. The numbers are lined up, but sometimes there are only
one or two numbers on a line. Actually, I'd be happy to read just one
line. I think I might be able to carry the ball after that.

A loop with scanf or fscanf, reading one at a time, will ignore newlines
(line boundaries). Sometimes that is good, sometimes not.

One problem that C doesn't help much with is knowing how many there
will be, so that you can allocate space for them.

-- glen
 
T

Tim Rentsch

Jorgen Grahn said:
The file can be opened with the fopen() function:

FILE *input = fopen( "mydata.dat", "r" );

I'd prefer to read from stdin, and/or from a file whose name was
provided in argv. [snip]

I might agree with those sentiments, but my comments were
concerned only with answering OP's question. He wasn't
asking for a lesson in programming style, and I didn't feel
a need to give one -- just to answer his question in the
most direct and straightforward way I could. Anything more
would serve to dilute the answer, and thereby lessen its
value.
 
W

W. eWatson

Nope. See for example <http://linux.die.net/man/3/sscanf> - sscanf() allows you to scan a string for data rather than reading from stdin or some other file.

Parts of section 12 of the C-FAQ <http://c-faq.com> may help you understand why this is a good thing.
OK, I thought I'd give sscanf a try. It must be a somewhat new function
to C. My old C book dated 1992 doesn't mention it.

Anyway I wrote the following program test.c to read the test.dat file
below. sscanf is used, but the printf goes looney when executed.

/* test.c */
#include<stdio.h>
#include<stdlib.h>

int
main(void)
{
float pxm[2][80];
int i,j,k;
float value;
char str[70];
FILE *p;

if((p=fopen("test_array.dat","r"))==NULL){
printf("\nUnable to open file test_array.dat");
exit(1);
}

while(fgets(str,70,p)!=NULL){
sscanf(str,"%5.1f", &value);
printf("value = %5.1f \n", value);
}
fclose(p);
exit(0);
}

test.dat.
100.0 35.3 3.1
18.2 41.7 27.4
986.3 11.6 21.0
 
P

Paul

W. eWatson said:
Nope. See for example <http://linux.die.net/man/3/sscanf> - sscanf()
allows you to scan a string for data rather than reading from stdin or
some other file.

Parts of section 12 of the C-FAQ <http://c-faq.com> may help you
understand why this is a good thing.
OK, I thought I'd give sscanf a try. It must be a somewhat new function
to C. My old C book dated 1992 doesn't mention it.

Anyway I wrote the following program test.c to read the test.dat file
below. sscanf is used, but the printf goes looney when executed.

/* test.c */
#include<stdio.h>
#include<stdlib.h>

int
main(void)
{
float pxm[2][80];
int i,j,k;
float value;
char str[70];
FILE *p;

if((p=fopen("test_array.dat","r"))==NULL){
printf("\nUnable to open file test_array.dat");
exit(1);
}

while(fgets(str,70,p)!=NULL){
sscanf(str,"%5.1f", &value);
printf("value = %5.1f \n", value);
}
fclose(p);
exit(0);
}

test.dat.
100.0 35.3 3.1
18.2 41.7 27.4
986.3 11.6 21.0

sscanf returns a value, indicating how many things it converted.

You're not even checking the sscanf() returned value, to see whether
it did anything.

In this example, they use "argsread" to hold the results. Some
of their scan specifications, have room for three conversions, and
the returned value is going to indicate whether none, some, or all
succeeded. If nothing got converted, then logically, there
is nothing to be printed. So the thing you want to do with
your test program, is print the equivalent of "argsread" first.
To see whether the conversion is going well or not.
Then figure out, why not.

http://www.keil.com/support/man/docs/c166/c166_sscanf.htm

Paul
 
I

Ike Naar

sscanf(str,"%5.1f", &value);

The format specifier for sscanf cannot contain a fractional part,
so "%5.1f" is not valid. If you really want to specify a maximum
field with of 5, then use "%5f". Not sure why you would want this,
it would only read the first five characters of a number, so reading
"-12345.67" would yield the value -1234.0 which may come as a
surprise.
Simply using "%f" should work just fine.

It often makes sense to check sscanf's return value, to
see if the read operation succeeded.
 
G

glen herrmannsfeldt

(snip, someone wrote)
The format specifier for sscanf cannot contain a fractional part,
so "%5.1f" is not valid. If you really want to specify a maximum
field with of 5, then use "%5f". Not sure why you would want this,
it would only read the first five characters of a number, so reading
"-12345.67" would yield the value -1234.0 which may come as a
surprise.
Simply using "%f" should work just fine.

In the early Fortran days it was usual for files on punched cards not
to have any space between the fields. The field widths tell how many
characters are for each value. For Fortran formatted input with a
descriptor such as F5.1, five characters are used, and one of them
is considered after the decimal point, unless an actual decimal
is punched. (Or otherwise in the input file.)

I hadn't thought about it for years, and didn't remember that
C didn't have that feature.

-- glen
 
I

Ike Naar

In the early Fortran days it was usual for files on punched cards not
to have any space between the fields. The field widths tell how many
characters are for each value. For Fortran formatted input with a
descriptor such as F5.1, five characters are used, and one of them
is considered after the decimal point, unless an actual decimal
is punched. (Or otherwise in the input file.)

Good catch, I hadn't thought about that possibility.
However it doesn't seem to apply to the file format used by the OP,
given the sample input file that was attached to the program.
Hopefully the OP can comment on this.
 
E

Eric Sosman

OK, I thought I'd give sscanf a try. It must be a somewhat new function
to C. My old C book dated 1992 doesn't mention it.

Burn that book. sscanf() is described on page 150 of
"The C Programming Language," published in (wait for it...)

Nineteen Seventy-Eight
 
T

Tim Rentsch

W. eWatson said:
Nope. See for example <http://linux.die.net/man/3/sscanf> -
sscanf() allows you to scan a string for data rather than
reading from stdin or some other file.

Parts of section 12 of the C-FAQ <http://c-faq.com> may help
you understand why this is a good thing.
OK, I thought I'd give sscanf a try. It must be a somewhat new
function to C. My old C book dated 1992 doesn't mention it.

Anyway I wrote the following program test.c to read the test.dat file
below. sscanf is used, but the printf goes looney when executed.

/* test.c */
#include<stdio.h>
#include<stdlib.h>

int
main(void)
{
float pxm[2][80];
int i,j,k;
float value;
char str[70];
FILE *p;

if((p=fopen("test_array.dat","r"))==NULL){
printf("\nUnable to open file test_array.dat");
exit(1);
}

while(fgets(str,70,p)!=NULL){
sscanf(str,"%5.1f", &value);
printf("value = %5.1f \n", value);
}
fclose(p);
exit(0);
}

test.dat.
100.0 35.3 3.1
18.2 41.7 27.4
986.3 11.6 21.0

Your program has these lines:

float value;
...
sscanf(str,"%5.1f", &value);

What differences do you see between that code and what was shown
in my previously posted example:

double a, b, c;
int n;

n = sscanf( line, " %lf %lf %lf", &a, &b, &c );
/* n == 3 means three values were read - a, b, c */
/* n == 2 means two values were read - a, b */
/* n == 1 means one value was read - a */
/* n anything else means no values were read */

... now use the values ...

Which of the differences that you see might be significant?
Hint: if you don't know if a particular difference will be
significant, then it _might_ be significant.
 
W

W. eWatson

The format specifier for sscanf cannot contain a fractional part,
so "%5.1f" is not valid. If you really want to specify a maximum
field with of 5, then use "%5f". Not sure why you would want this,
it would only read the first five characters of a number, so reading
"-12345.67" would yield the value -1234.0 which may come as a
surprise.
Simply using "%f" should work just fine.

It often makes sense to check sscanf's return value, to
see if the read operation succeeded.
The columns are fixed, so I would expect
20.3
1.45
to appear as they are.

Interesting about %5f. It appears that a specifier with a decimal, e.g.,
%10.3, is only available when a printf (or similar cmd) is used. I
decided to catch the output of sscanf. Here's what I have:
while(fgets(str,70,p)!=NULL){
n=sscanf(str,"%5f", &value);
printf("value = %5.1f %d\n", value,n);
}

value is float. str is char str[70];

Here's what the program produces:

value = 0.00 0
value = 0.00 0
value = 0.00 -1
value = 0.00 0
value = 0.00 0
value = 0.00 0
value = 0.00 0
value = 0.00 0
value = 0.00 0
value = 0.00 0
value = 0.00 0
value = 0.00 -1
value = 0.00 0
value = 0.00 0
value = 0.00 0
value = 0.00 0
value = 0.00 0
value = 0.00 0
....
value = 0.00 0
value = 0.00 -1
value = 0.00 -1
value = 0.00 0
value = 0.00 0
value = 0.00 -1
value = 0.00 -1
value = 0.00 -1
value = 0.00 -1

About 39 lines!!!
n goes negative?
 
W

W. eWatson

Burn that book. sscanf() is described on page 150 of
"The C Programming Language," published in (wait for it...)

Nineteen Seventy-Eight
Rats. I missed it in the index. It's a lame description.
 
W

W. eWatson

Maybe the culprit is that I'm using a data file that was produced in
Win7? End of line incompatibility?
 
E

Eric Sosman

[...] Here's what I have:
while(fgets(str,70,p)!=NULL){
n=sscanf(str,"%5f", &value);
printf("value = %5.1f %d\n", value,n);
^
one decimal place
}

value is float. str is char str[70];

Here's what the program produces:

value = 0.00 0
^^
two decimal places

The code you've shown isn't the code you're running.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,999
Messages
2,570,243
Members
46,836
Latest member
login dogas

Latest Threads

Top