NvrBst said:
I have a file full of data that I want to tokenize. My function works
as long as the data I want to grab doesn't have padded whitespaces,
however, I want to preserve the padded whitespaces. Can I modify
fscanf to include them in the match?
---Example File---
MyKey1: INT, 3341, 1
MyKey2: STRING, Hello World, 1
MyKey3: STRING, , 1
--Format is Like so "KEYWORD: TYPE, Data1, Data2"---
fscanf(fFile, "%32[^:]: %32[^,], %32[^,], %d\n", P1, P2, P3, &P4);
When it gets to "MyKey3" it fails to match P3 thus returns 2
elements. I want P3 to be " ". Shouldn't "%32[^,]" be matching
anything but ",", aka spaces as well? A way around this? Different
way I should be tokenizeing such data?
Note: P1/P2/P3 are just "char[32+1]"'s. P4 is an int.
Thanks in Advance; I'm using GNU GCC 4.3.2 on a Ubuntu Machine w/
Latest Eclipse CDT.
I can suggest you to develop a self-made and overflow-free getline()
method to get the whole line in a file, something like this:
char* getline (FILE *fp) {
char *line = NULL;
char ch;
unsigned int size=0;
while ((ch=fgetc(fp)) && ch!='\n' && ch!='\r' && !feof(fp)) {
line = (char*) realloc(line,++size);
line[size-1]=ch;
}
line[size]=0;
return line;
}
Then you can parse the line obtained this way using regex.h functions
to match what you like, using "," as separator.
There are much simpler (and faster) ways to handle getline. Among
them is ggets, available at:
<
http://cbfalconer.home.att.net/download/ggets.zip>
I don't know if I supplied this earlier, in this thread. At any
rate, try it out. #define TESTING to get a testable version.
Don't define TESTING for normal use.
/* ------- file tknsplit.h ----------*/
#ifndef H_tknsplit_h
# define H_tknsplit_h
# ifdef __cplusplus
extern "C" {
# endif
#include <stddef.h>
/* copy over the next tkn from an input string, after
skipping leading blanks (or other whitespace?). The
tkn is terminated by the first appearance of tknchar,
or by the end of the source string.
The caller must supply sufficient space in tkn to
receive any tkn, Otherwise tkns will be truncated.
Returns: a pointer past the terminating tknchar.
This will happily return an infinity of empty tkns if
called with src pointing to the end of a string. Tokens
will never include a copy of tknchar.
released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
revised 2007-05-26 (name)
*/
const char *tknsplit(const char *src, /* Source of tkns */
char tknchar, /* tkn delimiting char */
char *tkn, /* receiver of parsed tkn */
size_t lgh); /* length tkn can receive */
/* not including final '\0' */
# ifdef __cplusplus
}
# endif
#endif
/* ------- end file tknsplit.h ----------*/
/* ------- file tknsplit.c ----------*/
#include "tknsplit.h"
/* copy over the next tkn from an input string, after
skipping leading blanks (or other whitespace?). The
tkn is terminated by the first appearance of tknchar,
or by the end of the source string.
The caller must supply sufficient space in tkn to
receive any tkn, Otherwise tkns will be truncated.
Returns: a pointer past the terminating tknchar.
This will happily return an infinity of empty tkns if
called with src pointing to the end of a string. Tokens
will never include a copy of tknchar.
A better name would be "strtkn", except that is reserved
for the system namespace. Change to that at your risk.
released to Public Domain, by C.B. Falconer.
Published 2006-02-20. Attribution appreciated.
Revised 2006-06-13 2007-05-26 (name)
*/
const char *tknsplit(const char *src, /* Source of tkns */
char tknchar, /* tkn delimiting char */
char *tkn, /* receiver of parsed tkn */
size_t lgh) /* length tkn can receive */
/* not including final '\0' */
{
if (src) {
while (' ' == *src) src++;
while (*src && (tknchar != *src)) {
if (lgh) {
*tkn++ = *src;
--lgh;
}
src++;
}
if (*src && (tknchar == *src)) src++;
}
*tkn = '\0';
return src;
} /* tknsplit */
#ifdef TESTING
#include <stdio.h>
#define ABRsize 6 /* length of acceptable tkn abbreviations */
/* ---------------- */
static void showtkn(int i, char *tok)
{
putchar(i + '1'); putchar(':');
puts(tok);
} /* showtkn */
/* ---------------- */
int main(void)
{
char teststring[] = "This is a test, ,, abbrev, more";
const char *t, *s = teststring;
int i;
char tkn[ABRsize + 1];
puts(teststring);
t = s;
for (i = 0; i < 4; i++) {
t = tknsplit(t, ',', tkn, ABRsize);
showtkn(i, tkn);
}
puts("\nHow to detect 'no more tkns' while truncating");
t = s; i = 0;
while (*t) {
t = tknsplit(t, ',', tkn, 3);
showtkn(i, tkn);
i++;
}
puts("\nUsing blanks as tkn delimiters");
t = s; i = 0;
while (*t) {
t = tknsplit(t, ' ', tkn, ABRsize);
showtkn(i, tkn);
i++;
}
return 0;
} /* main */
#endif
/* ------- end file tknsplit.c ----------*/