remove entry from text file [NOT A HOMEWORK]

T

tmp123

Roman said:
Hello, All!

I implemented simple program to eliminate entry from the file having the
following structure (actually it's config file of 'named' DNS package for
those who care and know):

options {
directory "/var/named";
listen-on { 192.168.11.22; 127.0.0.1; };
forwarders { 168.126.63.1; };
pid-file "/var/run/named.pid";
};

controls {
unix "/var/run/ndc" perm 0600 owner 0 group 0;
};

zone "." {
type hint;
file "root.cache";
};

zone "localhost" {
type master;
file "localhost";
};

zone "my.local" {
type master;
file "my.local";
allow-update { 127.0.0.1; 192.168.11.0/24; };
};
...

I bother only about "zone" entries and the goal of code to remove them only.
Here is the source code I'm done with. I'd like to hear reasonable criticism
and useful advices :) on optimization, bugs etc.

#include <stdio.h>
#include <string.h>
#include <strings.h>

#define BUFSIZE 100

int main(int argc, char **argv)
{
FILE *f; /* original */
FILE *fn; /* new */
char buf[BUFSIZE] = { 0 };
register char *idx;
char sidx[BUFSIZE];
int found = 0; /* set if zone found */
int i = 0;

if (argc != 3) {
fprintf(stderr, "Usage: %s <zonename> <configfile>\n", *argv);
return (0);
}

printf("%s %s\n", argv[1], argv[2]);

/* open original */
if ( (f = fopen(argv[2], "r")) == NULL )
return -1;

if ( (fn = fopen("named.conf.tmp", "w+")) == NULL )
return -1;

while ( !feof(f) ) {
fgets(buf, BUFSIZE, f);

/* skip leading spaces */
for (idx = buf; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
;

/* look for 'zone' token */
if ( strncasecmp(idx, "zone ", sizeof("zone ") - 1) != 0 ) {
if (!found) {
fprintf(fn, buf);
fflush(fn);
}
continue;
}

idx += sizeof("zone ") - 1;

/* skip spaces inside of token */
for (; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
;

/* extract token from commas */
if (*idx++ != '"')
continue;

/* put token into buffer to compare */
while (*idx != '\0' && *idx != '"')
sidx[i++] = *idx++;

sidx = '\0'; i = 0;
if (strcasecmp(sidx, argv[1]) != 0) {
found = 0;
fprintf(fn, buf);
fflush(fn);
}
else
found = 1;
}

fclose(f);
fclose(fn);

return 0;
}

Thank you in advance.

With best regards, Roman Mashak. E-mail: (e-mail address removed)


Look for references about "lex" C source generator (or even "yacc" if
you have more complex requirements like this one)
 
R

Roman Mashak

Hello, All!

I implemented simple program to eliminate entry from the file having the
following structure (actually it's config file of 'named' DNS package for
those who care and know):

options {
directory "/var/named";
listen-on { 192.168.11.22; 127.0.0.1; };
forwarders { 168.126.63.1; };
pid-file "/var/run/named.pid";
};

controls {
unix "/var/run/ndc" perm 0600 owner 0 group 0;
};

zone "." {
type hint;
file "root.cache";
};

zone "localhost" {
type master;
file "localhost";
};

zone "my.local" {
type master;
file "my.local";
allow-update { 127.0.0.1; 192.168.11.0/24; };
};
....

I bother only about "zone" entries and the goal of code to remove them only.
Here is the source code I'm done with. I'd like to hear reasonable criticism
and useful advices :) on optimization, bugs etc.

#include <stdio.h>
#include <string.h>
#include <strings.h>

#define BUFSIZE 100

int main(int argc, char **argv)
{
FILE *f; /* original */
FILE *fn; /* new */
char buf[BUFSIZE] = { 0 };
register char *idx;
char sidx[BUFSIZE];
int found = 0; /* set if zone found */
int i = 0;

if (argc != 3) {
fprintf(stderr, "Usage: %s <zonename> <configfile>\n", *argv);
return (0);
}

printf("%s %s\n", argv[1], argv[2]);

/* open original */
if ( (f = fopen(argv[2], "r")) == NULL )
return -1;

if ( (fn = fopen("named.conf.tmp", "w+")) == NULL )
return -1;

while ( !feof(f) ) {
fgets(buf, BUFSIZE, f);

/* skip leading spaces */
for (idx = buf; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
;

/* look for 'zone' token */
if ( strncasecmp(idx, "zone ", sizeof("zone ") - 1) != 0 ) {
if (!found) {
fprintf(fn, buf);
fflush(fn);
}
continue;
}

idx += sizeof("zone ") - 1;

/* skip spaces inside of token */
for (; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
;

/* extract token from commas */
if (*idx++ != '"')
continue;

/* put token into buffer to compare */
while (*idx != '\0' && *idx != '"')
sidx[i++] = *idx++;

sidx = '\0'; i = 0;
if (strcasecmp(sidx, argv[1]) != 0) {
found = 0;
fprintf(fn, buf);
fflush(fn);
}
else
found = 1;
}

fclose(f);
fclose(fn);

return 0;
}

Thank you in advance.

With best regards, Roman Mashak. E-mail: (e-mail address removed)
 
S

Sirius Black

Roman said:
Hello, All!

I implemented simple program to eliminate entry from the file having the
following structure (actually it's config file of 'named' DNS package for
those who care and know):
<snip>
if ( (fn = fopen("named.conf.tmp", "w+")) == NULL )
return -1;

Here, because of opening the file for writing in present directory, it
will restrict the directory from which a user can run your program. If a
user does n't have write permission for the present directory, then he
will have to change the directory and run your program.

Probably you should have explained algorithm used in your program in
short paragraph.
 
B

boa

Roman said:
Hello, All!

I implemented simple program to eliminate entry from the file having the
following structure (actually it's config file of 'named' DNS package for
those who care and know):

options {
directory "/var/named";
listen-on { 192.168.11.22; 127.0.0.1; };
forwarders { 168.126.63.1; };
pid-file "/var/run/named.pid";
};

controls {
unix "/var/run/ndc" perm 0600 owner 0 group 0;
};

zone "." {
type hint;
file "root.cache";
};

zone "localhost" {
type master;
file "localhost";
};

zone "my.local" {
type master;
file "my.local";
allow-update { 127.0.0.1; 192.168.11.0/24; };
};
...

I bother only about "zone" entries and the goal of code to remove them only.
Here is the source code I'm done with. I'd like to hear reasonable criticism
and useful advices :) on optimization, bugs etc.

#include <stdio.h>
#include <string.h>
#include <strings.h>
strings.h?


#define BUFSIZE 100

int main(int argc, char **argv)
{
FILE *f; /* original */
FILE *fn; /* new */
char buf[BUFSIZE] = { 0 };
register char *idx;
char sidx[BUFSIZE];
int found = 0; /* set if zone found */
int i = 0;

if (argc != 3) {
fprintf(stderr, "Usage: %s <zonename> <configfile>\n", *argv);
return (0);

Maybe return EXIT_FAILURE to indicate an error?
}

printf("%s %s\n", argv[1], argv[2]);

/* open original */
if ( (f = fopen(argv[2], "r")) == NULL )
return -1;

Again, EXIT_FAILURE.
if ( (fn = fopen("named.conf.tmp", "w+")) == NULL )

How about using tmpnam() instead?
return -1;

while ( !feof(f) ) {
fgets(buf, BUFSIZE, f);

See http://c-faq.com/stdio/feof.html
/* skip leading spaces */
for (idx = buf; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
;

/* look for 'zone' token */
if ( strncasecmp(idx, "zone ", sizeof("zone ") - 1) != 0 ) {

strlen("zone"), not sizeof().
if (!found) {
fprintf(fn, buf);
fflush(fn);

Why do you need to call fflush()?

}
continue;
}

idx += sizeof("zone ") - 1;

/* skip spaces inside of token */
for (; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
;

How about a function that strips the white space?

/* extract token from commas */
if (*idx++ != '"')
continue;

/* put token into buffer to compare */
while (*idx != '\0' && *idx != '"')
sidx[i++] = *idx++;

sidx = '\0'; i = 0;
if (strcasecmp(sidx, argv[1]) != 0) {
found = 0;
fprintf(fn, buf);
fflush(fn);
}
else
found = 1;
}

fclose(f);
fclose(fn);

return 0;
}


That's a start, HTH.

Boa
 
M

Micah Cowan

Roman Mashak said:
Hello, All!

I implemented simple program to eliminate entry from the file having the
following structure (actually it's config file of 'named' DNS package for
those who care and know):

zone "." {
type hint;
file "root.cache";
};

zone "localhost" {
type master;
file "localhost";
};

zone "my.local" {
type master;
file "my.local";
allow-update { 127.0.0.1; 192.168.11.0/24; };
};

#include <stdio.h>
#include <string.h>
#include <strings.h>

The last of these is not Standard C; see my discussion later.
#define BUFSIZE 100

int main(int argc, char **argv)
{
FILE *f; /* original */
FILE *fn; /* new */
char buf[BUFSIZE] = { 0 };
register char *idx;
char sidx[BUFSIZE];
int found = 0; /* set if zone found */
int i = 0;

if (argc != 3) {
fprintf(stderr, "Usage: %s <zonename> <configfile>\n", *argv);
return (0);
}

printf("%s %s\n", argv[1], argv[2]);

/* open original */
if ( (f = fopen(argv[2], "r")) == NULL )
return -1;

A return of -1 will have an implementation-defined meaning... slightly
better would be to #include <stdlib.h> and return EXIT_FAILURE; which
is guaranteed to have the obvious meaning on any current or future
implementation.

Also, it would really pay to emit some sort of message here. You'll
really thank yourself when you accidentally open a non-existant file,
or non-readable.... etc.
if ( (fn = fopen("named.conf.tmp", "w+")) == NULL )
return -1;

while ( !feof(f) ) {
fgets(buf, BUFSIZE, f);

/* skip leading spaces */
for (idx = buf; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
;

/* look for 'zone' token */
if ( strncasecmp(idx, "zone ", sizeof("zone ") - 1) != 0 ) {

Whenever I find myself having to use the exact same string literal
twice like this, it's time to farm it out. Using something like:

const char zone_token[] = "zone ";
...
if (strncasecmp(idx, zone_token, (sizeof zone_token)-1) != 0)

saves you from the problem of accidental typos. If the number of
characters between the two of those were to accidentally get out of
sync, it'd be a bug. And even though you have them exactly right now,
you might later need to change the code to use a different literal,
and accidentally change only one.

BTW, strncasecmp() is not a Standard C function (it's POSIX, which is
not on-topic here). The residents on this newsgroup insist that you do
not post code containing functions that are (a) not Standard C
functions and (b) not defined in the provided code. For the sake of
example code on this newsgroup, the easiest way for you to avoid
difficulties is probably to use strncmp() instead. Or, define your own
implementation of strncasecmp()...
if (!found) {
fprintf(fn, buf);
fflush(fn);
}
continue;
}

idx += sizeof("zone ") - 1;

If you decide to follow my advice, this would become:

idx += (sizeof zone_array) - 1;

Actually, my own preference has been to define a macro such as:

#define ARY_STRLEN(a) (sizeof(a)-1)

which leads to a slightly better expression, in terms of
self-documentation:

idx += ARY_STRLEN(zone_array) - 1;
/* skip spaces inside of token */
for (; *idx != '\0' && (*idx == ' ' || *idx == '\t'); idx++)
;

This is the same thing as:

idx += strspn(idx, " \t");
/* extract token from commas */
if (*idx++ != '"')
continue;

/* put token into buffer to compare */
while (*idx != '\0' && *idx != '"')
sidx[i++] = *idx++;

sidx = '\0'; i = 0;


The second statement on the line above is easy to miss. I'd recommend
giving each of these a separate line.
if (strcasecmp(sidx, argv[1]) != 0) {
found = 0;
fprintf(fn, buf);
fflush(fn);
}
else
found = 1;
}

fclose(f);
fclose(fn);

return 0;
}

The last pair of fclose()s are not strictly necessary, if you're not
going to check their return value. However, I'd recommend that you do
check their return value, so you can emit an error message if fn
didn't flush properly... actually, you should check the return code of
your fflush() calls, too.

I'll point out that the code above won't work properly if your config
file contains lines longer than 100 characters.

-Micah
 
M

Micah Cowan

boa said:
strlen("zone"), not sizeof().

You mean, strlen("zone ") (with the space); and there's nothing wrong
with sizeof: it's much more efficient.
Why do you need to call fflush()?

Could be useful if he's using another program to view the temporary
file while it's being written (UNIX tail...).

Slightly better might be to set line buffering on fn with setvbuf().

-Micah
 
P

Pedro Graca

Micah said:
You mean, strlen("zone ") (with the space); and there's nothing wrong
with sizeof: it's much more efficient.

Hmmmm ???

What's the type of a literal string in code?
I thought it was `const char *'



#include <stdio.h>
#include <string.h>

int main(void) {
int a, b, c, d, e, f;
char * z1 = "forty two";
char z2[] = "forty two";

a = strlen("forty two");
b = (int)sizeof("forty two");
c = (int)sizeof(char*);
d = (int)sizeof(const char*);
e = (int)sizeof z1;
f = (int)sizeof z2; /* Ah! A match */
printf("%d, %d, %d, %d, %d, %d\n",
a, b, c, d, e, f);
return 0;
}


No Micah, I wasn't doubting you (... well, maybe just a little)
I stand corrected.
 
P

pete

Pedro Graca wrote:
What's the type of a literal string in code?

An array of some number of char.
No implict const qualifier,
just trouble if you attempt to modify the object.
 
D

David Holland

>
> How about using tmpnam() instead?

<OT>
tmpnam is insecure in Unix; he's right to avoid it.
</OT>

Now, for something on topic...
^^^
Don't do that. Ever.

If the input file happens to contain a '%' character, undefined
behavior ensues.

This branch loses the input line, which probably isn't what was
intended.
 
K

Keith Thompson

pete said:
An array of some number of char.
No implict const qualifier,
just trouble if you attempt to modify the object.

"trouble" being undefined behavior.
 
J

Jordan Abel

"trouble" being undefined behavior.

This seems like an annoying compromise - I think it should have been
const char - any implementation concerned with supporting old code could
have let it been a warning, or have a compiler switch to put it in a
non-conforming mode, perhaps one which would also allow the
modifications themselves.
 
K

Keith Thompson

S.Tobias said:
...

I beleive so, but what is C&V for that?

C99 6.4.5p5:
In translation phase 7, a byte or code of value zero is appended
to each multibyte character sequence that results from a string
literal or literals. The multibyte character sequence is then used
to initialize an array of static storage duration and length just
sufficient to contain the sequence. For character string literals,
the array elements have type char, and are initialized with the
individual bytes of the multibyte character sequence;
[discussion of wide string literals deleted]

C99 6.5.1p4:
A string literal is a primary expression. It is an lvalue with
type as detailed in 6.4.5.
 
P

pete

Keith said:
S.Tobias said:
...

I beleive so, but what is C&V for that?

C99 6.4.5p5:
In translation phase 7, a byte or code of value zero is appended
to each multibyte character sequence that results from a string
literal or literals. The multibyte character sequence is then used
to initialize an array of static storage duration and length just
sufficient to contain the sequence. For character string literals,
the array elements have type char, and are initialized with the
individual bytes of the multibyte character sequence;
[discussion of wide string literals deleted]

C99 6.5.1p4:
A string literal is a primary expression. It is an lvalue with
type as detailed in 6.4.5.

N869
6.3.2 Other operands
6.3.2.1 Lvalues and function designators
[#3] Except when it is the operand of the sizeof operator or
the unary & operator, or is a string literal used to
initialize an array, an expression that has type ``array of
type'' is converted to an expression with type ``pointer to
type'' that points to the initial element of the array
object and is not an lvalue. If the array object has
register storage class, the behavior is undefined.
 
M

Micah Cowan

Pedro Graca said:
Hmmmm ???

What's the type of a literal string in code?
I thought it was `const char *'

I wish the const were there....

But now. A string literal has type char[].

Since you posted some very helpful test code, I'll explain what you're
seeing...
#include <stdio.h>
#include <string.h>

int main(void) {
int a, b, c, d, e, f;
char * z1 = "forty two";
char z2[] = "forty two";

These two don't involve objects generated, as they're initializers,
and not assignments.
a = strlen("forty two");

Not much need to comment here.
b = (int)sizeof("forty two");

This "forty two" requires that the generated program have at least one
character array with a length of 10, and that is the result given by
the string literal, so sizeof will return 10.
c = (int)sizeof(char*);
d = (int)sizeof(const char*);
e = (int)sizeof z1;

The size of all three of these will be implementation-dependant, but
will certainly be the same, as they all test the same type.
f = (int)sizeof z2; /* Ah! A match */

And, as you've discovered, this should also give 10, for the
explicitly declared array of 10 characters.
printf("%d, %d, %d, %d, %d, %d\n",
a, b, c, d, e, f);
return 0;
}


No Micah, I wasn't doubting you (... well, maybe just a little)
I stand corrected.

Hey, doubt away! I'm certainly human, and have certainly said my share
of stupid things on this list, both formerly and recently. That's why
posting to comp.lang.c is a great way to learn the ins and outs of the
C language: if you say something that reveals a misperception of
reality, you will quickly be corrected. :)

-Micah
 
K

Keith Thompson

Jordan Abel said:
This seems like an annoying compromise - I think it should have been
const char - any implementation concerned with supporting old code could
have let it been a warning, or have a compiler switch to put it in a
non-conforming mode, perhaps one which would also allow the
modifications themselves.

I agree that it's annoying, but I think it was pretty much necessary.
Pre-ANSI C didn't have "const". Making string literals const would
break code like this:

int print_string(char *s)
{
return printf("s = \"%s\"\n", s);
}

print_string("hello");

The problem would occur on passing the string literal to a function
expecting a (non-const) char*, not on any attempt to actually write to
it. There would have been no good way to write this program in a
manner compatible with both K&R and ANSI C, though I suppose something
like

#ifdef __STDC__
#define CONST const
#else
#define CONST
#endif

int print_string(CONST char *s) ...

would have been ok.

gcc's "-Wwrite-strings" option causes string literals to be const (and
causes some misleading error messages).
 
M

Micah Cowan

Roman Mashak said:
Hello, David!
You wrote on Fri, 24 Feb 2006 00:30:00 +0000 (UTC):

??>>> if (!found) {
??>>> fprintf(fn, buf);
DH> ^^^
DH> Don't do that. Ever.

DH> If the input file happens to contain a '%' character, undefined
DH> behavior ensues.
Hm, I didn't find this issue in FAQ. What's the more robust and portable
method to put formatted output into file?

It's /not/ formatted output: that's the point. If he must use
fprintf(), he should use "%s", followed by buf. But fputs() or puts()
seem like more reasonable facilities.
??>>> /* extract token from commas */
??>>> if (*idx++ != '"')
??>>> continue;

DH> This branch loses the input line, which probably isn't what was
DH> intended.
Here it is supposed to put a pointer on to first occurence of ' " ' symbol
in the line.

You've missed the point. If the line doesn't specify a " after the
keyword "zone ", then it's "not the line you're looking for, move
along." The trouble is, that if it's not the line he's looking for, he
probably wants to print it out, as he does in other places, but forgot
to do that here.

-Micah
 
R

Roman Mashak

Hello, boa!
You wrote on Thu, 23 Feb 2006 20:16:14 +0100:

??>> #include <strings.h>

b> strings.h?
I should've apologized for using of non-standard stuffs :)
??>> #define BUFSIZE 100
??>>
??>> int main(int argc, char **argv)
??>> if ( (fn = fopen("named.conf.tmp", "w+")) == NULL )

b> How about using tmpnam() instead?
Manual of my system strongly recommends never to use this function cause of
being unsafe.
??>> return -1;
??>>
??>> while ( !feof(f) ) {
??>> fgets(buf, BUFSIZE, f);

b> See http://c-faq.com/stdio/feof.html
Yeas, that was the problem I run into: printing '\n' twice, BUT it happened
only if I removed entry from the middle of file, otherwise everything came
right.
Thanks for hint.

??>> if (!found) {
??>> fprintf(fn, buf);
??>> fflush(fn);

b> Why do you need to call fflush()?
I view the result file in the next console via "tail -f" (I'm using linux
presently).
??>> }
??>> continue;
??>> }
??>>

With best regards, Roman Mashak. E-mail: (e-mail address removed)
 
M

Micah Cowan

Roman Mashak said:
Hello, Micah!
You wrote on Thu, 23 Feb 2006 23:07:14 GMT:

[skip]
thanks for giving me many pieces of advices.

??>> /* extract token from commas */
??>> if (*idx++ != '"')
??>> continue;
??>>
??>> /* put token into buffer to compare */
??>> while (*idx != '\0' && *idx != '"')
??>> sidx[i++] = *idx++;
??>>
??>> sidx = '\0'; i = 0;

MC> The second statement on the line above is easy to miss. I'd recommend
MC> giving each of these a separate line.
Is there any way in your opinion to somehow optimize this piece of code,
make it more elegant :) may be in less statements...


Well, personally, I'd probably have avoided the separate and
unnecessary buffer sidx[] altogether, and just stored the start and
end of the interesting text.

Aside from that, though, it might be slightly more efficient to use an
incremented pointer instead. If you had:

char *scur;

Then the loop could be:

for (scur = sidx; *idx != '\0' && *idx != '"'; ++scur, ++idx)
*scur = *idx;

which would at least eliminate the need for i.
??>> if (strcasecmp(sidx, argv[1]) != 0) {
??>> found = 0;
??>> fprintf(fn, buf);
??>> fflush(fn);
??>> }
??>> else
??>> found = 1;
??>> }
??>>
??>> fclose(f);
??>> fclose(fn);
??>>
??>> return 0;
??>> }

MC> The last pair of fclose()s are not strictly necessary, if you're not
What do you mean by that? OS should take care of unclosed descriptors by
yourself?

By itself, yes. The C implementation is required to make sure that
exit from the program properly closes any open files.
And they're not descriptors, they're file handles
(descriptors is a UNIX term for something related but different).

-Micah
 
R

Roman Mashak

Hello, Micah!
You wrote on Thu, 23 Feb 2006 23:07:14 GMT:

[skip]
thanks for giving me many pieces of advices.

??>> /* extract token from commas */
??>> if (*idx++ != '"')
??>> continue;
??>>
??>> /* put token into buffer to compare */
??>> while (*idx != '\0' && *idx != '"')
??>> sidx[i++] = *idx++;
??>>
??>> sidx = '\0'; i = 0;

MC> The second statement on the line above is easy to miss. I'd recommend
MC> giving each of these a separate line.
Is there any way in your opinion to somehow optimize this piece of code,
make it more elegant :) may be in less statements...

??>> if (strcasecmp(sidx, argv[1]) != 0) {
??>> found = 0;
??>> fprintf(fn, buf);
??>> fflush(fn);
??>> }
??>> else
??>> found = 1;
??>> }
??>>
??>> fclose(f);
??>> fclose(fn);
??>>
??>> return 0;
??>> }

MC> The last pair of fclose()s are not strictly necessary, if you're not
What do you mean by that? OS should take care of unclosed descriptors by
yourself?
MC> going to check their return value. However, I'd recommend that you do
MC> check their return value, so you can emit an error message if fn
MC> didn't flush properly... actually, you should check the return code of
MC> your fflush() calls, too.

MC> I'll point out that the code above won't work properly if your config
MC> file contains lines longer than 100 characters.
That's correct, but I assume sane config file won't contain strings longer
than 100 characters ;-)

With best regards, Roman Mashak. E-mail: (e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,813
Latest member
lawrwtwinkle111

Latest Threads

Top