string search?

B

Bertram Trabant

Hello, Im working on a little LAN game in the style of old text-only
MUD's and would need to find a way to search for a string in a text
file (for example for usernames). I know it works in the way of looking
for the first letter, if matches the second and so on, but don't know
how to write it. Any suggestions?
 
R

Richard Heathfield

Bertram Trabant said:
Hello, Im working on a little LAN game in the style of old text-only
MUD's and would need to find a way to search for a string in a text
file (for example for usernames). I know it works in the way of looking
for the first letter, if matches the second and so on, but don't know
how to write it. Any suggestions?

Look up "Boyer-Moore" or "Knuth-Morris-Pratt".
 
A

Andrew Poelstra

Hello, Im working on a little LAN game in the style of old text-only
MUD's and would need to find a way to search for a string in a text
file (for example for usernames). I know it works in the way of looking
for the first letter, if matches the second and so on, but don't know
how to write it. Any suggestions?

Well, you could attempt to read the third volume of Knuth's /The Art of
Computer Programming/. If you try it, I wish you all the luck in the
world.

Otherwise, you could try asking in comp.programming for pseudocode.
You'd probably get more help there.
 
R

Roland Pibinger

Hello, Im working on a little LAN game in the style of old text-only
MUD's

A hobbyist program, I daresay.
and would need to find a way to search for a string in a text
file (for example for usernames). I know it works in the way of looking
for the first letter, if matches the second and so on, but don't know
how to write it. Any suggestions?

You have 2 possibilities depending on your requirements:
1. Search directly in the file using stdio functions like fopen,
fread, fgets, fgetc, fseek, ftell, fclose ...

2. Read the whole or parts (e.g. single lines) of the file into a
string and search the string. For that you need malloc and free for
(de-)allocation (or a fixed size buffer), the aforementioned stdio
functions and strstr to find the substring.

Good luck,
Roland Pibinger
 
C

Clever Monkey

Bertram said:
Hello, Im working on a little LAN game in the style of old text-only
MUD's and would need to find a way to search for a string in a text
file (for example for usernames). I know it works in the way of looking
for the first letter, if matches the second and so on, but don't know
how to write it. Any suggestions?
If you are going to be doing a fair amount of string manipulation in C
for this project then I suggest you do one of two things:

1. Implement a string handling library of your own
2. Use one of the many string libraries that are out there

The first is a fine way to get your chops down, but it may delay the
actual fun part of the development for you, namely writing the game.
 
B

Bertram Trabant

Roland said:
A hobbyist program, I daresay.

Oh yes, it's absolutely nothing professional, just a way to kill time
and make sarcastic comments about others.
2. Read the whole or parts (e.g. single lines) of the file into a
string and search the string. For that you need malloc and free for
(de-)allocation (or a fixed size buffer), the aforementioned stdio
functions and strstr to find the substring.

This looks good. Do commands like getc() use some sort of "cursor" in
the file? That would be a huge time-saver (Stupid code, clever file).

Also, thanks to everybody else who helped.
 
F

Flash Gordon

Roland said:
You may have a look at an introductionary C book or an online tutorial
like:
http://www-ee.eng.hawaii.edu/Courses/EE150/Book/

I would not recommend this one. In the first C program it used the very
old fashioned definition of main:
main()
{
}

It failed to #include <stdio.h> before using printf.

It then went on to specify specific sizes for types (sizes that are
wrong on many modern systems) without mentioning that they depended on
the particular implementation.

In another program it went on to using scanf (again without a prototype
in scope) without checking the return value or ensuring that the
previous prompt will have been displayed. Then it only does while loops,
not for loops. I could go on.

I can't recommend this either. This started off looking better until it
suggested using gets in real programs and did not check the return value
of scanf and again did not ensure prompts were displayed.

Then the author goes on to assume that there is an implicit conversion
between a function pointer and an int. It was suggesting that C would accept
x = rand
when what was wanted was
x = rand()
Obviously, x is an int and rand is a function returning an int. As
everyone here knows the first line would *require* a diagnostic.

The comp.lang.c FAQ, K&R2 (see the bibliography of the comp.lang.c FAQ)
and many other books that have been recommended here would be far better.
 
B

Bertram Trabant

pete napísal(a):
They use a "file position indicator",
if that's what you mean.

This looks promising, is there any way of directly controlling it?
 
R

Richard Heathfield

Bertram Trabant said:
pete napísal(a):


This looks promising, is there any way of directly controlling it?

Look up ftell, fseek, fgetpos, and fsetpos.
 
R

Richard Heathfield

Flash Gordon said:
I would not recommend this one. In the first C program it used the very
old fashioned definition of main:
main()
{
}

Hardly a killer. After all, K&R2 does this as well.
It failed to #include <stdio.h> before using printf.

That's rather more serious. Taken with the other issues you mentioned, it's
enough to scupper any tutorial.

I can't recommend this either. This started off looking better until it
suggested using gets in real programs
Ouch!

The comp.lang.c FAQ, K&R2 (see the bibliography of the comp.lang.c FAQ)
and many other books that have been recommended here would be far better.

See also:

* http://www.eskimo.com/~scs/cclass/
* http://cprog.tomsweb.net/cintro.html
 
C

CBFalconer

Bertram said:
.... snip ...


This looks good. Do commands like getc() use some sort of "cursor" in
the file? That would be a huge time-saver (Stupid code, clever file).

To search for a single string you don't need any buffers. See the
following little demonstration to search for a string in a stream
without backtracking.

/*
Leor said:
I think so. Here's a version I just threw together:
*/

/* And heres another throw -- binfsrch.c by CBF */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <assert.h>

/* The difference between a binary and a text file, on read,
is the conversion of end-of-line delimiters. What those
delimiters are does not affect the action. In some cases
the presence of 0x1a EOF markers (MsDos) does.

This is a version of Knuth-Morris-Pratt algorithm. The
point of using this is to avoid any backtracking in file
reading, and thus avoiding any use of buffer arrays.
*/

size_t chrcount; /* debuggery, count of input chars, zeroed */

/* --------------------- */

/* Almost straight out of Sedgewick */
/* The next array indicates what index in id should next be
compared to the current char. Once the (lgh - 1)th char
has been successfully compared, the id has been found.
The array is formed by comparing id to itself. */
void initnext(int *next, const char *id, int lgh)
{
int i, j;

assert(lgh > 0);
next[0] = -1; i = 0; j = -1;
while (i < lgh) {
while ((j >= 0) && (id != id[j])) j = next[j];
i++; j++;
next = j;
}
#if (0)
for (i = 0; i < lgh; i++)
printf("id[%d] = '%c' next[%d] = %d\n",
i, id, i, next);
#endif
} /* initnext */

/* --------------------- */

/* reads f without rewinding until either EOF or *marker
has been found. Returns EOF if not found. At exit the
last matching char has been read, and no further. */
int kmpffind(const char *marker, int lgh, int *next, FILE *f)
{
int j; /* char position in marker to check */
int ch; /* current char */

assert(lgh > 0);
j = 0;
while ((j < lgh) && (EOF != (ch = getc(f)))) {
chrcount++;
while ((j >= 0) && (ch != marker[j])) j = next[j];
j++;
}
return ch;
} /* kmpffind */

/* --------------------- */

/* Find marker in f, display following printing chars
up to some non printing character or EOF */
int binfsrch(const char *marker, FILE *f)
{
int *next;
int lgh;
int ch;
int items; /* count of markers found */

lgh = strlen(marker);
if (!(next = malloc(lgh * sizeof *next))) {
puts("No memory");
exit(EXIT_FAILURE);
}
else {
initnext(next, marker, lgh);
items = 0;
while (EOF != kmpffind(marker, lgh, next, f)) {
/* found, take appropriate action */
items++;
printf("%d %s : \"", items, marker);
while (isprint(ch = getc(f))) {
chrcount++;
putchar(ch);
}
puts("\"");
if (EOF == ch) break;
else chrcount++;
}
free(next);
return items;
}
} /* binfsrch */

/* --------------------- */

int main(int argc, char **argv)
{
FILE *f;

f = stdin;
if (3 == argc) {
if (!(f = fopen(argv[2], "rb"))) {
printf("Can't open %s\n", argv[2]);
exit(EXIT_FAILURE);
}
argc--;
}
if (2 != argc) {
puts("Usage: binfsrch name [binaryfile]");
puts(" (file defaults to stdin text mode)");
}
else if (binfsrch(argv[1], f)) {
printf("\"%s\" : found\n", argv[1]);
}
else printf("\"%s\" : not found\n", argv[1]);
printf("%lu chars\n", (unsigned long)chrcount);
return 0;
} /* main binfsrch */

--
Some informative links:
< <http://www.geocities.com/nnqweb/>
<http://www.catb.org/~esr/faqs/smart-questions.html>
<http://www.caliburn.nl/topposting.html>
<http://www.netmeister.org/news/learn2quote.html>
<http://cfaj.freeshell.org/google/>
 
F

Flash Gordon

Richard said:
Flash Gordon said:


Hardly a killer. After all, K&R2 does this as well.

I agree. If all that was wrong was a few old fashioned definitions it
would be worth pointing out but would not be a reason to avoid it.
That's rather more serious. Taken with the other issues you mentioned, it's
enough to scupper any tutorial.

Which is why I considered a brief critique worth the effort.

What makes it worse is it is apparently a course that was run at this
educational establishment in the spring of 98. I was about to say that a
later course was a bit better although still having a lot of old
fashioned and out of date stuff for a course in 2006, then I looked more
carefully and saw this
http://www-ee.eng.hawaii.edu/~tep/EE160/Book/chap2/subsection2.1.1.2.html#SECTION0011200000000000000
To be fair, some of the rest of it is not so bad and I saw several
examples that did include stdlib.h. However, then we get to
http://www-ee.eng.hawaii.edu/~tep/EE160/Book/chap7/section2.1.3.html#SECTION0013000000000000000
which says amongst other things:
| ... If the array is an integer array, (float array, character array,
| etc.) then the type of X is int * ( float *, char *, etc.). Thus,
| the declaration of an array causes the compiler to allocate the
| specified number of contiguous cells of the indicated type, as well as
| to allocate an appropriate pointer cell, initialized to point to the
| first cell of the array. This pointer cell is given the name of the
| array.
Then I think, well, that kind of teaching could explain all the effort
people have to go in to on this group to explain the difference between
pointers and arrays.

Some of it is not too bad, but you *really* don't want to know some of
the rest of what I saw.

To be fair it mentioned fgets in the same sentence, but did not
recommend it over fgets.

Your recommendations, on the other hand, I feel no need to check myself
for suitability. Your recommendation would be enough even if I did not
recognise who wrote the first tutorial from the URL.
 
K

Keith Thompson

Richard Heathfield said:
Bertram Trabant said:

Look up ftell, fseek, fgetpos, and fsetpos.

And don't expect them to work on anything other than disk files. You
can't rewind a keyboard.
 
W

Walter Roberson

Bertram Trabant wrote:
They use a "file position indicator",
if that's what you mean.

I'd forgotten that term was used in the standard.


I noticed something interesting (but not much related to the original
quesiton) when I re-read the C89 section today, and I don't recall seeing
it before: that the actual address of a FILE object may be significant
and that a copy of the FILE object "may not necessarily serve"
in place of the original. It makes a certain kind of sense in practice,
allowing the implementation some optimizations in linking FILE objects
and implementation-level "underlying functions" such as read().
 
S

Spiros Bousbouras

Walter said:
I'd forgotten that term was used in the standard.


I noticed something interesting (but not much related to the original
quesiton) when I re-read the C89 section today, and I don't recall seeing
it before: that the actual address of a FILE object may be significant
and that a copy of the FILE object "may not necessarily serve"
in place of the original. It makes a certain kind of sense in practice,
allowing the implementation some optimizations in linking FILE objects
and implementation-level "underlying functions" such as read().

I started a thread on this some time ago:
http://groups.google.co.uk/group/co...624d?lnk=gst&q=spiros&rnum=1#d0a89169bf8c624d
 
R

Roland Pibinger

I would not recommend this one.


I can't recommend this either.

IMO, for the student accessibility of the content and the right 'pace'
are more important than 100% correctness.

Best regards,
Roland Pibinger
 
R

Richard Heathfield

Keith Thompson said:
And don't expect them to work on anything other than disk files. You
can't rewind a keyboard.

Oh yes you can. It's messy, but it can be done. (I wouldn't count on being
able to type on it afterwards, though.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,156
Messages
2,570,878
Members
47,408
Latest member
AlenaRay88

Latest Threads

Top