automated test for sub-string search algorithms...

Chris M. Thomasson · Feb 20, 2010

This code includes my sub-string search algorithm, the automated test, and a
stress test for my algorithm:

http://clc.pastebin.com/f5cb3bbd2

This testing framework helped me find several bugs in my algorithm. I have
fixed them and so far, so good... I will continue to blast the shi% out of
my algorithm with this test and report any errors here.

The automated test randomly generates a comparand string, then it builds a
source string and inserts the comparand at random positions within it. It
records all the offsets, and the count of matches. Now, you can use this
information to verify that your search algorithm is getting everything
correct. Please refer to the `strstr_test_create()' function which generates
the random data. Then refer to the `strstr_test_xstrstr()' function which
actually verifies that my algorithm is getting things right.

AFAICT, it's a fairly decent way to blast your sub-string search algorithm
with tremendous forms of random data.

Enjoy!

BTW, if you have any improvements in mind, please share all of them.

;^)

Chris M. Thomasson · Feb 21, 2010

Chris M. Thomasson said:
This code includes my sub-string search algorithm, the automated test, and
a stress test for my algorithm:

http://clc.pastebin.com/f5cb3bbd2

[...]

There is a little bug on `line 68':

printf("source(%u): %s\n",
(unsigned long)src_size,
src);

The first formatter is not compatible with the damn argument! Here, let me
fix that:

http://clc.pastebin.com/f32655568

Sorry about that crap. Anyway, I changed a couple of things. One, I funnel
all debug messages through a function macro `DBG_PRINTF()' which
automatically disables output if `NDEBUG' is defined. Two, I made a little
tweak to the search algorithm itself. It now searches from left-to-right
after it determines that a rightmost character matches the rightmost
character from the comparand.

IMHO, this simple little test program comes in handy when you are trying to
debug a sub-string search algorithm. I can play around with all sorts of
improvements and see exactly where each one of them fail. One thing I am
exploring is reading 32/64-bits worth of characters at a time. This allows
me reduce the total number of comparisons...

Chris M. Thomasson · Feb 22, 2010

Chris M. Thomasson said:
This code includes my sub-string search algorithm, the automated test, and
a stress test for my algorithm:

http://clc.pastebin.com/f5cb3bbd2

This testing framework helped me find several bugs in my algorithm. I have
fixed them and so far, so good... I will continue to blast the shi% out of
my algorithm with this test and report any errors here.

Here is latest version of my sub-string search algorithm:

http://clc.pastebin.com/f5c1d3e5a

This one allows search to skip many more characters than before. I don't
think its like Boyer-Moore because I am not using two tables. Humm...

Any thoughts?

Chris M. Thomasson · Feb 23, 2010

Chris M. Thomasson said:
Here is latest version of my sub-string search algorithm:

http://clc.pastebin.com/f5c1d3e5a

This one allows search to skip many more characters than before. I don't
think its like Boyer-Moore because I am not using two tables. Humm...

Any thoughts?

I would say it's close to the following algorithm:

http://en.wikipedia.org/wiki/Boyer-Moore-Horspool_algorithm

I think I am initializing the hash table a little bit differently. For
instance, my bad character hash hit is zero, while the Horspool algorithm
would have the length of the comparand stored in the table.

automated sub-string search/replace algorithm...	40	Feb 28, 2010
Test-driven development of random algorithms	2	Nov 14, 2006
[SUMMARY] Maximum Sub-Array (#131)	0	Jul 19, 2007
Algorithm for fuzzy string matching	1	Mar 23, 2011
[SUMMARY] Word Search Generator (#159)	3	Apr 18, 2008
Search a Text File for a String, Return String to Function	1	Jul 29, 2006
The devolution of English language and slothful c.l.p behaviors exposed!	50	Jan 24, 2012
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006

automated test for sub-string search algorithms...

Chris M. Thomasson

Chris M. Thomasson

Chris M. Thomasson

Chris M. Thomasson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads