Do we need a more generic sequence searching algorithm?

J

jimxoch

Dear list,

I have recently implemented a generic sequence searching template
function, named single_pass_search, which is more generic than
std::search and has better worst case complexity at the same time.

More specifically, the main advantage of the single_pass_search over
the std::search is its capability to search for a sub-sequence in a
search-range that is accessed through a pair of single-pass (input)
iterators, while std::search requires that the search-range must be
accessed via forward iterators at least. This practically means that
single_pass_search can for example search a file directly through a
pair of istream iterators, without requiring the interference of an
intermediate container. (Please see "http://www.codeproject.com/KB/stl/
single_pass_search.aspx" for more.)

Although, I believe that single_pass_search can probably be
interesting in theory, I still have some doubts regarding its
practical usefulness to the average C++ developer. Hence, I would be
really grateful to anyone that will express his opinion regarding the
single_pass_search usefulness. Please don't try to be nice to me, I
need real feedback, however some reasoning together with your opinion
would be very welcomed.

Thank you very much,
Jim Xochellis
 
M

Michael DOUBEZ

jimxoch a écrit :
Dear list,

I have recently implemented a generic sequence searching template
function, named single_pass_search, which is more generic than
std::search and has better worst case complexity at the same time.

More specifically, the main advantage of the single_pass_search over
the std::search is its capability to search for a sub-sequence in a
search-range that is accessed through a pair of single-pass (input)
iterators, while std::search requires that the search-range must be
accessed via forward iterators at least.

Which is logical since it returns an iterator on pointing to the
beginning of the sequence searched. With an Input iterator, the only
thing you can is return on the iterator after the sequence and you have
some internal storage requirements (for backtracking or for state
machine) that search doesn't have.

[snip]
Although, I believe that single_pass_search can probably be
interesting in theory, I still have some doubts regarding its
practical usefulness to the average C++ developer.

Perhaps, if I needed to discard data from an incoming flow (waiting for
a synchronization sequence). But since I would have some timeout
somewhere, I don't know.
 
J

jimxoch

I have recently implemented a generic sequence searching template
Which is logical since it returns an iterator on pointing to the
beginning of the sequence searched. With an Input iterator, the only
thing you can is return on the iterator after the sequence and you have
some internal storage requirements (for backtracking or for state
machine) that search doesn't have.

No internal storage requirements for buffering and no backtracking at
all!
single_pass_search is using good-suffix shifting, instead of
buffering,
for better performance. This is why its complexity is also improved.
It only needs a small lookup table for the good-suffix shifting.

Thank you,
-Jim
 
M

Michael DOUBEZ

jimxoch a écrit :
[snip]
you have
some internal storage requirements (for backtracking or for state
machine) that search doesn't have.

No internal storage requirements for buffering and no backtracking at
all!
single_pass_search is using good-suffix shifting, instead of
buffering,
for better performance. This is why its complexity is also improved.
It only needs a small lookup table for the good-suffix shifting.

Do you mean you have a general search algorithm in o(n) complexity
without computing a state machine before (such as in KMP).

I am indeed interested in seeing it.
 
J

jimxoch

Do you mean you have a general search algorithm in o(n) complexity
without computing a state machine before (such as in KMP).

I am indeed interested in seeing it.

Sorry, I need no backtracking, but I still need a small lookup-table,
(see previous e-mail) which has to be constructed in a pre-processing
phase. The internal algorithm is actually a KMP amelioration. (Which
is based on the good-suffix shifting technique.)

References:
"http://www.codeproject.com/KB/stl/single_pass_search.aspx"
"http://www.cs.rpi.edu/~musser/gp/gensearch1.pdf"

-Jim
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,153
Members
46,701
Latest member
XavierQ83

Latest Threads

Top