Hi!
Looking at the various solutions to the original problem, I wanted to
state my design goals so one could make a resonable decision about which
code to use.
I try to keep the algorithmic complexity low. I try to reuse code, that
is I use the STL and therefore I stick to its idioms.
Hmm... I see a lot of really complex and strange code here when it's
not really necessary. Most of what people posted requires multiple
passes through the string, or a lot of shifting of bytes around (e.g.
something like Paavo's "while (string contains char) remove_char" is
going to do -way- more moving of data than necessary -- it shifts the
entire end of the string back every time through the loop). Sticking
to generic STL calls for finding and removing characters in the string
gains you nothing unless you are going to be finding and removing
elements from generic containers that don't provide random access
iterators (in which case the generic programming is a benefit). The
use of remove_if, such as in Frank's example, will get you equal
performance to the example below (remove_if may very well be
implemented the same way), except Frank's sort + binary search is
likely to have more overhead then a simple linear search for your
original requirements of removing a set of 3 or 4 bad characters only
(however, for removing large character sets, a binary search will
perform better, the sort is unnecessary if the input is sorted to
begin with -- but you can do the search in -constant- time, with no
pre-sorting either, if you make some assumptions about the max value
of a char and load valid characters into a lookup table first). You
know that you are using a string (or any type with random access
iterators). Just do something like this:
In-place, single pass through string, no unnecessary copies or moves:
void remove_chars (const string &bad, string &str) {
string::iterator s, d;
for (s = str.begin(), d = s; s != str.end(); ++ s)
if (bad.find(*s) == string::npos)
*(d ++) = *s;
str.resize(d - str.begin());
}
That works because 'd' will always be behind or at the same position
as 's'. That in-place version can be made to work with generic
iterators as well as random access iterators if you replace the
resize() call with "erase(d, str.end())". Here is the same thing,
places result in destination buffer:
void remove_chars (const string &bad, const string &str, string
&clean) {
string::const_iterator s;
clean = "";
clean.reserve(str.size()); // do not perform extra realloc + copies.
for (s = str.begin(); s != str.end(); ++ s)
if (bad.find(*s) == string::npos)
clean += *s;
}
Example use:
{
string s = "a m|e|s|s|y s,t,r,i,n,g", c;
remove_chars("mesy", s, c);
remove_chars("|,", s);
cout << c << endl << s << endl;
}
Jason