Splitting strings

A

Alan Woodland

Hi,

I was looking for a clean, generic way to split strings around a
character using STL algorithms. The best I could manage was this
example, which isn't exactly great to say the least.

#include <cassert>
#include <vector>
#include <algorithm>
#include <string>
#include <sstream>
#include <iterator>
#include <iostream>

namespace {
template <typename T>
struct SplitHelper {
std::basic_ostringstream<typename T::value_type> next;
std::vector<T> result;
typename T::value_type match;

static bool test(SplitHelper& h, const typename T::value_type c) {
if (c == h.match) {
h.result.push_back(h.next.str());
h.next.str(T());
}

return c == h.match;
}
};
}

std::vector<T> split(const T& str, const typename T::value_type c='/') {
SplitHelper<T> h;
h.match = c;
h.result.reserve(std::count(str.begin(), str.end(), c));
std::remove_copy_if(str.begin(), str.end(),
std::eek:stream_iterator<typename T::value_type>(h.next),
std::bind1st(std::ptr_fun(&h.test), h));
h.result.push_back(h.next.str());
return h.result;
}

#include <iostream>
int main() {
const std::string path = "Hello/cruel/world";
const std::vector<std::string>& result = split(path);
std::cout << result.size() << std::endl;
assert(3==result.size());
std::cout << result[0] << std::endl;
std::cout << result[1] << std::endl;
std::cout << result[2] << std::endl;
return 0;
}

Is this really the tidiest way to do this using STL algorithms?
Obviously it wouldn't be hard at all to do just using a for loop and two
pointers, but I was trying to do this 'the STL way'.

Thanks for any suggestions,
Alan
 
J

Jeff Flinn

Alan said:
Hi,

I was looking for a clean, generic way to split strings around a
character using STL algorithms. The best I could manage was this
example, which isn't exactly great to say the least.

#include <cassert>
#include <vector>
#include <algorithm>
#include <string>
#include <sstream>
#include <iterator>
#include <iostream>

namespace {
template <typename T>
struct SplitHelper {
std::basic_ostringstream<typename T::value_type> next;
std::vector<T> result;
typename T::value_type match;

static bool test(SplitHelper& h, const typename T::value_type c) {
if (c == h.match) {
h.result.push_back(h.next.str());
h.next.str(T());
}

return c == h.match;
}
};
}

std::vector<T> split(const T& str, const typename T::value_type c='/') {
SplitHelper<T> h;
h.match = c;
h.result.reserve(std::count(str.begin(), str.end(), c));
std::remove_copy_if(str.begin(), str.end(),
std::eek:stream_iterator<typename T::value_type>(h.next),
std::bind1st(std::ptr_fun(&h.test), h));
h.result.push_back(h.next.str());
return h.result;
}

#include <iostream>
int main() {
const std::string path = "Hello/cruel/world";
const std::vector<std::string>& result = split(path);
std::cout << result.size() << std::endl;
assert(3==result.size());
std::cout << result[0] << std::endl;
std::cout << result[1] << std::endl;
std::cout << result[2] << std::endl;
return 0;
}

Is this really the tidiest way to do this using STL algorithms?
Obviously it wouldn't be hard at all to do just using a for loop and two
pointers, but I was trying to do this 'the STL way'.

Thanks for any suggestions,
Alan

If you want to 'split' any string based on any character use
boost::tokenizer, regex, xpressive or spirit. If you want to walk a file
path use boost::filesystem.

See www.boost.org

Jeff
 
W

White Wolf

Jeff said:
Alan said:
Hi,

I was looking for a clean, generic way to split strings around a
character using STL algorithms. The best I could manage was this
example, which isn't exactly great to say the least. [SNIP]
Is this really the tidiest way to do this using STL algorithms?
Obviously it wouldn't be hard at all to do just using a for loop and two
pointers, but I was trying to do this 'the STL way'.

Thanks for any suggestions,
Alan

If you want to 'split' any string based on any character use
boost::tokenizer, regex, xpressive or spirit. If you want to walk a file
path use boost::filesystem.

See www.boost.org


While I agree that boost.org provides the solution, but if we are into
that, then it is string_algo and split, and not the rest you mention.

However the OP's question was not how to split a path, or where he could
find a library that provides split functionality. He (see quotes) very
clearly said that a) he wants to make this and b) using STL algorithms
(not Boost, not for loops).

BR, WW
 
W

White Wolf

Alan Woodland wrote:
[SNIP]
Is this really the tidiest way to do this using STL algorithms?
Obviously it wouldn't be hard at all to do just using a for loop and two
pointers, but I was trying to do this 'the STL way'.

I rarely use the STL algorithms, so my approach may be stupid. But I
believe that what you need is not copy_if kind of thing. What you do is
a "sort of copying" all elements, because even the separator changes the
output, it starts a new string:

output_iterator copy( input_iterator start, input_iterator end,
output_iterator dest );

Where start and begin are begin/end of your string.

The output iterator needs to be a special insert iterator that adapts
our "container" (std::vector<std::string>) into an output iterator that
will add the character to the last existing string if the character is
not a delimiter and add a new empty string otherwise.

struct split_result {
split_result(char splitchar) : r(1), sc(splitchar) { ; }
void push_back(char const &c) {
if (c==sc) {
r.push_back(std::string());
} else {
r.top()+=c;
}
}
private:
std::vector<std::string> r;
const char sc;
};

split_result sr;
std::copy(str.begin(), str.end(),std::back_inserter(sr));

I have not tried to compile this, it is off the top of my head, but I
think it demonstrates the idea.

BR, WW
 
R

red floyd

Hi,

I was looking for a clean, generic way to split strings around a
character using STL algorithms. The best I could manage was this
example, which isn't exactly great to say the least.

#include <cassert>
#include <vector>
#include <algorithm>
#include <string>
#include <sstream>
#include <iterator>
#include <iostream>

namespace {
  template <typename T>
  struct SplitHelper {
    std::basic_ostringstream<typename T::value_type> next;
    std::vector<T> result;
    typename T::value_type match;

    static bool test(SplitHelper& h, const typename T::value_type c) {
      if (c == h.match) {
        h.result.push_back(h.next.str());
        h.next.str(T());
      }

      return c == h.match;
    }
  };

}

std::vector<T> split(const T& str, const typename T::value_type c='/') {
  SplitHelper<T> h;
  h.match = c;
  h.result.reserve(std::count(str.begin(), str.end(), c));
  std::remove_copy_if(str.begin(), str.end(),
std::eek:stream_iterator<typename T::value_type>(h.next),
std::bind1st(std::ptr_fun(&h.test), h));
  h.result.push_back(h.next.str());
  return h.result;

}

#include <iostream>
int main() {
  const std::string path = "Hello/cruel/world";
  const std::vector<std::string>& result = split(path);
  std::cout << result.size() << std::endl;
  assert(3==result.size());
  std::cout << result[0] << std::endl;
  std::cout << result[1] << std::endl;
  std::cout << result[2] << std::endl;
  return 0;

}

Is this really the tidiest way to do this using STL algorithms?
Obviously it wouldn't be hard at all to do just using a for loop and two
pointers, but I was trying to do this 'the STL way'.

What's wrong with using an istringstream?
 
J

Jeff Flinn

White said:
Jeff said:
Alan said:
Hi,

I was looking for a clean, generic way to split strings around a
character using STL algorithms. The best I could manage was this
example, which isn't exactly great to say the least. [SNIP]
Is this really the tidiest way to do this using STL algorithms?
Obviously it wouldn't be hard at all to do just using a for loop and two
pointers, but I was trying to do this 'the STL way'.

Thanks for any suggestions,
Alan

If you want to 'split' any string based on any character use
boost::tokenizer, regex, xpressive or spirit. If you want to walk a
file path use boost::filesystem.

See www.boost.org


While I agree that boost.org provides the solution, but if we are into
that, then it is string_algo and split, and not the rest you mention.

Aah, forgot that, thanks.
However the OP's question was not how to split a path, or where he could

But his example is exactly that.
find a library that provides split functionality. He (see quotes) very
clearly said that a) he wants to make this and b) using STL algorithms
(not Boost, not for loops).

So why not broaden the OP's knowledge of the solution domain. If there
were a direct and easy way of doing this with standard algorithms(just
what does one mean by STL these day), there would not have been all of
the aforementioned ways of skinning this cat. If boost is usable the OP
will use it, if not it's a source for alternative methods that may or
may not be doable with C++ library or language facilities. As a matter
of fact regex is part of C++ tr1.

Jeff Flinn
 
J

James Kanze

I was looking for a clean, generic way to split strings around
a character using STL algorithms. The best I could manage was
this example, which isn't exactly great to say the least.
#include <cassert>
#include <vector>
#include <algorithm>
#include <string>
#include <sstream>
#include <iterator>
#include <iostream>
namespace {
template <typename T>
struct SplitHelper {
std::basic_ostringstream<typename T::value_type> next;
std::vector<T> result;
typename T::value_type match;

static bool test(SplitHelper& h, const typename T::value_type c) {
if (c == h.match) {
h.result.push_back(h.next.str());
h.next.str(T());
}
return c == h.match;
}
};
}
std::vector<T> split(const T& str, const typename T::value_type c='/') {
SplitHelper<T> h;
h.match = c;
h.result.reserve(std::count(str.begin(), str.end(), c));
std::remove_copy_if(str.begin(), str.end(),
std::eek:stream_iterator<typename T::value_type>(h.next),
std::bind1st(std::ptr_fun(&h.test), h));
h.result.push_back(h.next.str());
return h.result;
}

[...]
Is this really the tidiest way to do this using STL algorithms?

Certainly not. If it were, I don't think anyone would use the
STL. I haven't understood all of it, but I don't see why you
would need a stringstream, for example. Something as simple as:

std::vector< std::string >
split( std::string const& original, char separator = ':' )
{
std::vector< std::string >
result;
typedef std::string::const_iterator
TextIter;
TextIter end = original.end();
TextIter current
= std::find( original.begin(), end, separator );
result.push_back( std::string( original.begin(), current ) );
while ( current != end ) {
++ current;
TextIter next
= std::find( current, end, separator );
result.push_back( std::string( current, next ) );
current = next;
}
return result;
}

(This is just off the top of my head, so there may be some
issues with border conditions. For that matter, you haven't
really defined adequately what the function should do to be able
to write it correctly.)
Obviously it wouldn't be hard at all to do just using a for
loop and two pointers, but I was trying to do this 'the STL
way'.

The STL way is to use iterators (instead of pointers) and
algorithms. You still need the outer loop; you could probably
design a special iterator, based on std::string::const_iterator,
and returning an std::string when dereferences, then use
something like std::copy and a back inserter, but that's really
more complexity than you want. (If the standard iterators used
the GoF idiom, it would be very simple, but the STL idiom is
designed to make everything twice as complex as needs be.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,002
Messages
2,570,259
Members
46,858
Latest member
FlorrieTuf

Latest Threads

Top