allocators and string

C

Christopher

Admittedly, I've never had to worry about making a custom allocator. I
think that is what I need. I need to replace a broken buffer, where it
looks like the only real goal was to allocate large chunks of storage
at a time instead of every insertion.

Where do I start?

It isn't standard or documented how much a string allocates when an
insertion causes it to run out of space, is it?
I've heard things like "double the current."

If I want to the string to allocate space for 50,000 elements at a
time, every time? How do I go about that.
 
C

Christopher

I think STL allocators do not decide how much to allocate, they just
allocate.

You can wrap std::string in a custom class and use the capacity() and
reserve() member functions for checking and enlarging the buffer.
However, from your description it sounds like std::string might not be
the best fit here. What problem are you trying to resolve?


Well the broken implementation that I am trying to replace is just a
buffer to grab XML from an http response. The author evidently tried
to make it faster by allocating configurable sized chunks of chars and
adding them to a linked list as they come in. Then allocating one
after calculating the total size of all and copying on when
extracting. It doesn't seem to make much of a difference and throws
exceptions which I've given up trying to fix. I read something similar
called "simple segregation" when looking up allocators.

The things I know of that would effect design are
* I get variable sized char data from the http library.
* I get the data in multiple calls and need to append often
* The total size of the data tends to be very large ~50,000 characters
worth of xml document on average, but can be tiny up to a gig of data.
* I need to give the caller whom wishes to extract the entire contents
in contiguous memory at once
 
C

Christopher

P.S. I could just use std::wstring as is, but if speed by eliminating
allocations is possible, it would be a good time for me to learn how
to do it instead of always depending on default STL behavior.
 
G

Goran

Well the broken implementation that I am trying to replace is just a
buffer to grab XML from an http response. The author evidently tried
to make it faster by allocating configurable sized chunks of chars and
adding them to a linked list as they come in. Then allocating one
after calculating the total size of all and copying on when
extracting. It doesn't seem to make much of a difference and throws
exceptions which I've given up trying to fix. I read something similar
called "simple segregation" when looking up allocators.

The things I know of that would effect design are
* I get variable sized char data from the http library.
* I get the data in multiple calls and need to append often
* The total size of the data tends to be very large ~50,000 characters
worth of xml document on average, but can be tiny up to a gig of data.
* I need to give the caller whom wishes to extract the entire contents
in contiguous memory at once

That's a shame. Normally, XML parsers know how to receive some sort of
a stream and work off that. Streams get rid of all your
considerations.

That said, if you know approx. max size, perhaps, as Paavo said, you
should play with reserve(). std::vector might be a better fit, too.

Goran.
 
C

Christopher

Wow, 1G XML files! I hope you have 64-bit compilations.

Yea, tell me about it. I've never seen a project where anything and
everything is represented as xml rather than having a domain and
classes for representing business objects. It is out of my hands. It's
ironic that there is an issue with speed and a lesser issue with
memory usage.
I assume that the size of the packet is not known in advance, otherwise
it would be easy to allocate the right buffer in the beginning.
Correct.

So what was the problem again? Are you concerned about speed, memory
usage or something else?

Speed primarily. Memory also, but not as much as speed.
Have you actually verified that standard
std::string append functionality does not meet your needs? The
exponential growth of the buffer is specifically designed to give
amortized linear complexity in such usage scenarios, without knowing the
required buffer size in advance it would be hard to compete with that.

Well, who is to say that std::string doesn't suit my needs until there
is something to compare it against. It is more of an investigation
into whether or not a better speed could be achieved. The original
author of the broken buffer claimed to achieve better speed, but the
performace tests are skewed and incorrect anyway.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,999
Messages
2,570,243
Members
46,836
Latest member
login dogas

Latest Threads

Top