Implementing deque with a couple of vectors?

C

chris

Hello,

I've recently been trying to understand the various structures supplied
by c++, and the one I find most confusing is deque.

One quick question about this. It seems most implementations of deque
are quite complex. Why couldn't I implement deque with a couple of
vectors V,W where the deque is the element of V in reverse order
followed by W? This would appear to me to satisfy all the conditions,
and be significantly simpler. Am I missing something obvious?

Chris
 
C

Catalin Pitis

chris said:
Hello,

I've recently been trying to understand the various structures supplied by
c++, and the one I find most confusing is deque.

One quick question about this. It seems most implementations of deque are
quite complex. Why couldn't I implement deque with a couple of vectors V,W
where the deque is the element of V in reverse order followed by W? This
would appear to me to satisfy all the conditions, and be significantly
simpler. Am I missing something obvious?

Chris

There are some differences between vectors and deques. A deque is very
performant at front and back insertions (constant time). A deque doesn't
store the elements in a continuous memory block, like vectors, but in linked
blocks. This means that it doesn't reallocate memory when the the size
becomes greater than capacity (like in the case of a vector), but instead
allocates a new memory block and links it to the last block used.

br/
Catalin
 
P

Phlip

Catalin said:
There are some differences between vectors and deques. A deque is very
performant at front and back insertions (constant time). A deque doesn't
store the elements in a continuous memory block, like vectors, but in linked
blocks. This means that it doesn't reallocate memory when the the size
becomes greater than capacity (like in the case of a vector), but instead
allocates a new memory block and links it to the last block used.

Herb Sutter recommends deque as the default container without a reason to
use another.

New questions:

Is deque more time efficient than vector? Is it more memory efficient than
list?

deque sounds similar to a data structure which I'l call "Rope". That's
designed for extra long and highly dynamic strings. When you insert a
character into the middle of the block, Rope might split the block, and
might put the new character into the lower block, with a reserved area after
the character, anticipating more. This naturally tunes for editors.

Does deque split blocks like that at insert time?

Block splitting requires eventual heap compaction. How could deque support
that? (Custom memory allocators need not apply!)
 
C

chris

Catalin said:
There are some differences between vectors and deques. A deque is very
performant at front and back insertions (constant time). A deque doesn't
store the elements in a continuous memory block, like vectors, but in linked
blocks. This means that it doesn't reallocate memory when the the size
becomes greater than capacity (like in the case of a vector), but instead
allocates a new memory block and links it to the last block used.

I've read claims that a deque can do constant time insertions at the
beginning and end and constant time random access. However is possible
to satisfy both of these requirements. I suspect that the insertions at
the beginning and end are only required to be amortized constant time?
If constant time random access is to be achieved, then we can't simply
be using a list of blocks else accessing an arbitary element would be
linear?

Chris
 
R

Richard Herring

chris said:
Hello,

I've recently been trying to understand the various structures supplied
by c++, and the one I find most confusing is deque.

One quick question about this. It seems most implementations of deque
are quite complex. Why couldn't I implement deque with a couple of
vectors V,W where the deque is the element of V in reverse order
followed by W? This would appear to me to satisfy all the conditions,
and be significantly simpler. Am I missing something obvious?

What happens when you repeatedly pop from one end until one of the two
vectors is empty?
 
C

chris

Richard said:
What happens when you repeatedly pop from one end until one of the two
vectors is empty?

Ah yes, I see your point. If you get pushing on one and and popping from
the other, then one of the two vectors would continue to grow bigger and
bigger over time while the deque remained the same size (I'm assuming in
this situation this kind of pushing on one end, popping off the other
should take a vaguely constant amount of memory, although I'm not
convinced the standard is clear about that requirement).

However, whenever we have to resize one of the vectors, then we could
take the opportunity to do a rebalancing between the two vectors, which
could take twice as long as just resizing one vector, but would still
have the same complexity I believe (although I would have to think about
it..)

Chris
 
I

Ioannis Vranos

Phlip said:
Herb Sutter recommends deque as the default container without a reason to
use another.

New questions:

Is deque more time efficient than vector? Is it more memory efficient than
list?


TC++PL 3 has a table of container comparison on page 464 (17.1.2).

From that table:


[] list front back Iterators
operations operations operations

vector const O(n)+ const+ Ran

deque const O(n) const const Ran


"In the iterators column, Ran means random-access iterator and Bi means
bidirectional iterator; the operations for a bidirectional operator are
a subset of those of a random-access iterator (19.2.1).

Other entries are measures of the efficiency of the operations. A const
entry means the operation takes an amount of time that does not depend
on the number of elements in the container. Another conventional
notation for constant time is O(1). An O(n) entry means the entry takes
time proportional to the number of elements involved. A + suffix
indicates that occasionally a significant extra cost is incurred. For
example, inserting an element into a list has a fixed cost (so it is
listed as const), whereas the same operation on a vector involves moving
the elements following the insertion point (so it is listed as O(n)).

Occasionally, all elements must be relocated (so I added a +).
The "big O" notation is conventional. I added the + for the benefit of
programmers who care about predictability in addition to average
performance. A conventional term for O(n)+ is amortized linear time.

Naturally, if a constant is large it can dwarf a small cost proportional
to the number of elements. However, for large data structures const
tends to mean "cheap", O(n) to mean "expensive", and O(log(n)) to mean
"fairly cheap". For even moderately large values of n, O(log(n)) is
closer to constant time than to O(n). People who care about cost must
take a closer look. In particular, they must understand what elements
are counted to get the n. No basic operation is "very expensive", that
is, O(n*n) or worse.

Except for string, the measures of costs listed here reflect
requirements in the standard. The string estimates are my assumptions.

These measures of complexity and cost are upper bounds. The measures
exist to give users some guidance as to what they can expect from
implementations. Naturally, implementers will try to do better in
important cases."
 
I

Ioannis Vranos

chris said:
I've read claims that a deque can do constant time insertions at the
beginning and end and constant time random access. However is possible
to satisfy both of these requirements. I suspect that the insertions at
the beginning and end are only required to be amortized constant time?
If constant time random access is to be achieved, then we can't simply
be using a list of blocks else accessing an arbitary element would be
linear?


Since it is a small paragraph, here is what TC++PL 3 says about deque:


"17.2.3 Deque

A deque (it rhymes with check) is a double-ended queue. That is, a deque
is a sequence optimized so that operations at both ends are about as
efficient as for a list, whereas subscripting approaches the efficiency
of a vector:

template <class T, class A = allocator<T> > class std: :deque {
// types and operations like vector (§16.3.3, §16.3.5, §16.3.6)
// plus front operations (§17.2.2.2) like list
}
;

Insertion and deletion of elements "in the middle" have vector-like
(in)efficiencies rather than list-like efficiencies. Consequently, a
deque is used where additions and deletions take place "at the ends".

For example, we might use a deque to model a section of a railroad or to
represent a deck of cards in a game:

deque<car> siding_ no_ 3;
deque<Card> bonus;"
 
H

Herb Sutter

Herb Sutter recommends deque as the default container without a reason to
use another.

I changed my mind a couple of years ago. Bjarne convinced me.

In C++ Coding Standards, due on bookstore shelves in the next two weeks,
you'll find the first three Items in the STL Containers section have
titles that begin with "use vector." In particular:

76. Use vector by default. Otherwise, choose an appropriate container.
77. Use vector and string instead of arrays.
78. Use vector (and string::c_str) to exchange data with non-C++ APIs.

http://www.gotw.ca/publications/coding.htm

Herb

---
Herb Sutter (www.gotw.ca) (www.pluralsight.com/blogs/hsutter)

Convener, ISO WG21 (C++ standards committee) (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
Architect, Developer Division, Microsoft (www.gotw.ca/microsoft)
 
C

chris

Ioannis said:
Phlip said:
Herb Sutter recommends deque as the default container without a reason to
use another.

New questions:

Is deque more time efficient than vector? Is it more memory efficient
than
list?



TC++PL 3 has a table of container comparison on page 464 (17.1.2).

From that table:


[] list front back Iterators
operations operations operations

vector const O(n)+ const+ Ran

deque const O(n) const const Ran

Thanks, this explains my mistake. I thought that deque only guaranteed
const+ on front and back operations, but it offers the stronger
guarantee const.

This is slightly misleading, as all the implementations I've seen only
promise a const number of copy operations on elements of the deque. The
messing around with internal structures might still be const+ (ie
amorised constant time), and I believe this is unavoidable.

Chris
 
I

Ioannis Vranos

Herb said:
I changed my mind a couple of years ago. Bjarne convinced me.

In C++ Coding Standards, due on bookstore shelves in the next two weeks,
you'll find the first three Items in the STL Containers section have
titles that begin with "use vector." In particular:

76. Use vector by default. Otherwise, choose an appropriate container.
77. Use vector and string instead of arrays.
78. Use vector (and string::c_str) to exchange data with non-C++ APIs.

http://www.gotw.ca/publications/coding.htm


Now that you are here. :)


Making pin_ptr and interior_ptr possible for ref classes makes
interoperability with non-CLI code easier, where there are not CLI
specialisations.


So why not have pin_ptr and interior_ptr for ref and value types alike?



And what about arrays with stack semantics/deterministic destruction?


array<SomeClass> somearray;
 
J

Jerry Coffin

[ ... ]
I've read claims that a deque can do constant time insertions at the
beginning and end and constant time random access. However is possible
to satisfy both of these requirements. I suspect that the insertions at
the beginning and end are only required to be amortized constant time?

Yes, amortized constant, not unqualified constant.
If constant time random access is to be achieved, then we can't simply
be using a list of blocks else accessing an arbitary element would be
linear?

It can't be implemented (properly) using a linked list of blocks.
Normally it's done with a vector of (pointers to) blocks. You normally
pre-allocate space for a fairly large number of pointers and then
start filling it around the middle.

Ocassionally, this vector of pointers will itself need to be resized,
but since this is an amortized constant operation itself, it can be
done as part of the amortized constant operation on the overall deque.

This gives constant speed access to the deque, though the consant is
often slightly larger than for a vector. In a typical modern machine,
the difference will only be visible statistically though: computing
the right address typically adds only a couple of clock cycles to a
process that varies from a few clocks (if the value is in L1 cache) to
thousands or even millions of cycles (if the value has been paged out
to disk, possibly accross a network...)
 
S

Siemel Naran

Herb Sutter said:
I changed my mind a couple of years ago. Bjarne convinced me.

In C++ Coding Standards, due on bookstore shelves in the next two weeks,
you'll find the first three Items in the STL Containers section have
titles that begin with "use vector." In particular:

76. Use vector by default. Otherwise, choose an appropriate container.
77. Use vector and string instead of arrays.
78. Use vector (and string::c_str) to exchange data with non-C++ APIs.

http://www.gotw.ca/publications/coding.htm

That link is not working.
 
U

Uwe Schnitker

Herb Sutter said:
I changed my mind a couple of years ago. Bjarne convinced me.

May I ask: How did he manage to do that?
[Apart from the fact that he is your series editor ;-) ]

You never mentioned that change of mind in your writing, AFAIK.

Regards,

Uwe
 
P

Phlip

Herb said:
I changed my mind a couple of years ago. Bjarne convinced me.

Apologies for not tracking your mind for a couple years. ;-)
In C++ Coding Standards, due on bookstore shelves in the next two weeks,
you'll find the first three Items in the STL Containers section have
titles that begin with "use vector." In particular:

76. Use vector by default. Otherwise, choose an appropriate container.
77. Use vector and string instead of arrays.
78. Use vector (and string::c_str) to exchange data with non-C++ APIs.

http://www.gotw.ca/publications/coding.htm

Are there Standard or implementation-specific answers to my other questions?

I ask them because, back in the olden days, I could run UEdit on AmigaDOS,
with a program that displayed memory as a stack of colored blocks. When I
typed a letter into the middle of a long document, I could see its block
split. Multiple edits here and there would frag memory, and each block would
reserve area after its insertion point, getting ready for more. Then if I
idled the system's inputs, after UEdit's idle timer went off I could see
blocks shuffle around, until memory contained one packed list of sequential
blocks.

(I'm asking about the block splitting algorithms here - not about their
relations to physical memory. Virtual memory managers take care of much of
that nowadays...)
 
H

Herb Sutter

Now that you are here. :)

Making pin_ptr and interior_ptr possible for ref classes makes
interoperability with non-CLI code easier, where there are not CLI
specialisations.

So why not have pin_ptr and interior_ptr for ref and value types alike?

And what about arrays with stack semantics/deterministic destruction?

array<SomeClass> somearray;

We didn't get to those things in this release, but left open room for
those extensions.

Herb

---
Herb Sutter (www.gotw.ca) (www.pluralsight.com/blogs/hsutter)

Convener, ISO WG21 (C++ standards committee) (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
Architect, Developer Division, Microsoft (www.gotw.ca/microsoft)
 
I

Ivan Vecerina

Phlip said:
Are there Standard or implementation-specific answers to my other
questions?

[once upon a time, Phlip wrote:]
Does deque split blocks like that at insert time?
No.
The typical implementation of deque is a vector of fixed size arrays,
with only the first and last of these arrays not being 'filled'.
It would probably be impossible to ensure constant-time random access
to deque's elements if the allocated blocks had a variable size.
But I can't think right now of another deque requirement that would
prevent the implementation you are thinking of.
Block splitting requires eventual heap compaction. How could deque support
that? (Custom memory allocators need not apply!)
For memory management, deque would be restricted to what the interface
if its allocator (template parameter) provides: allocate & deallocate.
(This is already an annoyance for std::vector, which cannot 'realloc').
I ask them because, back in the olden days, I could run UEdit on AmigaDOS,
with a program that displayed memory as a stack of colored blocks. When I
typed a letter into the middle of a long document, I could see its block
split. Multiple edits here and there would frag memory, and each block
would
reserve area after its insertion point, getting ready for more. Then if I
idled the system's inputs, after UEdit's idle timer went off I could see
blocks shuffle around, until memory contained one packed list of
sequential
blocks.
Such a 'rope' container is provided in the SGI STL (and probably in its
STLport derivative, and maybe others), but hasn't made it into the standard.
See: http://www.sgi.com/tech/stl/Rope.html
 
I

Ioannis Vranos

chris said:
Ah yes, I see your point. If you get pushing on one and and popping from
the other, then one of the two vectors would continue to grow bigger and
bigger over time while the deque remained the same size (I'm assuming in
this situation this kind of pushing on one end, popping off the other
should take a vaguely constant amount of memory, although I'm not
convinced the standard is clear about that requirement).

However, whenever we have to resize one of the vectors, then we could
take the opportunity to do a rebalancing between the two vectors, which
could take twice as long as just resizing one vector, but would still
have the same complexity I believe (although I would have to think about
it..)


Here is something I found on the web:


"In the HP implementation of the STL, a deque is implemented as a
collection of fixed sized buffers which actually hold the data. Another
structure, the buffer map, holds pointers to the currently allocated
data buffers".
 
T

Tom Widmer

Ah yes, I see your point. If you get pushing on one and and popping from
the other, then one of the two vectors would continue to grow bigger and
bigger over time while the deque remained the same size (I'm assuming in
this situation this kind of pushing on one end, popping off the other
should take a vaguely constant amount of memory, although I'm not
convinced the standard is clear about that requirement).

However, whenever we have to resize one of the vectors, then we could
take the opportunity to do a rebalancing between the two vectors, which
could take twice as long as just resizing one vector, but would still
have the same complexity I believe (although I would have to think about
it..)

That would break the iterator invalidation requirements of deque -
push_back and push_front must not invalidate references. This is why
using two (or any constant number of) vectors won't work.

A degenerate implementation of deque is to have a vector of pointers
to elements, combined with an offset for the start element and an
offset for the end element. This is basically how Dinkumware's
implementation works for largeish element types, and has some
advantages (e.g. inserting elements anywhere doesn't require any
assignment or copying except for the vector<T*>, and doesn't
invalidate references (but will invalidate iterators I think)).

Tom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,183
Messages
2,570,966
Members
47,516
Latest member
ChrisHibbs

Latest Threads

Top