vector acces by index/iterator

A

Alex Vinokur

Peter van Merkerk said:
Alex Vinokur wrote: [snip]
What information about standard library implementation for g++ version 3.3.1 (cygming special) should be supplied?

Just the fact that you use it. The compiler and the standard library
implementation that comes with it are not one and the same thing. One
might decide not to use the standard library implementation that comes
with the compiler but a third party one instead.
[snip]

How can we know which implementation of the standard library is used (in g++)?
 
S

SaltPeter

Peter van Merkerk said:
"constant time" should not be taken too literally on many platforms.
Thanks to caching and virtual memory accessing of raw memory isn't a
constant time operation. Consequently it cannot be guaranteed that
accessing a element of a vector (or a C style array for that matter)
takes a fixed amount of time. I.e. accessing one element may take in the
order of nanoseconds, while accesing another element in the same vector
or array may take tens of milliseconds if it triggers a page fault. This
is concern when working with large datasets that don't fit into the
cache or physical memory.

Agreed, constant time should not be thought of as a fixed, literal value. In
fact, what i wrote is wrong. But i'm glad you've brought up the issue of
page faults and data set size. 2 of the many parameters that the standard
has no control over.
 
T

tom_usenet

Peter van Merkerk said:
Alex Vinokur wrote: [snip]
What information about standard library implementation for g++ version 3.3.1 (cygming special) should be supplied?

Just the fact that you use it. The compiler and the standard library
implementation that comes with it are not one and the same thing. One
might decide not to use the standard library implementation that comes
with the compiler but a third party one instead.
[snip]

How can we know which implementation of the standard library is used (in g++)?

If you haven't changed it, it will be libstdc++3 -
http://gcc.gnu.org/libstdc++/

GCC also works with STLPort, SGI's STL and Dinkumware's library, and
possibly others.

Tom
 
S

SaltPeter

Rob Williscroft said:
SaltPeter wrote in in
comp.lang.c++:


All this means is there is a *fixed* upper bound on access time, and also
that this upper bound is independant of the number of elements in the
container (*), it doesn't imply that access to specific elements might
not be faster than access to others.

Thats a good description you make: "an upper_bound to an access-time range".
As i already mentioned elsewhere, what i wrote is incorrect. Beleive it or
not, i've never equated "constant time" with a finite value.
For example on a 64 bit machine with 32 bit int's std::vector< int >
(or int array[ N ]) might have access to odd elements taking twice the
time as access to even elements. In such a case its the time to access
the odd elements that determines the fixed upper bound.

Thanks for the example. Another reason to convince myself that time is not
always what it seems.
 
R

Richard Herring

SaltPeter said:
Richard Herring said:
SaltPeter said:
news:[email protected]... [...]

My thinking is that since both an array and a vector use a single
contiguous
block of memory and that to access an individual element in either an
array or
vector requires the same pointer arithmetic that the performance would be
comparable in the presence of inlining for the vector operator[].

The standard guarentees that *any* element of the vector container is
accessed using a constant time period (so any element is accessed with the
same access period as the container's first element). There is a
misconception out there that a vector is like a list where each element
pointing to the next. This is not the case.

It's not a misconception I've noticed. YMMV.
Vectors use random iterators.
These aren't available to either the array or std::list.

Arrays have random-access iterators. They're called pointers.

A {random-access, bidirectional, forward, input, output} iterator is
anything satisfying the requirements of {random-access, bidirectional,
forward, input, output} iterator (24.1). A pointer into an array does
everything that's needed for a random-access iterator, therefore it is
one. As witness the fact that many implementations of std::vector use
raw pointers as their iterators.
And the array
doesn't even have a conception of what an iterator is.

The only iterator-related things that arrays lack are the functions
which return them.

Yep, well i deserved that response.

What i was refering to was the support for the said iterators as your last
comment mentions. In other words, an array doesn't have an end(). :)

Well, boost::array makes up for that deficiency.
 
K

Kai-Uwe Bux

Karl said:
Kai-Uwe Bux said:
Hilarious. Try running your tests in release mode with optimization.
You'll throw the array out the door. The larger the data set, the
farther the array will fall.

Are you saying that a vector could actually end up being faster than an
array with respect to retrieving items through operator[]?

I have seen the results of performance tests that seem to indicate this
possibility.

I am wondering how this is possible though.

My thinking is that since both an array and a vector use a single
contiguous block of memory and that to access an individual element in
either an array or vector requires the same pointer arithmetic that the
performance would be comparable in the presence of inlining for the
vector operator[].

Could you illustrate a scenario in which a vector might be faster?

Hi,

just whipped up some test code. Sorry for including the sys/timeb.h
header. Does someone know a standard way of doing this?

#include <ctime>
clock_t ticks = clock();
Thanks.
just a measurement on my machine (2.4GH, g++3.4.0 -O3):

457
601

Same here: VC++ 6.0, 2.4 Ghz

Debug:
vector 3485
array 581

Release:
vector 511
array 761


Now I think, std::vector can beat raw arrays because of smarter allocators:


#include <vector>
#include <iostream>
#include <ctime>
#include <memory>

template < typename T, typename Alloc = std::allocator<T> >
class stupid {
public:

typedef Alloc allocator;
typedef typename allocator::value_type value_type;
typedef typename allocator::size_type size_type;
typedef typename allocator::difference_type difference_type;
typedef typename allocator::pointer pointer;
typedef typename allocator::const_pointer const_pointer;
typedef typename allocator::reference reference;
typedef typename allocator::const_reference const_reference;

typedef pointer iterator;
typedef const_pointer const_iterator;
typedef typename std::reverse_iterator< iterator >
reverse_iterator;
typedef typename std::reverse_iterator< const_iterator >
const_reverse_iterator;

private:

pointer ptr;
size_type the_size;

public:

stupid ( size_type length ) :
ptr ( new T [ length ] ),
the_size ( length )
{
for ( iterator iter = this->ptr;
iter != this->ptr + the_size;
++ iter ) {
::new( static_cast<void*>(iter) ) T();
}
}

~stupid ( void ) {
iterator iter = ptr + the_size;
while ( iter > ptr ) {
-- iter;
iter->~T();
}
{
allocator alloc;
alloc.deallocate( ptr, the_size );
}
the_size = 0;
}

reference operator[] ( size_type index ) {
return( this->ptr[ index ] );
}

const_reference operator[] ( size_type index ) const {
return( this->ptr[ index ] );
}

}; // stupid

int main ( void ) {
const unsigned long l = 50000000;
{
std::vector< int > v ( l );
std::clock_t loop_start = std::clock();
for ( unsigned long i = 0; i < l; ++i ) {
v = 5;
}
std::clock_t loop_end = std::clock();
std::cout << loop_end - loop_start << std::endl;
}
{
int* v = new int [ l ];
std::clock_t loop_start = std::clock();
for ( unsigned long i = 0; i < l; ++i ) {
v = 5;
}
std::clock_t loop_end = std::clock();
std::cout << loop_end - loop_start << std::endl;
}
{
stupid< int > v ( l );
std::clock_t loop_start = std::clock();
for ( unsigned long i = 0; i < l; ++i ) {
v = 5;
}
std::clock_t loop_end = std::clock();
std::cout << loop_end - loop_start << std::endl;
}
}

320000
460000
310000


It appears that std::allocator<int> does something fancy. Maybe, it alligns
very large arrays so that it starts at a memory page boundary. This way,
paging and pointer arithmetic do not interfere.


Best

Kai-Uwe Bux
 
S

SaltPeter

Richard Herring said:
SaltPeter said:
Richard Herring said:
In message <[email protected]>, SaltPeter

[...]

My thinking is that since both an array and a vector use a single
contiguous
block of memory and that to access an individual element in either an
array or
vector requires the same pointer arithmetic that the performance
would
be
comparable in the presence of inlining for the vector operator[].

The standard guarentees that *any* element of the vector container is
accessed using a constant time period (so any element is accessed with the
same access period as the container's first element). There is a
misconception out there that a vector is like a list where each element
pointing to the next. This is not the case.

It's not a misconception I've noticed. YMMV.

Vectors use random iterators.
These aren't available to either the array or std::list.

Arrays have random-access iterators. They're called pointers.

A {random-access, bidirectional, forward, input, output} iterator is
anything satisfying the requirements of {random-access, bidirectional,
forward, input, output} iterator (24.1). A pointer into an array does
everything that's needed for a random-access iterator, therefore it is
one. As witness the fact that many implementations of std::vector use
raw pointers as their iterators.

And the array
doesn't even have a conception of what an iterator is.

The only iterator-related things that arrays lack are the functions
which return them.

Yep, well i deserved that response.

What i was refering to was the support for the said iterators as your last
comment mentions. In other words, an array doesn't have an end(). :)

Well, boost::array makes up for that deficiency.

Indeed: http://www.boost.org/doc/html/array.html
Thanks for your input.
 
D

Denis Remezov

Kai-Uwe Bux said:
Hilarious. Try running your tests in release mode with optimization.
You'll throw the array out the door. The larger the data set, the farther
the array will fall.

Are you saying that a vector could actually end up being faster than an
array with respect to retrieving items through operator[]?

I have seen the results of performance tests that seem to indicate this
possibility.

I am wondering how this is possible though.

My thinking is that since both an array and a vector use a single
contiguous block of memory and that to access an individual element in
either an array or vector requires the same pointer arithmetic that the
performance would be comparable in the presence of inlining for the vector
operator[].

Could you illustrate a scenario in which a vector might be faster?

Hi,

just whipped up some test code. Sorry for including the sys/timeb.h header.
Does someone know a standard way of doing this?

#include <vector>
#include <iostream>
#include <sys/timeb.h>

long int time_diff ( timeb a, timeb b ) {
return( ( a.time - b.time )*1000 + ( a.millitm - b.millitm ) );
}

int main ( void ) {
const unsigned long l = 50000000;
{
std::vector< int > v ( l );
timeb loop_start, loop_end;
ftime( &loop_start );
for ( unsigned long i = 0; i < l; ++i ) {
v = 5;
}
ftime( &loop_end );
std::cout << time_diff( loop_end, loop_start ) << std::endl;
}
{
int* v = new int [ l ];
timeb loop_start, loop_end;
ftime( &loop_start );
for ( unsigned long i = 0; i < l; ++i ) {
*( v + i ) = 5;
}
ftime( &loop_end );
std::cout << time_diff( loop_end, loop_start ) << std::endl;
}
}

just a measurement on my machine (2.4GH, g++3.4.0 -O3):
457
601

Thus, std::vector<int> was faster than int[].


I bet all the difference was due to plain memory caching.

You have to take into account the initialisation (and hence
caching) of the storage that line
std::vector< int > v ( l );
does in addition to the allocation.

For a fair comparison, add something to the effect of
std::fill_n(v, l, 0);

right after the line
int* v = new int [ l ];

and see what the results are now.

Denis
 
K

Kai-Uwe Bux

Denis said:
Kai-Uwe Bux said:
Hilarious. Try running your tests in release mode with optimization.
You'll throw the array out the door. The larger the data set, the
farther the array will fall.

Are you saying that a vector could actually end up being faster than an
array with respect to retrieving items through operator[]?

I have seen the results of performance tests that seem to indicate this
possibility.

I am wondering how this is possible though.

My thinking is that since both an array and a vector use a single
contiguous block of memory and that to access an individual element in
either an array or vector requires the same pointer arithmetic that the
performance would be comparable in the presence of inlining for the
vector operator[].

Could you illustrate a scenario in which a vector might be faster?

Hi,

just whipped up some test code. Sorry for including the sys/timeb.h
header. Does someone know a standard way of doing this?

#include <vector>
#include <iostream>
#include <sys/timeb.h>

long int time_diff ( timeb a, timeb b ) {
return( ( a.time - b.time )*1000 + ( a.millitm - b.millitm ) );
}

int main ( void ) {
const unsigned long l = 50000000;
{
std::vector< int > v ( l );
timeb loop_start, loop_end;
ftime( &loop_start );
for ( unsigned long i = 0; i < l; ++i ) {
v = 5;
}
ftime( &loop_end );
std::cout << time_diff( loop_end, loop_start ) << std::endl;
}
{
int* v = new int [ l ];
timeb loop_start, loop_end;
ftime( &loop_start );
for ( unsigned long i = 0; i < l; ++i ) {
*( v + i ) = 5;
}
ftime( &loop_end );
std::cout << time_diff( loop_end, loop_start ) << std::endl;
}
}

just a measurement on my machine (2.4GH, g++3.4.0 -O3):
457
601

Thus, std::vector<int> was faster than int[].


I bet all the difference was due to plain memory caching.

You have to take into account the initialisation (and hence
caching) of the storage that line
std::vector< int > v ( l );
does in addition to the allocation.

For a fair comparison, add something to the effect of
std::fill_n(v, l, 0);

right after the line
int* v = new int [ l ];

and see what the results are now.

Denis



You are right:

#include <vector>
#include <iostream>
#include <ctime>
#include <memory>

#include <kubux/bits/allocator.cc>
#include <kubux/bits/new_delete_allocator.cc>
#include <kubux/bits/malloc_free_allocator.cc>

template < typename T, typename Alloc = std::allocator<T> >
class stupid {
public:

typedef Alloc allocator;
typedef typename allocator::value_type value_type;
typedef typename allocator::size_type size_type;
typedef typename allocator::difference_type difference_type;
typedef typename allocator::pointer pointer;
typedef typename allocator::const_pointer const_pointer;
typedef typename allocator::reference reference;
typedef typename allocator::const_reference const_reference;

typedef pointer iterator;
typedef const_pointer const_iterator;
typedef typename std::reverse_iterator< iterator >
reverse_iterator;
typedef typename std::reverse_iterator< const_iterator >
const_reverse_iterator;

private:

pointer ptr;
size_type the_size;

public:

stupid ( size_type length ) :
ptr ( new T [ length ] ),
the_size ( length )
{
for ( iterator iter = this->ptr;
iter != this->ptr + the_size;
++ iter ) {
::new( static_cast<void*>(iter) ) T();
}
}

~stupid ( void ) {
iterator iter = ptr + the_size;
while ( iter > ptr ) {
-- iter;
iter->~T();
}
{
allocator alloc;
alloc.deallocate( ptr, the_size );
}
the_size = 0;
}

reference operator[] ( size_type index ) {
return( this->ptr[ index ] );
}

const_reference operator[] ( size_type index ) const {
return( this->ptr[ index ] );
}

}; // stupid

int main ( void ) {
const unsigned long l = 50000000;
{
std::vector< int > v ( l );
std::clock_t loop_start = std::clock();
for ( unsigned long i = 0; i < l; ++i ) {
v = 5;
}
std::clock_t loop_end = std::clock();
std::cout << "vector: " << loop_end - loop_start << std::endl;
}
{
int* v = new int [ l ];
std::fill_n(v, l, 0);
std::clock_t loop_start = std::clock();
for ( unsigned long i = 0; i < l; ++i ) {
v = 5;
}
std::clock_t loop_end = std::clock();
std::cout << "array: " << loop_end - loop_start << std::endl;
}
{
stupid< int, std::allocator<int> > v ( l );
std::clock_t loop_start = std::clock();
for ( unsigned long i = 0; i < l; ++i ) {
v = 5;
}
std::clock_t loop_end = std::clock();
std::cout << "stupid: " << loop_end - loop_start << std::endl;
}
{
std::vector<int> v ( l );
std::clock_t loop_start = std::clock();
for ( std::vector<int>::iterator i = v.begin();
i != v.end(); ++i ) {
*i = 5;
}
std::clock_t loop_end = std::clock();
std::cout << "ptr: " << loop_end - loop_start << std::endl;
}
{
int* v = new int [ l ];
std::fill_n(v, l, 0);
std::clock_t loop_start = std::clock();
for ( int* i = v; i < v+l; ++i ) {
*i = 5;
}
std::clock_t loop_end = std::clock();
std::cout << "ptr: " << loop_end - loop_start << std::endl;
}
}
vector: 320000
array: 320000
stupid: 350000
iterator: 340000
ptr: 340000


No surprises anymore.


Thanks

Kai-Uwe Bux
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,172
Messages
2,570,934
Members
47,478
Latest member
ReginaldVi

Latest Threads

Top