overloading the [] operator

  • Thread starter Joseph Paterson
  • Start date
J

Joseph Paterson

Hello,

I've created a class to store 2 dimensional matrices, and I've been
trying to overload the [] operator, so access to the elements of the
matrix is easy. I have 3 private variables, _m, _row and _col which are
respectively the array to store the data in, the number of rows and the
number of columns. _m is of type T, and is size _row * _col * sizeof
(T).

I've managed to overload the subscript operator like this:
T & operator [] (int n)
{
return (_m[n * _col]);
}

but I don't understand why you can't do:
T * operator [] (int n)
{
return (&_m[n * _col]);
}

instead. The first one works fine, but with the second one when I try
something like m[0][0] = 100.0, m[0][0] comes up as being an invalid
lvalue.
Could someone help me out here?

Cheers,

- Joseph Paterson
 
K

Kai-Uwe Bux

Joseph said:
Hello,

I've created a class to store 2 dimensional matrices, and I've been
trying to overload the [] operator, so access to the elements of the
matrix is easy. I have 3 private variables, _m, _row and _col which are
respectively the array to store the data in, the number of rows and the
number of columns. _m is of type T, and is size _row * _col * sizeof
(T).

I've managed to overload the subscript operator like this:
T & operator [] (int n)
{
return (_m[n * _col]);
}

but I don't understand why you can't do:
T * operator [] (int n)
{
return (&_m[n * _col]);
}

instead. The first one works fine, but with the second one when I try
something like m[0][0] = 100.0, m[0][0] comes up as being an invalid
lvalue.

Please, hit the FAQ. This topic has been beaten to death repeatedly in this
group and gives rise to a little bit flaming every once in a while. The
gist of it is:

a) Consider using operator()( row, col ) instead of operartor[][]. It is
easier to implement cleanly: the problem with [][]-syntax is that upon a
first try, you will settle on a quick and dirty solution where operator[]
return something like a vector (just as you indicate in your post). This is
bound to expose implementation details of the matrix class, which is
considered poor design. Details are in the FAQ.

b) If you are hard-headed / strongly-willed and determined to support the
[][]-syntax, overload operator[] to return a proxy object that contains a
reference to your matrix and knows the row. The proxy class itself will
overload operator[] (const) to return a (const) reference to the entry.
This trick allows to separate implementation and interface. Details are in
the FAQ.

c) If your matrix class aims to be more than merely a container, i.e., if
you plan on supporting linear algebra, please consider using one of the
many available libraries. A full-fledged matrix class is a project of
considerable size and difficulty. Don't call that upon yourself without
legitimate need unless you are looking for a character building and highly
educational exercise.


Best

Kai-Uwe Bux
 
J

Joseph Paterson

I actually am trying to implement a fully fledged matrix class. I'm
starting on a graphics engine, and I want to fully support matrices and
all possible operations you can do on them :) It's fun, and a learning
experience.

I'll have a look at the FAQ - thanks!

- Joseph Paterson
 
K

Kai-Uwe Bux

Joseph said:
I actually am trying to implement a fully fledged matrix class.

Oh well.
I'm starting on a graphics engine, and I want to fully support matrices
and all possible operations you can do on them :) It's fun, and a learning
experience.

Ok, you asked for it:

a) You may want to read up on expression templates. When it comes to things
like adding vectors / matrices, the introduction of temporaries truly
becomes a performance issue. Expression templates are used in most state of
the art matrix libraries to deal with this problem. If you are really
headed for a graphics engine, performance will become an issue pretty soon.

b) As far as syntactic sugar (like [][]-notation) is concerned, I found that
the most interesting use of proxy classed for column and row vectors is to
support elementary operations like so:

A.row(i) += some_scalar * A.row(j);

or

swap( A.col(i), A.col(j) );

However, to make that work is a little bit challenging.

I'll have a look at the FAQ - thanks!

That is a good point to start. You may also want to pay attention to what it
has to say about floating point arithmetic (given that you are very likely
bound to do numerical linear algebra). Be warned that numerical linear
algebra is difficult and there are many traps (beyond the ones in the FAQ)
that are hard to spot. What looks like a correct algorithm from your Linear
Algebra text can turn out to be numerically unstable.


Have fun

Kai-Uwe Bux
 
N

Noah Roberts

Kai-Uwe Bux said:
Joseph said:
Hello,

I've created a class to store 2 dimensional matrices, and I've been
trying to overload the [] operator, so access to the elements of the
matrix is easy. I have 3 private variables, _m, _row and _col which are
respectively the array to store the data in, the number of rows and the
number of columns. _m is of type T, and is size _row * _col * sizeof
(T).
Please, hit the FAQ. This topic has been beaten to death repeatedly in this
group and gives rise to a little bit flaming every once in a while. The
gist of it is:

Yes, read the FAQ on why matrix shouldn't have [].

Also, I can't find it but it must be in the faq somewhere...all your
variable names are broken. Don't use anything that starts with an
underscore. An undercore *postfix* is now commonly used to identify
member variables...I don't personally see the need but if you do, think
about doing it that way...it won't run into conflicts with the std lib
or implementation.
 
N

Noah Roberts

Joseph said:
Hello,

I've created a class to store 2 dimensional matrices, and I've been
trying to overload the [] operator, so access to the elements of the
matrix is easy. I have 3 private variables, _m, _row and _col which are
respectively the array to store the data in, the number of rows and the
number of columns. _m is of type T, and is size _row * _col * sizeof
(T).

I've managed to overload the subscript operator like this:
T & operator [] (int n)
{
return (_m[n * _col]);
}

but I don't understand why you can't do:
T * operator [] (int n)
{
return (&_m[n * _col]);
}

instead. The first one works fine, but with the second one when I try
something like m[0][0] = 100.0, m[0][0] comes up as being an invalid
lvalue.
Could someone help me out here?

You are not providing enough information. The first should NOT work
and the second should based on my interpretation of your post. But I
don't know what _m is or what T is so I'm just guessing. At any rate,
I can't answer your question as I obviously don't understand the
problem.
 
K

Kai-Uwe Bux

Noah said:
Kai-Uwe Bux said:
Joseph said:
Hello,

I've created a class to store 2 dimensional matrices, and I've been
trying to overload the [] operator, so access to the elements of the
matrix is easy. I have 3 private variables, _m, _row and _col which are
respectively the array to store the data in, the number of rows and the
number of columns. _m is of type T, and is size _row * _col * sizeof
(T).
[snip]

Also, I can't find it but it must be in the faq somewhere...all your
variable names are broken. Don't use anything that starts with an
underscore.
[reasonable recommendation snipped]

You cannot find it because it should not be there. The reserved identifiers
are [17.4.3.1.2]:

* any identifier that contains a *double* underscore "__".
* any identifier that begins with "_" followed by a *capital* letter.
* any identifier that begins with "_" is reserved in the *global namaspace*.

So, for local variables and parameters, the names are fine.



Best

Kai-Uwe
 
A

Axter

Kai-Uwe Bux said:
Noah said:
Kai-Uwe Bux said:
Joseph Paterson wrote:

Hello,

I've created a class to store 2 dimensional matrices, and I've been
trying to overload the [] operator, so access to the elements of the
matrix is easy. I have 3 private variables, _m, _row and _col which are
respectively the array to store the data in, the number of rows and the
number of columns. _m is of type T, and is size _row * _col * sizeof
(T).
[snip]

Also, I can't find it but it must be in the faq somewhere...all your
variable names are broken. Don't use anything that starts with an
underscore.
[reasonable recommendation snipped]

You cannot find it because it should not be there. The reserved identifiers
are [17.4.3.1.2]:

* any identifier that contains a *double* underscore "__".
* any identifier that begins with "_" followed by a *capital* letter.
* any identifier that begins with "_" is reserved in the *global namaspace*.

So, for local variables and parameters, the names are fine.

You should never name a local variable with an underscore followed by
an upper case letter.
IAW C++ standards that you quoted, names that begin with an underscore
and are followed by an uppercase letter, can be used for anything by
the implementation.
That means the implementation can use macros for this.
Example:
#define _T(x) L##x

If you have a local variable with the name _T, it will fail to be
portable to an implementation that uses above macro.
IAW C++ standard, a name that begins with an underscore and then
followed by a lower case letter are reserved for the implementation in
the global namespace.
So technically speaking, IAW C++ standard, you could name a local
variable that begins with an underscore and is followed by a lower case
letter. And your code should be portable.

However, this is not true in practice, because some implementations
also use these types of names with macros, which invade all namespaces,
and not just the global namespace.
IAW with the C++ standard, the following code should be portable:
int foo(int _itot, int _ttoi)
{
return _itot + _ttoi;
}

But the above code is not portable to windows platform that reference
tchar.h, because this header has macros named _itot and _ttoi.

So I recommend that leading undescore not be used at all, even in local
variables.
It doesn't make the code work any better, and it can make your code
non-portable to some common implementations.
 
A

Axter

Joseph said:
Hello,

I've created a class to store 2 dimensional matrices, and I've been
trying to overload the [] operator, so access to the elements of the
matrix is easy. I have 3 private variables, _m, _row and _col which are
respectively the array to store the data in, the number of rows and the
number of columns. _m is of type T, and is size _row * _col * sizeof
(T).

I've managed to overload the subscript operator like this:
T & operator [] (int n)
{
return (_m[n * _col]);
}

but I don't understand why you can't do:
T * operator [] (int n)
{
return (&_m[n * _col]);
}

instead. The first one works fine, but with the second one when I try
something like m[0][0] = 100.0, m[0][0] comes up as being an invalid
lvalue.
Could someone help me out here?
It should be something like the following:
T* operator[](int n) {return m_ + (col_*n);}

Check out the following link for an example:
http://code.axter.com/dynamic_2d_array.h
 
K

Kai-Uwe Bux

Axter said:
Kai-Uwe Bux said:
Noah said:
Kai-Uwe Bux wrote:
Joseph Paterson wrote:

Hello,

I've created a class to store 2 dimensional matrices, and I've been
trying to overload the [] operator, so access to the elements of the
matrix is easy. I have 3 private variables, _m, _row and _col which
are respectively the array to store the data in, the number of rows
and the number of columns. _m is of type T, and is size _row * _col
* sizeof (T). [snip]

Also, I can't find it but it must be in the faq somewhere...all your
variable names are broken. Don't use anything that starts with an
underscore.
[reasonable recommendation snipped]

You cannot find it because it should not be there. The reserved
identifiers are [17.4.3.1.2]:

* any identifier that contains a *double* underscore "__".
* any identifier that begins with "_" followed by a *capital* letter.
* any identifier that begins with "_" is reserved in the *global
namaspace*.

So, for local variables and parameters, the names are fine.

You should never name a local variable with an underscore followed by
an upper case letter.

IAW C++ standards that you quoted, names that begin with an underscore
and are followed by an uppercase letter, can be used for anything by
the implementation.
That means the implementation can use macros for this.
Example:
#define _T(x) L##x

If you have a local variable with the name _T, it will fail to be
portable to an implementation that uses above macro.

Note: the names the OP used are not of this form! We are all in agreement
about names like "_T".

IAW C++ standard, a name that begins with an underscore and then
followed by a lower case letter are reserved for the implementation in
the global namespace.
So technically speaking, IAW C++ standard, you could name a local
variable that begins with an underscore and is followed by a lower case
letter. And your code should be portable.

That is why the OP's names are fine.
However, this is not true in practice, because some implementations
also use these types of names with macros, which invade all namespaces,
and not just the global namespace.

Those are broken implementations.
IAW with the C++ standard, the following code should be portable:
int foo(int _itot, int _ttoi)
{
return _itot + _ttoi;
}

But the above code is not portable to windows platform that reference
tchar.h, because this header has macros named _itot and _ttoi.

tchar.h is a not a standard header. If you use third-party libraries, all
sorts of new macros could be defined. No naming scheme will prevent clashes
with certainty.
So I recommend that leading undescore not be used at all, even in local
variables.

Although I would agree that the advice is sound and reasonable from a
practical point of view (and from an aesthetical angle, too: leading
underscores are plain ugly); I would also maintain that workarounds for
problems with tchar.h and other third-party headers are not exactly topical
in this group.
It doesn't make the code work any better, and it can make your code
non-portable to some common implementations.



Best

Kai-Uwe Bux
 
A

Axter

Kai-Uwe Bux said:
Axter said:
Kai-Uwe Bux said:
Noah Roberts wrote:


Kai-Uwe Bux wrote:
Joseph Paterson wrote:

Hello,

I've created a class to store 2 dimensional matrices, and I've been
trying to overload the [] operator, so access to the elements of the
matrix is easy. I have 3 private variables, _m, _row and _col which
are respectively the array to store the data in, the number of rows
and the number of columns. _m is of type T, and is size _row * _col
* sizeof (T).
[snip]

Also, I can't find it but it must be in the faq somewhere...all your
variable names are broken. Don't use anything that starts with an
underscore.
[reasonable recommendation snipped]

You cannot find it because it should not be there. The reserved
identifiers are [17.4.3.1.2]:

* any identifier that contains a *double* underscore "__".
* any identifier that begins with "_" followed by a *capital* letter.
* any identifier that begins with "_" is reserved in the *global
namaspace*.

So, for local variables and parameters, the names are fine.

You should never name a local variable with an underscore followed by
an upper case letter.

IAW C++ standards that you quoted, names that begin with an underscore
and are followed by an uppercase letter, can be used for anything by
the implementation.
That means the implementation can use macros for this.
Example:
#define _T(x) L##x

If you have a local variable with the name _T, it will fail to be
portable to an implementation that uses above macro.

Note: the names the OP used are not of this form! We are all in agreement
about names like "_T".

IAW C++ standard, a name that begins with an underscore and then
followed by a lower case letter are reserved for the implementation in
the global namespace.
So technically speaking, IAW C++ standard, you could name a local
variable that begins with an underscore and is followed by a lower case
letter. And your code should be portable.

That is why the OP's names are fine.
However, this is not true in practice, because some implementations
also use these types of names with macros, which invade all namespaces,
and not just the global namespace.

Those are broken implementations.
IAW with the C++ standard, the following code should be portable:
int foo(int _itot, int _ttoi)
{
return _itot + _ttoi;
}

But the above code is not portable to windows platform that reference
tchar.h, because this header has macros named _itot and _ttoi.

tchar.h is a not a standard header. If you use third-party libraries, all
sorts of new macros could be defined. No naming scheme will prevent clashes
with certainty.
So I recommend that leading undescore not be used at all, even in local
variables.

Although I would agree that the advice is sound and reasonable from a
practical point of view (and from an aesthetical angle, too: leading
underscores are plain ugly); I would also maintain that workarounds for
problems with tchar.h and other third-party headers are not exactly topical
in this group.

The standard referrs to what the implementation can do with leading
underscore names.
There's no way of giving an example of this, without actually referring
to a specific implementation.
This paticular windows header, is not an isolated example.
There are other implementations that incorrectly use leading underscore
names followed by a lowercase letter.
IMHO, this is within the scope and context of this group.

This is not a workaround for a paticular header. It's a workaround for
all implementations that incorrectly use leading underscores followed
by a lowercase letter.
 
A

Axter

Noah said:
STILL trying to pass that off as an example are you???

If you are going to show people how to implement something like that at
least put a little effort into it.

Still passing yourself as an expert in this topic.
If you think you can do better, then do-so, and post it as an example.

If you think you have a better link with the same interface, then post
that.

If you don't have any thing better, then I recommend you find something
better to do then wasting this group's time with your post.
 
N

Noah Roberts

Axter said:
Still passing yourself as an expert in this topic.

Actually, no. You'll find you are the only one doing that here.
If you think you can do better, then do-so, and post it as an example.

template<typename T>
struct dynamic_2d_array
{
typedef T** type;
};

Actually, I wouldn't even bother with that.

Tell me what your class offers over that.
If you think you have a better link with the same interface, then post
that.

Why would I? It already exists in the standard.
If you don't have any thing better, then I recommend you find something
better to do then wasting this group's time with your post.

You waste people's time every time you post that piece of crap and tell
newbies to do it that way. Like I said, if you are going to create an
example then at least spend a little time on it. If that code took
more that 5 seconds to write I would be surprised. I _know_ it hasn't
been tested.

Hint:

map<int, dynamic_2d_array<double> > mtx_map;

But you know what. I am going to take you up on this challenge. I'm
going to fix your code for you. That way when you want to tell people
to implement op[][] at least you will have a halfway decent
implementation that won't also display horrid coding practices and
numerous bugs.
 
N

Noah Roberts

Axter said:
Still passing yourself as an expert in this topic.
If you think you can do better, then do-so, and post it as an example.

I am not posting this as an "example" as I don't agree with the idea.
I am also not posting it as an expert sample. I am posting this to
show some of the things a programmer should do differently than you did
if they insist on going that way and why your code "example" should not
be considered such.

If anyone believes the following code exhibits bad practice in any way
please speak up. I won't promise to agree but do listen and in cases
such as this I believe correctness is more important than my ego.

Note the use of whitespace!!

I did keep my lines to 80 chars but this "newsreader" may still break
the formatting. Does anyone have a suggestion of some place to put
this for better readability?

I don't normally comment so verbosely. In this case explainations and
reasoning are the important part...not necissarily the code itself. As
a coding standard I try to make my code itself do the explaining and
not make use of cryptic style that HAS to be commented.

One question I do have for the experts: In the copy constructor I
originally had new T[colCount_ * rowCount_] but this caused occasional
crashes (that were actually not so occasional). It is my understanding
that rowCount_ and colCount_ are initialized before the call to new T[]
so that should have worked. Using other's variables instead fixed the
problem for at least 20 runs so I am confident it is fixed. The
question is, why did it not work the first time, my failure or
compilers? If mine, where is my understanding flawed?

CODE: ==================================

// This class reimplements the dynamic_2d_array template created
// by David Maisonave and can be viewed at:
// http://code.axter.com/dynamic_2d_array.h
//
// This version fixes some of the bugs and poor coding practices in the
above
// implementation. I do not recommend doing anything like this, but if
you
// must, and want an example, I can't leave you at the mercy of a coder
that
// comes up with the above and calls himself an expert.
//
// Note: I am also not an expert. There are very likely places where
this could
// be improved. My heart is not into it as I disagree with the very
idea.

#include <cstdlib>
#include <cassert>

// Use typename or class? It doesn't really matter. One could
succesfully
// argue that the primary purpose of this class isn't to contain
classes though.
template< typename T >
class dynamic_2d_array
{
public:
dynamic_2d_array(size_t rowCount, size_t colCount)
: rowCount_(rowCount), colCount_(colCount),
storage_(new T[rowCount * colCount])
{
/*
* Changes: more descriptive variable names and a lack for a check
* of size for 0. Allocate any time. The reason for the
* later is simple: the check doesn't help, ignores the user's
* intention, and causes undefined behavior if the class is ever
* used.
*
* The correct way to do what the original code does, making sure
this
* class isn't used with 0 bounds is:
*/
assert(rowCount > 0 && colCount > 0);
/*
* You could throw an exception instead. All depends on if this
is an
* incorrect use of the class or a runtime error. I'm choosing
incorrect
* use of the class. Seems logical and is easier to implement.
*/
}

dynamic_2d_array(const dynamic_2d_array & other)
: rowCount_(other.rowCount_), colCount_(other.colCount_),
storage_(new T[other.rowCount_ * other.colCount_])
{
// In this case there is no need for the check at all as the
creation
// of a dynamic_2d_array with 0 bounds has been disallowed.
// Assert anyway to advertise the precondition...

assert(other.rowCount_ > 0 && other.colCount_ > 0 && other.storage_
!= 0);

std::copy(other.storage_,
other.storage_ + (rowCount_ * colCount_),
storage_);
}

~dynamic_2d_array() { delete [] storage_; }

// These definitions hide the fact that you are actually returning
// a pointer to insides. Clients are discouraged from depending on
this fact
// as it is not advertized by the interface. You could change this
at any
// time and not affect well behaved clients. Without these
definitions any
// changes would propegate to all callers.
typedef T* row_iterator;
typedef const T* const_row_iterator;

// Col iterator is much more complex and not in the original
interface.
// Personally I think the interface is not complete without this
definition
// but am not willing to spend the time implementing it. For one,
operator[]
// would then be totally inappropriate and implementing that operator
is the
// purpose of the original class.

row_iterator operator [] (size_t row)
{
return storage_ + (row * colCount_);
}

const_row_iterator operator [] (size_t row) const
{
return storage_ + (row * colCount_);
}

// That is the extent of the original dynamic_2d_array. Those are
all the
// messages that object can respond to. Not very dynamic. Here are
some
// more messages that will let this class at least be useful in that
it will
// do MORE, not LESS, than a vector or primitive array.

// bounds checking implementations of the above:
row_iterator row(size_t row)
{
if (!(row < rowCount))
{
throw std::eek:ut_of_range("Row index out of range");
}

return storage_ + (row * colCount_);
}

const_row_iterator row(size_t row) const
{
if (!(row > rowCount))
{
throw std::eek:ut_of_range("Row index out of range");
}

return storage_ + (row * colCount_);
}

// No way to protect the row iterator from invalid col bounds because
of the
// implementation. Changing row_iterator to a smarter object could
fix that.

// provide sizing knowledge, needed by almost any client:
size_t rowCount() const { return rowCount_; }
size_t colCount() const { return colCount_; }

// provide resizing - make the class live up to its name:
void resize(size_t rowCount, size_t colCount)
{
assert(rowCount > 0 && colCount > 0);

T * temp_storage = new T[rowCount * colCount];
memset(temp_storage, 0, sizeof(T) * rowCount * colCount);

size_t rows = std::min(rowCount, rowCount_);
size_t cols = std::min(colCount, colCount_);

for (size_t i = 0; i < rows; ++i)
{
std::copy(storage_ + (i * colCount_),
storage_ + (i * colCount_) + cols,
temp_storage + (i * colCount));
}

// operations that could cause an exception are finished.
// None of the following can. Strong guarantee...
std::swap(storage_, temp_storage);

rowCount_ = rowCount;
colCount_ = colCount;

delete [] temp_storage;
}

/*
* The lack of this operator is a major deficiency of the original
class.
* At one point it was hidden under the "protected" scope and I
mentioned that
* this was not the best as the class cannot be overriden anyway.
The
* original was better at that time but has since been "improved" to
not have
* or hide this operator. This means it is implemented by default
and will
* generate undefined behavior the first time it is used and the
objects
* destroyed. */
dynamic_2d_array & operator = (dynamic_2d_array other)
{
// Strong exception guarantee...
std::swap(storage_, other.storage_); // If this is new to you look
closely.
std::swap(rowCount_, other.rowCount_);
std::swap(colCount_, other.colCount_);

/* note that 'other' now owns my old storage and will delete it
when this
* function exits. */
}

// To finish the *advertized* interface of the class as commented in
the
// original we must provide direct access to the data storage for C
functons
// that need it. This is reasonable but dangerous.
// Probably better ways to do this:

T * c_array() { return storage_; } // no reason to pretend we are not
doing
// what we are doing...

// Comments should reflect the inherint danger of calling the above
function
// at the least. Use should be highly discouraged. Better yet,
don't
// implement it at all and/or do something better. Perhapse the more
costly
// but more correct way:
// std::auto_ptr<T> c_array() const;

/*
* other useful interface items that could be implemented:
* * explicit constructor from c array.
* * functional interface to get T object as f() or operator ().
*
* Do not be tempted to implement a non-explicit constructor from a c
array
* or to implement an implicit conversion to T* or T**.
*/

private:
T * storage_;
size_t rowCount_;
size_t colCount_;
};


#include <iostream>

int main()
{
typedef dynamic_2d_array<int> int_array;

int_array a1(5, 7);
assert(a1.colCount() == 7);
assert(a1.rowCount() == 5);

a1[3][5] = 20;
a1[4][6] = 30;

assert(a1[3][5] == 20);
assert(a1[4][6] == 30);

a1.resize(7, 10);

assert(a1[3][5] == 20);
assert(a1[4][6] == 30);

a1.resize(4, 6);

assert(a1[3][5] == 20);

int_array a2(a1);

assert(a1[3][5] == 20);

a1[1][1] = 100;

assert(a1[3][5] == 20);
assert(a1[1][1] == 100);

a2 = a1;

assert(a1[3][5] == 20);
assert(a1[1][1] == 100);
assert(a2[3][5] == 20);
assert(a2[1][1] == 100);

std::cout << "ALL IS WELL WITH THE INTERFACE.\n";

return 0;
}
 
A

Axter

Noah said:
Axter said:
Still passing yourself as an expert in this topic.
If you think you can do better, then do-so, and post it as an example.

I am not posting this as an "example" as I don't agree with the idea.
I am also not posting it as an expert sample. I am posting this to
show some of the things a programmer should do differently than you did
if they insist on going that way and why your code "example" should not
be considered such.

If anyone believes the following code exhibits bad practice in any way
please speak up. I won't promise to agree but do listen and in cases
such as this I believe correctness is more important than my ego.

Note the use of whitespace!!

I did keep my lines to 80 chars but this "newsreader" may still break
the formatting. Does anyone have a suggestion of some place to put
this for better readability?

I don't normally comment so verbosely. In this case explainations and
reasoning are the important part...not necissarily the code itself. As
a coding standard I try to make my code itself do the explaining and
not make use of cryptic style that HAS to be commented.

One question I do have for the experts: In the copy constructor I
originally had new T[colCount_ * rowCount_] but this caused occasional
crashes (that were actually not so occasional). It is my understanding
that rowCount_ and colCount_ are initialized before the call to new T[]
so that should have worked. Using other's variables instead fixed the
problem for at least 20 runs so I am confident it is fixed. The
question is, why did it not work the first time, my failure or
compilers? If mine, where is my understanding flawed?

The following is the variable members in your class.
private:
T * storage_;
size_t rowCount_;
size_t colCount_;
};

The problem with your code, is that a compiler initialize variables in
the order that the variables are listed in your class, and NOT in the
order that they're listed in the initialize list.
Example:
dynamic_2d_array(size_t rowCount, size_t colCount)
: rowCount_(rowCount), colCount_(colCount),
storage_(new T[rowCount_ * colCount_])

If your class tries to use member variables rowCount_ and colCount_ to
initialize storage_, the row and col variables will have random values,
because they haven't gotten initialized yet.
That's why it's important to list your member variables in the order
you wish them to get initialize.
private:
size_t rowCount_;
size_t colCount_;
T * storage_; //This should be last
};

The order in which you create the initialize list does not make any
difference as far as the order of initialization. What makes the
difference is the order in which you have the member variables declared
in your class.
This is a common beginners mistake.

Your code is not even compilable on a standard C++ compiler.
Your assignment operator is missing a return statement, so it returns
nothing.
And you're missing required header <algorithm>, which is needed for
std::copy and std::swap

There's no need for an assertion in your copy constructor, because if
the count is zero, you're already going to get an assertion in the
first constructor, and more over the std::copy will not copy anything
if either row or col variables are zero.

IMHO, it's a poor choice to put iterators in a class, and not have
begin and end member functions, which can easily be added with one line
of code.

The resize should avoid the memset command for two reasons.
First, it assumes T is a POD type, but you have no code validating it's
a POD type.
And secondly, it's inefficient.
You can set the values to zero when you call new by using the following
method:
T * temp_storage = new T[rowCount * colCount]();

Notice that the above line of code has () at the end.
IAW C++ standards, this will cause a POD type to be initialized to zero
value.
So not only is the above method more efficient to initialize POD type
to zero, but you also get the added benifit that if the type is not a
POD type, it will still be compatible.

Also, by adding the assertion for zero in the resize function, you
remove the ability for the class to clear it's data by performing a
resize(0,0).

In general, the idea of a resize function is good, and I'll probably
add a resize function to my class.
Also the c_array() function is a good idea.
The iterators would have been a good idea, if you would have added the
begin and end function.

I give you much credit for making the effort, which is more then I
thought you would do.
koodoos
 
N

Noah Roberts

Axter said:
The problem with your code, is that a compiler initialize variables in
the order that the variables are listed in your class, and NOT in the
order that they're listed in the initialize list.

Yep, the compiler generally warns one about this when you do it.
fixed.
Your code is not even compilable on a standard C++ compiler.

You mean on a C++ compiler that meets the standard in a specific way.

The compiler certainly should have warned me about the return of op= at
the least. I am disappointed it didn't...that is a common mistake.
(and the mistake was mine for not telling it to)

Fixed at any rate.

Obviously I rely a bit much on the compiler to tell me when I make
common errors. One can't keep track of all the little stuff and needs
decent tools to tell you when you've made basic mistakes or omisions.
I guess you could call this using the compiler as a crutch or failure
to understand...on the other hand, I don't feel my time is well spent
worrying about such things and if the tool will do that part of the
thinking for me it gives me more brain room for solving actual
problems.

And yes, I have numerous reference books because I can't memorize the
entire standard nor the libraries I use even on a daily basis. I am
fully aware of ordering, I just assumed I would be warned if I made a
mistake in that area and didn't concern myself with it....I was wrong.
In fact I had assumed so much that I didn't even look when it became a
problem.

But at least it gives you a chance to point out beginner mistakes in my
code. Difference between us is that mine is now fixed ;)
There's no need for an assertion in your copy constructor, because if
the count is zero, you're already going to get an assertion in the
first constructor, and more over the std::copy will not copy anything
if either row or col variables are zero.

That is noted in the code. The assertion is for the programmer.
Preconditions need to be documented. It doesn't hurt anything and an
assert is the best way to provide that documentation.

Besides, such /could/ happen and be caused by some sort of memory issue
- buffer overrun for instance. The assert would catch that rare
condition where without it there is just one more bug that could have
been caught but wasn't.

Nope, "unneccisary" assertion stays. It's good practice. It is doing
what it is supposed to, informing the programmer what the following
code assumes.
IMHO, it's a poor choice to put iterators in a class, and not have
begin and end member functions, which can easily be added with one line
of code.

Actually no they can't. What happens when you increment one? Right...

I called them iterators as in some ways they meet that definition and
there needs to be a typedef but there is one very important detail I
missed. I will fix it by calling them something else as I don't want
to do the work to implement them correctly.

Yet another reason I wouldn't even do this. Too much to account for,
too much to do...vector already exists and is a pretty darn GOOD wheel.
The resize should avoid the memset command for two reasons.
First, it assumes T is a POD type, but you have no code validating it's
a POD type.

Yes, I did make that assumption in that part of the code. That is poor
practice. Fixed.

Interestingly your code completely omits this initialization.
So not only is the above method more efficient to initialize POD type
to zero

It is actually unlikely to be any different at all...except that it
works with non-POD types.
Also, by adding the assertion for zero in the resize function, you
remove the ability for the class to clear it's data by performing a
resize(0,0).

That is ok because that would be an invalid use of that function. That
function is for resizing, not for clearing...one responsibility... I
debated having that assertion in there at all as it is not necissary
for the functioning of the code. However, I saw no major problem (just
inconveniences) with disallowing 0 bounds arrays by design as your
version does so it is something I kept. Allowing it here would be
inconsistant and invalid.

I will grant you though, the concept of not allowing 0 bounded arrays
may be flawed.

I would implement a clear() but at this point it isn't clear what that
would do. Yes, this functionality would probably be expected though
isn't actually necissary since the client can accomplish it with the
functionality that exists without breaking into the class or having to
track extra data this class should be tracking...whatever they think
clear() should do.

What constitutes a clearing would need to be defined and then it could
be implemented. It can't be a 0 bounded array though because that is
invalid at this point. The design would need to change for that.
Also the c_array() function is a good idea.

It is documented in your header but not implemented. I think it is a
bad idea myself and I explain that in the code. But in an effort to
provide the advertised design of dynamic_2d_array I added the
functionality against my better judgement....I did that a lot here.
The iterators would have been a good idea, if you would have added the
begin and end function.

Actually, no they are a bad idea...or at least much more involved that
you think. I'm removing the concept.

The necessity of begin()/end() is debatable but it would be consistant
with the STL in that ONE regard.


NEW CODE==================

// This class reimplements the dynamic_2d_array template created
// by David Maisonave and can be viewed at:
// http://code.axter.com/dynamic_2d_array.h
//
// This version fixes some of the bugs and poor coding practices in the
above
// implementation. I do not recommend doing anything like this, but if
you
// must, and want an example, I can't leave you at the mercy of a coder
that
// comes up with the above and calls himself an expert.
//
// Note: I am also not an expert. There are very likely places where
this could
// be improved. My heart is not into it as I disagree with the very
idea.

#include <cstdlib>
#include <cassert>
#include <algorithm>

// Use typename or class? It doesn't really matter. One could
succesfully
// argue that the primary purpose of this class isn't to contain
classes though.
template< typename T >
class dynamic_2d_array
{
public:
dynamic_2d_array(size_t rowCount, size_t colCount)
: rowCount_(rowCount), colCount_(colCount),
storage_(new T[rowCount * colCount]())
{
/*
* Changes: more descriptive variable names and a lack for a check
* of size for 0. Allocate any time. The reason for the
* later is simple: the check doesn't help, ignores the user's
* intention, and causes undefined behavior if the class is ever
* used.
*
* The correct way to do what the original code does, making sure
this
* class isn't used with 0 bounds is:
*/
assert(rowCount > 0 && colCount > 0);
/*
* You could throw an exception instead. All depends on if this
is an
* incorrect use of the class or a runtime error. I'm choosing
incorrect
* use of the class. Seems logical and is easier to implement.
*/
}

dynamic_2d_array(const dynamic_2d_array & other)
: rowCount_(other.rowCount_), colCount_(other.colCount_),
storage_(new T[other.rowCount_ * other.colCount_]())
{
// In this case there is no need for the check at all as the
creation
// of a dynamic_2d_array with 0 bounds has been disallowed.
// Assert anyway to advertise the precondition...

assert(other.rowCount_ > 0 && other.colCount_ > 0 && other.storage_
!= 0);

std::copy(other.storage_,
other.storage_ + (rowCount_ * colCount_),
storage_);
}

~dynamic_2d_array() { delete [] storage_; }

// These definitions hide the fact that you are actually returning
// a pointer to insides. Clients are discouraged from depending on
this fact
// as it is not advertized by the interface. You could change this
at any
// time and not affect well behaved clients. Without these
definitions any
// changes would propegate to all callers.
typedef T* row_reference;
typedef const T* const_row_reference;

// Col reference is much more complex and not in the original
interface.
// Personally I think the interface is not complete without this
definition
// but am not willing to spend the time implementing it. For one,
operator[]
// would then be totally inappropriate and implementing that operator
is the
// purpose of the original class.

// Why not "iterators"? Imagine using ++ on one and what you would
get.
// These are not row iterators and any other type would be bad
design.
// I'm too lazy to implement row iterators correctly - just as
complex as
// a col reference would be. That is an exercize left to the reader.

row_reference operator [] (size_t row)
{
return storage_ + (row * colCount_);
}

const_row_reference operator [] (size_t row) const
{
return storage_ + (row * colCount_);
}

// That is the extent of the original dynamic_2d_array. Those are
all the
// messages that object can respond to. Not very dynamic. Here are
some
// more messages that will let this class at least be useful in that
it will
// do MORE, not LESS, than a vector or primitive array.

// bounds checking implementations of the above:
row_reference row(size_t row)
{
if (!(row < rowCount))
{
throw std::eek:ut_of_range("Row index out of range");
}

return storage_ + (row * colCount_);
}

const_row_reference row(size_t row) const
{
if (!(row > rowCount))
{
throw std::eek:ut_of_range("Row index out of range");
}

return storage_ + (row * colCount_);
}

// No way to protect the row reference from invalid col bounds
because of the
// implementation. Changing row_reference to a smarter object could
fix that
// and allow a true row_iterator.

// provide sizing knowledge, needed by almost any client:
size_t rowCount() const { return rowCount_; }
size_t colCount() const { return colCount_; }

// provide resizing - make the class live up to its name:
void resize(size_t rowCount, size_t colCount)
{
assert(rowCount > 0 && colCount > 0);

T * temp_storage = new T[rowCount * colCount]();

size_t rows = std::min(rowCount, rowCount_);
size_t cols = std::min(colCount, colCount_);

for (size_t i = 0; i < rows; ++i)
{
std::copy(storage_ + (i * colCount_),
storage_ + (i * colCount_) + cols,
temp_storage + (i * colCount));
}

// operations that could cause an exception are finished.
// None of the following can. Strong guarantee...
std::swap(storage_, temp_storage);

rowCount_ = rowCount;
colCount_ = colCount;

delete [] temp_storage;
}

/*
* The lack of this operator is a major deficiency of the original
class.
* At one point it was hidden under the "protected" scope and I
mentioned that
* this was not the best as the class cannot be overriden anyway.
The
* original was better at that time but has since been "improved" to
not have
* or hide this operator. This means it is implemented by default
and will
* generate undefined behavior the first time it is used and the
objects
* destroyed. */
dynamic_2d_array & operator = (dynamic_2d_array other)
{
// Strong exception guarantee...
std::swap(storage_, other.storage_); // If this is new to you look
closely.
std::swap(rowCount_, other.rowCount_);
std::swap(colCount_, other.colCount_);

/* note that 'other' now owns my old storage and will delete it
when this
* function exits. */

return *this;
}

// To finish the *advertized* interface of the class as commented in
the
// original we must provide direct access to the data storage for C
functons
// that need it. This is reasonable but dangerous.
// Probably better ways to do this:

T * c_array() { return storage_; } // no reason to pretend we are not
doing
// what we are doing...

// Comments should reflect the inherint danger of calling the above
function
// at the least. Use should be highly discouraged. Better yet,
don't
// implement it at all and/or do something better. Perhapse the more
costly
// but more correct way:
// std::auto_ptr<T> c_array() const;

/*
* other useful interface items that could be implemented:
* * explicit constructor from c array.
* * functional interface to get T object as f() or operator ().
*
* Do not be tempted to implement a non-explicit constructor from a c
array
* or to implement an implicit conversion to T* or T**.
*/

private:
size_t rowCount_;
size_t colCount_;
T * storage_;
};
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,226
Members
46,815
Latest member
treekmostly22

Latest Threads

Top