Overloading operator []

K

kanze

Say you have a class Foo. The operator[] that is non-const returns a
proxy. There is a function operator const Foo&() const. Then there
is a function void operator=(const Foo&). What about other member
functions of class Foo? So that we can say
EnvelopeClass env;
env["abc"]; // returns EnvelopeClass::reference, which is conceptually a
Foo&
env["abc"] = Foo(3); // ok
env["abc"].memfun(); // oops
For the last line to work, class EnvelopeClass::reference should have
a member function memfun() because class Foo has such a function.
Then we have to duplicate the entire interface of class Foo inside
Foo::reference. Quite a nuisance.

This is a known problem. It is a major motivation behind the desire to
be able to overload operator.(). In the meantime, about the best you
can do is overload operator-> on the proxy, and require your users to
write things like:

env[ "abc" ]->memfun() ;

It exposes the fact that you are using a proxy, but until you can
overload operator.(), it is the best you can do.
 
K

kanze

Thomas Mang said:
James Kanze schrieb:

[Code deleted...]
What advantage does one gain by writing a proxy class, instead of
simply returning an int&?
That is, why not simply write the following overloaded operator[]?
int& operator[](int index)

If you can return a reference, you do. If you can't, you use a proxy.

References can be dangerous, because they allow the user direct access
to your internals. Thus, for exemple, the problemes with COW with
std::string -- if std::string::eek:perator[] were allowed to return a
proxy, COW would be a lot easier, *and* would be a much bigger win.

Or consider a class which actually maintains the data on disk, and not
in memory. If the accesses are all through get and put (called by the
proxy), there is no problem; if you return a reference, however, you
don't know how long to lock the data in memory, and you don't know after
whether it has been modified or not, so you don't know whether you have
to write the data back or not.
 
F

Francis Glassborow

Andrey Tarasevich said:
Not true. We gave user full control of the value, which doesn't
necessarily mean that we "lost" control of it in negative sense of the
word, since the object that returned the reference might not even care
about this control.

But that is a design decision, and if you truly do not care make the
data public.
Not true.

There still is one (although thin) level of protection that is not
destroyed by returning a reference to private data. It doesn't expose
the actual location of the lvalue, i.e. it doesn't declare loud and
clear that the returned reference is bound to a subobject of the class.
Making a subobject 'public' immediately removes this last layer of
protection.

For example, making a subobject 'public' allows user to make assumptions
about the lifetime of subobject based on the lifetime of the entire
object. Accessor member function that returns a reference prevents user
to make such assumptions.

The subtle problems that this introduces are well illustrated by the
c_str() member of std::string.

Now I over stated my case because containers have internal data
structures that we do not want to expose to users, yet we do want to
provide full access to the contained objects. Generally that is easily
supplied by either references or iterators. Sometimes such as in the
case of bitset<> we cannot do that and have to provide a proxy to do the
work but that is a different design problem because the individual
elements do not have independent existence.
 
L

llewelly

Andrey Tarasevich said:
Not true. We gave user full control of the value, which doesn't
necessarily mean that we "lost" control of it in negative sense of the
word, since the object that returned the reference might not even care
about this control.

This is hairsplitting on the definition of control.
Not true.

There still is one (although thin) level of protection that is not
destroyed by returning a reference to private data. It doesn't expose
the actual location of the lvalue,

That depends on what you mean by location. The address of the lvalue
is exposed.
i.e. it doesn't declare loud and
clear that the returned reference is bound to a subobject of the
class.

#include<iostream>

class a
{
int i;
public:
int& foo(){return i;}
};

bool is_within(void* p, void* begin, void* end)
{
return p >= begin and p < end ;
}

int main()
{
a b;
std::cout << std::boolalpha << "is_within(&b.foo(), &b, &b + 1) == "
<< is_within(&b.foo(), &b, &b + 1) << std::endl;
}


I believe the expression 'is_within(&b.foo(), &b, &b + 1)' in the
above is required to return true.
Making a subobject 'public' immediately removes this last layer of
protection.
[snip]

I used to think so too. Then I encountered a real-life situation
where someone had come to rely on what is implied by the code
above.
 
S

Siemel Naran

For the last line to work, class EnvelopeClass::reference should have
a member function memfun() because class Foo has such a function.
Then we have to duplicate the entire interface of class Foo inside
Foo::reference. Quite a nuisance.

This is a known problem. It is a major motivation behind the desire to
be able to overload operator.(). In the meantime, about the best you
can do is overload operator-> on the proxy, and require your users to
write things like:

env[ "abc" ]->memfun() ;

It exposes the fact that you are using a proxy, but until you can
overload operator.(), it is the best you can do.

So essentially operator[] returns a proxy pointer. However I'm not sure if
it solves the original problem. For const member functions we just want to
forward to real function. For non-const member functions we want to call
the alter function and then forward to the real function.

class Envelope::Reference {
public:
void constmemfun() const {
return d_object.constmemfun();
}
void memfun() const {
d_envelope.alter();
return d_object.memfun();
}
private:
Envelope& d_envelope;
Envelope::Object& d_object;
};
 
K

kanze

Siemel Naran said:
For the last line to work, class EnvelopeClass::reference should
have a member function memfun() because class Foo has such a
function. Then we have to duplicate the entire interface of class
Foo inside Foo::reference. Quite a nuisance.
This is a known problem. It is a major motivation behind the desire
to be able to overload operator.(). In the meantime, about the best
you can do is overload operator-> on the proxy, and require your
users to write things like:
env[ "abc" ]->memfun() ;
It exposes the fact that you are using a proxy, but until you can
overload operator.(), it is the best you can do.
So essentially operator[] returns a proxy pointer.

Sort of a mixure. It acts like an object on the left side of
assignment, and when an lvalue to rvalue conversion is called for, but
like a pointer if you use the -> operator:). Not an ideal situation, I
will admit.
However I'm not sure if it solves the original problem. For const
member functions we just want to forward to real function. For
non-const member functions we want to call the alter function and then
forward to the real function.
class Envelope::Reference {
public:
void constmemfun() const {
return d_object.constmemfun();
}
void memfun() const {
d_envelope.alter();
return d_object.memfun();
}
private:
Envelope& d_envelope;
Envelope::Object& d_object;
};

Yet another problem.

In practice, I have only used such proxies within collections, and my
collections, like the STL, use value semantics. So you typically don't
have non-const member functions except for assignment; you also tend to
copy the object you're interested in out of the collection, and work
with the copy, potentially copying it back in.

I suppose if I had to deal with objects with behavior, I'd arrange for
operator[] (or the operator-> of the proxy) to return some sort of smart
pointer -- if the collection is caching objects, for example, you'd need
the smart pointer to tell you when you could free the object, for
example.
 
A

Andrey Tarasevich

llewelly said:
That depends on what you mean by location. The address of the lvalue
is exposed.

I meant a completely different notion of location (OK, I agree that
"location" is rather bad choice of word). What I meant is explained
below (after "i.e."):
#include<iostream>

class a
{
int i;
public:
int& foo(){return i;}
};

bool is_within(void* p, void* begin, void* end)
{
return p >= begin and p < end ;
}

int main()
{
a b;
std::cout << std::boolalpha << "is_within(&b.foo(), &b, &b + 1) == "
<< is_within(&b.foo(), &b, &b + 1) << std::endl;
}

I believe the expression 'is_within(&b.foo(), &b, &b + 1)' in the
above is required to return true.

Yes, in this particular case it is required to return 'true'. (Actually,
reading 5.9/2 I can't immediately say whether the comparison between
'&b.foo()' and '&b + 1' is specified by the standard. But let's assume
that it is.) But this technique cannot legally be used to detect the
situation when the returned reference is bound to a subobject of given
object. The problem arises when the returned reference is _not_ bound to
a subobject. In this case the pointer comparison in the above
'is_within' function is unspecified (see 5.9/2), which means that
basically anything can be returned. In other words, when this function
returns 'true', it means one of two things:

1) pointer 'p' points to some location between 'begin' and 'end'
2) the comparison is unspecified and the returned result is nothing but
an accident - a particular case of unspecified result, that just happens
to be 'true' this time

Can you come up with a way to differentiate these two situations? I
don't believe it is possible.
Making a subobject 'public' immediately removes this last layer of
protection.
[snip]

I used to think so too. Then I encountered a real-life situation
where someone had come to rely on what is implied by the code
above.
...

As I said above, this code does not illustrate any legal technique,
which means that I can't accept it as an argument.
 
L

llewelly

Andrey Tarasevich said:
I meant a completely different notion of location (OK, I agree that
"location" is rather bad choice of word). What I meant is explained
below (after "i.e."):


Yes, in this particular case it is required to return 'true'. (Actually,
reading 5.9/2 I can't immediately say whether the comparison between
'&b.foo()' and '&b + 1' is specified by the standard. But let's assume
that it is.) But this technique cannot legally be used to detect the
situation when the returned reference is bound to a subobject of given
object. The problem arises when the returned reference is _not_ bound to
a subobject. In this case the pointer comparison in the above
'is_within' function is unspecified (see 5.9/2), which means that
basically anything can be returned. In other words, when this function
returns 'true', it means one of two things:

1) pointer 'p' points to some location between 'begin' and 'end'
2) the comparison is unspecified and the returned result is nothing but
an accident - a particular case of unspecified result, that just happens
to be 'true' this time
[snip]

This is quite similar to my response the first time I encountered this
issue. What I didn't realize is that most real programs must rely
on on some unspecified, undefined, or implementation-defined
behavior (though such code is hopefully labled, conditioned
according to platform, and most importantly confined), and in
certain areas (e.g., implementing garbage collectors) relying on
pointer comparisons is common.
 
G

Glen Low

No, you are making assumptions as to what the class designer should
expose. The point of using a proxy class is exactly that it give the
class designer control whilst the return of a reference abandons it. If
we write any function that returns a plain reference to private data we
have exposed that data and lost control of it. The main motive for
having private data is exactly to avoid such circumstances. If a member
function returns a plain reference we might as well make the data
public.

In general I agree with you and disagree with the parent post -- it is
dangerous to return references. However there is one other case I can
think of besides containers where you may want to return references:
where you want to restrict to const access.

E.g.
class X
{
private:
Y y_;
public:
const Y& getReference () const { return y_; }
};

Here you don't want to expose the data member y_ to arbritary change,
you want all clients (even those with a non-const reference to an X)
to only be able to see y_ and not change it.

The moral equivalent of:

class X
{
private:
Y y_;
public:
const Y& reference;
X (...): y_ (...), reference (y_) { }
};

.... though I'm not sure by C++ standard rules whether y_ exists at the
right time to get a reference to it.

Cheers,
Glen Low
http://www.pixelglow.com/
 
M

Maciej Sobczak

Hi,

Glen said:
The moral equivalent of:

class X
{
private:
Y y_;
public:
const Y& reference;
X (...): y_ (...), reference (y_) { }
};

... though I'm not sure by C++ standard rules whether y_ exists at the
right time to get a reference to it.

Members are initialized in the order of their declaration, independent
of their order in the initializer list.

In other words, 'y_' is initialized before 'reference', so 'reference'
is initialized with a reference to the already initialized 'y_'.

Interestingly:

class X
{
public:
const Y& reference;
X (...): y_ (...), reference (y_) { }
private:
Y y_;
};

Here, 'reference' is initialized with a reference to something that has
not yet been initialized, but that already has memory allocated. This is
a problem, and I cannot find in the Standard any clear statement that
would either bless or damn it. Some compilers accept this code. For
sure, *using* a reference to something that does not yet exist is forbidden.

--
Maciej Sobczak
http://www.maciejsobczak.com/

Distributed programming lib for C, C++, Python & Tcl:
http://www.maciejsobczak.com/prog/yami/
 
D

Daniel Spangenberg

Hello, Maciej Sobczak!

Maciej Sobczak schrieb:
[snip]
Interestingly:

class X
{
public:
const Y& reference;
X (...): y_ (...), reference (y_) { }
private:
Y y_;
};

Here, 'reference' is initialized with a reference to something that has
not yet been initialized, but that already has memory allocated. This is
a problem, and I cannot find in the Standard any clear statement that
would either bless or damn it. Some compilers accept this code. For
sure, *using* a reference to something that does not yet exist is forbidden.

My interpretation of what is written in 3.7.3.1/p. 2 and in 3.8/p. 5 let me
deduce, that your example should be well-defined.

Greetings,

Daniel
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top