Accessors vs public data members

Z

Zap

[I already posted the following to comp.lang.c++.moderated and received
2 replies (which you can go and see if you are interested). I'm looking
for more discussion by posting again here.]


Widespread opinion is that public data members are evil, because if you
have to change the way the data is stored in your class you have to
break the code accessing it, etc.

After reading this (also copied below for easier reference):

http://groups.google.it/groups?hl=en&lr=&safe=off&[email protected]&rnum=95

I don't agree anymore.

Code that uses x = obj.data_get() and obj.data_set(x) is uglier, longer
to write and more difficult to read than x=obj.data and obj.data=x .

Furthermore, chances that in future the implementation will beed to be
changed are in general rather small: I would say less than 10%: of 10
classes with data elements, likely less than 1 will really need to be
modified in the future by putting code in the place of raw data members.

To avoid problems with that only class over 10, the current widespread
practice is to use getters and setters also to access members of the
other 9 classes.

Jerry Coffin's approach seems to me much better than getters and
setters. However I'm trying to analyze the drawbacks of Coffin's
approach. Correct me if I am wrong:

- The compiler is forced to leave at least 1 byte for the nested class
X, while would need zero for getters and setters

- Inheriting the class "whatever" won't let you change the getters and
setters. However: 1) with the getters and setters you anyway needed to
foresee need for inheritance and declare them virtual (*), which usually
nobody does 2) if you foresee need for inheritance, you can do that also
with Coffin's tecnhique, by putting a reference to X inside "whatever"
instead of an instance of X. You initialize the reference in the
constructor. In the derived class you initialize it pointing to an XDerived.

- The standard allows only 1 implicit conversion to be performed. So if
the accessing code uses X (while reading), the 1 implicit conversion (to
int, in the example) is already used. If the old accessor code is
relying on another implicit conversion to be performed (but I see this
as rather unlikely), that code will break.


These three drawbacks IMHO don't seem to compensate for the hassle of
using accessors for all the classes which might, in the future, need
code instead of data. However I'm waiting for comments...

Thanks for your comments :)

Zap


(*) Ok... sometimes inheritance is useful even without "virtual"



-----------------------------
[Jerry Coffin post copied for reference]

From: Jerry Coffin ([email protected])
Subject: Re: Member name style
View: Complete Thread (119 articles)
Original Format
Newsgroups: comp.lang.c++.moderated
Date: 1998/02/06

[email protected] says... said:
But what if in the future you change the attribute to be calculated or
obtained from some database? Then, all the dozens or hundreds of
places that directly access the member variable will have to be
modified to use the getter method. Using getter methods all the
time, hides the internal implementation and let's you change that
internal implementation with no ripple effects to the rest of the
code.

When/if you really run into this, you can avoid the problem quite
easily. Instead of using a member function, and converting all access
to the variable to use the member function instead, you simply turn
the member variable into a class of its own. Provide that class with
conversion operators to the original type, and voila' your code gets
executed as necessary. E.g.:

class whatever {
public:
int x;
// ...
};

turns into:

class X {
public:
operator int() {
// access database or whatever to determine value
return value;
}
X operator=(int new_value) {
// set value in database, or whatever...
}
};

class whatever {
public:
X x;
// ...
};

Now assignments to and from x use your code anyway, but avoid having
to make it clear in all other code that you're using code rather than
simply accessing a variable.

Keep in mind that a single class should generally be responsible for
only a single thing. If use of a particular member variable starts to
become more complex than simply assigning to it/using its value, it
most likely belongs in a class of its own.

I'm not sure I'd particularly recommend this technique for new code,
but if you have some existing code that allows public access to its
variables, it can make long-term maintenance much easier.

I should add that problems can arise with this basic design as well.
In particular, if you can no longer reasonably represent the TYPE of
thing being dealt with as the original type at all, then you have to
make more widespread changes. If there's ANY chance of this happening
at all, you're probably best off to start with the return being a
class or at least a typedef rather than making it a built-in type.
This will generally ease any long term transitions you have to make.
 
A

Andrey Tarasevich

Zap said:
[I already posted the following to comp.lang.c++.moderated and received
2 replies (which you can go and see if you are interested). I'm looking
for more discussion by posting again here.]


Widespread opinion is that public data members are evil, because if you
have to change the way the data is stored in your class you have to
break the code accessing it, etc.

After reading this (also copied below for easier reference):

http://groups.google.it/groups?hl=en&lr=&safe=off&[email protected]&rnum=95

I don't agree anymore.

Code that uses x = obj.data_get() and obj.data_set(x) is uglier, longer
to write and more difficult to read than x=obj.data and obj.data=x .
...

The entire premise of this question is incorrect. These two methods are
not supposed to be interchangeable. Each one has its own uses and a
discussion about one being "uglier" and/or "more difficult to read" than
the other makes no sense.

Making data member 'public' implements _explicit_ aggregation. It gives
client code full control of the data member. It provides the client with
information about the location of the data (i.e. lvalue access) and
lifetime of the member (i.e. the relation of its lifetime to the
lifetime of the enclosing object).

Using getter/setter pair implements _implicit_ aggregation, which has
significantly higher level of abstraction. It doesn't give away anything
about the actual location (no lvalue access) and/or lifetime of the data
member.

In each particular case you have to choose and use the one that fits
your intended design better.
 
P

Paavo Helde

Zap said:
I don't agree anymore.

Code that uses x = obj.data_get() and obj.data_set(x) is uglier,
longer to write and more difficult to read than x=obj.data and
obj.data=x .

Yes, so long as getter/setter only gets and sets.
Furthermore, chances that in future the implementation will beed to be
changed are in general rather small: I would say less than 10%: of 10
classes with data elements, likely less than 1 will really need to be
modified in the future by putting code in the place of raw data
members.

To avoid problems with that only class over 10, the current widespread
practice is to use getters and setters also to access members of the
other 9 classes.

Nope, to my mind the first reason for setters is to check/assert/update
the invariants of the class. They are also a good place to put
breakpoints on. I could do without getters most of the time, if there
were a 'readonly' keyword in addition to private/public.

If the class has no invariants (just collects a bunch of independent
variables) then you are right, just call it a struct and forget about C++
:)

Jerry Coffin's approach seems to me much better than getters and
setters. However I'm trying to analyze the drawbacks of Coffin's
approach. Correct me if I am wrong:

(For background: Coffin's approach suggests replacing a member variable
with a dedicated class object having suitable assignment and conversion
operators defined, bringing an example of replacing the variable with an
object interacting directly with a database.)

It seems that you have an opinion that Coffin's approach should be a
replacement of common getter/setter mechanism. It's not so, it's a
replacement for public data member, in case if you suddenly want to
remove the data member. So it should be fairly compared with the public
data member approach (which I'm trying to do below):
- The compiler is forced to leave at least 1 byte for the nested class
X, while would need zero for getters and setters

Not necessarily, if the class just encapsulates the same variable (and is
not polymorphic), it is of the same size. If the variable is removed
totally, it can be even less.
- Inheriting the class "whatever" won't let you change the getters and
setters. However: 1) with the getters and setters you anyway needed to
foresee need for inheritance and declare them virtual (*), which
usually nobody does 2) if you foresee need for inheritance, you can do
that also with Coffin's tecnhique, by putting a reference to X inside
"whatever" instead of an instance of X. You initialize the reference
in the constructor. In the derived class you initialize it pointing to
an XDerived.

As a plain POD member cannot be overridden anyway I don't see this as a
drawback.
- The standard allows only 1 implicit conversion to be performed. So
if the accessing code uses X (while reading), the 1 implicit
conversion (to int, in the example) is already used. If the old
accessor code is relying on another implicit conversion to be
performed (but I see this as rather unlikely), that code will break.

I've got a feeling that a lot of old-time C++-sers would vote for less
implicit conversions :)
These three drawbacks IMHO don't seem to compensate for the hassle of
using accessors for all the classes which might, in the future, need
code instead of data. However I'm waiting for comments...

You have missed the major drawback of Coffin's approach: having to write
a separate class for each such member. In general, the setters should
verify the class invariants; this means that the 'whatever' class must
know about and communicate with the owner class. To do this, it has to
store a reference to the owner (or use some ugly and non-portable
offsetof() technique), and the overall complexity of the solution is
easily quadrapled. So this approach suits only what Jerry proposed it
for: when you have a public data member in some widely-used class and you
suddenly want to replace the data member by something else.

For me, I have had enough occasions where I just wanted to add some check
or breakpoint to a member variable getting/setting, and waited 20 minutes
for the projects recompile after an header change, so now I don't
hesitate any more to define proper getters/setters up the front. If the
profiler will show there is a bottleneck I will bring them inline, but
this has not yet happened ;-)

Thanks for your comments :)

Zap

You are welcome!
Paavo
 
C

Cy Edmunds

Zap said:
[I already posted the following to comp.lang.c++.moderated and received 2
replies (which you can go and see if you are interested). I'm looking for
more discussion by posting again here.]


Widespread opinion is that public data members are evil, because if you
have to change the way the data is stored in your class you have to
break the code accessing it, etc.

After reading this (also copied below for easier reference):

http://groups.google.it/groups?hl=en&lr=&safe=off&[email protected]&rnum=95

I don't agree anymore.

Code that uses x = obj.data_get() and obj.data_set(x) is uglier, longer
to write and more difficult to read than x=obj.data and obj.data=x .

Furthermore, chances that in future the implementation will beed to be
changed are in general rather small: I would say less than 10%: of 10
classes with data elements, likely less than 1 will really need to be
modified in the future by putting code in the place of raw data members.

To avoid problems with that only class over 10, the current widespread
practice is to use getters and setters also to access members of the
other 9 classes.

Jerry Coffin's approach seems to me much better than getters and
setters. However I'm trying to analyze the drawbacks of Coffin's
approach. Correct me if I am wrong:

- The compiler is forced to leave at least 1 byte for the nested class
X, while would need zero for getters and setters

- Inheriting the class "whatever" won't let you change the getters and
setters. However: 1) with the getters and setters you anyway needed to
foresee need for inheritance and declare them virtual (*), which usually
nobody does 2) if you foresee need for inheritance, you can do that also
with Coffin's tecnhique, by putting a reference to X inside "whatever"
instead of an instance of X. You initialize the reference in the
constructor. In the derived class you initialize it pointing to an
XDerived.

- The standard allows only 1 implicit conversion to be performed. So if
the accessing code uses X (while reading), the 1 implicit conversion (to
int, in the example) is already used. If the old accessor code is
relying on another implicit conversion to be performed (but I see this
as rather unlikely), that code will break.


These three drawbacks IMHO don't seem to compensate for the hassle of
using accessors for all the classes which might, in the future, need
code instead of data. However I'm waiting for comments...

Thanks for your comments :)

Zap


(*) Ok... sometimes inheritance is useful even without "virtual"



-----------------------------
[Jerry Coffin post copied for reference]

From: Jerry Coffin ([email protected])
Subject: Re: Member name style
View: Complete Thread (119 articles)
Original Format
Newsgroups: comp.lang.c++.moderated
Date: 1998/02/06

[email protected] says... said:
But what if in the future you change the attribute to be calculated or
obtained from some database? Then, all the dozens or hundreds of
places that directly access the member variable will have to be
modified to use the getter method. Using getter methods all the
time, hides the internal implementation and let's you change that
internal implementation with no ripple effects to the rest of the
code.

When/if you really run into this, you can avoid the problem quite
easily. Instead of using a member function, and converting all access
to the variable to use the member function instead, you simply turn
the member variable into a class of its own. Provide that class with
conversion operators to the original type, and voila' your code gets
executed as necessary. E.g.:

class whatever {
public:
int x;
// ...
};

turns into:

class X {
public:
operator int() {
// access database or whatever to determine value
return value;
}
X operator=(int new_value) {
// set value in database, or whatever...
}
};

class whatever {
public:
X x;
// ...
};

Now assignments to and from x use your code anyway, but avoid having
to make it clear in all other code that you're using code rather than
simply accessing a variable.

Keep in mind that a single class should generally be responsible for
only a single thing. If use of a particular member variable starts to
become more complex than simply assigning to it/using its value, it
most likely belongs in a class of its own.

I'm not sure I'd particularly recommend this technique for new code,
but if you have some existing code that allows public access to its
variables, it can make long-term maintenance much easier.

I should add that problems can arise with this basic design as well.
In particular, if you can no longer reasonably represent the TYPE of
thing being dealt with as the original type at all, then you have to
make more widespread changes. If there's ANY chance of this happening
at all, you're probably best off to start with the return being a
class or at least a typedef rather than making it a built-in type.
This will generally ease any long term transitions you have to make.

Seems like we've been through this before. :p

The issue always seems to be public data vs. get/set pairs. I almost never
use either. Here's why.

For small, simple concrete classes the constructor often suffices for
setting the values:

class rgb
{
private:
unsigned char m_r, m_g, m_b;
public:
rbg(unsigned char i_r, unsigned char i_g, unsigned char i_b) :
m_r(i_r), m_g(i_g), m_b(i_b) {}
unsigned char r() const {return m_r;}
unsigned char g() const {return m_g;}
unsigned char b() const {return m_b;}
};

rgb a(45, 21, 122);
a = rgb(31, 55, 22); // changed my mind

For large, complex classes you *absolutely* don't want get/set pairs because
the implementation is likely to be complex and subject to relatively
frequent change. These classes pretty much demand you think in object
oriented terms -- things like sending and receiving messages rather than
setting internal data values.

Of course sometimes I use a get/set pair but only if I have a good reason. A
public data member on the other hand is like a goto: you should feel like
taking a shower after writing one.
 
E

emofine

It has long been acknowledged that global variables are really bad
news. Nobody really asks why any more. Public data members fall into
the same category, IMHO. So many people have suffered so much for so
long due to public data members -- and implicit conversions -- it
should be second nature by now to avoid these, which are widely
regarded as evil.

If a class is so simple that it is mostly accessors and mutators, maybe
it should shrug off its class disguise and morph into its true form, a
struct (e.g. stl::pair). If it is more complex, most public members
should be functions that perform fairly complex tasks, and there should
be little, if any, need for the value of member variables. If there is,
maybe some refactoring is in order?

When I *have* needed to access class data (and we all do at some point
- look at std::string::data() or c_str()), I thanked numerous deities
on countless occasions because I used accessors and mutators to protect
the invariants. (The same goes for protected data members. A company I
once worked for had to throw away an entire large and complex C++ class
library because of protected data members, and some bad design
decisions. But that's another story...)
 
E

E. Robert Tisdale

Zap wrote:

[snip]
#ifndef GUARD_WHATEVER_H
#define GUARD_WHATEVER_H 1

#include <iostream>

class whatever {
public:
class X {
private:
double value;
public:
// operators
operator int() const {
return (int)value;
}
X operator=(int new_value) {
value = new_value;
return X((int)value);
}
friend
std::eek:stream& operator<<(
std::eek:stream& os, const whatever& w) {
return os << (int)(w.x);
}
// constructors
X(int x = 0): value(x) { }
};
// representation
X x;
// ...
};

#endif//GUARD_WHATEVER_H
cat whatever.cc
#include <whatever.h>

int main() {
whatever w;
w.x = 13;
std::cout << w << std::endl;
return 0;
}
g++ -I. -Wall -ansi -pedantic -o whatever whatever.cc
./whatever
13
 
J

Jerry Coffin

It seems to me that there's been a bit of hyperbole on the part of a
number of posters in this thread.

First of all, I'd be the last to claim that this general technique can
replace _every possible_ thing that might look/feel/act a bit like an
accessor and/or mutator -- quite the contrary, there are clearly
situations in which it would be counterproductive.

At the same time, much of what has been said about the possible
weaknesses seems to me to have been exaggerated. In particular, while
it's true that _some_ class invariants can't (reasonably) be enforced
in a fashion such as this, it's certainly NOT true that _none_ of them
can be -- quite the contrary, I believe quite a few can be enforced
quite easily and effectively in this fashion. About a year ago or so,
I spent a week or so looking for various libraries and such in C++, and
went through them trying to find accessor and mutator (aka
getter/setter) functions. I haven't kept exact numbers, but roughly
half of these basically did nothing -- any value you tried to set was
assigned to the underlying data member, and anytime you called the
getter, it simply returned the value of the data member. IOW, around
half were there either "because the book said so" or (hopefully) with
an eye to a future in which they might become necessary, NOT because
they provided real utility in the code as it stood.

Of the remaining that actually did something, most simply enforced a
range on the data value -- i.e. the mutator didn't assign a new value
unless it fell within the correct range, but the accessor still just
returned whatever value was in the data member.

This technique can obviously handle the first group easily. While it
may be marginally less obvious, it can handle the second quite easily
as well. Just for example, a class template like this:

template <class T, T minimum, T maximum>
class ranged {
T value;

void assign(T const &val) {
if ((val < minimum) || (val > maximum))
throw std::eek:ut_of_range("value");
value = val;
}
public:
operator T() { return value; }
ranged(T initial=minimum) {
assign(initial);
}
ranged const &operator=(T const &new_val) {
assign(new_val);
return *this;
}
};

makes it trivial to enforce a specified range on an object. At least
from the code I looked at, a large majority of accessor and mutator
functions could be replaced by this alone (which seems to fit
reasonably well with Paavo's statement about a readonly keyword).

I'd also note that a template on this general order obviates writing a
new class for each such member variable -- as noted above, the majority
simply require instantiating a template with the correct values. I'd
also note that contrary to Paavo's assertion, enforcing a class
invariant normally does not require any sort of communication:
communication is necessary when/if something _dynamic_ is being
enforced, by an invariant (by its very nature) is static. Communication
becomes necessary only when the invariant is something like a static
relationship between two objects, either of which can vary dynamically
(e.g. X <= Y/2). This is certainly a possibility, but qually certainly
is NOT necessarily the case (nor in my experience is it even typically
the case). IOW, many (IME, most) enforcement of most class invariants
does not require any sort of on-going communication between the proxy
and the containing class. Typically, the invariant is supplied to the
proxy at instantiation, and that's the end of it.

There are certainly still cases that require more that this template
can provide. Some can/could be cured with more elaborate proxy
classes/templates, though most are also sufficiently specialized that
it would be more or less pointless for me to try to post examples for
imaginary purposes.

Even with those, a few cases remain in which a class' invariants are
sufficiently complex that the class itself is the lowest level at which
they can be enforced. At least in my experience, however, these cases
are unusual if not downright rare. Specifically, they're sufficiently
rare that I think it disproves the argument that explicit accessor and
mutator functions should be used even when they're not really needed --
if (for example) you hurt the readability of 2% of the code to be
consisten with the remaining 98%, that would probably be perfectly
reasonable. OTOH, hurting the readabilty of 98% of the code to be
consistent with the remaining 2% is just plain foolish.

IMO, the claim that explicit accessor and mutator functions provide a
higher level of abstraction is simply nonsense -- quite the contrary,
they require client code to include explicit knowledge of the
_mechanism_ (function calls) being used to enforce constraints. This
technique uses the same mechanism, but hides it so it looks like a
simple assignment. At least to me, this sounds like a _higher_ level of
abstraction. This is usually reflected in the syntax of the client
code: explicit accessor/mutator function usage quickly leads to code
that looks a great deal like assembly language (and not even very good
assembly language at that).

I also think it IS sensible to compare these techniques on the basis of
readability --they can provide the same basic capabilities in a large
majority of cases, but one does so in a fashion that I'd posit is far
more readable and understandable than the other.

Andrey got into a discussion of lifetime and such, which seems to me
completely orthogonal to the question at hand. In nearly every case, an
accessor and mutator operate on one or more private data members of an
object. These do the same. In either case, it's unusual but entirely
possible for them to operate on data with a different lifetime (in the
case of this technique, that's typically done by the outer class
containing a reference to the proxy class instead of an instance of
it).

I also agree with Cy that when possible it's better to avoid all of the
above -- I've argued in the past (and continue to believe) that in
general classes should strive to provide higher-level operations,
rather than simple access to data. Nonetheless, the result of those
higher-level operations is more or less inevitably modifying the state
of the system, or producing data, or both. These correspond to
accessing and mutating data, possibly using the value returned by the
mutator. As such, using values/assignments as a _model_ of what's going
on can be useful, even when a great deal more may be going on "under
the hood", so to speak.

I'm only aware of one circumstance in which you'd really _expect_ this
technique to lead to a difference in size of the final class: when/if
the data being manipulated is external to the object. The proxy object
is given its own address even if it contains no data, using up one byte
even if it's never really used. Since, however, this storage is never
really used, it can easily be assigned the address of storage that
would otherwise be used for padding. As such, the difference typically
falls well within the range of variation expected from one compiler to
another, or even the same compiler with different flags. If you expect
LOTS of instances of a class containing a lot of proxies, this might be
a real concern, but I've yet to see it cause a problem in real life.

While it's true that this technique "uses up" (so to speak) the one
available implicit conversion when retrieving a value, I generally
consider this an advantage. Using the one implicit conversion prevents
other accidental/unintended conversions that can allow (for example) an
unintended overloaded function to be invoked.

Ultimately, the idea that data should be made private was based on the
fact that data was "dumb", and should be protected by "intelligent"
code. This technique does NOT expose "dumb" data -- what's really
exposed is still the "intelligent" code, but it's exposed in such a way
that client code doesn't require explicit knowledge that it's code.

In the end, any implication that this is radically different from using
accessors and mutators is simply and absolutely wrong. This technique
continues to use accessor and mutator functions. What it does is use a
consistent naming convention for the accessor and mutator functions,
and chooses names that C++ treats as overloaded operators. This, in
turn, allows the mechanism involved to be ignored by client code, but
in no way changes the mechanism itself.

The place that there's a real change is subtler: anything that becomes
publicly visible as a separate item (i.e. has its own accessor and/or
mutator) must be represented as a separate class. That isn't
necessarily a change in itself, but it may impose changes, and in a few
cases those changes may be problematic. I'd be the last to advise using
a technique when it causes problems, but I'd also advise against
discarding or ignoring something because it lacks universal
applicability -- after all, if that was the requirement, we'd all have
to give up C++ and for that matter, programming in general.
 
J

Jerry Coffin

It seems to me that there's been a bit of hyperbole on the part of a
number of posters in this thread.

First of all, I'd be the last to claim that this general technique can
replace _every possible_ thing that might look/feel/act a bit like an
accessor and/or mutator -- quite the contrary, there are clearly
situations in which it would be counterproductive.

At the same time, much of what has been said about the possible
weaknesses seems to me to have been exaggerated. In particular, while
it's true that _some_ class invariants can't (reasonably) be enforced
in a fashion such as this, it's certainly NOT true that _none_ of them
can be -- quite the contrary, I believe quite a few can be enforced
quite easily and effectively in this fashion. About a year ago or so,
I spent a week or so looking for various libraries and such in C++, and
went through them trying to find accessor and mutator (aka
getter/setter) functions. I haven't kept exact numbers, but roughly
half of these basically did nothing -- any value you tried to set was
assigned to the underlying data member, and anytime you called the
getter, it simply returned the value of the data member. IOW, around
half were there either "because the book said so" or (hopefully) with
an eye to a future in which they might become necessary, NOT because
they provided real utility in the code as it stood.

Of the remaining that actually did something, most simply enforced a
range on the data value -- i.e. the mutator didn't assign a new value
unless it fell within the correct range, but the accessor still just
returned whatever value was in the data member.

This technique can obviously handle the first group easily. While it
may be marginally less obvious, it can handle the second quite easily
as well. Just for example, a class template like this:

template <class T, T minimum, T maximum>
class ranged {
T value;

void assign(T const &val) {
if ((val < minimum) || (val > maximum))
throw std::eek:ut_of_range("value");
value = val;
}
public:
operator T() { return value; }
ranged(T initial=minimum) {
assign(initial);
}
ranged const &operator=(T const &new_val) {
assign(new_val);
return *this;
}
};

makes it trivial to enforce a specified range on an object. At least
from the code I looked at, a large majority of accessor and mutator
functions could be replaced by this alone (which seems to fit
reasonably well with Paavo's statement about a readonly keyword).

I'd also note that a template on this general order obviates writing a
new class for each such member variable -- as noted above, the majority
simply require instantiating a template with the correct values. I'd
also note that contrary to Paavo's assertion, enforcing a class
invariant normally does not require any sort of communication:
communication is necessary when/if something _dynamic_ is being
enforced, by an invariant (by its very nature) is static. Communication
becomes necessary only when the invariant is something like a static
relationship between two objects, either of which can vary dynamically
(e.g. X <= Y/2). This is certainly a possibility, but qually certainly
is NOT necessarily the case (nor in my experience is it even typically
the case). IOW, many (IME, most) enforcement of most class invariants
does not require any sort of on-going communication between the proxy
and the containing class. Typically, the invariant is supplied to the
proxy at instantiation, and that's the end of it.

There are certainly still cases that require more that this template
can provide. Some can/could be cured with more elaborate proxy
classes/templates, though most are also sufficiently specialized that
it would be more or less pointless for me to try to post examples for
imaginary purposes.

Even with those, a few cases remain in which a class' invariants are
sufficiently complex that the class itself is the lowest level at which
they can be enforced. At least in my experience, however, these cases
are unusual if not downright rare. Specifically, they're sufficiently
rare that I think it disproves the argument that explicit accessor and
mutator functions should be used even when they're not really needed --
if (for example) you hurt the readability of 2% of the code to be
consisten with the remaining 98%, that would probably be perfectly
reasonable. OTOH, hurting the readabilty of 98% of the code to be
consistent with the remaining 2% is just plain foolish.

IMO, the claim that explicit accessor and mutator functions provide a
higher level of abstraction is simply nonsense -- quite the contrary,
they require client code to include explicit knowledge of the
_mechanism_ (function calls) being used to enforce constraints. This
technique uses the same mechanism, but hides it so it looks like a
simple assignment. At least to me, this sounds like a _higher_ level of
abstraction. This is usually reflected in the syntax of the client
code: explicit accessor/mutator function usage quickly leads to code
that looks a great deal like assembly language (and not even very good
assembly language at that).

I also think it IS sensible to compare these techniques on the basis of
readability --they can provide the same basic capabilities in a large
majority of cases, but one does so in a fashion that I'd posit is far
more readable and understandable than the other.

Andrey got into a discussion of lifetime and such, which seems to me
completely orthogonal to the question at hand. In nearly every case, an
accessor and mutator operate on one or more private data members of an
object. These do the same. In either case, it's unusual but entirely
possible for them to operate on data with a different lifetime (in the
case of this technique, that's typically done by the outer class
containing a reference to the proxy class instead of an instance of
it).

I also agree with Cy that when possible it's better to avoid all of the
above -- I've argued in the past (and continue to believe) that in
general classes should strive to provide higher-level operations,
rather than simple access to data. Nonetheless, the result of those
higher-level operations is more or less inevitably modifying the state
of the system, or producing data, or both. These correspond to
accessing and mutating data, possibly using the value returned by the
mutator. As such, using values/assignments as a _model_ of what's going
on can be useful, even when a great deal more may be going on "under
the hood", so to speak.

I'm only aware of one circumstance in which you'd really _expect_ this
technique to lead to a difference in size of the final class: when/if
the data being manipulated is external to the object. The proxy object
is given its own address even if it contains no data, using up one byte
even if it's never really used. Since, however, this storage is never
really used, it can easily be assigned the address of storage that
would otherwise be used for padding. As such, the difference typically
falls well within the range of variation expected from one compiler to
another, or even the same compiler with different flags. If you expect
LOTS of instances of a class containing a lot of proxies, this might be
a real concern, but I've yet to see it cause a problem in real life.

While it's true that this technique "uses up" (so to speak) the one
available implicit conversion when retrieving a value, I generally
consider this an advantage. Using the one implicit conversion prevents
other accidental/unintended conversions that can allow (for example) an
unintended overloaded function to be invoked.

Ultimately, the idea that data should be made private was based on the
fact that data was "dumb", and should be protected by "intelligent"
code. This technique does NOT expose "dumb" data -- what's really
exposed is still the "intelligent" code, but it's exposed in such a way
that client code doesn't require explicit knowledge that it's code.

In the end, any implication that this is radically different from using
accessors and mutators is simply and absolutely wrong. This technique
continues to use accessor and mutator functions. What it does is use a
consistent naming convention for the accessor and mutator functions,
and chooses names that C++ treats as overloaded operators. This, in
turn, allows the mechanism involved to be ignored by client code, but
in no way changes the mechanism itself.

The place that there's a real change is subtler: anything that becomes
publicly visible as a separate item (i.e. has its own accessor and/or
mutator) must be represented as a separate class. That isn't
necessarily a change in itself, but it may impose changes, and in a few
cases those changes may be problematic. I'd be the last to advise using
a technique when it causes problems, but I'd also advise against
discarding or ignoring something because it lacks universal
applicability -- after all, if that was the requirement, we'd all have
to give up C++ and for that matter, programming in general.
 
R

Rade

Jerry Coffin said:
It seems to me that there's been a bit of hyperbole on the part of a
number of posters in this thread.

I'd say that each poster defends its own point of view by exaggerating the
problems with the others' approach... I really believe that the truth is
somewhere in-between. Public members are good (e.g. in POD structs).
Accessor and mutator methods are good (to preserve invariants in a general
case). Coffin's approach is good (in simple cases, being simpler than
accessor/mutator approach, while allowing a level of additional logic to
control access to the data). Each approach is good for a certain purpose
(but not as good for others). Best is to know them all, and then to be able
to choose the right one.

What we can discuss here is actually for which purpose is one of these
approaches better, compared to other approaches. I'd like to present
something that seems to me as a boundary case. Suppose you want to write a
class that will represent a rectangle that can lock its aspect ratio. So it
should look to external users as a plain old struct:

struct RatioLockableRectangle
{
int width;
int height;
bool locked;
};

but if locked is true, the ratio width:height has to be preserved (i.e.
changing one of them should result in change of the other, except, perhaps,
in some cases where this would lead to division by zero).

Of course, this can't be done with public data members. Certainly, it can be
done with accessors/mutators. What I am interested is to see a solution that
uses Coffin's approach, but which is simple enough (it is easy to think of a
complicated solution, but is there a simple one?). Actually, is there a
coding pattern (e.g. some template wizardry?) that would be applicable for
this (and similar) cases?

Sorry for my bad English, I hope you can understand what I want to say.

Rade
 
J

Jerry Coffin

Rade said:
Suppose you want to write a
class that will represent a rectangle that can lock its aspect ratio. So it
should look to external users as a plain old struct:
struct RatioLockableRectangle
{
int width;
int height;
bool locked;
};
but if locked is true, the ratio width:height has to be preserved (i.e.
changing one of them should result in change of the other, except, perhaps,
in some cases where this would lead to division by zero).

IMO, this isn't really a borderline case at all. Both the possibilities
being discussed are ultimately related to the implementation of a
design, and the problem here (at least IMO) is with the design itself.

This strikes me as _very_ similar to the situation with circles and
ellipses. The problem is ultimately that even though a
RatioLockedRectangle and a SimpleRectangle (or whatever) look a lot
alike, they ultimately present different interfaces to the world:
scaling one is NOT the same as scaling the other.

As such, what we should have is really two separate classes, one of
which scales both axes identically, and the other of which allows you
to scale them separately:

struct basic_rect {
// keep from repeating these in both classes:
int height;
int width;
point position;
};

class unlocked_rect;

class ratio_locked_rect : private basic_rect {
public:
void scale(double factor) {
height *= factor;
width *= factor;
}

ratio_locked_rect(unlocked_rect const *other) {
height = other.height;
width = other.width;
position = other.position;
}
};

class unlocked_rect : private basic_rect {
public:
void scale(double h_scale, double v_scale) {
width *= h_scale;
height *= v_scale;
}

explicit unlocked_rect(ratio_locked_rect const &other) {
width = other.width;
height = other.height;
position = other.position;
}
};

That's undoubtedly incomplete in both cases, but should at least give
the basic idea. The bottom line is that these are a lot alike, but
they're ultimately NOT the same things. IMO, it's also better that
scaling be done as scaling, NOT as access (direct or otherwise) to the
variables that hold the size of the rectangle. It's also possible that
there could be a public base class for both that provides a common
interface for quite a bit of what both can do, but I haven't tried to
get into that here -- the question at hand seems to be related
primarily to the scaling, which is NOT common between the two, and I
haven't tried to deal with everything else you might want to do with a
rectangle.
 
R

Rade

IMO, this isn't really a borderline case at all. Both the possibilities
being discussed are ultimately related to the implementation of a
design, and the problem here (at least IMO) is with the design itself.

Well, this was just an example. I don't claim that the design is very good.
I just didn't want to question this particular design, but instead I wanted
to question applicability of different coding practices to this particular
design. As you see, I restrained from criticizing any of the coding
practices (yours and others), so please restrain from criticizing my design
for a while, until we see whether this example can give some benefit to all
of us.
This strikes me as _very_ similar to the situation with circles and
ellipses. The problem is ultimately that even though a
RatioLockedRectangle and a SimpleRectangle (or whatever) look a lot
alike, they ultimately present different interfaces to the world:
scaling one is NOT the same as scaling the other.

That is true, and that was the reason why I thought this example will be
interesting to the C++ community (many people are familiar to the "circle is
(not) an ellipse" example, so they may be interested in this problem as
well, at least because they "seem" to be close to each other).

[snip]

Now, my proposal of the solution... Actually, I am not sure how the
following example is good (I am even not sure whether it is 100%
standard-compliant, as I could be just lucky I was able to compile it with
my MSVC 7.1 compiler. Certainly, the solution should be more polished (e.g.
it is inefficient - each property contains a pointer to the containing
class...).

BTW, MSVC has certain extensions that sound like __declspec(property), which
allow you to declare a fake class member and two real methods (accessor and
mutator), so your code seems to access the class member, while actually the
accessor and mutator are called. I am actually interested whether this is
possible in pure standard C++, and I'd be happy if it is.

Anyway, here the proposal of the solution is:

#include <iostream>

// Read/Write property
template <
typename Val, // Type of the property
class T, // Class having this property
Val (T::*Get)() const, // Accessor method
void (T::*Set)(Val const &) // Mutator methodclass RwProperty
{
public:
// Remembers its 'owner' object - I couldn't do better (so far)
RwProperty(T *owner): m_Owner(owner) { }

// Remembers its 'owner' object and sets the initial value
template <class U>
RwProperty(T *owner, U const &val): m_Owner(owner)
{
(m_Owner->*Set)(val);
}

// Assignment operator - invokes mutator
template <class U>
RwProperty &operator=(U const &val)
{
(m_Owner->*Set)(val);
return *this;
}

// Conversion to value type - invokes accessor
operator Val() const
{
return (m_Owner->*Get)();
}

// An example of operators returning lvalue
template <class U>
RwProperty &operator+=(U const &u)
{
(m_Owner->*Set)((m_Owner->*Get)() + u);
return *this;
}

// The way prefix operator can be implemented
RwProperty &operator++()
{
return operator+=(1);
}

// The way postfix operator can be implemented
Val operator++(int)
{
Val v = (m_Owner->*Get)();
(m_Owner->*Set)(v + 1);
return v;
}

// Add all other necessary operators here,
// so this RwProperty can mimic an ordinary member

private:
T *m_Owner;
};

// Rectangle that can lock its aspect ratio
class RatioLockableRectangle
{
private:

// Our RwProperty template is a friend of us, as it calls private
accessor and mutators.
// However, my code compiles even without the following line, presumably
because the
// template parameters are pointers to members, not actual members, so
they
// can be used externally (?)
template <typename Val, class T, Val (T::*Get)() const, void
(T::*Set)(Val const &)>
friend class RwProperty;

// Real width
int m_width;

// Real height
int m_height;

// Accessor for width
int get_width() const
{
return m_width;
}

// Accessor for height
int get_height() const
{
return m_height;
}

// Mutator for width
void set_width(int const &val)
{
if (locked)
{
if (m_width != 0) m_height = val * m_height / m_width;
else if (val != 0) throw std::runtime_error("locked, width =
0");
}
m_width = val;
}

// Mutator for height
void set_height(int const &val)
{
if (locked)
{
if (m_height != 0) m_width = val * m_width / m_height;
else if (val != 0) throw std::runtime_error("locked, height =
0");
}
m_height = val;
}

public:
// Constructor just sets up the RwProperty members
RatioLockableRectangle() : width(this), height(this) { }

// The rest of the public part is just trivial
RwProperty<int, RatioLockableRectangle, &get_width, &set_width> width;
RwProperty<int, RatioLockableRectangle, &get_height, &set_height>
height;
bool locked;
};

// Test driver
int main()
{
RatioLockableRectangle rlr;
rlr.locked = false;

rlr.width = 10;
rlr.height = 20;
std::cout << rlr.width << " " << rlr.height << std::endl;

rlr.locked = true;
rlr.width = 15;
std::cout << rlr.width << " " << rlr.height << std::endl;
rlr.height = 40;
std::cout << rlr.width << " " << rlr.height << std::endl;
rlr.width += 2;
std::cout << rlr.width << " " << rlr.height << std::endl;

rlr.width = 3;
while (6 > rlr.width++)
{
std::cout << rlr.width << " " << rlr.height << std::endl;
}

for (rlr.width = 5; rlr.width < 10; ++rlr.width)
{
std::cout << rlr.width << " " << rlr.height << std::endl;
}

return 0;
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,981
Messages
2,570,188
Members
46,731
Latest member
MarcyGipso

Latest Threads

Top