Difference between pointer to the no data member and the pointer tothe first member

S

somenath

I am trying to understanding the difference between the pointer to the no data member and pointer to the first member of class.

Also I have learned that “The value returned from taking the member’s address, however, is always bumped up by 1”.
To understand this concept I wrote the following simple program.

#include<iostream>
using namespace std;
class A
{
public:
int data;
int data2;

};

int main(void)
{

A a1;
int A::*first_data_member =&A::data;
int A::*second_data_member =&A::data2;
int A::*no_data_member ;
cout<<"no_data_member = "<<(no_data_member)<<endl;
cout<<"first_data_member = "<<(first_data_member)<<endl;
cout<<"second_data_member = "<<(second_data_member)<<endl;


return 0;

}

The output of the above program is as follows.
=================================================
no_data_member = 1
first_data_member = 1
second_data_member = 1

================================================

I am not able to understand the output of the program.

My expectation is if no_data_member is 1 then first_data_member sould be = no_data_member + 1 and second_data_member should be = first_data_member +1.

Please let me know where I am going wrong?
 
Ö

Öö Tiib

I am trying to understanding the difference between the pointer to the
no data member and pointer to the first member of class.

Also I have learned that “The value returned from taking the
member’s address, however, is always bumped up by 1”.

What book that was? By your code it seems that you are confused
and trying things out by trial and error without understanding
their purpose. What you try to program should (I am even not
sure) perhaps to look like that:

#include<iostream>
class A
{
public:
int data;
int data2;
};

int main()
{
A a1;
int* first_data_member =&a1.data;
int* second_data_member =&a1.data2;
int* no_data_member = NULL;
std::cout<<"no_data_member = "<<no_data_member<<std::endl;
std::cout<<"first_data_member = "<<first_data_member<<std::endl;
std::cout<<"second_data_member = "<<second_data_member<<std::endl;
}

The output of the above program is as follows.
=================================================
no_data_member = 0
first_data_member = 0xbf876138
second_data_member = 0xbf87613c

Hope it helps.
 
S

somenath

What book that was? By your code it seems that you are confused

and trying things out by trial and error without understanding

their purpose. What you try to program should (I am even not

sure) perhaps to look like that:

The book is "Inside the C++ Object Model". Actually I am trying to understand the possible layout of the pointer to the data member. In this book it is said that "The value returned from taking the member’s address, however, is always bumped up by 1." . Also trying to understand how compiler solve the problem of " distinguishing between a pointer to no data member and apointer to the first data member."

For this I was trying to create an example which will have a pointer to theno member of class and first and second member of the class . So that I can understand the layout of those pointer. But I think I have failed to do that.

I the book the example given is as follows
///Beginning of example
class Point3d {
public:
virtual ~Point3d();
// ...
protected:
static Point3d origin;
float x, y, z;
};

float Point3d::*p1 = 0;
float Point3d::*p2 = &Point3d::x;

// oops: how to distinguish?
if ( p1 == p2 ) {
cout << ” p1 & p2 contain the same value — ”;
cout << ” they must address the same member!” << endl;
}

To distinguish between p1 and p2, each actual member offset value is bumpedup by 1. Hence, both the compiler (and the user) must remember to subtract1 before actually using the value to address a member. "

//End of example

#include<iostream>

class A

{

public:

int data;

int data2;

};



int main()

{

A a1;

int* first_data_member =&a1.data;

int* second_data_member =&a1.data2;

int* no_data_member = NULL;

std::cout<<"no_data_member = "<<no_data_member<<std::endl;

std::cout<<"first_data_member = "<<first_data_member<<std::endl;

std::cout<<"second_data_member = "<<second_data_member<<std::endl;

}

So my question is then how can we understand the fact that
"The value returned from taking the member’s address, however, is always bumped up by 1"
 
S

somenath

Paavo Helde wrote in






I started to wonder myself how the data member pointers are encoded, and

adapted your program to the following:



#include<iostream>

#include <stdint.h>

using namespace std;



class A

{

public:

int data;

int data2;



};



int main(void)

{

int A::*first_data_member =&A::data;

int A::*second_data_member =&A::data2;

int A::*no_data_member = NULL;



if (sizeof(int A::*)==4) {

int32_t x;



memcpy(&x, &no_data_member, 4);

cout<<"no_data_member = "<< x <<endl;



memcpy(&x, &first_data_member, 4);

cout<<"first_data_member = "<< x <<endl;



memcpy(&x, &second_data_member, 4);

cout<<"second_data_member = "<< x <<endl;

}

}



The output is with my compiler:



no_data_member = -1

first_data_member = 0

second_data_member = 4



So it appears a NULL data member pointer is encoded by integer value -1

and otherwise they just store byte offset of the data member inside the

object. But this is all implementation-specific so one must not rely on

this.

I am not getting why are you using memcpy in the above program? Also could you please explain the meaning "So it appears a NULL data member pointer isencoded by integer value -1 and otherwise they just store byte offset of the data member inside the object"
Does it mean that the pointer to the no member value is -1 and pointer thefirst member is 0 in this case? What exactly the meaning of "encoded" here?
 
Ö

Öö Tiib

Actually I am trying to understand the possible layout of the pointer
to the data member.

It is not specified by standard to leave free hands to compiler writers.
You can have "no member" with all bits set and no bumping of others.
You can have "no member" equal with 0 and all others bumped up by 1.

Paavo Helde understood what you attempt better than me. If you want to
see what is inside of something do like he did. memcpy the contents
to suitable integer (or into array of bytes) and output that.
Paavo used signed integer so the "no member" value that on his compiler
had all bits set was displayed as "-1".

Sending the pointer to member to cout seems to result with "1"
when the pointer points at member and with "0" when it is initialized
with 0. Since you had your "no_data_member" uninitialized it is
very unlikely that it had that special "no member" value and so
it displayed "1" too.
In this book it is said that "The value returned from taking the
member’s address, however, is always bumped up by 1.".

The book describes specific implementation; it may be so but
there are no guarantees anywhere.
Also trying to understand how compiler solve the problem of
" distinguishing between a pointer to no data member and a pointer to
the first data member."

Such knowledge can be useful if you want to make some platform-specific
debugging tool but generally you should avoid to write code that relies
on it. Using undefined features without very good reason and
abundant commentary in code may even get your work contract terminated.
 
K

Kalle Olavi Niemitalo

Richard Damon said:
To throw another kink into the problem, a simple offset is NOT
sufficient for a fully generalized member pointer.

The "Itanium C++ ABI" used by GCC on multiple platforms
<http://refspecs.linuxfoundation.org/cxxabi-1.86.html#member-pointers>
says a pointer to data member is just an offset; there is no provision
for virtual bases.

Conversion of a pointer to data member across a virtual base
seems disallowed by [conv.mem] and [expr.cast] anyway, so I don't
think your sample code is valid C++:
int derived::* p1 = &base::b2;

According to [expr.unary.op], the type of &base::b2 is "int base::*".
The result then cannot be converted to "int derived::*".

I'm not sure whether the rules are any different for pointers to
member functions.
 
K

Kalle Olavi Niemitalo

Richard Damon said:
It looks like perhaps the line should be

int derived::* p1 = &derived::b2;

Because class derived inherits int b2 from class base, the type
of &derived::b2 is "int base::*", just like &base::b2.
An example in [expr.unary.op] demonstrates exactly this.
 
J

James Kanze

What book that was? By your code it seems that you are confused
and trying things out by trial and error without understanding
their purpose. What you try to program should (I am even not
sure) perhaps to look like that:
#include<iostream>
class A
{
public:
int data;
int data2;
};
int main()
{
A a1;
int* first_data_member =&a1.data;
int* second_data_member =&a1.data2;
int* no_data_member = NULL;
std::cout<<"no_data_member = "<<no_data_member<<std::endl;
std::cout<<"first_data_member = "<<first_data_member<<std::endl;
std::cout<<"second_data_member = "<<second_data_member<<std::endl;
}
The output of the above program is as follows.
=================================================
no_data_member = 0
first_data_member = 0xbf876138
second_data_member = 0xbf87613c
Hope it helps.

His code used pointers to members, which are completely
different. (The expression "no data member" is a bit strange,
but it looks like he meant null pointers.)
 
J

James Kanze

I am trying to understanding the difference between the
pointer to the no data member and pointer to the first member
of class.

By "no data member", I presume you mean a null pointer (of type
pointer to member).
Also I have learned that “The value returned from taking the
member’s address, however, is always bumped up by 1”.

That's the usual implementation, but it certainly isn't
required. An implementation could also use -1 as the
representation for null pointers when they are pointers to
(non-function) member.
To understand this concept I wrote the following simple program.

using namespace std;
class A
{
public:
int data;
int data2;
};
int main(void)
{
A a1;
int A::*first_data_member =&A::data;
int A::*second_data_member =&A::data2;
int A::*no_data_member ;
cout<<"no_data_member = "<<(no_data_member)<<endl;
cout<<"first_data_member = "<<(first_data_member)<<endl;
cout<<"second_data_member = "<<(second_data_member)<<endl;
return 0;
}
The output of the above program is as follows.
=================================================
no_data_member = 1
first_data_member = 1
second_data_member = 1
================================================
I am not able to understand the output of the program.

I am:). But I can understand your surprize for the last entry.
My expectation is if no_data_member is 1 then
first_data_member sould be = no_data_member + 1 and
second_data_member should be = first_data_member +1.
Please let me know where I am going wrong?

Two things. The first is that you never initialize
no_data_member, so using it is undefined behavior. In this
case, what's happening is that it contains random junk (which
isn't 0). You should define it as:

int A::*no_data_member = NULL; // or nullptr, if you have C++11

The second is that there isn't an overload of << for outputting
pointers to members. So the compiler tries to find a conversion
for which there is an overload. For horrible historical
reasons, pointers (including pointers to members) convert to
bool, and there is a << defined for bool, so what you're seeing
is the default output for each of the pointers defined as
a bool.

Formally, for this sort of thing, you should be dumping the
pointers using something along the lines of:

template <typename T>
class Dump
{
T const& myObj;
public:
Dump( T const& obj ) : myObj( obj ) {}
std::eek:stream& operator<<( std::eek:stream& dest, Dump const& obj )
{
IOSave stateSaver( dest ); // Saves fmtflags, etc.
dest.setf( std::ios_base::hex, std::ios_base::baseflags );
dest.fill( '0' );
unsigned char const* begin
= reinterpret_cast<unsigned char const*>( &obj.myObj );
unsigned char const* end = begin + sizeof( T );;
for ( char const* current = begin; current != end; ++ current ) {
if ( current != begin ) {
dest << ' ';
}
dest << std::setw(2) << *current;
}
return dest;
}
};

template <typename T>
Dump<T>
dump( T const& obj )
{
return Dump<T>( obj );
}

You can then do things like:

std::cout << "no data member = " << dump( no_data_member ) << std::endl;

and get a hex dump of the values.

This is a generic solution for looking at the representation of
any type, and should be generally useful if you're trying to
figure out what the compiler is doing. Depending on what you
are looking at, you may want to modify it to dump the data as
unsigned or size_t, rather than as bytes. Or in specific cases
(like this one, perhaps), you might want to cast the pointer to
an appropriately sized unsigned integer, and use it. (Pointers
to data members are almost always implemented as an integral
type under the hood.)
 
J

James Kanze

The book is "Inside the C++ Object Model".

This book describes one possible way of implementing C++. In
the case of pointers to data members, I think the solution he
describes is almost univeral (but I'm not sure), but for other
things, there is a lot of variance.
Actually I am trying to understand the possible layout of the
pointer to the data member.

Typically, it's just an int under the hood.
In this book it is said that "The value returned
from taking the member’s address, however, is always bumped up
by 1." . Also trying to understand how compiler solve the
problem of " distinguishing between a pointer to no data
member and a pointer to the first data member."

You just answered your question. The int value in a null
pointer to member will be 0. Otherwise, the int value will be
the physical offset of the member in the class, incremented by
one to ensure that it can never be 0.
For this I was trying to create an example which will have
a pointer to the no member of class and first and second
member of the class . So that I can understand the layout of
those pointer. But I think I have failed to do that.
I the book the example given is as follows
///Beginning of example
class Point3d {
public:
virtual ~Point3d();
// ...
protected:
static Point3d origin;
float x, y, z;
};
float Point3d::*p1 = 0;
float Point3d::*p2 = &Point3d::x;
// oops: how to distinguish?
if ( p1 == p2 ) {
cout << ” p1 & p2 contain the same value — ”;
cout << ” they must address the same member!” << endl;
}

This is a bad example for two reasons. First, it shouldn't
compile, because the members whose address you are taking are
protected (and you're not in a derived class). Second, the
class has a virtual function, which means that it will have
(typically) a vptr at the start address. So even without
incrementing the value, &Point3d::x will not be 0.
To distinguish between p1 and p2, each actual member offset
value is bumped up by 1. Hence, both the compiler (and the
user) must remember to subtract 1 before actually using the
value to address a member. "

The user doesn't have to remember anything. The compiler takes
care of everything.
 
J

James Kanze

[...]
I am not getting why are you using memcpy in the above program?
Because this is about the only legal way to study what bit
pattern some or another C++ object has.

But since this is just experimenting, we don't mind a little
undefined behavior that actually works:). (In fact, the
standard does guarantee that accessing the original object
through lvalues of type char or unsigned char works. Otherwise,
the memcpy couldn't work either.)
Also
could you please explain the meaning "So it appears a NULL data member
pointer is encoded by integer value -1 and otherwise they just store
byte offset of the data member inside the object" Does it mean that
the pointer to the no member value is -1 and pointer the first member
is 0 in this case?
[...]
Here we have objects of type 'int A::*'. My compiler reports that sizeof
(int A::*) is 4. This tells me that the object state is represented by 32
bits in the memory (because I know the byte has 8 bits in my
environment). From this it is already immediately clear that data member
pointers have nothing to do with actual machine code pointers, because
those have size of 64 bits in my environment and could not fit in 32 bits
in any way.

And this intriges me. What happens if you create an object
along the lines of:

struct S
{
char makeItBig[ 5000000000 ];
int a;
};

On a machine with 64 bit linear addressing, I would expect this
to be possible. But if 'int A::*' has a size of 4, &S::a isn't
going to fit.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,955
Messages
2,570,117
Members
46,705
Latest member
v_darius

Latest Threads

Top