Objects -- Instances vs. References

H

Hal Vaughan

I've had a few questions on objects, pointers, different instances and such
before. I'm self taught and this is one aspect of OOP that has been the
hardest for me to fully grasp. I think I've finally got the hang of it,
because now I'm looking at objects differently.

That leads to a few questions, to see if I really do "get it." Since
objects are instantiated once and every time they're used in a different
class or method, we're only dealing with a pointer to the one object, then
I'm wondering about these two code snippets:

Vector testVec = new Vector();
MyObject myObj = new MyObject();

//Adding myObj as item 0
testVec.add(myObj);

//Variation #1, used in another method elsewhere:
MyObject testObj = (MyObject) testVec.get(0);
testObj.doSomeMethod();

//Variation #2, also used elsewhere:
((MyObject) testVec.get(0)).doSomeMethod();

//Variation #3, also used elsewhere:
Object testObj = testVect.get(0);
((MyObject) testObj).doSomeMethod();

From what I understand, variations 1 & 3 may add another pointer to the
original myObj, but #2 would not do that -- just access the object
directly. But as I think about it more, I can see why *I* need the other
reference in version #1 to make it easier to follow, but I can also see
that the compiler would just keep track of the reference and not need to
even add another pointer -- just keep using the first one. Is that right?
When those variations are compiled, is there much (or any) difference that
effects size or speed of operation? Or any difference overall by the time
they're compiled?

This group has been most helpful with giving explanations and background on
things like this. I do have a number of reference books, but I'm trying to
fill in the gaps in my understand. Any help on this is most appreciated.
While it doesn't effect whether or not I can get a program to work, it does
help my overall grasp of the language.

Thanks!

Hal
 
S

Stefan Ram

Hal Vaughan said:
Vector testVec = new Vector();
MyObject myObj = new MyObject();
testVec.add(myObj);
//Variation #1, used in another method elsewhere:
MyObject testObj = (MyObject) testVec.get(0);

If this is used "elsewhere", the identifier "testVec" might
refer to another object than the identified "textVec" of the
containing "add". So one can not assert much about it.

Even if it refers to the same object, it might have been
changed arbitrarily in the meantime.

Moreover, we do not know the meaning of "add" or "get", we can
only guess.

So there is so much to guess in order to analyze these code parts.

Therefore, it is often better to post a small compilable unit.
 
H

Hal Vaughan

Stefan said:
If this is used "elsewhere", the identifier "testVec" might
refer to another object than the identified "textVec" of the
containing "add". So one can not assert much about it.

Even if it refers to the same object, it might have been
changed arbitrarily in the meantime.

Moreover, we do not know the meaning of "add" or "get", we can
only guess.

So there is so much to guess in order to analyze these code parts.

Therefore, it is often better to post a small compilable unit.

Okay -- what if it is used in the same class, in a different method, with no
changes?

Hal
 
R

Robert Klemme

Hal said:
I've had a few questions on objects, pointers, different instances and such
before. I'm self taught and this is one aspect of OOP that has been the
hardest for me to fully grasp. I think I've finally got the hang of it,
because now I'm looking at objects differently.

That leads to a few questions, to see if I really do "get it." Since
objects are instantiated once and every time they're used in a different
class or method, we're only dealing with a pointer to the one object, then
I'm wondering about these two code snippets:

Vector testVec = new Vector();
MyObject myObj = new MyObject();

//Adding myObj as item 0
testVec.add(myObj);

//Variation #1, used in another method elsewhere:
MyObject testObj = (MyObject) testVec.get(0);
testObj.doSomeMethod();

//Variation #2, also used elsewhere:
((MyObject) testVec.get(0)).doSomeMethod();

//Variation #3, also used elsewhere:
Object testObj = testVect.get(0);
((MyObject) testObj).doSomeMethod();

From what I understand, variations 1 & 3 may add another pointer to the
original myObj, but #2 would not do that -- just access the object
directly. But as I think about it more, I can see why *I* need the other
reference in version #1 to make it easier to follow, but I can also see
that the compiler would just keep track of the reference and not need to
even add another pointer -- just keep using the first one. Is that right?
When those variations are compiled, is there much (or any) difference that
effects size or speed of operation? Or any difference overall by the time
they're compiled?

I am not exactly sure what you mean by "add pointer" and why this should
be something to be concerned about. Ignoring changes to testVec by
other code you simply get more variables which reference the object
which is referred to by myObj. Your assumption about the compiler not
having to create a second variable is wrong because it has no idea about
the semantics of Vector; so it does not know that get(0) actually
returns a reference to myObj. The situation might be different if you
simply used "MyObject testObj = myObj;" - the compiler could at least
detect the situation and optimize one of the two away if data flow
analysis shows that both cannot be out of sync (i.e. must under all
circumstances contain the same reference). Whether current Java
compilers actually do this I don't know. HTH

Kind regards

robert


PS: MyObject might not be the best name for a /class/.
 
M

Mike Schilling

Hal said:
I've had a few questions on objects, pointers, different instances
and such before. I'm self taught and this is one aspect of OOP that
has been the hardest for me to fully grasp. I think I've finally got
the hang of it, because now I'm looking at objects differently.

That leads to a few questions, to see if I really do "get it." Since
objects are instantiated once and every time they're used in a
different class or method, we're only dealing with a pointer to the
one object, then I'm wondering about these two code snippets:

This sentence already indicates that you don't. Objects are instantiated
when the new operator executes. Period.

Being self-taught is fine, but there are times when you need to read a book.
Any good introduction to Java should explain objects vs. references to
objects. Sorry I can't recommend one, but I'm sure others reading this
can.
 
H

Hal Vaughan

Mike said:
This sentence already indicates that you don't. Objects are
instantiated
when the new operator executes. Period.

I've had a number of people say that a new object is not instantiated if it
references an older object, but that only a reference to the original
object is passed. That was made clear by a number of posters in a thread
started by a previous question of mine.

So if I create a new object by using like this:

MyObject myObj = myVector.get(i);

I've been told that a new object is not created, but that myObj is merely a
pointer to the object already referred to in the Vector.

I'm sure it comes down to technical details, but it's hard to see why a
number of posters have said an object is not created and only a reference
to an existing object (in this case) is created, yet you state an actual
separate object is created.
Being self-taught is fine, but there are times when you need to read a
book. Any good introduction to Java should explain objects vs. references
to
objects. Sorry I can't recommend one, but I'm sure others reading this
can.

Done that. Several. Without a previous background and a learning
disability (connected with symbol and word recognition, which makes dealing
with things like OOP especially difficult), it's not always easy without
getting some feedback.

Hal
 
M

Mike Schilling

Hal said:
I've had a number of people say that a new object is not instantiated
if it references an older object, but that only a reference to the
original object is passed. That was made clear by a number of
posters in a thread started by a previous question of mine.

So if I create a new object by using like this:

MyObject myObj = myVector.get(i);

I've been told that a new object is not created, but that myObj is
merely a pointer to the object already referred to in the Vector.

That's correct.
I'm sure it comes down to technical details, but it's hard to see why
a number of posters have said an object is not created and only a
reference to an existing object (in this case) is created, yet you
state an actual separate object is created.

No, I didn't say that. I said that an object is created only by the "new"
operator, for instance

new Integer(12); // creates a new object

while

MyObject myObj = myVector.get(i); // no call to "new"

does not create any objects.
 
H

Hal Vaughan

Mike said:
That's correct.


No, I didn't say that. I said that an object is created only by the "new"
operator, for instance

new Integer(12); // creates a new object

while

MyObject myObj = myVector.get(i); // no call to "new"

does not create any objects.

Thanks. I follow that. That helps me get that straight.

Now, allowing for myObj of class MyObject being element 0 in testVec, in
this example, does the first line actually create source code, or is it
just used by the compiler to connect my new symbol, "testObj" with the
original "myObj"? In other word, when I'm using the first line to make it
clear for me, so I don't have to use
((MyObject) testVec.get(0)).doSomeMethod(), does adding the symbol I, as a
human need, lead to the compiler also creating a new symbol, or does it not
need it?

MyObject testObj = (MyObject) testVec.get(0);
testObj.doSomeMethod();

From what I understand, and when it comes to the compiler and byte code it
is very limited, it seems the JRE would not need that first line, since all
I'm doing is creating another symbol for the already existing object. I,
as a human, may need that symbol, but does the JRE need it?

That's the part I'm trying to grasp right now.

Thanks for your help, by the way. I do get a lot from books, but there are
some things I need clarified.

Hal
 
C

Chris Uppal

Hal said:
From what I understand, variations 1 & 3 may add another pointer to the
original myObj, but #2 would not do that -- just access the object
directly.

Not quite. #1 and #3 create /named/ variables (both called "testObj" as it
happens) which contain pointers to the original object, but #2 also involves a
new reference to the old object -- it's just that that reference is temporary
and is never given a name.

You (neither you nor the compiler) can /ever/ do anything with an object except
via a reference. The only choice you have is whether to store that reference
in a variable with a name, or leave it up to the compiler to put the reference
in a temporary place (on the stack, in fact) for only as long as it is needed.

In general there is little difference in performance (time or space) between
the two options -- it's just a matter of programming convenience.

-- chris
 
C

Chris Uppal

[I wrote this, and then re-read your post, and I think you may have got further
than I originally though (as indicated in my first paragraph) -- but I'm not
certain, and thing's written already, so I may as well post it. Apologies if
it is only telling you what you already know]

Hal said:
From what I understand, variations 1 & 3 may add another pointer to the
original myObj, but #2 would not do that -- just access the object
directly. But as I think about it more, I can see why *I* need the other
reference in version #1 to make it easier to follow, but I can also see
that the compiler would just keep track of the reference and not need to
even add another pointer -- just keep using the first one. Is that right?
When those variations are compiled, is there much (or any) difference that
effects size or speed of operation? Or any difference overall by the time
they're compiled?

I think you are still working your way out of your original misunderstanding --
you are getting closer but are not there yet.

I'll try a slightly different approach from the ones others have taken and talk
about it in terms closer to the implementation. (The "implementation" is
completely hypothetical -- real JVMs may be very different).

Start with this statement.

MyObject myObj;

That doesn't create /anything/, it just tells the compiler to reserve a 32-bit
slot on the stack for later use.

Now consider this (as a complete statement all by itself):

new MyObject();

That tells the JVM to create a new instance of class MyObject. Lets say that
MyObjects require 132-bytes each (for their various instance-fields, and for
each to contain a (hidden) reference back to the class it's an instance of).
So you have just allocated 132-bytes and called the MyObject constructor which
will do any necessary initialisation. The Java system makes available a 32-bit
pointer or reference (the two terms are synonyms in this context) to that
object, but because we haven't any code to do anything with that, the pointer
is just discarded. In all likelyhood the newly created object will be
reclaimed by the GC (garbage collector) in short while. A short like, but a
happy one. Note that it /was/ a life -- the object could have done quite a lot
while it was alive (e.g. the constructor could contain code to delete all your
files) -- so it is in no sense trivial or "just an implementation detail" that
it was created.

If we now create a second instance of MyObject with:

myObj = new MyObject();

then the compiler will arrange to save the pointer to the new object in the
32-bit variable called "myObj" (remember we set up that variable earlier). You
can (I hope) see how there isn't "an object called myObj"; just a variable
called myObj which holds a reference/pointer to an (anonymous) object.

If we now create a third instance of MyObject with:

myObj = new MyObject();

then myObj will now contain a pointer to the third object; and the second one
will have (presumably) no more references to it, and will soon die.

We've now created three instances of MyObject. None of them have any right to
be considered to "be" myObj. They were all anonymous. They all were born,
two of them died, but none of them knew anything about a variable called myObj
since objects don't have any way to tell what variables point to them. There
may be one, there may be many. The object doesn't know or care -- except that
it will die once there are no more variables left (but objects have a stoical
attitude and normally face death with indifference).

OK, now let's add another variable.

MyObject otherObj;

We've just told the compiler to reserve another 32-bit slot (on the stack).
We'll use that:

otherObj = myObj;

Now there are /two/ references to that third anonymous object we created. I'm
getting sick of typing anonymous object, so from now on I'm going to call it
"Fred". We have two references to Fred. Let's loose one of them:

myObj = null;

Now there is only one reference to Fred. If we lost that then Fred would
die -- hardly fair now that it has a name and everything. So, for safety:

Vector testVec = new Vector();
testVec.add(otherObj);

Now there are two references to Fred again. One is in the 32-bit variable
called otherObj, the second is somewhere buried away inside the Vector we have
just created.

OK. Now loose one reference to Fred:

otherObj = null;

Now the only reference to Fred is inside the Vector. If we throw that away:

testVec = null;

then the Vector object will (presumably) die. And when it does, so will
Fred....



The thing to remember is that:

MyObject myObj = new MyObject();

is /only a short-hand for:

MyObject myObj;
// other code which doesn't use myObj could go here
myObj = new MyObject();


-- chris
 
M

Matt Humphrey

Thanks. I follow that. That helps me get that straight.

Now, allowing for myObj of class MyObject being element 0 in testVec, in
this example, does the first line actually create source code, or is it
just used by the compiler to connect my new symbol, "testObj" with the
original "myObj"?

Source code is only what you type in. No lines ever create source code--they
*are* source code. The compiler uses them to produce bytecode, but it's
probably not helpful at this stage to think about what the compiler and
bytecode are doing.

I think you're still missing a key concept. Your variables "myObj" and
"testObj" are containers that hold references to objects. The object you
created does not have a name--which is why pictures in books are often
needed to establish the concept. This line:

MyObject myObj = new MyObject();

creates a new object with a unique reference floating in memory and puts
that reference into myObj. You might think of the reference as a hidden
name such that no two objects ever have exactly the same name, so for
convenience call this reference <R-101>. Floating out in memory, but
identified by <R-101> is all the information about that object you created.
You cannot call that object "myObj" because that is not its name. "myObj"
is the name of a variable whose contents is <R-101> (the reference to the
object). So if you continue with

MyObject myObj2 = new MyObject ();

You get another object and we'll call this one <R-231>. So we have 2
different instances and 2 variables whose contents point to those instances.

MyObject myObj3 = myObj;

This creates a new variable and puts into that variable the contents of the
variable of myObj, which is <R-101> the reference to the R-101 object.
There are still only 2 objects, but 3 variables. Now you can write

myObj = myObj2;

By now you should be saying to yourself, "Oh, this copies the contents of
the varaible myObj2 to the variable myObj" and realize that that means that
whatever reference is contained in myObj2 has been copied into myObj. myObj
will now point to <R-231>.

Look here for a picture of what this means
http://www.iviz.com/memory pointer.gif

Exactly the same thing happens with your vector, with the reference being
placed inside the vector and then copied out to a new variable.
In other word, when I'm using the first line to make it
clear for me, so I don't have to use
((MyObject) testVec.get(0)).doSomeMethod(), does adding the symbol I, as a
human need, lead to the compiler also creating a new symbol, or does it
not
need it?

MyObject testObj = (MyObject) testVec.get(0);
testObj.doSomeMethod();

From what I understand, and when it comes to the compiler and byte code it
is very limited, it seems the JRE would not need that first line, since
all
I'm doing is creating another symbol for the already existing object. I,
as a human, may need that symbol, but does the JRE need it?

The purpose of variables is give you a name for referring to values and
objects. A necessary milestone for beginning programming is to distinguish
the variable from the object (not using the variable name as the name of the
object and not trying to have variables refer to other variables. The
compiler does set aside space for the variable, but is free to eliminate it
or create new ones as long as the effect is the same.


Matt Humphrey (e-mail address removed) http://www.iviz.com/
 
H

Hal Vaughan

Chris said:
[I wrote this, and then re-read your post, and I think you may have got
[further
than I originally though (as indicated in my first paragraph) -- but I'm
not
certain, and thing's written already, so I may as well post it. Apologies
if it is only telling you what you already know]

While there is some of what I already know, this helps a lot in clearing it
up in my head! Thanks to you and others in this thread for taking the time
to answer and helping me work this out. I appreciate it!

Hal
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,821
Latest member
AleidaSchi

Latest Threads

Top