gk said:
does bytecode and machine code are same ?
They are similar in concept but /very/ different in practise.
In both cases they are sequences of instructions from a specific "instruction
set". An instruction set is very like a programming language in that it
defines exactly what steps the machine should follow to execute some task.
Just as programming languages can be /very/ different from each other (perhaps
more so than you yet know), instruction set can be very different from each
other. As it happens the instruction set, called "Java bytecode", which is
used to tell the JVM what to do, is about as different as you can get from the
instruction set, called "IA32 machine code" which is used to tell a Pentium
processor (or similar) what to do. Still, despite their differences, they are
still instruction sets.
Since any instruction set has a well--defined meaning (if you can work it out
from the documentation -- which is often difficult), it is /always/ possible to
write a program (in whatever programming language you like) which will
interpret those instructions and thus, in software, execute the target
instructions. That's called a "virtual machine". Similarly it is almost
always possible (if you have the money) to create a hardware implementation of
the same idea -- it still executes the instructions from the instruction set,
but since it's done in hardware (and, even more important, since it is created
by people with /staggeringly/ large budgets) it normally will run a lot faster.
(But I'll come back to that later).
Now, think of the Java bytecode instruction set. It has an abstract
definition, and so someone sitting down to create a JVM only has to follow that
definition and they'll produce a correct implementation ("only" !). The thing
is that there are /lots/ of different ways that you can produce a correct
implementation -- some are easy to write but run slowly, others are less easy
to write and run quite a bit faster, some are ridiculously complicated to write
and run fastest of all. (Remember that I said that an important reason for
hardware to be fast was that the engineers have the budget to create very
complicated implementations ? The same thing happens in software -- if you
have a large enough budget, and enough /really/ good programmers, then you can
create a very fast JVM.)
One effect of that is that you can't ask "how does Java implement bytecodes ?",
or even "how does the JVM do it ?". It always depends on exactly /which/ JVM
you are talking about.
So, what implementation techniques are available ? The simplest is just a big
switch statement rather like this:
for (;
{
int inst = nextInstruction()
switch (inst)
{
case: iadd: // ...do something...
break;
case: isub: // ...do something...
break;
// ...and so on for the 200 or so instuctions...
}
}
That is simple, but it is also slow (probably somewhere between 10 and 20 times
slower than we could do if we used more cleverness). But then, it does have
some huge advantages too. One is that it is simple to write. Another is that
it takes up very little memory since it interprets the Java bytecode directly,
and they are fairly compact (perhaps 10 times smaller than "equivalent" IA32
machine code -- although that varies a lot). If I remember correctly, JDK
1.0.2 (a long time ago) used that technique exclusively.
One thing you can do to improve on that is to rewrite the above interpreter
loop in assembler -- that will gain you some extra speed (not a huge amount,
but some), but is quite a bit harder to do. I think that JDK 1.1 was the first
Sun JVM which had its main loop written in assembler.
A different approach would be to translate the bytecode into machine code. Two
simple approaches are to do that unconditionally as each class is loaded in, or
to wait until a method is executed before translating it (doing "Just In Time"
translation -- JITing). That is probably simpler than writing a hand-crafted
interpreter loop in assembler, and will run quite a lot faster. The problem is
that it takes up a lot of memory for "compiled" (by which I mean: translated
into machine code) methods which may never be executed or may be executed only
once. Another problem is that although it's not too difficult to translate
bytecode into machine code in a simple-minded way, the resulting machine code
is nothing like as fast as would be produced by, say, an optimising C compiler.
But if we take the time (and invest the development resources) to optimise the
machine code well, then the program will spend most of its time optimising
stuff, and so it will /still/ seem to run slowly. I think that JDK 1.2 was the
first from Sun to use this kind of technique
One obvious improvement would be to combine a simple interpreter with JITing.
So you use the slow interpreter for methods which are not called often, and
then use faster compiled code for the ones that are. Another possibility would
be to translate all methods to machine code before executing them the first
time, but only use a quick-and-easy translator at first and reserve the
optimising translator for methods which are run often. Or you could combine
all three -- use the interpreter the first few times the method was executed,
use a simple complier if it is used more often, and use a complicated
optimising compiler for code which is executed a lot. That way you don't waste
time or space compiling or optimising methods which don't need it, and so can
afford to spend a lot more effort on optimising the code that /does/ need it.
That's the approach that Sun use these days. They call their version of it
"Hotspot" because it concentrates its optimisation efforts on the hot-spots in
the code. It is also massively complicated. That's mostly because optimisation
is always complicated (if it isn't complicated then you aren't trying hard
enough ;-). But it is also complicated because it has to change code (machine
code) while that code is running (and in a thread-safe way), and also has to be
able to keep track of what optimisations it has made, and why, so that it can
be ready to undo them again if something happens (like a new class being
loaded) which might invalidate the assumptions the optimisation was based on.
(d) when we compile with javac ---->does JVM invoked ?
Yes, but only incidentally. the Java compiler, javac, happens to be written in
Java, so it the program needs a JVM to run. However, you could create an
equivalent compiler (written in a different language) which didn't need a JVM
to execute.
(e) when we run with java command -->does JVM invoked ?
Very definitely yes. That's what the java command is /for/. It is, or it
contains (however you prefer to think of it) the software implementing the JVM
spec.
-- chris