J
jacob navia
In a recent discussion, we have seen the "regulars" group say they do
not use a debugger. Just reading the source code is enough. Wild claims
are being done. For instance a certain Mr "Heathfield" claims
<quote >
<[email protected]>
On a number of occasions I have (correctly) debugged code that I haven't
even seen, let alone run through a debugger.
<end quote>
This series of installments are dedicated to people that aren't
super heroes, "regulars in clc" or similar, and will use a debugger.
I will assume then, that you need a debugger, and want to know how
it works.
------------------------------------------------------------------------
A debugger is a normal program, that starts another program (the
"debuggee", or using gdb terminology "the "inferior") that is going
to run under the supervision of the debugger.
The debugger uses the informations generated by the compiler to
associate the source code with the machine addresses that the program
is executing, and show the corresponding source code.
The debugger achieves this goal by establishing a *breakpoint*. This
can be established in (basically) two ways:
1: By inserting a breakpoint instruction into special positions of the
code.
2: By using the compiler to generate stop instructions that will make
the program stop and wait instructions from a supervisor program.
The first solution is used when the debugger can have access to the
code stream and can modify it. This is the most usual solution in
workstations and advanced CPUs.
The second solution is used when the code stream is not writable (for
instance, the code is in ROM or EPROM and it is not possible to insert
anything into it).
The second solution is also the only solution available when the CPU
has no "break" instruction and no single step execution mode. In that
case, the debugger must provide this functionality.
Debug information formats
-------------------------
The informations generated by the compiler are specific from compiler to
compiler. Under windows, we have the NBXX series of standards produced
by Microsoft corp for their debugger "Code view", one of the most
innovative debuggers written to date.
The debug information under Unix can be from several standards (this is
an old Unix story: there are just too many standards to choose from).
The COFF standard debug info is generated by compilers in certain
debugging environment, specifically for the DBX debugger under AIX,
as far as I know, even if the format seems extended COFF and not just
COFF that is very simple as debugging formats go. (COFF means Common
Object code File Format, or something like that, I forgot)
We have under Unix also the "stabs" debug info format. This is a format
used by older versions of gcc (and lcc-win under linux) that writes
"strings" (i.e. String TABleS), that describe the program.
And we have more recently, the DWARF standard. This format i used by
more recent versions of gcc, and its a VERY complex debug information
format, as complex as its target language, C++. Obviously DWARF people
will tell you that DWARF is language independent, but its main use is
within C++/C.
Under windows, lcc-win uses the NB09 format from Microsoft Visual C++
version 4. I decided to stay within that format, and not "upgrade" to
the more "advanced" versions NB10, and following, since the effort
needed was not worth the few functionalities more that the new format
provided. This is a binary format (unlike "stabs") that is composed of
variable length records, with two bytes for length of record, and then
a "Type" field that describes what the record should contain.
With the change from 32 to 64 bits, I needed to tweak this format, and
I see a major rewrite soon, since there are some fields needed that
I do not yet support.
Operating system support
------------------------
The debugger needs to be called by the OS when something happens, and
obviously, it needs to be called before the inferior program starts, so
that breakpoints can be established at startup.
Under windows there is a complex API that provides this essential
service to debuggers. It allows everything a debugger needs, like
getting notified when an event occurs, writing/reading the inferior
program data space, etc.
Under Linux/Unix, there is the "ptrace" system call that provides this
service. It provides the same functionalities you would want from a
debugger: controlled execution, step by step, read/write program memory,
etc.
Setting a breakpoint.
---------------------
To set a breakpoint, the debugger reads the code at the address where
it wants to stop, and overwrites that code instructions with a
breakpoint instruction. This instruction will work like an interrupt,
stop the program execution, and give control to the OS to service that
interrupt. The OS saves the context of the breakpoint (registers, etc)
and gives control to the debugger.
To erase a breakpoint, the debugger writes the instructions that were
previously stored at the breakpoint address, and either executes them
in single step mode, rewriting the breakpoint after control has passed
over those instructions (that is a *permanent* breakpoint) or just
goes on (that was a *temporary* breakpoint).
Changing the point where execution resumes
------------------------------------------
The debugger can write any value into the inferior program registers.
This means that the instruction pointer can be changed, making the
program resume at a different source line than the last one executed.
In a next installment we will discuss about what happens when you crash.
You will see what other options are available besides "reading the
source code".
not use a debugger. Just reading the source code is enough. Wild claims
are being done. For instance a certain Mr "Heathfield" claims
<quote >
<[email protected]>
On a number of occasions I have (correctly) debugged code that I haven't
even seen, let alone run through a debugger.
<end quote>
This series of installments are dedicated to people that aren't
super heroes, "regulars in clc" or similar, and will use a debugger.
I will assume then, that you need a debugger, and want to know how
it works.
------------------------------------------------------------------------
A debugger is a normal program, that starts another program (the
"debuggee", or using gdb terminology "the "inferior") that is going
to run under the supervision of the debugger.
The debugger uses the informations generated by the compiler to
associate the source code with the machine addresses that the program
is executing, and show the corresponding source code.
The debugger achieves this goal by establishing a *breakpoint*. This
can be established in (basically) two ways:
1: By inserting a breakpoint instruction into special positions of the
code.
2: By using the compiler to generate stop instructions that will make
the program stop and wait instructions from a supervisor program.
The first solution is used when the debugger can have access to the
code stream and can modify it. This is the most usual solution in
workstations and advanced CPUs.
The second solution is used when the code stream is not writable (for
instance, the code is in ROM or EPROM and it is not possible to insert
anything into it).
The second solution is also the only solution available when the CPU
has no "break" instruction and no single step execution mode. In that
case, the debugger must provide this functionality.
Debug information formats
-------------------------
The informations generated by the compiler are specific from compiler to
compiler. Under windows, we have the NBXX series of standards produced
by Microsoft corp for their debugger "Code view", one of the most
innovative debuggers written to date.
The debug information under Unix can be from several standards (this is
an old Unix story: there are just too many standards to choose from).
The COFF standard debug info is generated by compilers in certain
debugging environment, specifically for the DBX debugger under AIX,
as far as I know, even if the format seems extended COFF and not just
COFF that is very simple as debugging formats go. (COFF means Common
Object code File Format, or something like that, I forgot)
We have under Unix also the "stabs" debug info format. This is a format
used by older versions of gcc (and lcc-win under linux) that writes
"strings" (i.e. String TABleS), that describe the program.
And we have more recently, the DWARF standard. This format i used by
more recent versions of gcc, and its a VERY complex debug information
format, as complex as its target language, C++. Obviously DWARF people
will tell you that DWARF is language independent, but its main use is
within C++/C.
Under windows, lcc-win uses the NB09 format from Microsoft Visual C++
version 4. I decided to stay within that format, and not "upgrade" to
the more "advanced" versions NB10, and following, since the effort
needed was not worth the few functionalities more that the new format
provided. This is a binary format (unlike "stabs") that is composed of
variable length records, with two bytes for length of record, and then
a "Type" field that describes what the record should contain.
With the change from 32 to 64 bits, I needed to tweak this format, and
I see a major rewrite soon, since there are some fields needed that
I do not yet support.
Operating system support
------------------------
The debugger needs to be called by the OS when something happens, and
obviously, it needs to be called before the inferior program starts, so
that breakpoints can be established at startup.
Under windows there is a complex API that provides this essential
service to debuggers. It allows everything a debugger needs, like
getting notified when an event occurs, writing/reading the inferior
program data space, etc.
Under Linux/Unix, there is the "ptrace" system call that provides this
service. It provides the same functionalities you would want from a
debugger: controlled execution, step by step, read/write program memory,
etc.
Setting a breakpoint.
---------------------
To set a breakpoint, the debugger reads the code at the address where
it wants to stop, and overwrites that code instructions with a
breakpoint instruction. This instruction will work like an interrupt,
stop the program execution, and give control to the OS to service that
interrupt. The OS saves the context of the breakpoint (registers, etc)
and gives control to the debugger.
To erase a breakpoint, the debugger writes the instructions that were
previously stored at the breakpoint address, and either executes them
in single step mode, rewriting the breakpoint after control has passed
over those instructions (that is a *permanent* breakpoint) or just
goes on (that was a *temporary* breakpoint).
Changing the point where execution resumes
------------------------------------------
The debugger can write any value into the inferior program registers.
This means that the instruction pointer can be changed, making the
program resume at a different source line than the last one executed.
In a next installment we will discuss about what happens when you crash.
You will see what other options are available besides "reading the
source code".