anyone interested in decompilation

Q

QuantumG

Decompilation is the process of recovering human readable source code
from a program executable. Many decompilers exist for Java and .NET as
the program executables (class files) maintain much of the information
found in the source code. This is not true for machine code
executables however.

In recent years decompilation for machine code has moved from the
domain of crackpots and academic hopefuls to a number of real
technologies that are available to the general public. Decompilers for
machine code now exist which produce output that rivals disassemblers
as a tool for analysing programs for security flaws, malware or just
simply to see how something works. Full source code recovery that is
economically attainable will soon be a reality.

The legal challenges posed by this technology differs country to
country. As such, much research is being done in secret in countries
that prohibit some uses of the technology, whereas some research is
being done more publicly in countries that have laws which support the
technology (Australia, for example).

Boomerang is an open source decompiler written (primarily) by two
Australian researchers. Open source projects need contributors. If
you have an interest in decompilation, we'd like to hear from you.
We're not only interested in talking to programmers. The project
suffers from a lack of documentation, tutorials and community. There
are many tasks that can be performed by users with minor technical
knowledge.

For more information on machine code decompilation see the Boomerang
web site (http://boomerang.sourceforge.net/). For interesting
technical commentary on machine code decompilation, see my blog
(http://quantumg.blotspot.com/).

Thanks for reading this message,

QuantumG
 
J

jaysome

Decompilation is the process of recovering human readable source code
from a program executable.

And the human readable source code looks something like this:

int V00000001;

V00000001 = function_that_returns_int();

if ( V00000001 > 9 )
{
/* do something */
}
else
{
/* do something else */
}
Many decompilers exist for Java and .NET as
the program executables (class files) maintain much of the information
found in the source code. This is not true for machine code
executables however.

Whether or not Java or .NET produce program excutables that maintain
information that is "found in the source code" has no bearing on
whether Standard C does the same. Nor should it have any bearing on
whether other languages such as Ada or Basic or C++ do the same.
In recent years decompilation for machine code has moved from the
domain of crackpots and academic hopefuls to a number of real
technologies that are available to the general public. Decompilers for
machine code now exist which produce output that rivals disassemblers
as a tool for analysing programs for security flaws, malware or just
simply to see how something works. Full source code recovery that is
economically attainable will soon be a reality.

And some would still claim that anyone who wrote a decompiler and used
variable names like V00000001, when the original name was
reactor_overflow, could arguably be labeled a "crackpot".
The legal challenges posed by this technology differs country to
country. As such, much research is being done in secret in countries
that prohibit some uses of the technology, whereas some research is
being done more publicly in countries that have laws which support the
technology (Australia, for example).

Can you tell us what countries cuurently ban the
"turn-hamburger-into-cow" tool?
Boomerang is an open source decompiler written (primarily) by two
Australian researchers. Open source projects need contributors. If
you have an interest in decompilation, we'd like to hear from you.
We're not only interested in talking to programmers. The project
suffers from a lack of documentation, tutorials and community. There
are many tasks that can be performed by users with minor technical
knowledge.

(I hope the source is written in C.)

Did you have a question about C?
 
R

Richard Heathfield

QuantumG said:

Boomerang is an open source decompiler written (primarily) by two
Australian researchers.

When you diff(1) the output of Boomerang on itself with the input to the
compiler and get no differences, call again. :)

<snip>
 
K

Keith Thompson

QuantumG said:
Decompilation is the process of recovering human readable source code
from a program executable. Many decompilers exist for Java and .NET as
the program executables (class files) maintain much of the information
found in the source code. This is not true for machine code
executables however. [...]
Thanks for reading this message,

Which, as far as I can tell, has nothing to do with the C programming
language, the topic of this newsgroup. Perhaps this would be topical
in comp.compilers.
 
Q

QuantumG

jaysome said:
And the human readable source code looks something like this:
int V00000001;
<snip>
And some would still claim that anyone who wrote a decompiler and used
variable names like V00000001, when the original name was
reactor_overflow, could arguably be labeled a "crackpot".

It sure does. There isn't really a good name for this kind of "source
code". Some might use the term "obsfucated" but that implies that a
deliberate effort has been made to make the source code unreadable,
whereas this kind of output is typically generated by tools which are
trying to do the opposite. Another possible name is "symbol stripped"
but this is typically a term used to refer to a process performed on
binaries, not source code, and can therefore be confusing.

I've been thinking recently about a new term: the "compiler view" of
source code. To a compiler it really doesn't matter if a variable is
called reactor_overflow or V00000001. Armed with this new terminology
we can say something very insightful: the best output you can hope for
from an automatic decompiler which you have given a symbol stripped
binary is the compiler view of the original source code.

To get from the compiler view to the programmer view you need a lot of
user input - specifically, the user must supply domain knowledge. As
this is essentially a source-to-source transformation one can argue
that it is not really what a decompiler is for - after all, you could
do this with your favourite text editor - but some kind of tool support
will certainly help the reverse engineer out, and the core of those
tools will most likely be very similar to the core of a decompiler, so
why not integrate?
Can you tell us what countries cuurently ban the
"turn-hamburger-into-cow" tool?

Certain uses of decompilation are indeed banned in a lot of countries
of the world, making the development of decompilers suspect in those
countries. But as I cannot give any specific examples of people
secretly developing decompilers in these countries without violating
their trust, I guess I'll just drop the assertion. Sorry I brought it
up.
(I hope the source is written in C.)

Did you have a question about C?

We generate C. :) I want to talk about decompilers. If you don't
want to talk about decompilers, don't reply to a thread that is clearly
about decompilers. If, on the other hand, you're somehow trying to
"police" this newsgroup for off-topic discussions, allow me to suggest
that you might be more at home in a moderated newsgroup.

QuantumG
 
Q

QuantumG

Keith said:
Which, as far as I can tell, has nothing to do with the C programming
language, the topic of this newsgroup. Perhaps this would be topical
in comp.compilers.

Gee, I'm sorry. Please don't kick me... oh wait, this isn't a
moderated newsgroup. I'll talk about whatever the hell I like. If you
don't want to talk about decompilers, don't reply to a thread which is
clearly about decompilers.

QuantumG
 
Q

QuantumG

Richard said:
When you diff(1) the output of Boomerang on itself with the input to the
compiler and get no differences, call again. :)

Maybe one day, if the binary includes symbol information, that will be
possible. But it won't be done with an open source decompiler unless
people who want such a tool start contributing to its creation. More
likely is that you'll be able to use a commercially guarded decompiler
to decompile itself, but only after manually removing any binary
protection layers that have been wrapped around the output of the
compiler.

QuantumG
 
G

Gernot Frisch

Maybe one day, if the binary includes symbol information, that will
be
possible.

How would one be so stupid as to put symbolic information in an
executable? I mean: What's the deal of a compiler then?

Turning a C executable back into C code is like turning a hamburger
into a cow. Period.
 
M

MQ

QuantumG said:
Gee, I'm sorry. Please don't kick me... oh wait, this isn't a
moderated newsgroup. I'll talk about whatever the hell I like.

Way to go, moron. You want to advertise a product in an inappropriate
place, get people interested, and then abuse them. Your right, this is
not a moderated group, and you are one of the reasons moderated groups
exist.
 
Q

QuantumG

Gernot said:
Turning a C executable back into C code is like turning a hamburger
into a cow. Period.

Seems to me that all you're saying is recoving symbols from a program
with no symbols is an impossible proposition. Ya know, we can test
this today, we don't even need a decompiler. Just take any old program
written in C, run the C preprocessor over it, replace any fancy for
loops with not-as-pretty while loops and replace every symbol with a
generic one (local1, global22, param4). There you go, you have the
absolute best output you could ever expect from a decompiler.

Now, are you honestly telling me that someone who hasn't seen the
original source code couldn't come up with sensible symbols, redo the
fancy for loops and add any C preprocessor macros that are appropriate?
That *really* doesn't sound that hard to me. Not when you consider
that people do the same thing with the output of disassemblers.

QuantumG
 
Q

QuantumG

MQ said:
Way to go, moron. You want to advertise a product in an inappropriate
place, get people interested, and then abuse them. Your right, this is
not a moderated group, and you are one of the reasons moderated groups
exist.

Everyone thinks they're a traffic cop.

If you don't like it don't read the thread.

QuantumG
 
M

MQ

QuantumG said:
Everyone thinks they're a traffic cop.

No, like any community, we expect those who enter to respect the
protocol of the people who are in that community. I can't stop you
ranting on about decompilers, but there are much better places to do
it. It's just basic manners. Think about that next time someone
pushes in front of you when you are in a queue.
If you don't like it don't read the thread.
I was interested, I read your thread in alt.lang.asm. But it is
offensive the way you have treated some of the people here

MQ
 
Q

QuantumG

MQ said:
No, like any community, we expect those who enter to respect the
protocol of the people who are in that community. I can't stop you

I have been a member of this community and other C language
communities. Our protocol is to push out anyone with any actual
*interest* in the language and attact newbies who want their
programming homework done. I believe some of the people left in this
community are interested in decompilers and would like to talk about
them, but they feel they will be shouted down because it is off topic.
If all the traffic cops would just lighten up we'd have a much better
time of it. After all, I'm not talking about racing cars here.
I was interested, I read your thread in alt.lang.asm. But it is
offensive the way you have treated some of the people here

Well, that's your opinion. Grow a thicker skin.

QuantumG
 
Q

QuantumG

MQ said:
No, like any community, we expect those who enter to respect the
protocol of the people who are in that community. I can't stop you

I have been a member of this community and other C language
communities. Our protocol is to push out anyone with any actual
*interest* in the language and attact newbies who want their
programming homework done. I believe some of the people left in this
community are interested in decompilers and would like to talk about
them, but they feel they will be shouted down because it is off topic.
If all the traffic cops would just lighten up we'd have a much better
time of it. After all, I'm not talking about racing cars here.
I was interested, I read your thread in alt.lang.asm. But it is
offensive the way you have treated some of the people here

Well, that's your opinion. Grow a thicker skin.

QuantumG
 
I

Ivan Vecerina

: > Richard Heathfield wrote:
: >> When you diff(1) the output of Boomerang on itself with the input
: >> to the
: >> compiler and get no differences, call again. :)
: >
: > Maybe one day, if the binary includes symbol information, that will
: > be
: > possible.
:
: How would one be so stupid as to put symbolic information in an
: executable? I mean: What's the deal of a compiler then?
:
: Turning a C executable back into C code is like turning a hamburger
: into a cow.
: Period.

You mean, like extracting DNA from the hamburger cells
to generate a clone ?

Provided you can make some reasonable assumptions about
the way that the hamburger was created, and that your
expectation for the reverse-engineered cow are not too
high, the biotech is not that far from that ... :D

Ivan
 
K

Kenny McCormack

I have been a member of this community and other C language
communities. Our protocol is to push out anyone with any actual
*interest* in the language and attact newbies who want their
programming homework done. I believe some of the people left in this
community are interested in decompilers and would like to talk about
them, but they feel they will be shouted down because it is off topic.
If all the traffic cops would just lighten up we'd have a much better
time of it. After all, I'm not talking about racing cars here.

You are so totally dead bang on, it is scary. In particular, it is
exactly right that they've set things up so that the only thing that can
be done is abusing the newbies.

But it won't do you any good. You see, the regs here are so totally
devoid of lives that this is all they have. And you're trying to take
it away from them. For shame!
 
K

Kenny McCormack

Ivan Vecerina said:
You mean, like extracting DNA from the hamburger cells
to generate a clone ?

Provided you can make some reasonable assumptions about
the way that the hamburger was created, and that your
expectation for the reverse-engineered cow are not too
high, the biotech is not that far from that ... :D

Good point. The regs are going to have to come up with a new metaphor.
 
P

Philip Potter

We generate C. :) I want to talk about decompilers. If you don't
want to talk about decompilers, don't reply to a thread that is clearly
about decompilers. If, on the other hand, you're somehow trying to
"police" this newsgroup for off-topic discussions, allow me to suggest
that you might be more at home in a moderated newsgroup.

You must be new round here. comp.lang.c is unmoderated, but that doesn't
mean it doesn't have an accepted remit. If you aren't going to talk about
the C language, you will get killfiled, ignored, and mocked, depending on
the poster. I'm not sure this is what you were trying to achieve.

Can I suggest you move to somewhere like comp.programming.misc?
 
I

Igmar Palsenberg

Gernot said:
How would one be so stupid as to put symbolic information in an
executable? I mean: What's the deal of a compiler then?

That remarkable feature is called 'debugging'. You know, when you fire
up your debugger, it knows that at some point, a certains variable
exists, and how it's called.
Turning a C executable back into C code is like turning a hamburger
into a cow. Period.

Ah well... We have a brainfucked instance here called 'UWV' that can
probably do that.



Igmar
 
K

Kenny McCormack

You must be new round here. comp.lang.c is unmoderated, but that doesn't
mean it doesn't have an accepted remit. If you aren't going to talk about
the C language, you will get killfiled, ignored, and mocked, depending on
the poster. I'm not sure this is what you were trying to achieve.

Thank you for proving all of Q's points.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top