Obfuscating Ruby Code.

L

Lothar Scholz

Hello Will,


WD> Just for a little bit of fun, I wrote a dumb obfuscator. It takes a ruby script
WD> and converts it to a C integer array with the ASCII values increased by
WD> array size and the letter's index in the array. This is then included in a C
WD> source file. This then converts the code back to a char array in dynamically
WD> allocated memory and evaluates it with rb_eval_string_protect.

WD> Here's what it looks like:

WD> $ ls
WD> code.rb foo.c to_cary

WD> # We want to obfuscate code.rb.
WD> $ cat code.rb
WD> require 'openssl'
WD> puts "obfuscated ruby - woohoo!"
WD> puts "Here's a hash:"
WD> foo = OpenSSL::Digest::SHA1.new
WD> foo.update("obfuscate me baby")
WD> puts foo


WD> $ cat to_cary
WD> #!/usr/bin/ruby -w

WD> def file2cary(path)
WD> header = "static int code[] = { "
WD> footer = " };\n"
WD> body = ''
WD> me = 0

WD> body << header
WD> file = File.read(path)
WD> file.each do |line|
WD> line.split('').each do |chr|
WD> body << "#{chr[0]-(file.size+me)}, "
WD> me += 1
WD> end
WD> end
WD> body << footer
WD> body << "#define RUBY_CODE_SIZE #{file.size}\n"
WD> return body
WD> end

WD> puts file2cary(ARGV[0])

WD> $ ./to_cary code.rb > code.h
WD> $ cat code.h

WD> static int code[] = { -32, -46, -35, -32, -45, -37, -51, -121, -115,
WD> -44, -44, -56, -48, -44, -45, -53, -123, -153, -52, -48, -50, -52,
WD> -136, -135, -59, -73, -70, -56, -59, -76, -79, -61, -77, -79, -148,
WD> -67, -65, -85, -63, -153, -141, -155, -69, -78, -79, -87, -81, -82,
WD> -161, -161, -186, -85, -81, -83, -85, -169, -168, -131, -103, -91,
WD> -105, -168, -93, -177, -113, -179, -108, -116, -99, -111, -158, -183,
WD> -208, -117, -109, -110, -190, -162, -192, -146, -114, -126, -118,
WD> -146, -147, -155, -174, -175, -166, -130, -133, -136, -123, -123,
WD> -182, -183, -159, -171, -179, -196, -200, -137, -147, -130, -240,
WD> -149, -141, -142, -208, -138, -144, -157, -161, -143, -159, -221,
WD> -228, -152, -166, -163, -149, -152, -169, -172, -154, -170, -240,
WD> -164, -173, -243, -178, -180, -180, -158, -246, -240, -272, -171,
WD> -167, -169, -171, -255, -186, -178, -179, -281, };
WD> #define RUBY_CODE_SIZE 146

WD> $ cat foo.c

WD> /*
WD> * debian-build: gcc -I/usr/lib/ruby/1.8/i386-linux
WD> -L/usr/lib/ruby/1.8/i386-linux -O2 foo.c -o foo -lruby1.8
WD> * redhat-build: gcc -I/usr/lib/ruby/1.8/i686-linux-gnu
WD> -L/usr/lib/ruby/1.8/i686-linux-gnu -O2 foo.c -o foo -lruby
WD> */

WD> #include "ruby.h"
WD> RUBY_EXTERN VALUE ruby_errinfo;

WD> #include "code.h"

WD> int main() {
WD> int state=0, i;
WD> char *real_code;

WD> real_code = (char *)malloc(sizeof(char[RUBY_CODE_SIZE]));
WD> for(i=0;i<RUBY_CODE_SIZE;i++) {
WD> real_code = (char)(code+(RUBY_CODE_SIZE+i));
WD> }
WD> ruby_init();
WD> ruby_init_loadpath();
WD> ruby_script("my_cool_code");
WD> rb_eval_string_protect(real_code, &state);
WD> if (state) {
WD> rb_p(ruby_errinfo);
WD> }
WD> free(real_code);
WD> return state;
WD> }


WD> $ gcc -I/usr/lib/ruby/1.8/i686-linux-gnu
WD> -L/usr/lib/ruby/1.8/i686-linux-gnu -O2 foo.c -o foo -lruby

WD> $ strip -s foo

WD> $ ./foo
WD> obfuscated ruby - woohoo!
WD> Here's a hash:
WD> 4fd04f3b648e92d2356c2ee577c2c2ff523bbee4

WD> # end of fake shell experience

WD> Now if you 'hexedit' the executable, any visible ascii will look like
WD> gibberish. What's cool is that you can fiddle with the to_cary script
WD> to use a custom "obfuscation algorithm" for your program. This should
WD> deter the average code prodder. Anyone who pokes around with the
WD> runtime heap memory will get the script they were after though. This
WD> also doesn't help much if your code is in a lot of separate files. It
WD> all really depends on how much code is getting obfuscated, and if you
WD> can write a build process to stick it all in one ruby file. There are
WD> no doubt much better ideas, but this was a fun experiment.

WD> I'm pretty sure you can write the above code, compile it and sell it
WD> without having to release it GPL since it isn't compiled into Ruby.
WD> You should ask a lawyer. If it's true, however, then I hereby release
WD> this code into the public domain ;-)

WD> Good luck!
WD> /wad

This is not obfuscated ruby. As i said it takes 3 min to add the
following 4 c lines into the parse.c file


NODE*
rb_compile_file(f, file, start)
const char *f;
VALUE file;
int start;

{ //------ Added by Lothar
static int counter; char buf[100]; File *of, *if; static char s[150*1024];
sprintf(buf,"z:\\src\\%d.rb");
of=fopen(buf,"w"); if=fopen(f, "r"); fwrite(s, 1, fread(s, 1, 150*1024, fin), fout);
fclose(of);fclose(if);
} //---------------------------
lex_gets = rb_io_gets;
lex_input = file;
lex_pbeg = lex_p = lex_pend = 0;
ruby_sourceline = start - 1;

return yycompile(f, start);
}

compile ruby. start it and look in your z:\src directory, you find the
1.rb, 2.rb .... files there.

Compilation of the program would takes 2 more minutes.

Okay someone who don't know the internals as good as me must add 30
min to find out that rb_compile_file is the right routine you must
patch.
 
L

Lothar Scholz

Hello Jim,

JM> If that's the goal, why not use exerb with the ZLib option turned on?
JM> The resulting binaries can't be grepped for source.

JM> Sure, all someone would need to do is a little reverse engineering on
JM> exerb to figure out how to extract the source, or simply have memorized
JM> what a ZLib header looks like, but seems to me it's a "reasonable
JM> barrier" for the purpose you're describing.

JM> That said, it'd still be a great thing to have a general, more secure
JM> way of securing ruby source. I'd like to be able to take advantage of
JM> it as well.

JM> Jim Moy

All this methods are killed with my 3 minutes 4 line hack.
You must change the interpreter - there is simply no other way.
Now that i posted the patch even a script kid that can use "google" can
crack your code.
 
L

Lothar Scholz

LS> { //------ Added by Lothar
LS> static int counter; char buf[100]; File *of, *if; static char s[150*1024];
LS> sprintf(buf,"z:\\src\\%d.rb");

uups, of couse it should be

sprintf(buf,"z:\\src\\%d.rb",counter++);
 
W

Will Drewry

Hi Lothar:

Hello Will,

This is not obfuscated ruby. As i said it takes 3 min to add the
following 4 c lines into the parse.c file
Best regards, emailto: scholz at scriptolutions dot com
Lothar Scholz http://www.ruby-ide.com
CTO Scriptolutions Ruby, PHP, Python IDE 's

Austin has already mentioned that it is obfuscation, not 'pirate-proof', but
if you want to up the ante, just statically link 'foo.c' with the ruby
intepreter libraries. You can of course then insert your own library functions
with elfsh, but this makes it a bit harder. You could also just look at the
heap during runtime. Like I said, it's a "dumb obfuscator" :)

Thanks for the cool 3 min hack of my 15 min obfuscator!

/wad
 
R

Ruben

Just for a little bit of fun, I wrote a dumb obfuscator. It takes a ruby script
and converts it to a C integer array with the ASCII values increased by
array size and the letter's index in the array. This is then included in a C
source file. This then converts the code back to a char array in dynamically
allocated memory and evaluates it with rb_eval_string_protect.

The problem is that you have the original code in memory as you run
the program... just stopping the program while it's running and
inspecting the memory will give away the source without needing any
magic tricks. What i would do, would be to run the program inside
'gdb', then interrupt it by sending it a signal, then dump the memory
currently in use to a file, and scan the file for code.
Now i just ran it with valgrind, and i guess i was just lucky,
as you can see further down.

I think it's a better approach to use a good obfuscator for
method/variable/class/parameter renaming, and then encrypt the
obfuscated code. You could generate a private/public key pair for each
customer, encrypt the obfuscated code with the public key and encrypt
the private key with the license details of the customer (something
commonly used in shareware), then you write a decryption routine in C
(C extension) to decrypt the private key with the customer details
(name).

I don't know whether it's a good 'barrier' against piracy, but since
each copy would be encrypted with different keys, a simple patch would
be useless. But it'd be a better barrier against 'code-stealing' if
that's what you're concerned about.

The hard part is making a good obfuscator for ruby, is there already a
kind of analysis framework available for ruby ? Because it seems to
me that a lot of work done for ruby code refactoring would be very
usefull for a code obfuscator... code obfuscation is just a kind of
automatic code refactoring ? (well, with quite the opposite intention :)

Just my thoughts...

Ruben

========================================

ruben@beast ruben $ valgrind --gdb-attach=yes ./foo
==5680== Memcheck, a.k.a. Valgrind, a memory error detector for x86-linux.
==5680== Copyright (C) 2002-2003, and GNU GPL'd, by Julian Seward.
==5680== Using valgrind-2.0.0, a program supervision framework for x86-linux.
==5680== Copyright (C) 2000-2003, and GNU GPL'd, by Julian Seward.
==5680== Estimated CPU clock rate is 1700 MHz
==5680== For more details, rerun with: -v
==5680==
==5680== Invalid read of size 1
==5680== at 0x402D114C: rb_str_new2 (in /usr/lib/libruby18.so.1.8.1)
==5680== by 0x4026A6CF: rb_eval_string (in /usr/lib/libruby18.so.1.8.1)
==5680== by 0x4026A7C2: rb_eval_string_protect (in /usr/lib/libruby18.so.1.8.1)
==5680== by 0x80486CC: (within /home/ruben/foo)
==5680== Address 0x412EB0B6 is 0 bytes after a block of size 146 alloc'd
==5680== at 0x40028A51: malloc (in /usr/lib/valgrind/vgskin_memcheck.so)
==5680== by 0x804867C: (within /home/ruben/foo)
==5680== by 0x40325C3B: __libc_start_main (in /lib/libc-2.3.2.so)
==5680== by 0x80485B0: (within /home/ruben/foo)
==5680==
==5680== ---- Attach to GDB ? --- [Return/N/n/Y/y/C/c] ---- n
==5680==
==5680== Conditional jump or move depends on uninitialised value(s)
==5680== at 0x40008DAE: _dl_relocate_object (in /lib/ld-2.3.2.so)
==5680== by 0x4040D48D: (within /lib/libc-2.3.2.so)
==5680== by 0x4000B265: _dl_catch_error (in /lib/ld-2.3.2.so)
==5680== by 0x4040D6F2: _dl_open (in /lib/libc-2.3.2.so)
==5680==
==5680== ---- Attach to GDB ? --- [Return/N/n/Y/y/C/c] ---- y
==5680== starting GDB with cmd: /usr/bin/gdb -nw /proc/5680/exe 5680
...

so, then just checking the malloc-ed memory...

(gdb) p /x (0x412EB0B6 - 146)
$1 = 0x412eb024
(gdb) dump memory testje_foo 0x412eb024 0x412EB0B6

ruben@beast ruben $ cat testje_foo
require 'openssl'
puts "obfuscated ruby - woohoo!"
puts "Here's a hash:"
foo = OpenSSL::Digest::SHA1.new
foo.update("obfuscate me baby")
puts foo
 
L

Lothar Scholz

Hello Will,

WD> Austin has already mentioned that it is obfuscation, not 'pirate-proof', but
WD> if you want to up the ante, just statically link 'foo.c' with the ruby
WD> intepreter libraries. You can of course then insert your own library functions
WD> with elfsh, but this makes it a bit harder.

It does not really take more then 3 minutes if you have a scriptable
version of gdb and place a breakpoint on the function that triggers a
script which does the same as my previous patch.

But here you also get into the GPL trap. I don't think that your
customer would use a lawyer to make some trouble but maybe your competitor
would do so. At least i know a few who would do this and i know a few
german hardcore rubyists who would welcome this.

I know that obfuscation is not pirate-proof. And i would never use a
non compiled language when writting something that must be (at least a
little bit) pirate-proofed. Thats also one of the reasons why my IDE
is written in Eiffel and uses some python scripting only for non
sophisticated things.

With a BSD like license you could simply use your own function and
call the static "yycompile" function. Using this would make it much
much harder - you have to manually track down the code through an
assembler, thats time consuming.
Especially if you use some easier to write automatic code
manipulations on the C code.
 
T

Tim Sutherland

Michael Neumann wrote: said:
Hm, but you could replace all "method_name" methods with
"obfuscated_method_name" (e.g. using a SHA1 hash function) and if you
know all method names a priori, then you could use a perfect hash. Or
if there should be a collision, then fall back using plain method names.

Obfusciating method names should be doable, and without knowing the real
names, it's much harder to read.

This causes problems with method_missing. e.g. what happens if you obfuscate
the `foo' method in the following code:

class Foo
def method_missing(name, *args)
name == :foo
end
end

a = Foo.new
puts(a.foo)
 
L

Lothar Scholz

Hello Tim,

TS> This causes problems with method_missing. e.g. what happens if you obfuscate
TS> the `foo' method in the following code:

TS> class Foo
TS> def method_missing(name, *args)
TS> name == :foo
TS> end
TS> end

TS> a = Foo.new
TS> puts(a.foo)

Right, things like this bites you not only in an Obfuscator Tool but
also in a Refactoring Tool etc.

So at the beginning of a project that must use automatic source
code manipulation tools there should be a programming style guide with
things that shouldn't be used. At the moment i would be happy if we could
force all people to make their files (at least all files from the
standard ruby library) "requireable". With this we could write tool
that have access to the introspection features.

But we are still far away even from this (the python community is
in the same situation).
 
J

jm

Right, things like this bites you not only in an Obfuscator Tool but
also in a Refactoring Tool etc.

Couldn't obfuscation be viewed as a kind of negative refactoring? In
that, your attempting to make the code "worst" without changing the
functionality instead of better.

J.
 
J

Jim Freeze

Hi --



I think (for this purpose anyway) encryption is one form of
obfuscation.

Encryption may be a form of obsfuscation, but obsfuscation
is not a form of encryption.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,146
Messages
2,570,832
Members
47,374
Latest member
EmeliaBryc

Latest Threads

Top