Is it possible to protect python source code by compiling it to .pycor .pyo?

S

Sam

I would like to protect my python source code. It need not be foolproof as long as it adds inconvenience to pirates.

Is it possible to protect python source code by compiling it to .pyc or .pyo? Does .pyo offer better protection?
 
N

Ned Batchelder

I would like to protect my python source code. It need not be foolproof as long as it adds inconvenience to pirates.

Is it possible to protect python source code by compiling it to .pyc or .pyo? Does .pyo offer better protection?

First, .pyc and .pyo are nearly identical: they are bytecode. The only
difference is that .pyo has been "optimized", which in this case simply
means that the docstrings and asserts are gone. It is not difficult to
see what a Python program does by looking at the bytecode, and the
standard library includes the dis module for disassembling it.

How to protect your code depends an awful lot on what kinds of secrets
are in the code, and how valuable those secrets are, and therefore how
hard someone will work to get at them.
 
C

Chris Angelico

I would like to protect my python source code. It need not be foolproof as long as it adds inconvenience to pirates.

Is it possible to protect python source code by compiling it to .pyc or .pyo? Does .pyo offer better protection?

The only difference between pyo and pyc is that the former is with
optimization done. And neither of them offers any real security.

Even if you compiled it down to machine code, you wouldn't do much to
deter pirates. All you'd do is make it so they have to take your code
as a whole instead of piece-meal.

Fighting against piracy using technology is pretty much guaranteed to
be a losing battle. How much time and effort can you put in, versus
the whole rest of the world? And how much harassment will you permit
on your legitimate users in order to slow down a few who want to rip
you off? I've seen some programs - usually games - that put lots and
lots of checks in (checksumming the program periodically and crashing
if it's wrong, "calling home" and making sure the cryptographic hash
of the binary matches what's on the server, etc, etc)... and they
still get cracked within the first day. And then legitimate purchasers
like me have to deal with the stupidities (single-player games calling
home??), to the extent that it's actually more convenient to buy the
game and then install a cracked version from a torrent, than to
install the version you bought. And there's one particular game where
I've done exactly that. It's just way too much fiddliness to try to
make the legit version work.

Distribute your code with a copyright notice, accept that a few people
will rip you off, and have done with it.

ChrisA
 
E

Ethan Furman

No and no.
Distribute your code with a copyright notice, accept that a few people
will rip you off, and have done with it.

Yes. One of the nice things about Python is being able to fix bugs myself [1].
 
S

Steven D'Aprano

I would like to protect my python source code. It need not be foolproof
as long as it adds inconvenience to pirates.

What makes you think that "pirates" will be the least bit interested in
your code? No offence intended, I'm sure you worked really, really hard
to write it, but the internet has hundreds of gigabytes of free and open
source software which is easily and legally available, not to mention
easily available (legally or not) non-free software at a relatively cheap
price. Chances are that your biggest problem will not be piracy, but
getting anyone to care or even notice that your program exists.

Is it possible to protect python source code by compiling it to .pyc or
.pyo? Does .pyo offer better protection?

Compiling to .pyc or .pyo will not give any protection from software
piracy, since they can just copy the .pyc or .pyo file. It will give a
tiny bit of protection from people reading your code, but any competent
Python programmer ought to be able to use the dis module to read the byte
code.

Perhaps if you explain what your program is, and why you think it needs
protection, we can give you some concrete advice.
 
J

Joshua Landau

I would like to protect my python source code. It need not be foolproof as long as it adds inconvenience to pirates.

Is it possible to protect python source code by compiling it to .pyc or .pyo? Does .pyo offer better protection?

If you're worried about something akin to corporate espionage or
some-such, I don't know of a better way than ShedSkin or Cython. Both
of those will be far harder to snatch the source of. Cython will be
particularly easy to use as it is largely compatible with Python
codebases.

I offer no opinions, however, on whether this is a task worth doing. I
only suggest you consider the disadvantages and how they apply to your
individual case.
 
T

Tim Delaney

as long as it adds inconvenience to pirates.
.pyo? Does .pyo offer better protection?

If you're worried about something akin to corporate espionage or
some-such, I don't know of a better way than ShedSkin or Cython. Both
of those will be far harder to snatch the source of. Cython will be
particularly easy to use as it is largely compatible with Python
codebases.

Indeed - I've only had one time someone absolutely insisted that this be
done (for trade secret reasons - there needed to be a good-faith attempt to
prevent others from trivially getting the source). I pointed them at Pyrex
(this was before Cython, or at least before it was dominant). They fully
understood that it wouldn't stop a determined attacker - this was a place
where a large number of the developers were used to working on bare metal.

If you're going to do this, I strongly suggest only using Cython on code
that needs to be obscured (and if applicable, performance-critical
sections). I'm currently working with a system which works this way - edge
scripts in uncompiled .py files, and inner code as compiled extensions. The
..py files have been really useful for interoperability purposes e.g. I was
able to verify yesterday that one of the scripts had a bug in its
command-line parsing and I wasn't going insane after all.

Also, remember that any extension can be imported and poked at (e.g. in the
interactive interpreter). You'd be surprised just how much information you
can get that way just using help, dir, print and some experimentation. The
output I was parsing from one of the scripts was ambiguous, and it was one
where most of the work was done in an extension. I was able to poke around
using the interactive interpreter understand what it was doing and obtain
the data in an unambiguous manner to verify against my parser.

The only way to truly protect code is to not ship any version of it
(compiled or otherwise), but have the important parts hosted remotely under
your control (and do your best to ensure it doesn't become compromised).

Tim Delaney
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,744
Latest member
CortneyMcK

Latest Threads

Top