Selling Python Software

  • Thread starter Will Stuyvesant
  • Start date
A

Andrew Dalke

Bengt Richter:
OTOH, we are getting to the point where rather big functionality can be put
on a chip or tamper-proof-by-anyone-but-a-TLA-group module. I.e., visualize
the effect of CPUs' having secret-to-everyone private keys, along with
public keys,

Actually, we aren't. There have been various ways to pull data of
of a smart card (I recall readings some on RISKS, but the hits I
found are about 5+ years old). In circuit emulators get cheaper and
faster, just like the chips themselves. And when in doubt, you can
buy or even build your own STM pretty cheap -- in hobbiest range
even (a few thousand dollars).
and built so they can accept your precious program code wrapped in a PGP encrypted
message that you have encrypted with its public key.

Some of the tricks are subtle, like looking at the power draw.
Eg, suppose the chip stops when it finds the key is invalid. That
time can be measured and gives clues as to how many steps it
went through, and even what operations were done. This can
turn an exponential search of key space into a linear one.
This is not so magic. You could design a PC with a locked enclosure and special BIOS
to simulate this, except that that wouldn't be so hard to break into. But the principle
is there. Taking the idea to SOC silicon is a matter of engineering, not an idea break-through
(though someone will probably try to patent on-chip stuff as if it were essentially different
and not obvious ;-/)

But the counter principle (breaking into a locked box in an uncontrolled
environment) is also there. There are a lot of attacks against smart
cards (eg, as used in pay TV systems), which cause improvements (new
generation of cards), which are matched by counter attacks.

These attacks don't require the resources of a No Such Agency,
only dedicated hobbiest with experience and time on their hands.

Andrew
(e-mail address removed)
P.S.
I did have fun breaking the license protection on a company's
software. Ended up changing one byte. Took about 12 hours.
Would have been less if I knew Solaris assembly. And I did
ask them for permission to do so. :)
 
J

John J. Lee

Andrew Dalke said:
Bengt Richter:
OTOH, we are getting to the point where rather big functionality can be put
on a chip or tamper-proof-by-anyone-but-a-TLA-group module. I.e., visualize
the effect of CPUs' having secret-to-everyone private keys, along with public keys,

Actually, we aren't. There have been various ways to pull data of
of a smart card (I recall readings some on RISKS, but the hits I [...]

Right.


In circuit emulators get cheaper and faster, just like the chips
themselves.

Though there will always be some consumer apps that will run too slow
like that. Maybe not the most important ones, though (email, sound
and video playing stuff, etc.).

And when in doubt, you can
buy or even build your own STM pretty cheap -- in hobbiest range
even (a few thousand dollars).

That's a thought: somebody is going to know (or be able to find
experimentally) exactly where to look on each chip, and once that fact
is out, I guess people are going to be selling motherboards / CPUs
with their private key stuck to them on a post-it note. :)

A mass-market for STMs -- maybe it's worth investing ;-)

Some of the tricks are subtle, like looking at the power draw.
Eg, suppose the chip stops when it finds the key is invalid. That
time can be measured and gives clues as to how many steps it
went through, and even what operations were done. This can
turn an exponential search of key space into a linear one.
[...]

Yeah, it's all very cute, and sounds like rocket science (which it is,
in some ways), but really it does all boil down to the same technique
I used as a child to crack my own bicycle chain combination lock (I
forgot the combination). I discovered you could hear the right
position on the dials individually, so I could crack it one dial at a
time. But that was a cheap lock, and there are others that don't do
that. Processors aren't locks, of course, and they give off so much
noise that I wonder if that's possible in their case.


John
 
K

Kyler Laird

...unless said SW developer keeps the extremely precious parts of his
SW safely on a network server under his control (yes, it IS possible
to technically secure that -- start with an OpenBSD install...:) and
only distributes the run-of-the-mill "client-oid" parts he doesn't
particularly mind about.

I can imagine the "client-oid" part being a simple shell script that
just runs SSH (perhaps with some tunnel commands for database
connections).

Especially for a command-line app, this seems like an *easy* and
complete answer. Heck, for some situations, it might even be considered
the final product. It'd be a snap for the developer to maintain.

--kyler
 
S

Svenne Krap

But decomiling a .pyc takes some work.

How is that done in the first place ? (not that I am that client, and
not that I need to know, it's just nice to be informed.)

I agree with most of you, copyprotection does not work, if the software
is essential to a large enough "market". But even simple protections
keeps the casual pirate away and after all, the most effective system
are contracts and laywers (I'm not glad it is that way, but it is).

Svenne
 
P

Peter Hansen

Svenne said:
How is that done in the first place ? (not that I am that client, and
not that I need to know, it's just nice to be informed.)

I agree with most of you, copyprotection does not work, if the software
is essential to a large enough "market". But even simple protections
keeps the casual pirate away and after all, the most effective system
are contracts and laywers (I'm not glad it is that way, but it is).

Program called "decompyle" can do it, though it's generally not
up to date with the latest Python version until some time afterwards.
It may not even be supported any more. Google would help...

-Peter
 
A

Alex Martelli

Erik said:
If he wrote the software for the company under the agreement that they'd
buy it if they liked it, what's the likelihood that this is a concern at
all? I would hope that if he had a real concern that this was a likely
outcome -- that they'd talk to him, wait for him to deliver a product,
and then refuse to pay him -- that he wouldn't be dealing with them at
all anyway.

Well, in the real world of consulting (in Southern Europe, at least),
getting your invoices actually *paid* by the companies you've done work
for isn't necessarily trivial (which may explain why I "commute" to
AB Strakt Sweden, and also telework for them... they _do_ pay their
bills, and on time too!, besides being a great firm in other ways:).

This has little to do with "selling software", necessarily -- services
are of course more problematic still, but even if you sell durable
goods that might in theory be repossessed for non-payment, the amount
of trouble, cost and impact on cash-flow can be problematic. Francis
Fukuyama's "Trust: human nature and the reconstitution of social
order" is highly recommended reading on the underlying problems --
but unless one wants to emigrate, even being fully aware of them
doesn't necessarily take one much closer to avoiding them:).


Alex
 
A

Alex Martelli

John said:
Alex Martelli said:
Part of the problem is, that the "warezdoodz culture" is stacked
against you. If you DO come up with a novel approach, that is a
[...]

Ah, stop right there (oops, too late!-). I think we're somewhat at
cross-purposes. I was talking about protecting something more at the
level of source code than running programs.

Oh, "shrouding"? Sure, you can do that. Many programs might
actually be _enhanced_ by that approach (at least the variable
and function names, while not helpful, aren't actively hostile:),
but probably not Python programs.

I mostly agree with you on the issue of protecting "binaries", but:
Part of the problem is, that the "warezdoodz culture" is stacked
against you. If you DO come up with a novel approach, that is a
[...]

Though information is indeed always incomplete, it seems a good bet
that war3zd00dz are not an issue for a consultant being hired by a
company to write a 1000 line program. Do you disagree?

A Python 1000-SLOC program may be about 200+ function points --
not exactly trivial (it may be equivalent to more than 10,000
lines of C, easily) though not earth-shaking. But, anyway,
we weren't talking about somebody being _hired_, but rather
wanting to sell what they independently came up with the idea
of developing -- there's a difference! And yes, it wouldn't
be the first time that a company deliberately exploits the warez
"circuit" to get programs cracked -- look around and you'll see
it's definitely NOT just games and the like that end up there.

Anyway, back to source vs. binaries. Obviously, code that's closer to
the "source" end of the spectrum has additional value. I'd got the
impression that something rather similar to the original source could
be recovered from Python byte-code, due to its high-level nature
(albeit obviously missing a lot of stuff -- including all those
valuable names). Certainly that's impossible with optimising
compilers (I should have stated this much more strongly in my last
message, of course -- there's no "may" or "guessing" involved there,
unlike the Python case, where I don't know the answer).

If you think you do, "you're in denial". Check out:

http://www.program-transformation.org/twiki/bin/view/Transform/DecompilationPossible
http://boomerang.sourceforge.net/
http://www.itee.uq.edu.au/~cristina/dcc.html#dcc

I suspect it must in some way be easier (but my multiplicative
constants, not O(N) easier...;-) for lower "semantic gaps" -- but
that intuition might well be misguided (it's close to "it must in
some way be easier to produce optimal machine code for a CISC
than for a RISC", and that's simply not true).


Alex
 
B

Bengt Richter

Bengt Richter:
public keys,

Actually, we aren't. There have been various ways to pull data of
of a smart card (I recall readings some on RISKS, but the hits I
found are about 5+ years old). In circuit emulators get cheaper and
faster, just like the chips themselves. And when in doubt, you can
buy or even build your own STM pretty cheap -- in hobbiest range
even (a few thousand dollars).
Even if you knew exactly where on a chip to look, and it wasn't engineered
to have the key self-destruct when exposed, what would you do with the key?
You'd have the binary image of an executable meant to execute in the secret-room
processing core. How would you make it available to anyone else? You could re-encrypt
it with someone else's specific public key. Or distribute a program that does that,
along with the clear binary. But what if the program contains an auth challenge for the target
executing system? Now you have to reverse engineer the binary and see if you can modify it
to remove challenges and checks and still re-encrypt it to get it executed by other processors.
Or you have to translate the functionality to a program that runs in clear mode on the ordinary cores.
Sounds like real work to me, even if you have a decompyler and the inter-core comm specs.
Of course, someone will think it's fun work. And they would get to start over on the next program,
even assuming programs encrypted with the public key of the compromised system would be provided,
so there better not be a watermark left in the warez images that would indentify the compromised
system. Or else they would get to destroy another CPU module to find its key. Probably easier the
second time, assuming no self-destruct stuff ;-)
Some of the tricks are subtle, like looking at the power draw.
Eg, suppose the chip stops when it finds the key is invalid. That
time can be measured and gives clues as to how many steps it
went through, and even what operations were done. This can
turn an exponential search of key space into a linear one.
That was then. Plus remember this would not be an isolated card chip that you can
probe, it's one or more specialized cores someplace on a general purpose
multi-cpu chip that you can't get at as a hobbyist, because opening it without destroying
what you want to look at requires non-hobby equipment, by design.
But the counter principle (breaking into a locked box in an uncontrolled
environment) is also there. There are a lot of attacks against smart
cards (eg, as used in pay TV systems), which cause improvements (new
generation of cards), which are matched by counter attacks.

These attacks don't require the resources of a No Such Agency,
only dedicated hobbiest with experience and time on their hands.
Sounds like an article of faith ;-)
Andrew
(e-mail address removed)
P.S.
I did have fun breaking the license protection on a company's
software. Ended up changing one byte. Took about 12 hours.
Would have been less if I knew Solaris assembly. And I did
ask them for permission to do so. :)

Changed a conditional jump to unconditional? Some schemes aren't so
static and centralized ...

I once ran into a scheme that IIRC involved a pre-execution snippet of code that had to
run full bore for _lots_ of cycles doing things to locations and values in
the program-to-be-executed that depended on precise timing and obscured info.
I guess the idea was that if someone tried to trace or step through it, it would
generate wrong locations and info and also stop short and the attacker would have
to set up to record memory addresses and values off the wires to figure out what to change,
but even then they would run into code that did mysterious randomly sprinkled
milestone checks, so capturing the core image after start wasn't free lunch either.
Plus it had to run in a privileged CPU mode, and it wasn't a stand-alone app, it
was part of an OS ... that wasn't open source ... and you didn't have to tools to rebuild...
This was just some code I stumbled on, I may have misunderstood, since I didn't
pursue it, being there for other reasons. But that was primitive compared to what you
could do with specialized chip design.

Regards,
Bengt Richter
 
A

Andrew Dalke

Bengt Richter:
Even if you knew exactly where on a chip to look, and it wasn't engineered
to have the key self-destruct when exposed, what would you do with the
key?

As I mentioned, I'm not a hardware guy. What I know is that
trying to hide code on a chip is open to its own sorts of attacks and
that at least some companies which have tried to do so (like pay-TV
companies) have had their systems broken, and not broken by the
efforts of a three letter agency.
That was then.

Yup.

Crypto software is notoriously hard to write. Even with the good
libraries we have now, people make mistakes and misuse them.
Similarly, while it people can make chips that are hard to decode,
doing so in practice is likely to be even harder.
Plus remember this would not be an isolated card chip that you can
probe, it's one or more specialized cores someplace on a general purpose
multi-cpu chip that you can't get at as a hobbyist, because opening it without destroying
what you want to look at requires non-hobby equipment, by design.

Then perhaps a small business instead of a hobbyist. Still
doesn't require large government-based efforts to crack it.
Changed a conditional jump to unconditional? Some schemes aren't so
static and centralized ...

That's what I did for that one.

Another time I was working for a company. We were shipping a time-locked
program and had forgotten about it (there was a big lay off a few months
before I came in). Then one day our customers started complaining that
it wasn't working. The builds were done at another site, so as a backup
plan I figured out how to break our own scheme. In that one I looked
for the time call then added some code to shift the return value some
arbitrary time in the future.
I once ran into a scheme that IIRC involved a pre-execution snippet of code that had to
run full bore for _lots_ of cycles doing things to locations and values in
the program-to-be-executed that depended on precise timing and obscured
info.

Another one is the code in an old copy of MS Windows which gave a
warning message when run on top of DR-DOS (which had about 1/300th
of the marketplace). See
http://www.ddj.com/documents/s=1030/ddj9309d/

But these are defeatable. For example, run the program under an
emulator, save the state after the license check occurs, and restart
by reloading from that state. Or write a wrapper which intercepts
all I/O calls and returns the results from a known, good call (a
replay attack).

The system you propose seems to require a lot of changes to how
existing computer hardware is built, so that people can attach
dedicated compute resources. It's definitely an interesting idea,
and not just for program hiding. But a big change.

I'll end with a quote from E.E. "Doc" Smith's "Gray Lensman"
(p10 in my copy)

Also, there was the apparently insuperable difficulty of
the identification of authorized personnel. Triplanetary's best
scientists had done there best in the way of a non-counter-
feitable badge -- the historic Golden Meteor, of which upon
touch impressed upon the toucher's consciousness an unpro-
nounceable, unspellable symbol -- but that best was not
enough. /What physical science could devise and synthesize,
physical science could analyze and duplicate/; and that analy-
sis and duplication had caused trouble indeed.

(I realize this is not at issue and that everyone agrees this
to be the case. I just like the quote ;)

Andrew
(e-mail address removed)
 
W

Will Stuyvesant

Thank you all for your input and thoughts! I am replying to my first
post since otherwise I would have to choose one of the threads and I
can't: several are useful.

I got the customer's interest via a CGI based webservice that did show
what I can do. Given this, now it would be very simple for me to keep
it as a webservice and give them a client script using the urllib
module or something like that...it would access the webservice and
they would never see the algorithm (I want to keep that to myself).
That is the solution I would like best. Not only because I want to
keep stuff to myself, but also because then I can easily upgrade to
new versions in one fell swoop if they or others are interested.

But a program that has to be connected to the internet is not
acceptable, their boss says. Humph.

Maybe I should trust them and do normal bussiness as I did before with
other customers already: just send them the commandline version of the
program; but somewhere in the back of my mind I feel uneasy about
this. Dunno why. Maybe just because they have a pointy haired boss.

Alex Martelli told about software bussiness in southern europe, well,
it's one big EU now, with the italian Berlusconi as chairman (soon
appointed invulnerable for that job too?), and I feel it's the same
thing now here in mideurope too.

I guess I am going to send them an MIT licence (although I am afraid
those licences are pretty useless in The Netherlands), the .pyc files
for the algorithm and the utility modules, and a .py file for the main
program. Or maybe I am going to use upx or another .exe encrypter
(but then I have to find out first if I can wrap my head around the
usage of the latest py2exe thing). Then if I find out later they
copied the algorithm then maybe I have a case against them. Just
let's not hope it comes to that, I'd rather start another project.
But this little thing I like and it would be great if I could make
some extra money with it. Curious? It does XML transformations.
With a twist.

Will let you know how it works out, takes a couple of weeks perhaps
until they decide.

Thank you!
 
P

Peter Hansen

Will said:
I guess I am going to send them an MIT licence (although I am afraid
those licences are pretty useless in The Netherlands), the .pyc files
for the algorithm and the utility modules, and a .py file for the main
program.

No need even for that .py file, is there? You can easily execute
a .pyc file directly if the .py doesn't exist. To create it,
easiest way is just to do "import mymainmodule" from the interactive
prompt, then copy the resulting .pyc somewhere else.

Alternatively, keep you .py as the main entry point, but do nothing
inside it except import another module and execute code in it.
But if you can do that, then invoking that module directly is easy
too: "python -c 'import main; main.main()'" does the trick...

-Peter
 
E

Erik Max Francis

Will said:
Maybe I should trust them and do normal bussiness as I did before with
other customers already: just send them the commandline version of the
program; but somewhere in the back of my mind I feel uneasy about
this. Dunno why. Maybe just because they have a pointy haired boss.

I'd be a little concerned why they're asking for a trial demo which
they'll decide whether to pay you for afterward given that, as you've
said, you've already demonstrated the ability to do the job with a CGI
based service that they've already explored and is what in fact drew
them to you in the first place. At this point I'd probably press on
about what the difficulty is in agreeing on contract fees up front would
be.

I agree with Peter's response, though -- it sounds like in this case,
simply distributing .pyc files (make sure you make it clear which
version of Python the .pyc files were created with, since the format
changes with releases!) would be sufficient obscurity to encourage them
to actually pay you once they've gotten the package and have
demonstrated it to work. I'd still wonder why they're insisting on in
effect a second trial demo, here; I now understand your initial
hesitation.
I guess I am going to send them an MIT licence (although I am afraid
those licences are pretty useless in The Netherlands) ...

If you're planning on reselling this to other customers, why would you
want an MIT license?

I'd also make it very clear, if you haven't already -- before you give
them anything -- on whether or not they think you're transferring the
copyright to them or merely buying a license to use it which leaves you
with the ability to resell it to other clients.
 
J

John J. Lee

Even if you knew exactly where on a chip to look, and it wasn't

(Which knowledge is bound to become available -- I don't think any
leak is required.)

engineered to have the key self-destruct when exposed, what would

Exposed to what?

you do with the key? You'd have the binary image of an executable
meant to execute in the secret-room processing core. How would you

No, you already have that -- it's on your hard drive (the current
scheme is only about the processor & associated gubbins, if I read
Ross Anderson's page right).

make it available to anyone else?
[...]

Copy it.

I think the idea is something like this (got from Ross Anderson's TC
FAQ). The processor makes sure that a single process can only see
it's own memory space. The processor also has a private key, and
knows how to take an md5 sum (or whatever), sign it with the key, and
send that off to the software author's server along with your
identity. The server checks that it was signed with your processor's
private key, and that you've paid for the software, and a sends a
signed message back that tells your machine "OK". Obviously (hmm... I
should hesitate to use that word about anything related to security!),
if you have your machine's private key, you can play
"man-in-the-middle".

Presumably the next phase is to make hard drives, etc. 'trusted'. I
couldn't find much useful stuff on this on the web. Anybody have any
good links to overviews of this?


John
 
M

Max M

Erik said:
I'd be a little concerned why they're asking for a trial demo which
they'll decide whether to pay you for afterward given that, as you've
said, you've already demonstrated the ability to do the job with a CGI
based service that they've already explored and is what in fact drew
them to you in the first place. At this point I'd probably press on
about what the difficulty is in agreeing on contract fees up front would
be.


I find this odd too. I have never had to deliver a working version of a
product before the customer decided if they wanted to pay me or not.

Just give them an offer for the delivered programme specifying which
features it will have.

Then they either say yes or no. And you won't have to go through all
that trouble.


regards Max M
 
J

John J. Lee

Alex Martelli said:
John J. Lee wrote: [...]
Though information is indeed always incomplete, it seems a good bet
that war3zd00dz are not an issue for a consultant being hired by a
company to write a 1000 line program. Do you disagree?
[...]
we weren't talking about somebody being _hired_, but rather
wanting to sell what they independently came up with the idea
of developing -- there's a difference! And yes, it wouldn't

Right. Substitute "a consultant selling a not-widely-distributed 1000
line program to a company" in what I said, though, and I think it's
still a good bet.

be the first time that a company deliberately exploits the warez
"circuit" to get programs cracked -- look around and you'll see
it's definitely NOT just games and the like that end up there.

Oh sure, but don't the vast majority tend to be far more widely
distributed than (I imagine, guessing of course) this 1000 line code
is? Maybe I'm just naive.


[...about decompilation: recovering source-like code from compiled code...]
[...]

OK, OK, not impossible if you have knowledge of the way compilers
actually do things (and, sigh... security is always about 'cheating',
isn't it). Still, even given that, the sample input / output on the
second two pages, though impressive, appear to show that doing it in
practice is far from a solved problem (assuming this is representative
of the state of the art). One would expect that it's far harder to do
this with optimised languages than with Python -- true?


John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,169
Messages
2,570,920
Members
47,462
Latest member
ChanaLipsc

Latest Threads

Top