Encryption with Python?

  • Thread starter Blake T. Garretson
  • Start date
B

Blake T. Garretson

I want to save some sensitive data (passwords, PIN numbers, etc.) to
disk in a secure manner in one of my programs. What is the
easiest/best way to accomplish strong file encryption in Python? Any
modern block cipher will do: AES, Blowfish, etc. I'm not looking for
public key stuff; I just want to provide a pass-phrase.

I found a few modules out there, but they seem to be all but abandoned.
Most seem to have died several years ago. The most promising package
is A.M. Kuchling's Python Cryptography Toolkit
(http://www.amk.ca/python/code/crypto.html).

Is this the defacto Python encryption solution? What does everyone
else use? Any other suggestions? The SSLCrypto package
(http://www.freenet.org.nz/python/SSLCrypto/) may be a good alternative
too, but I am not sure if it is actively maintained.

Thanks,
Blake
 
F

Fredrik Lundh

Blake said:
I found a few modules out there, but they seem to be all but abandoned.
Most seem to have died several years ago.

a lack of recent releases can also mean that they just work...
Is this the defacto Python encryption solution? What does everyone
else use? Any other suggestions?

http://sandbox.rulemaker.net/ngps/m2/

is actively maintained, as far as I can tell.

you might also be able to find some useful stuff inside:

http://trevp.net/tlslite/

(see the utils package for a pure-python AES implementation)

</F>
 
P

Paul Rubin

Blake T. Garretson said:
I want to save some sensitive data (passwords, PIN numbers, etc.) to
disk in a secure manner in one of my programs. What is the
easiest/best way to accomplish strong file encryption in Python? Any
modern block cipher will do: AES, Blowfish, etc. I'm not looking for
public key stuff; I just want to provide a pass-phrase.

http://www.nightsong.com/phr/crypto/p3.py

It uses SHA1 in OFB mode and is fairly fast for a pure Python function.
I found a few modules out there, but they seem to be all but abandoned.
Most seem to have died several years ago. The most promising package
is A.M. Kuchling's Python Cryptography Toolkit
(http://www.amk.ca/python/code/crypto.html).

Nice toolkit, more flexible and powerful than p3.py, but also more
complicated.
 
I

Ivan Voras

Blake said:
I want to save some sensitive data (passwords, PIN numbers, etc.) to
disk in a secure manner in one of my programs. What is the
easiest/best way to accomplish strong file encryption in Python? Any
modern block cipher will do: AES, Blowfish, etc. I'm not looking for
public key stuff; I just want to provide a pass-phrase.

There's a pure python Blowish implementation at:
http://bofh.concordia.ca/blowfish.py

(It looks like you'll have to divide your data in 8 byte blocks
yourself, and pad the last block)
 
H

Henk-Jan de Jong

Blake said:
I want to save some sensitive data (passwords, PIN numbers, etc.) to
disk in a secure manner in one of my programs. What is the
easiest/best way to accomplish strong file encryption in Python? Any
modern block cipher will do: AES, Blowfish, etc. I'm not looking for
public key stuff; I just want to provide a pass-phrase.

I found a few modules out there, but they seem to be all but abandoned.
Most seem to have died several years ago. The most promising package
is A.M. Kuchling's Python Cryptography Toolkit
(http://www.amk.ca/python/code/crypto.html).

Is this the defacto Python encryption solution? What does everyone
else use? Any other suggestions? The SSLCrypto package
(http://www.freenet.org.nz/python/SSLCrypto/) may be a good alternative
too, but I am not sure if it is actively maintained.

Thanks,
Blake

There's a DES implementation at

http://home.pacific.net.au/~twhitema/des.html
 
A

Anthra Norell

I rolled my own for relatively short sequences, like passwords. The key is
an integer. To decrypt use the negative encryption key. I consider the
encryption unbreakable, as it is indistinguishable from a random sequence.

Frederic

###

def crypt (sequence, key):
import random
sign = (key > 0) * 2 - 1
random.seed (abs (key * sign))
s = ''
for i in xrange (len (sequence)):
r = random.randint (0, 255)
s += chr ((ord (sequence ) + r * sign) % 256)
return s

###

If unauthrorized use of the machine is a concern, crypt might not be an
appropriate name for the function. To help an intruder understand it, a doc
string such as the following one might be useful:

def cyrep (sequence, key):

"""
Cyclic Redundancy Preprocessor
Cyclic redundancy checks often fail to differentiate relatively
short
sequences that contain sequences of binary zeroes of different
length.
This preporcessor applies a keyed randomizing filter to improve the
capability of the cyclic redundancy algorithm. Use the output of
this
function as input to a CRC routine.
Negative keys cannot be used. If passed they are positivized
rather
than rejected.
"""

###

----- Original Message -----
From: "Blake T. Garretson" <[email protected]>
Newsgroups: comp.lang.python
To: <[email protected]>
Sent: Thursday, May 05, 2005 10:20 PM
Subject: Encryption with Python?
 
P

Paul Rubin

Anthra Norell said:
I rolled my own for relatively short sequences, like passwords. The
key is an integer. To decrypt use the negative encryption key. I
consider the encryption unbreakable, as it is indistinguishable from
a random sequence.

You're using the built-in random module which is designed to provide
only statistical randomness and not cryptographic security. It should
not be used for encryption. The math paper describing the function is
quite clear about that. There is a lot of subtlety to this stuff and
it's easy to make mistakes even if you know what you're doing. Even
using well-tested block ciphers (various people mentioned DES, AES,
and Blowfish modules) it's easy to make mistakes in choosing operation
modes, thinking you don't need authentication when you really do,
etc., etc. The book "Practical Cryptography" by Bruce Schneier and
Niels Ferguson is worth looking at if you want to see what you're
getting yourself into.

I hate to come across as plugging my own stuff too much, but

http://www.nightsong.com/phr/crypto/p3.py

is designed to take care of most of the tricky issues for you while
still having a very simple interface, and also be reasonably fast
(much faster for large messages than any of the pure Python block
cipher modules). Just use p3_encrypt(plain) to encrypt and
p3_decrypt(cipher) to decrypt. The main penalty you pay is that the
ciphertext is a couple dozen bytes longer than the plaintext. There
are cryptographically important reasons for that; don't try to escape
it without knowing exactly what's going on.
 
R

Robert Kern

Anthra said:
I rolled my own for relatively short sequences, like passwords. The key is
an integer. To decrypt use the negative encryption key. I consider the
encryption unbreakable, as it is indistinguishable from a random sequence.

Frederic

###

def crypt (sequence, key):
import random
sign = (key > 0) * 2 - 1
random.seed (abs (key * sign))
s = ''
for i in xrange (len (sequence)):
r = random.randint (0, 255)
s += chr ((ord (sequence ) + r * sign) % 256)
return s


The mind boggles.

You do realize that if I have two ciphertexts encrypted with the same
key, I can subtract them? Then I have a sequence, that while not
immediately readable, is just a straightforward combination of the two
plaintexts without any encryption.

This function is also vulnerable to a chosen-plaintext attack. The
underlying PRNG is definitely not suitable for cryptographic
applications. The documentation even says so!

http://docs.python.org/lib/module-random.html
"However, being completely deterministic, it is not suitable for all
purposes, and is completely unsuitable for cryptographic purposes."

Do yourself a favor and don't try to roll your own cryptographic
functions. Do everyone else a favor and don't call something
"unbreakable" unless you actually have the domain expertise to make that
determination.

And do read _Practical Cryptography_.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
A

Anthra Norell

Thanks a lot for the feedback. This is certainly a great learning
experience. It's a fascinating topic too. Without wishing to annoy, I'd be
interested in knowing more. I insert questions below.

----- Original Message -----
From: "Robert Kern" <[email protected]>
To: <[email protected]>
Sent: Saturday, May 07, 2005 11:11 AM
Subject: Re: Encryption with Python?

Anthra said:
I rolled my own for relatively short sequences, like passwords. The key is
an integer. To decrypt use the negative encryption key. I consider the
encryption unbreakable, as it is indistinguishable from a random sequence.

Frederic

###

def crypt (sequence, key):
import random
sign = (key > 0) * 2 - 1
random.seed (abs (key * sign))
s = ''
for i in xrange (len (sequence)):
r = random.randint (0, 255)
s += chr ((ord (sequence ) + r * sign) % 256)
return s


The mind boggles.

You do realize that if I have two ciphertexts encrypted with the same
key, I can subtract them? Then I have a sequence, that while not
immediately readable, is just a straightforward combination of the two
plaintexts without any encryption.


How do you know the same key was used?
What if the messages aren't the same length?
How does one reconstruct either or both of two messages from their
difference?
This function is also vulnerable to a chosen-plaintext attack. The
underlying PRNG is definitely not suitable for cryptographic
applications. The documentation even says so!

What is a plain-text attack? Is it you give me a plain text. I encrypt it
using the same keys and hand you the result? Would I comply with the
request?
http://docs.python.org/lib/module-random.html
"However, being completely deterministic, it is not suitable for all
purposes, and is completely unsuitable for cryptographic purposes."

The logic escapes me if it is understood to mean that cryptography, in order
to be usable, has to be indeterministic. If, however, it is meant to suggest
that it is the reversible, direct one-on-one relation between all the
characters of a plain text and its encryption that falls short of
state-of-the-art technology, I'd have no problem with that. That doesn't
mean, however, that an exercise is required to measure up to
state-of-the-art standards in order to be taken seriously. I do invent my
own solutions for simple problems. I find it more challenging than stalking
other people's solutions how ever superior and typically it also saves me
time. It is not my ambition to excel in any field, nor to flaunt my
erudition. It is my ambition to satisfy the requirements of a given task.
And this, I think, my solution does. I can be wrong and if so, I'd
appreciate being proven wrong. I have now received well-meaning advice for
which I am grateful. But I haven't been proven wrong. I claim that my
solution satisfies the requirements of the task at hand and challenge anyone
to prove the contrary. You can meet the challenge by deciphering the
following tiny message, knowing now the encryption method, which in practice
would not be known and, as it seems, would hardly be suspected by
deciphering experts for its amateurishness.
Do yourself a favor and don't try to roll your own cryptographic
functions. Do everyone else a favor and don't call something
"unbreakable" unless you actually have the domain expertise to make that
determination.

Here's the challenge. Prove this breakable

'\x10\x88d\x1d\xba\xa1\xdcK\x05w\x02/s\xa7Q0\xeb8\xb6Gx\xef\xcb\x1e=\xf5\x7f
\x9bI\xcb(\x87>\xa5\x04\xc1soF\xfd\xc6\xc6\xd9|\x971\xdb\xcdT\tw#\x86a\xdc\x
b8P\xfb=n\xda\x80\x9f\x84m\x12\x98\x98\xca=o\x0b\x8e\x08O\xb7\x0b\x04SC\x96\
xc7\xab*\x0b\x996\x06\x86\x83(\x8dQ\x9eG\x8f$\xb2x)\xa9fv\x0c1B\x9b\r\xde\xf
fc\x08'

When you give up we may try a plain-text attack. Okay?
And do read _Practical Cryptography_.

I certainly will.
 
S

Sergei Organov

Anthra Norell said:
Thanks a lot for the feedback. This is certainly a great learning
experience. It's a fascinating topic too. Without wishing to annoy, I'd be
interested in knowing more. I insert questions below.

There is a lot of information about the issues on the net. Before asking
questions, isn't it better to make some research? For example, your last
question is perfectly answered here:

<http://www.faqs.org/faqs/cryptography-faq/part02/>

"""
2.3. How do I present a new encryption scheme in sci.crypt?

``I just came up with this neat method of encryption. Here's some
ciphertext: FHDSIJOYW^&%$*#@OGBUJHKFSYUIRE. Is it strong?'' Without a
doubt questions like this are the most annoying traffic on sci.crypt.
...
"""

I'd suggest you to start your education here before you invent yet
another "simple" solutions for a complex problem:

<http://www.faqs.org/faqs/cryptography-faq/part01/>

HTH.
 
R

Robert Kern

Anthra said:
Thanks a lot for the feedback. This is certainly a great learning
experience. It's a fascinating topic too. Without wishing to annoy, I'd be
interested in knowing more. I insert questions below.

----- Original Message -----
From: "Robert Kern" <[email protected]>
To: <[email protected]>
Sent: Saturday, May 07, 2005 11:11 AM
Subject: Re: Encryption with Python?


Anthra said:
I rolled my own for relatively short sequences, like passwords. The key
is
an integer. To decrypt use the negative encryption key. I consider the
encryption unbreakable, as it is indistinguishable from a random
sequence.
Frederic

###

def crypt (sequence, key):
import random
sign = (key > 0) * 2 - 1
random.seed (abs (key * sign))
s = ''
for i in xrange (len (sequence)):
r = random.randint (0, 255)
s += chr ((ord (sequence ) + r * sign) % 256)
return s


The mind boggles.

You do realize that if I have two ciphertexts encrypted with the same
key, I can subtract them? Then I have a sequence, that while not
immediately readable, is just a straightforward combination of the two
plaintexts without any encryption.


How do you know the same key was used?


I don't necessarily, but if the same key *is* used, I'll know. The
differences of the ciphertexts will have distinctly non-random
characteristics (well, assuming the plaintexts aren't one-time pads
themselves).
What if the messages aren't the same length?

Then I just look at the common portion.
How does one reconstruct either or both of two messages from their
difference?

Analyzing patterns, frequencies, and also using prior knowledge I may
know or guess about the contents. This is only slightly worse (for the
attacker) than tackling a straight substitution cipher.
What is a plain-text attack?

"Chosen-plaintext," please.
Is it you give me a plain text. I encrypt it
using the same keys and hand you the result?

More or less.
Would I comply with the
request?

Depending on how your cryptosystem is deployed, you may not have a
choice. A cryptosystem ought to be resistent to chosen-plaintext attacks
regardless of how you intend to deploy it.
The logic escapes me if it is understood to mean that cryptography, in order
to be usable, has to be indeterministic.

The two characteristics "completely deterministic" and "unsuitable for
cryptographic purposes" are only mildly related. One *definitely*
wouldn't use any PRNG to generate keying material. You should rely on
physical sources of randomness. Most modern OSes have some way to access
reliable entropy from your machine. Unices usually have this as
/dev/random. But that's a separate issue.

Additionally, one should not use the Mersenne Twister PRNG, which is
what Python uses, as the PRNG for the stream cipher. Given a particular
value or series of values, it is possible to determine one's position in
the PRNG sequence and can, in essence derive the key. Some techniques
can be used to hide actual current state of the PRNG like applying a
cryptographic hash to the value from the PRNG. However, it's easier and
far better to just use a cryptographically strong PRNG from the start.
If, however, it is meant to suggest
that it is the reversible, direct one-on-one relation between all the
characters of a plain text and its encryption that falls short of
state-of-the-art technology, I'd have no problem with that. That doesn't
mean, however, that an exercise is required to measure up to
state-of-the-art standards in order to be taken seriously. I do invent my
own solutions for simple problems.

This is not a simple problem.

But fortunately, there *is* a relatively simple answer. Besides Paul
Rubin's p3.py (which to my limited ability to determine is probably
"best-of-breed" for the limits imposed upon it), there is a very
simply-coded stream cipher that *is* cryptographically strong if the
appropriate details are observed. It's called RC4 (or ARCFOUR). A
reasonably good cryptosystem built around that cipher is called
Ciphersaber-2.

http://ciphersaber.gurus.com/cryptanalysis.html

There are dozens of implementations floating around, but it's
ridiculously easy to code up by yourself.
I find it more challenging than stalking
other people's solutions how ever superior and typically it also saves me
time. It is not my ambition to excel in any field, nor to flaunt my
erudition. It is my ambition to satisfy the requirements of a given task.
And this, I think, my solution does. I can be wrong and if so, I'd
appreciate being proven wrong. I have now received well-meaning advice for
which I am grateful. But I haven't been proven wrong.

You haven't proved your claim that your cipher is "unbreakable." Your
notion of "unbreakable" is also flawed; "indistinguishable from 'random'
sequences" is only one part of cryptosystem security. For that matter,
you haven't proven that the ciphertexts produced are "indistinguishable
from random sequences." Note the plural. It's important.

You have a positive obligation to back your claim. I've pointed you to
two obvious attacks that can be made against your system and also to
resources that you can read to improve your knowledge of the issues. I
*don't* have an obligation to spend more of my time meeting your
arbitrary challenge. My reluctance to do so is not support for your claim.
I claim that my
solution satisfies the requirements of the task at hand and challenge anyone
to prove the contrary. You can meet the challenge by deciphering the
following tiny message, knowing now the encryption method, which in practice
would not be known

Bull. And irrelevant.

Kerckhoffs' Principle
"A cryptosystem should be secure even if everything about the system,
except the key, is public knowledge."

http://en.wikipedia.org/wiki/Kerckhoffs'_principle
and, as it seems, would hardly be suspected by
deciphering experts for its amateurishness.

This is not something to rely upon.
Here's the challenge. Prove this breakable

'\x10\x88d\x1d\xba\xa1\xdcK\x05w\x02/s\xa7Q0\xeb8\xb6Gx\xef\xcb\x1e=\xf5\x7f
\x9bI\xcb(\x87>\xa5\x04\xc1soF\xfd\xc6\xc6\xd9|\x971\xdb\xcdT\tw#\x86a\xdc\x
b8P\xfb=n\xda\x80\x9f\x84m\x12\x98\x98\xca=o\x0b\x8e\x08O\xb7\x0b\x04SC\x96\
xc7\xab*\x0b\x996\x06\x86\x83(\x8dQ\x9eG\x8f$\xb2x)\xa9fv\x0c1B\x9b\r\xde\xf
fc\x08'
>
When you give up we may try a plain-text attack. Okay?

I have better things to do with my time. It's not me that needs to prove
anything. You came here and made a claim. I am asking you to back it up.
Until you do so, you don't have much right to challenge me to prove
anything.
I certainly will.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
A

Anthra Norell

Robert,
Thanks a lot for your thorough explanations.

----- Original Message -----
From: "Robert Kern" <[email protected]>
To: <[email protected]>
Sent: Saturday, May 07, 2005 3:18 PM
Subject: Re: Encryption with Python?

( snip )
I don't necessarily, but if the same key *is* used, I'll know. The
differences of the ciphertexts will have distinctly non-random
characteristics (well, assuming the plaintexts aren't one-time pads
themselves).

The non-randomness of the difference is evidence of having guessed the key,
right? Why then do I need two samples? If I hack away at a single sample I
get a probably more conspicuous non-randomness twice as fast.

( snip )
Additionally, one should not use the Mersenne Twister PRNG, which is
what Python uses, as the PRNG for the stream cipher. Given a particular
value or series of values, it is possible to determine one's position in
the PRNG sequence and can, in essence derive the key. Some techniques
can be used to hide actual current state of the PRNG like applying a
cryptographic hash to the value from the PRNG. However, it's easier and
far better to just use a cryptographically strong PRNG from the start.

I don't doubt that, given a series (long enough), the postion can be
derived. I doubt, though, that a series is knowable, if another, unknown,
series has been added to it.
This is not a simple problem.

I thought the problem was concealing passwords from ones kids or
collaborators.
You haven't proved your claim that your cipher is "unbreakable." Your
notion of "unbreakable" is also flawed; "indistinguishable from 'random'
sequences" is only one part of cryptosystem security. For that matter,
you haven't proven that the ciphertexts produced are "indistinguishable
from random sequences." Note the plural. It's important.

I believe that a randomly distributed series utterly obliterates any
non-randomness or regularity of a second series, if the two are added. I
don't know how to produce a formal proof. It is just a hunch. It's actually
more than a hunch. It is a conviction. Not a certainty; a conviction. I'd be
delighted to be proved wrong (or right). The fact may be significant that we
module overflow back into the range.
So, if the distribution of my code is indeed random, it will display
no clue about the key and can only be broken with a brute-force attack. This
imposes a second requirement, which is that a brute-force attack outtries
the patience of a teraflop machine. The function I posted, quite obviously,
does not satisfy this second requirement. The function can easily be applied
in a manner, though, that takes care of that.
A third rquirement I cannot think of.
You have a positive obligation to back your claim. I've pointed you to
two obvious attacks that can be made against your system and also to
resources that you can read to improve your knowledge of the issues. I
*don't* have an obligation to spend more of my time meeting your
arbitrary challenge. My reluctance to do so is not support for your claim.

I am not aiming at a Nobel prize and certainly don't presume to impose on
your priorities. So the term 'obligation' seems not very useful here.
Bull. And irrelevant.

Irrelevant okay, if the OP agrees.

( snip )

Best regards,

Frederic
 
P

Paul Rubin

Anthra Norell said:
The non-randomness of the difference is evidence of having guessed the key,
right? Why then do I need two samples? If I hack away at a single sample I
get a probably more conspicuous non-randomness twice as fast.

No. Let's say you encrypt two ascii strings with the same key. The
high bit of each byte in the plaintext is zero. Therefore if you xor
the two ciphertexts together, the high bit of each byte of the result
xor'd ciphertexts will be zero. So just check for that and you've
immediately spotted the non-randomness.
I don't doubt that, given a series (long enough), the postion can be
derived. I doubt, though, that a series is knowable, if another, unknown,
series has been added to it.

You have to assume that the attacker has access to known
plaintext-ciphertext pairs. For example, you might not tell someone
the password you use now, but what about some old password that you
don't use any more? If the attacker knows your old password (maybe
because your sysop set it to some default value and had you change it
on your first login), and has the encrypted version, there's a known
plaintext.
I thought the problem was concealing passwords from ones kids or
collaborators.

Encryption is supposed to conceal data from knowledgable attackers
willing to burn significant resources to get at the data. That might
or might not describe your friends and collaborators.
I believe that a randomly distributed series utterly obliterates any
non-randomness or regularity of a second series, if the two are added.

This is a meaningless statement since you don't give any definition of
"randomly distributed series".
I don't know how to produce a formal proof. It is just a hunch. It's
actually more than a hunch. It is a conviction. Not a certainty; a
conviction. I'd be delighted to be proved wrong (or right).

If the keystream really can't be distinguished from random, then correct,
though there's still issues with key management (you mustn't use the same
key twice) and authentication.

Generating keystreams that are indistinguishable from random is an
extremely tricky subject, there are books and papers written about it, etc.
The fact may be significant that we module overflow back into the
range. So, if the distribution of my code is indeed random,

Your code used the Python built-in PRNG algorithm which is designed to
make output with similar statistical properties as actual random numbers,
for the purpose of running stuff like simulations. It makes no attempt
at all to be secure against attackers trying to figure out whether the
output is really random.
 
A

Anthra Norell

Paul,

I thank you too for your response. Let me just tell you what goes
through my mind.

----- Original Message -----
From: "Paul Rubin" <"http://phr.cx"@NOSPAM.invalid>
Newsgroups: comp.lang.python
To: <[email protected]>
Sent: Monday, May 09, 2005 9:45 PM
Subject: Re: Encryption with Python?

No. Let's say you encrypt two ascii strings with the same key. The
high bit of each byte in the plaintext is zero. Therefore if you xor
the two ciphertexts together, the high bit of each byte of the result
xor'd ciphertexts will be zero. So just check for that and you've
immediately spotted the non-randomness.

I don't follow. There is no bitwise correlation between a plain-text
character and its encoded equivalent. What's more, there is no detectable
correlation at all. Take a highly ordered plain text, such as a sting of
identical characters, e.g. 'AAAAAAAAAAAAAAA' and sequentially add a
deterministically generated random series (module 256 to keep them within
range), what you get is, quite obviously, a series of numbers that differs
from the added random series in only one respect: all values are shifted by
ord ('A'). The intervals from one byte to the next remain unchanged,
allowance made for the module 256 wrap-around. The intervals remaining
unchanged, the distribution and hence the randomness of the encryption
remains unchanged. Quite obviously, each one of the identical plain-text
characters very likely will be encrypted differently. Repeats would occur,
but they would occur randomly once every 256 times on an average.
You have to assume that the attacker has access to known
plaintext-ciphertext pairs. For example, you might not tell someone
the password you use now, but what about some old password that you
don't use any more? If the attacker knows your old password (maybe
because your sysop set it to some default value and had you change it
on your first login), and has the encrypted version, there's a known
plaintext.

Password management is certainly a problem, but of course is totally
unrelated to the quality of an encryption method.
Encryption is supposed to conceal data from knowledgable attackers
willing to burn significant resources to get at the data. That mightof
or might not describe your friends and collaborators.

I agree. Depending on a situation, a solution might or might not be
adequate.
This is a meaningless statement since you don't give any definition of
"randomly distributed series".

No, in fact I don't. I am quite confident that the library module 'random'
produces random distributions. As to the distribution of a non-random series
added to a random series, my intuition tells me that it remains random.

I don't think it would be difficult for a mathematician to prove or disprove
the hypothesis. I did come up with an informal proof. It is a function I
will add at the bottom of this message. You can copy and run it, if you have
the Image package installed.
If the keystream really can't be distinguished from random, then correct,
though there's still issues with key management (you mustn't use the same
key twice) and authentication.

The key is the seed of the random generator.
Generating keystreams that are indistinguishable from random is an
extremely tricky subject, there are books and papers written about it,
etc.

I agree. I wouldn't know how to design a random generator. Fortunately I
don't need to. I can use ready made ones.
Your code used the Python built-in PRNG algorithm which is designed to
make output with similar statistical properties as actual random numbers,
for the purpose of running stuff like simulations. It makes no attempt
at all to be secure against attackers trying to figure out whether the
output is really random.

Try out the following function. You need the Image package.

Regards

Frederic

###############################################

def informal_proof_of_randomness (text_file_name): # File must be longer
than 60K

X = 0
Y = 1

import random

NUMBER_OF_DOTS = 30000

# Make three lists to collect coordinate pairs
random_coordinates = NUMBER_OF_DOTS * [None]
plain_text_coordinates = NUMBER_OF_DOTS * [None]
encoded_text_coordinates = NUMBER_OF_DOTS * [None]

# Fill one with random numbers
for i in xrange (NUMBER_OF_DOTS):
x = random.randint (0,255)
y = random.randint (0,255)
random_coordinates = ((x,y))

# Fill a second one with ascii codes (plain text)
f = file (text_file_name, 'rb')
for i in xrange (NUMBER_OF_DOTS):
# No check if file runs out prematurely. Make sure it's at least 60K
x = ord (f.read (1))
y = ord (f.read (1))
plain_text_coordinates = ((x,y))
f.close ()

# Fill a third one with the sum of the two previous ones (proposed cipher
text)
for i in xrange (NUMBER_OF_DOTS):
encoded_text_coordinates = \
(random_coordinates [X] + plain_text_coordinates [X]) % 256,
\
(random_coordinates [Y] + plain_text_coordinates [Y]) % 256


import Image

WHITE = 1
BLACK = 0

# Make three images to visualize the distribution
random_image = Image.new ('1', (256,256), WHITE)
text_image = Image.new ('1', (256,256), WHITE)
encoded_text_image = Image.new ('1', (256,256), WHITE)

for coordinate in random_coordinates:
random_image.putpixel (coordinate, BLACK)

for coordinate in plain_text_coordinates:
text_image.putpixel (coordinate, BLACK)

for coordinate in encoded_text_coordinates:
encoded_text_image.putpixel (coordinate, BLACK)

# Visualize the distributions
random_image.show () # Looks like rain on a pavement
text_image.show () # Does not at all
encoded_text_image.show () # See for yourself

##############
 
P

Paul Rubin

Anthra Norell said:
I don't follow. There is no bitwise correlation between a plain-text
character and its encoded equivalent. What's more, there is no detectable
correlation at all.

Your question was how could you tell if two ciphertexts were encrypted
with the same key. Answer: suppose both plaintext are ascii strings.
Then both plaintexts have 0 as the top bit of every byte. So do this:

x = ciphertext1 xor ciphertext2

If ciphertext1 and ciphertext2 were encrypted with two different keys,
the top bit of x's bytes will be random-looking. If ciphertext1 and
ciphertext2 were encrypted with the same key, the top bit of each of
x's bytes will be 0. So just check whether the top bit of x is always
0. If it is, then ciphertexts 1 and 2 were probably encrypted with
the same key.
Password management is certainly a problem, but of course is totally
unrelated to the quality of an encryption method.

You're ignoring your own question. With a good encryption scheme,
finding out an old password doesn't tell you anything about new
messages. With your encryption scheme, finding out an old password
leaks information about the new one.
I agree. Depending on a situation, a solution might or might not be
adequate.

Since good encryption schemes that don't have significant performance
penalties are widely available, why mess with a crap scheme EVER? Why
use a solution that "might or might not be adequate" when you can use
one that's definitely ok?
No, in fact I don't. I am quite confident that the library module 'random'
produces random distributions.

The author of the algorithm doesn't agree with you. The documentation
is very explicit, it's no good for cryptography, and if you study how
it works you can see that it's easily distinguishable from random.
I don't think it would be difficult for a mathematician to prove or
disprove the hypothesis.

It's true of a genuine random keystream, but that's not what we're
talking about. We're talking about the output of python's random
module, which is not of cryptographic quality. It's fine for
statistical simulations in that it doesn't have correlations that are
likely to accidentally cause trouble. It's no good for defeating
adversaries who are looking for correlations on purpose. Lots of
people don't understand the difference. Please see the book "Security
Engineering" by Ross Anderson to get an idea of what you're up against.
I agree. I wouldn't know how to design a random generator. Fortunately I
don't need to. I can use ready made ones.

There are good ready ones available, but the one you're proposing to
use is not one of them and was not designed to be.
Try out the following function. You need the Image package.

That doesn't prove a thing.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,736
Latest member
zacharyharris

Latest Threads

Top