encryption with python

K

Kirk Job Sluder

Paul Rubin said:
We're told there is already a secure database in the picture
somewhere, or at least one that unescapeably contains cleartext SSN's,
so that's the system that should assign the ID numbers and handle
SSN-based queries.

Well, IMO just having cleartext SSNs is questionable practice unless you
need those SSNs to report to some other agency that takes SSNs. And
even so, you might want to limit access to plaintext SSNs to a limited
group, and give access to the hashed SSNs as a search key to a different
group.
A voice exemplar stored at enrollment time plus a question or two like
"what classes did you take last term" could easily give a pretty good
clue that the person saying the words/phrases is the legitimate
student.

In my experience the typical student has trouble remembering what
happened last week, much less last term. In addition, universities
frequently need to field questions from people who were students years
ago.

Are voice exemplars at that stage yet?
Customers legitimately want actual security without having to care how
hash functions work, just like they want safe transportation without
having to care about how jet engine turbopumps work. Air travel is
pretty safe because if the airline fails to maintain the turbopumps
and a plane goes down, there is hell to pay. There is huge legal and
financial incentive for travel vendors (airlines) to not cut corners
with airplane safety. But vendors who deploy incompetently designed
IT systems full of confidential data resulting in massive privacy
breaches face no liability at all.

I'm more than happy to agree to disagree on this, but I see it
differently. In aviation there certainly is a bit of risk-benefit
analysis going on in thinking about whether the cost of a given safety
is justified given the benefits in risk reduction.

Likewise, credit companies are currently making money hand-over-fist.
If an identity is compromised, it's cheaper for them to just close the
account, refund the money, and do their own fraud investigation after
the fact. Meanwhile, for every person who gets stung, there are a
hundred wanting convenience. In addition, the losses due to bad
cryptographic implementation appear to be trivial compared to the losses
due to social engineering.
 
R

Robert Kern

Kirk said:
Well, IMO just having cleartext SSNs is questionable practice unless you
need those SSNs to report to some other agency that takes SSNs.

Colleges generally do have such needs.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
R

Ron Adam

Kirk said:
The only way to keep confidential stuff secure is to shred it, burn it,
and grind the ashes.

I think the fundamental problem is that that most customers don't want
actual security. They want to be able to get their information by
calling a phone number and saying a few words/phrases they memorized in
childhood. Given the current market, it seems to be cheaper to deal
with breaks after the fact than to expect more from customers.

Security = Privacy in this context, and most customers do want privacy.

But also in this case, you are referring to two party security
situations, where the data is shared between a service provider and a
service consumer.

I would think that any n digit random number not already in the data
base would work for an id along with a randomly generated password that
the student can change if they want. The service provider has full
access to the data with their own set of id's and passwords, so in the
case of a lost id, they can just look it up using the customers name
and/or ssn, or whatever they decide is appropriate. In the case of a
lost password, they can reset it and get another randomly generated
password.

Or am I missing something?

Cheers,
Ron
 
J

James Stroud

Providing any kind of access to data involves creating a security hole.
This is the biggest flaw in most discussions of computer security.

There are "one-way" encryption functions where the result can't easily be
traced back to the input, but why do you need the input anyway? Here is my
quick-and-dirty student ID algorithm:

I have invented the perfect security protocol that solves a major problem with
the one-time-pad. The problem with most one-time-pad protocols is that you
still need to have the pad around, creating a major security hole. I have
solved that problem here. It has all of the steps of the usual one-time-pad
plus an extra step.

1. Generate a random number the size of your data.
2. XOR your data with it.
3. Destroy the original data.

Here is the additional step:

4. Destroy the random number.

You can see now that no adversary can resonably reconstruct the plain text.
This protocol might be terribly inconvenient, though, because it makes the
origina data unaccessible. Oh well, just a necessary byproduct of
theoritcally perfect security.

I hereby place this algorithm in the public domain. Use it freely.

James

--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/
 
P

Paul Rubin

Kirk Job Sluder said:
I'm more than happy to agree to disagree on this, but I see it
differently. In aviation there certainly is a bit of risk-benefit
analysis going on in thinking about whether the cost of a given safety
is justified given the benefits in risk reduction.

Likewise, credit companies are currently making money hand-over-fist.
If an identity is compromised, it's cheaper for them to just close the
account, refund the money, and do their own fraud investigation after
the fact.

You don't get it. Refunding the money improperly charged on a single
card doesn't begin to compensate for the hassle of undoing an identity
theft. If airlines worked the way you're suggesting the credit
industry should work, and a plane went down, the airline would be off
the hook by refunding your estate the price of your ticket. It's only
because they face much further-reaching liability than that, that they
pay so much attention to safety.
 
M

Marc 'BlackJack' Rintsch

Steven said:
last_number_used = 12345
usable_IDs = []

def make_studentID():
global last_number_used
global usable_IDs
if not usable_IDs:
# generate another batch of IDs in random order
usable_IDs = range(last_number_used, last_number_used + 1000)
- usable_IDs.sort(random.random())
+ random.shuffle(usable_IDs)
last_number_used += 1000
return usable_IDs.pop()

Ciao,
Marc 'BlackJack' Rintsch
 
J

James Stroud

Kirk Job Sluder wrote:
I would think that any n digit random number not already in the data
base would work for an id along with a randomly generated password that
the student can change if they want. The service provider has full
access to the data with their own set of id's and passwords, so in the
case of a lost id, they can just look it up using the customers name
and/or ssn, or whatever they decide is appropriate. In the case of a
lost password, they can reset it and get another randomly generated
password.

Or am I missing something?

Yes and no. Yes, you are theoretically correct. No, I don't think you have the
OP's original needs in mind (though I am mostly guessing here). The OP was
obviously a TA who needed to assign students a number so that they could
"anonymously" check their publicly posted grades and also so that he could do
some internal record keeping.


But, I'm thinking no one remembers college here anymore.

When I was in college (and when I TA'd) security was kind of flimsy. TAs kept
all records of SS#s, etc. (etc. includes birthdays here) in a gradebook (or
the rich ones kept them on a 5 1/4" floppy). Grades were reported publicly by
full SS#s, usually on a centralized cork-board. That was back in the
good-ole-days, before financial fraud was euphemised to "identity theft".

When I TA'd several years later, grades were reported by the last n digits of
the SS#. Some very security conscious TAs--or was it just me? I think it was
just me--solicited pass phrases from each student and grades were reported
based on the student generated pass phrase--and not on SS# or the like. These
phrases usually came in the form of "Buffs1" or "Kitty1979" (the latter
possibly revealing some information about a birthday, perhaps?). Some
students didn't submit pass phrases, for whatever reason. I think I did the
less convenient of the two most reasonable options, which was to withold
reporting the grade to the student until they gave me a phrase. The other
option was to use a default pass phrase of the last n digits of the SS#.

The idea of combining ID information and encrypting it to create another ID is
a quantum leap beyond the primitive "last n digits of the SS#". Does it beat,
in theoretical terms, assigning random numbers? No. And it certainly doesn't
beat, in theoretical terms, my improved one-time-pad protocol (see my
previous email). I challenge even the most capable cryptographer to beat my
improved one-time-pad protocol for security (Oh wait, here it is: 1. Destroy
Data.) But it is convenient, especially if you discard the original
identifying information and store just the hashes. And as far as collisions
go, even if a class of 10,000 gives a 1% chance of collision, who is going to
TA a class of 10,000 students. If you can promise that kind of enrolment for
any department, much less any single class, there is a job in an Economics
department waiting for you out there, my friend.

So what would be the alternative to ID information generated IDs? Have a 3xDES
encrypted database with the SS# and birthday stored as plain-text? Better
keep the encryption protocol secret! Oops. Screwed up already. I figured out
the encryption protocol: Encrypt database with 3xDES using a secret key.
Dang, security through obscurity. All they have to do is to get that secret
key and all those records are easily readable.

The point is that *something has to be kept secret* for encryption security to
work. Theoretically best would be a passphrase, or a passphrase to a really
big key. So, perhaps we could modify the algorithm from a few messages back,
in order to address the (assumed) *practical* considerations of the OP's
original query:

import sha
def encrypt(x,y, password):
def _dosha(v): return sha.new(str(v)+str(password)).hexdigest()
return int(_dosha(_dosha(x)+_dosha(y))[5:13],16)

So now what is the criticism? That its still a "secret algorithm" because the
password is "secret"?

James


--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/
 
P

Paul Rubin

James Stroud said:
Yes and no. Yes, you are theoretically correct. No, I don't think
you have the OP's original needs in mind (though I am mostly
guessing here). The OP was obviously a TA who needed to assign
students a number so that they could "anonymously" check their
publicly posted grades and also so that he could do some internal
record keeping.

If that's all it's about, it's not a big deal. If it's for some central
administrative database that's more of a target, more care is warranted.
The idea of combining ID information and encrypting it to create

The info to be combined was the student's birthdate. Why would the TA
have access to either that or the SSN?
import sha
def encrypt(x,y, password):
def _dosha(v): return sha.new(str(v)+str(password)).hexdigest()
return int(_dosha(_dosha(x)+_dosha(y))[5:13],16)

So now what is the criticism? That its still a "secret algorithm"
because the password is "secret"?

That's sort of reasonable as long as the password really is secret and
you don't mind a small chance of two students getting the same ID
number once in a while. If the password is something that a TA types
into a laptop when entering grades and which goes away after the
course ends, it's not such a big deal. If it's a long-term key that
has to stay resident in a 24/7 server through the students' entire
time at the university and beyond, then the algorithm is the trivial
part and keeping the key secret is a specialized problem in its own
right. For example, financial institutions use special, tamper
resistant hardware modules for the purpose.

Could the OP please say what the exact application is? That might get
more useful responses if the question still matters.
 
J

James Stroud

The info to be combined was the student's birthdate.  Why would the TA
have access to either that or the SSN?

Speaking as a former TA, we had all that and a little more, if I remember
correctly. The "why" aspect is a little beyond me.

James

--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/
 
R

Ron Adam

James said:
Yes and no. Yes, you are theoretically correct. No, I don't think you have the
OP's original needs in mind (though I am mostly guessing here). The OP was
obviously a TA who needed to assign students a number so that they could
"anonymously" check their publicly posted grades and also so that he could do
some internal record keeping.
>
But, I'm thinking no one remembers college here anymore.

Last semester I took, I was able to check my grades by logging into a
web page with my student ID and using a password. The password default
was my SSN, we could change it. In any case students have read only
access and are not able to change anything. Not a big deal and very
little personal information was visible. If any one would have bothered
The point is that *something has to be kept secret* for encryption security to
work. Theoretically best would be a passphrase, or a passphrase to a really
big key. So, perhaps we could modify the algorithm from a few messages back,
in order to address the (assumed) *practical* considerations of the OP's
original query:

The actual database files should not be directly reachable, except by
the appropriate data base administrators, it should send and retrieve
information based on the users access rights via a server.

Is this a case where each account is encrypted with a different key in
addition to the access rights given to each user?

Cheers,
Ron
 
K

Kirk Job Sluder

Paul Rubin said:
You don't get it. Refunding the money improperly charged on a single
card doesn't begin to compensate for the hassle of undoing an identity
theft. If airlines worked the way you're suggesting the credit
industry should work, and a plane went down, the airline would be off
the hook by refunding your estate the price of your ticket. It's only
because they face much further-reaching liability than that, that they
pay so much attention to safety.

Oh, I'm not suggesting the credit industry should work that way. I'm
just saying that's the way they will work as long as they can push off
the costs for dealing with problems onto interest rates and other fees.
 
K

Kirk Job Sluder

Ron Adam said:
I would think that any n digit random number not already in the data
base would work for an id along with a randomly generated password
that the student can change if they want. The service provider has
full access to the data with their own set of id's and passwords, so
in the case of a lost id, they can just look it up using the customers
name and/or ssn, or whatever they decide is appropriate. In the case
of a lost password, they can reset it and get another randomly
generated password.

Or am I missing something?

Not really. My suggestion is that in many cases, if the data is being
used only as a backup password or authentication token, there is no need
for that data to be stored in plaintext. For example, with the
ubiquitous "mother's maiden name" * there is frequently no need to
actually have "Smith," "Jones," or "Gunderson" in the database.
"bf65d781795bb91ee731d25f9a68a5aeb7172bc7" serves the same purpose.

There are other cases where one-way anonymity is better than a table
linking people to randomly generated userIDs. I'd rather use
cryptographic hashes for research databases than keep a table matching
people to random numbers hanging around. But I'm weird that way.

* I think "mother's maiden name" is a really poor method for backup
authentication because for a fair number of people in the U.S., it
will be identical to their current surname, and for the rest, it's
trivial to discover.
 
R

Ron Adam

Kirk said:
Not really. My suggestion is that in many cases, if the data is being
used only as a backup password or authentication token, there is no need
for that data to be stored in plaintext. For example, with the
ubiquitous "mother's maiden name" * there is frequently no need to
actually have "Smith," "Jones," or "Gunderson" in the database.
"bf65d781795bb91ee731d25f9a68a5aeb7172bc7" serves the same purpose.

For that matter if the encrypted data is used a the key, then there is
no need to store the data period. OH... lets see, we'll just store the
password, and give them the data instead. Never mind it's a few thousand
characters or more. ;-) "Oh, and don't loose your account number BTW."

There are other cases where one-way anonymity is better than a table
linking people to randomly generated userIDs. I'd rather use
cryptographic hashes for research databases than keep a table matching
people to random numbers hanging around. But I'm weird that way.

Why would you need a table hanging around?

Most databases today are relational, so they are made up of lots of
linked tables of records and fields. And each user, can have access to
some parts without having access to other parts. So couldn't you
create a separate account to access, names and id numbers only?

Cheers,
Ron
 
S

Steven D'Aprano

Thank you to Mike Meyer, Kirk Sluder, and anyone who made constructive
comments and/or corrections to my earlier post about generating student
IDs as random numbers.

Especially thanks to Marc Rintsch who corrected a stupid coding mistake I
made. Serves me right for not testing the code.

Kirk pointed out that there is a good usage case for using a one-way
encryption function to encrypt a Social Security Number to the student ID:
you are prepared to deal with the inevetable, "I lost my
password/student ID, can you still look up my records?"

Whether the usefulness of that use outweighs the risks is not something we
can decide, but I hope the original poster is considering these issues and
not just blindly going for the technical solution.

For example, this is one possible way of dealing with students who have
lost their student ID:

- ask student for their name, d.o.b. and SSN;
- search the database for students whose name, d.o.b. and SSN match;
- if you have more than one match, there is a serious problem;
- otherwise you may consider that the student has proven their own
identity to you sufficiently, so you can safely tell them the student ID.

No need for a function that calculates the ID from the SSN, with the
associated risk that Black Hats will break the algorithm and use the
student ID to steal students' SSNs.

In effect, this scheme uses the algorithm "look it up in a secure
database" as the one-way function. It is guaranteed to be
mathematically secure, although it is vulnerable to bad guys cracking
into the database.


Thanks also to James Stroud for his amusing extension to the one-time pad
algorithm. If you have a need to be able to reconstruct the data, then
of course you need some sort of cryptographic function that can encrypt
the data and decrypt it. But that begs the question of whether or not you
actually do need to be able to reconstruct the data. The point of my post
was that you may not need to, in which case a random number is as good as
any other ID.

James also protested that passwords are "security through obscurity",
since "All they have to do is to get that secret key and all those
records are easily readable."

Of course this is technically correct, but that's not what security
through obscurity means to folks in the security business. The difference
between security through obscurity and security through a secret key is
profound: if I reverse-engineer your secret algorithm, I can read
every record you have. But if I discover the secret key belonging to one
person, I can only read that person's messages, not anyone else's.

As James says, "The point is that *something has to be kept secret* for
encryption security to work." Absolutely correct. But now think of the
difference between having keys to your door locks, compared to merely
keeping the principle of the door handle secret.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Encryption algorithm 1
Python and PHP encryption/decryption 3
python+encryption 1
Authenticated encryption with PyCrypto 6
Encryption 3
How do I solidify my Python skills 1
Help with an algorythm 5
AES encryption 5

Members online

No members online now.

Forum statistics

Threads
474,264
Messages
2,571,315
Members
48,001
Latest member
Wesley9486

Latest Threads

Top