Voting Project Needs Python People

A

Alan Dechert

We've decided to go with Python for the demo of the voting machine I
described on this forum yesterday. Thanks to all of you for your feedback.

We have an excellent team together. We're looking for a few good Python
coders willing to volunteer for this free open source project. It will be
on sourceforge.net later today.

I can't offer you any money right now but I think it will be a good
opportunity for Python users and the Python community in general. It's
likely to be fairly high-profile and will gain significant exposure for
Python.

Here are some of the people involved:

University of California Santa Cruz computer science graduate, Adrianne Yu
Wang will be the project administrator and lead the project.

http://home.earthlink.net/~adechert/adrianne_resume.doc

She will get help from Jack Walther, a UCSC graduate student and big Python
fan.

http://www.cse.ucsc.edu/~jdw/

They will be advised by computer science professors Doug Jones (U of Iowa)
and Arthur Keller (UCSC)

http://www.cs.uiowa.edu/~jones/

http://www.soe.ucsc.edu/~ark/

Ed Cherlin will also act in a mainly advisory role but will assist with the
design of the project and documentation.

http://home.earthlink.net/~adechert/cherlin_resume.doc

We anticipate that a successful demo will help expedite funding for the
overall project which is aimed at implementing a uniform transparent voting
system. We have very prominent academics interested in the project from
many of the top universities around the country: Political scientists,
lawyers, economists, computer scientists, and psychologists.

Stanford computer scientist David Dill has been helpful (he referred Ed
Cherlin and Prof Keller to me, among others). Professor Dill has gotten
involved in voting modernization issues in a big way.

http://www.verifiedvoting.org/index.asp

Henry Brady was my co-author for the original UC Berkeley proposal for
California.

http://home.earthlink.net/~adechert/src_proposal.html

Professor Brady is widely known for his papers and books on voting systems.
The main author of the Caltech-MIT reports from their voting project, Steve
Ansolabehere was one of Henry's students when Henry taught at Harvard.
Henry has two Ph.D.s from MIT.

http://www.apsanet.org/new/2003chairs.cfm

We are pulling together a great voting modernization projects. It's an
opportunity to get in on it at an early stage. It should be rewarding for
you, your community, and democracy.

Please contact me if you want to join us.

-- Alan Dechert 916-791-0456
(e-mail address removed)
 
P

Peter Hansen

Alan said:
We are pulling together a great voting modernization projects. [snip]

Is there a word missing in the above? I can't parse it as-is, but it
looks like it wants the word "many" after the word "great"...

-Peter
 
A

Alan Dechert

Peter Hansen said:
Alan said:
We are pulling together a great voting modernization projects. [snip]

Is there a word missing in the above? I can't parse it as-is, but it
looks like it wants the word "many" after the word "great"...
Editing error. I started to say, "the mother of all voting modernization
projects."

You get the idea.

Alan Dechert
 
H

Harry George

Alan Dechert said:
We've decided to go with Python for the demo of the voting machine I
described on this forum yesterday. Thanks to all of you for your feedback.

We have an excellent team together. We're looking for a few good Python
coders willing to volunteer for this free open source project. It will be
on sourceforge.net later today.

I can't offer you any money right now but I think it will be a good
opportunity for Python users and the Python community in general. It's
likely to be fairly high-profile and will gain significant exposure for
Python.

Here are some of the people involved:

University of California Santa Cruz computer science graduate, Adrianne Yu
Wang will be the project administrator and lead the project.

http://home.earthlink.net/~adechert/adrianne_resume.doc

She will get help from Jack Walther, a UCSC graduate student and big Python
fan.

http://www.cse.ucsc.edu/~jdw/

They will be advised by computer science professors Doug Jones (U of Iowa)
and Arthur Keller (UCSC)

http://www.cs.uiowa.edu/~jones/

http://www.soe.ucsc.edu/~ark/

Ed Cherlin will also act in a mainly advisory role but will assist with the
design of the project and documentation.

http://home.earthlink.net/~adechert/cherlin_resume.doc

We anticipate that a successful demo will help expedite funding for the
overall project which is aimed at implementing a uniform transparent voting
system. We have very prominent academics interested in the project from
many of the top universities around the country: Political scientists,
lawyers, economists, computer scientists, and psychologists.

Stanford computer scientist David Dill has been helpful (he referred Ed
Cherlin and Prof Keller to me, among others). Professor Dill has gotten
involved in voting modernization issues in a big way.

http://www.verifiedvoting.org/index.asp

Henry Brady was my co-author for the original UC Berkeley proposal for
California.

http://home.earthlink.net/~adechert/src_proposal.html

Professor Brady is widely known for his papers and books on voting systems.
The main author of the Caltech-MIT reports from their voting project, Steve
Ansolabehere was one of Henry's students when Henry taught at Harvard.
Henry has two Ph.D.s from MIT.

http://www.apsanet.org/new/2003chairs.cfm

We are pulling together a great voting modernization projects. It's an
opportunity to get in on it at an early stage. It should be rewarding for
you, your community, and democracy.

Please contact me if you want to join us.

-- Alan Dechert 916-791-0456
(e-mail address removed)

Please post the sourceforge link when it is set up.

I was curious why http://www.notablesoftware.com/evote.html wasn't
mentioned. I had the impression that was an early (and continuing)
portal for these issues.

Is the intent to do opensource all the way to commodity chips? Else a
proprietary BIOS could be the weak link.
 
A

Andrew Dalke

Harry George:
Is the intent to do opensource all the way to commodity chips? Else a
proprietary BIOS could be the weak link.

In which way? There resulting code should be runnable on a wide number
of platforms, so there isn't a single source issue, and the actual vote is
on voter verifiable paper, so corruption of the BIOS won't be able to
affect anything other than the number of "something's wrong - this isn't
what I voted for" complaints.

Perhaps you were thinking of a pure electronic version?

Andrew
(e-mail address removed)
 
P

Peter Hansen

Alan said:
Peter Hansen said:
Alan said:
We are pulling together a great voting modernization projects. [snip]

Is there a word missing in the above? I can't parse it as-is, but it
looks like it wants the word "many" after the word "great"...
Editing error. I started to say, "the mother of all voting modernization
projects."

You get the idea.

I do now. Thanks. "The mother of all" has a distinctly different
character than "pulling together a great many". One hopes there will
not also be dozens of other such projects going on in parallel, wasting
resources.

-Peter
 
H

Harry George

Andrew Dalke said:
Harry George:

In which way? There resulting code should be runnable on a wide number
of platforms, so there isn't a single source issue, and the actual vote is
on voter verifiable paper, so corruption of the BIOS won't be able to
affect anything other than the number of "something's wrong - this isn't
what I voted for" complaints.

Perhaps you were thinking of a pure electronic version?

This is getting a bit off-topic, but is relevant to
Python-as-open-source-scripting-tool.

Yes, paper audit trail is essential. But I'm pretty sure that
conflicts between paper and electronic will result in court cases,
with significant chunks of ballots in limbo or thrown out on one
pretext or another. By choosing which precincts are thrown in limbo,
you can impact the overall results.

Here is a possible scenario:

1. Software chooses 1% of votes to change (big enough to have an
effect, small enough to maybe go unnoticed).

2. Paper is correct. Visual monitor is correct. Electronic storage
is changed. Voter leaves happy.

3. Results are posted based on electronic storage.

4. Only if enough people suspect trouble do we go to the paper trail.
At 1%, that may not happen. Yet a 2% swing is pretty big in many
settings.
 
A

Alan Dechert

Peter Hansen said:
Alan said:
Peter Hansen said:
Alan Dechert wrote:

We are pulling together a great voting modernization projects. [snip]

Is there a word missing in the above? I can't parse it as-is, but it
looks like it wants the word "many" after the word "great"...
Editing error. I started to say, "the mother of all voting modernization
projects."

You get the idea.

I do now. Thanks. "The mother of all" has a distinctly different
character than "pulling together a great many". One hopes there will
not also be dozens of other such projects going on in parallel, wasting
resources.
There are various public and private voting machine development efforts
going on, but nothing like ours. I'm the guy when it comes to PC based open
source voting machine with a printer. The reason I'm still at it is that
when people get the idea they'd like to work on such a thing and start
making inquiries, they wind up getting referred to me. When it comes to the
top experts in this area, there really aren't very many -- and we tend to
know each other. Some of the very top voting technology experts (e.g., Roy
Saltman and Doug Jones) are closely associated with this project. Several
key people on our project were referrals from Stanford computer scientist
David Dill. http://www.verifiedvoting.org/index.asp

Alan Dechert
 
A

Alan Dechert

Harry George said:
Here is a possible scenario:

1. Software chooses 1% of votes to change (big enough to have an
effect, small enough to maybe go unnoticed).
I don't think this is a possible scenario. However, it brings up an
interesting test for our full blown study (keep in mind, we're trying to
focus on getting the demo done even though people want to jump ahead to
speculate on every possible detail).

The question you raise here has to do with the chance of x% votes being
recorded differently on paper without y number of voters noticing. I think
it would also depend on the contest that was changed. That is, it's
probably more likely that the voter would remember his or her choice for
president than, say, county supervisor. This also points out some
difficulty with trying to test this with a mock election. The voter may not
take the selections serious enough to care about remembering exactly how
they voted.

But taking your number, if you changed 1% of the votes in CA, you're talking
about on the order of 100,000 ballots (assuming you're talking about
changing only one contest... if you're talking about all the votes cast on
all races, then that's going to mean many times more ballots altered --
probably over one million). I think it's likely that some sizable fraction
*would* notice (again, depending somewhat on the contest). Say it's only
one tenth. That's 10,000 voters that will be complaining that the computer
changed the vote. That many complaints would set off fire alarms big time.
2. Paper is correct. Visual monitor is correct. Electronic storage
is changed. Voter leaves happy.
Okay, but now the electronic record doesn't match. With our system, the
paper is the authentic vote. There is no crisis because the paper ballots
are available for recount.
3. Results are posted based on electronic storage.
But it will be caught. There will be checks in place (in CA we already hand
verify 1% of the ballots at random after the election). And with
standardized laser printed output, automated scanning should be much faster
and more accurate than scanning hand marked ballots. Depending on how
elections are administered with this new equipment, it might be possible for
initial results to be posted incorrectly, but virtually impossible that the
tally would stand unchallenged. A guard against inital bad results would
entail some sampling before the results are announced. We can use some
statistics -- cumulative binomial distribution -- to get a pretty good
confidence level of correctness with a small sample.
4. Only if enough people suspect trouble do we go to the paper trail.
At 1%, that may not happen. Yet a 2% swing is pretty big in many
settings.
I don't think your arguement is very substantial, but certainly these are
some issues a large scale study of new voting technology should investigate.

Still, has nothing to do with the demo I am talking about.

Alan Dechert
 
P

Paul Rubin

Alan Dechert said:
I don't think this is a possible scenario. However, it brings up an
interesting test for our full blown study (keep in mind, we're trying to
focus on getting the demo done even though people want to jump ahead to
speculate on every possible detail).

But something like that seems to have happened in Escambia County,
Florida, in 2000. Out of 21,500 absentee ballots cast, 296 (1.5% of
the total) were overvotes with three or more presidential candidates
checked. ZERO were overvotes with exactly two candidates checked.
Ballot tampering after the ballots were received is the most plausible
explanation. You said that in your system the paper ballots are
supposed to take priority over the electronic count if there is a
dispute (that's the whole point of having the paper ballots). So it
doesn't matter if the paper and electronic results don't match, and
the tampering doesn't have to happen while the voter can still see the
ballot.

Reference:

http://www.failureisimpossible.com/essays/escambia.htm

Note: Paul Lukasiak, the main author of that article, did some of the
most thorough analysis of the Florida debacle that I've seen. I hope
you will read a lot of his stuff in designing your real system, so
you'll be able to say how your system deals with the problems that he
identified in Florida.
 
A

Alan Dechert

Paul Rubin said:
But something like that seems to have happened in Escambia County,
Florida, in 2000. Out of 21,500 absentee ballots cast, 296 (1.5% of
the total) were overvotes with three or more presidential candidates
checked. ZERO were overvotes with exactly two candidates checked.
Ballot tampering after the ballots were received is the most plausible
explanation.
But that's a different scenario. As you described it, the voter never had a
chance to see the alteration. The scenario Harry described is where the
voter has the altered ballot in hand but doesn't notice.
You said that in your system the paper ballots are
supposed to take priority over the electronic count if there is a
dispute (that's the whole point of having the paper ballots). So it
doesn't matter if the paper and electronic results don't match, and
the tampering doesn't have to happen while the voter can still see the
ballot.
I don't see much of a point here. It will be very hard -- if not
impossible -- to tamper with the printout in a manner that would go
undetected. First of all, overvotes will not be possible at all. I can't
quite visualize how you figure someone will alter the printout. Take some
whiteout and cover one name and print in a new one? That would look pretty
obvious. Furthermore, the bar code would no longer match the text. In my
scheme, the tamperer would have no way to know how to alter the bar code to
match any alterations in the text.

Post election checks (canvass period) would involve hand checks, and scanner
checks of the bar code and the text. It all has to match.
Reference:

http://www.failureisimpossible.com/essays/escambia.htm

Note: Paul Lukasiak, the main author of that article, did some of the
most thorough analysis of the Florida debacle that I've seen. I hope
you will read a lot of his stuff in designing your real system, so
you'll be able to say how your system deals with the problems that he
identified in Florida.
I read as much as possible and will continue to study all of this. Keep in
mind that some of the people on our team are leading experts in the field.
They know all this stuff inside out. We'll bring in more experts once the
study is funded.

Nobody is saying this issue is simple. Almost everyone that has approached
the voting mess dilemma and tried to figure it out has grossly
underestimated the problem. I have to say I underestimated too but I have
stuck with it long enough and hard enough to get a handle on it. Our
Election Rules Database (the largest component of our proposed study) will
surface inordinate problems -- get them out in the open where we can deal
with them.

Alan Dechert
 
M

Matt Shomphe

Alan Dechert said:
We have an excellent team together. We're looking for a few good Python
coders willing to volunteer for this free open source project. It will be
on sourceforge.net later today.

Are there any prerequisites for joining the team?
 
A

Alan Dechert

Matt Shomphe said:
"Alan Dechert" <[email protected]> wrote in message

Are there any prerequisites for joining the team?
It would be nice if you know some Python. Do you?

BTW, I'm told it will take a day or two to get the project going on
sourceforge.net.

Alan Dechert
 
H

Harry George

Alan Dechert said:
But that's a different scenario. As you described it, the voter never had a
chance to see the alteration. The scenario Harry described is where the
voter has the altered ballot in hand but doesn't notice.

No, I said the paper and the CRT or LCD were correct. It was just the
electronic storage that was altered.
 
A

Alan Dechert

Harry George said:
No, I said the paper and the CRT or LCD were correct. It was just the
electronic storage that was altered.
Okay, right. My mistake. I read #1 as one scenario and #2 as different
scenario.

Nonetheless, I maintain that such a game would be caught easily. Checking a
very small sample before announcing the preliminary (electronic) count would
catch it. If one percent of the electronic votes were altered (evenly
distributed), you will find one mismatch for sure after checking only a
dozen or two ballots. If the one percent is not evenly distributed, it will
show up as very suspicious results in the areas where the alterations were
concentrated. So, before announcing the electronic count, the result should
be given a common sense review. For example, if Orange County in CA shows
75 % for the Democrat you know something's wrong (or San Francisco shows 75
% for the Republican). Then samples should be checked in various locations
with any unexpected looking results getting special attention.

If you find one mismatch while sampling, you know there's a problem and the
tally would be delayed until all the paper ballots have been scanned. Then
you do some manual checking of the results from scanning.

Then, you figure out how the rigging was carried out. If the machines were
rigged all over, this would imply a very large conspiracy -- a very large
risk for a large number of people doing something that will be caught with
absolute certainty (thus very unlikely to happen).

Alan Dechert
 
A

Alan Dechert

Andrew Dalke said:
Alan Dechert:

Actually, around 70 ballots.

The odds of finding one ballot to be wrong is 1%, so there's a 99%
chance that it's unaltered. There's a 0.99*0.99 chance that two are
unaltered, and in general a 0.99**n chance that n are unaltered.

To get even odds of noticing an error requires

0.99**n == 0.5
--> log(0.99)*n == log(0.5)


about 70 verifications.
A likely story. I actually took combinatorics and a class in statistics in
college. But that was a long time ago. Since then, many brain cells have
died, tragically.

BTW, do you know about cumulative binomial distribution? I think we need to
include a tool like this to give us some "C.L" (confidence level) short of
verifying that every single paper ballot matches its electronic counterpart.

http://www.reliasoft.com/newsletter/2q2001/cumulative_binomial.htm

What I really want is a calculator. Do you know of a free calculator for
this? If not, could you make one? (for free, of course).

Alan Dechert
 
A

Andrew Dalke

Alan Dechert:
BTW, do you know about cumulative binomial distribution? I think we need to
include a tool like this to give us some "C.L" (confidence level) short of
verifying that every single paper ballot matches its electronic counterpart.

http://www.reliasoft.com/newsletter/2q2001/cumulative_binomial.htm

What I really want is a calculator. Do you know of a free calculator for
this? If not, could you make one? (for free, of course).

I don't know anything about it, although looking at it I think I understand
what it's trying to do. Here's the equation for those following along

r
1 - C.L. = Sum C(N, k) * (1-R)**k * R**(N-k)
k=0

where R(N, k) = N! / (k! * (N-k)!)

and C.L. is the confidence level,
N is the number of votes tested
r is the maximum allowable number of errors, and
R is the reliability (scanning errors, attempts at voting fraud, etc.)

The problem is, I don't see how to use it. Here's the case I
had in mind. Suppose you have 100,000 votes and want to
be 90% certain that there is no more than 1% error. This means
you'll need to retest N ballots (the ballots "under test"). Then
the variables are:

C.L. = 0.90
N = to be determined
r = N * 0.01
R = ???

and I just don't know what to use for R. Let's say it's <0.1%. Then

N*0.01
1 - 0.9 = Sum C(100000, k) * (1-0.001)**k * (0.001) ** (N-k)
k=0

and solve for N.

I wouldn't want to write a calculator for it without the verification of
someone who knows it better than I do. The only way I know to
solve this iteratively, and given the large and small numbers involved,
I would worry about numerical problems like overflow and underflow.
(OTOH, for those cases, approximations like the Stirling expansion
for factorials come into play. But my math for this is entirely too
rusty.)

So no, I can't help you further than this.

BTW, I don't think you need to distribute a calculator for this. I think
you just need some tables which give values for a few select points,
as in:

# | 75% certain | 90% | 98% || 90% | 98%
ballots | of no more than | .. | || |
| 1% error rate | 1% | 1% || 5% | 5%
------------------------------------------------------------
100 | values ....
200 | values ....
....


Andrew
(e-mail address removed)
 
P

Paul Rubin

Alan Dechert said:
BTW, do you know about cumulative binomial distribution? I think we need to
include a tool like this to give us some "C.L" (confidence level) short of
verifying that every single paper ballot matches its electronic counterpart.

I'm skeptical of that approach. It assumes independent events. You
should probably talk to a real statistician. This is the kind of
stuff they do all the time.
 
A

Alan Dechert

BTW, I don't think you need to distribute a calculator for this. I think
you just need some tables which give values for a few select points,
as in:
I still want a calculator. Here's what Bill Buck ( (e-mail address removed) ) sent
me. It turns out there is a BINOMDIST function in Excel. I think it might
be what I want.

http://home.earthlink.net/~adechert/LOTAcceptanceCalculator.xls

Let me try to clarify what I'm after. The paper record should always match
the electronic record. So the allowable defects is zero. If there is a
mismatch found in the sample, we don't publish the electronic tally: we take
all the paper ballots and check them.

We are talking about an election conducted with computer-generated paper
ballots. The paper ballots represent the actual vote since these ballots
are what voters actually saw, verified, and cast. We will have an
electronic record obtained from the computers which should match each paper
ballot generated. We want to use the electronic record since it will give
us an instant result -- but we have to check it against the paper ballots to
be sure the election result is correct. So, in this scenario, the
electronic count is a prediction (or preliminary tally).

So, if by the preliminary electronic tally a candidate won a race by 1
percent, I want to know how many ballots we have to check (random sample) to
be certain that the result predicted is true.

When I put one million into this Confidence Level Calculator, and Acceptable
Quality Level of .01, a sample of 10,000 shows a confidence level of "1." A
sample of 2000 give C.L of 0.999999998 Presumably, "1" is really
..9999999999+ more 9s. Can we get more decimal places?

So I guess the Lot Fraction Defective is analgous to the predicted victory
margin. Is that right?

I would still like a standalone calculator that doesn't require Excel.

Alan Dechert
 
A

Alan Dechert

There's a lot of assumed knowledge in this discussion. I know nothing
about electronic voting systems, I know almost nothing about Python as
a programming language but I do know how to calculate values
associated with the cumulative binomial distribution. A Javascript
(sorry!) calculator for the binomial distribution amongst others can
be found at http://members.aol.com/iandjmsmith/EXAMPLES.HTM.
Javascript can be converted to Java, C etc quite easily. I would
imagine with my lack of knowledge of Python you'll probably be better
off doing the conversion yourself.

The code is quick and accurate and can handle large sample sizes and
very small probabilities which you probably need. There's even a
calculator for the Hypergeometric distribution should you wish to do
sampling without replacement calculations.


Ian Smith

P.S. the calculation of 70 is fine. If you don't want to calculate
anything much more difficult then you probably don't need a calculator
anyway.
Thanks, Ian. I could not quite figure out if your binomial distribution
calculator would be applicable.

Here's what Bill Buck ( (e-mail address removed) ) sent me. It turns out there is a
BINOMDIST function in Excel. I think it might be what I want.

http://home.earthlink.net/~adechert/LOTAcceptanceCalculator.xls

Let me try to clarify what I'm after. The paper record should always match
the electronic record. So the allowable defects is zero. If there is a
mismatch found in the sample, we don't publish the electronic tally: we take
all the paper ballots and check them.

We are talking about an election conducted with computer-generated paper
ballots. The paper ballots represent the actual vote since these ballots
are what voters actually saw, verified, and cast. We will have an
electronic record obtained from the computers which should match each paper
ballot generated. We want to use the electronic record since it will give
us an instant result -- but we have to check it against the paper ballots to
be sure the election result is correct. So, in this scenario, the
electronic count is a prediction (or preliminary tally).

So, if by the preliminary electronic tally a candidate won a race by 1
percent, I want to know how many ballots we have to check (random sample) to
be certain that the result predicted is true.

When I put one million into this Confidence Level Calculator, and Acceptable
Quality Level of .01, a sample of 10,000 shows a confidence level of "1." A
sample of 2000 give C.L of 0.999999998 Presumably, "1" is really
..9999999999+ more 9s. Can we get more decimal places?

So I guess the Lot Fraction Defective is analgous to the predicted victory
margin. Is that right?

I would still like a standalone calculator that doesn't require Excel.

Alan Dechert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,189
Members
46,734
Latest member
manin

Latest Threads

Top