building an online judge to evaluate Python programs

J

Jabba Laci

Hi,

In our school I have an introductory Python course. I have collected a
large list of exercises for the students and I would like them to be
able to test their solutions with an online judge (
http://en.wikipedia.org/wiki/Online_judge ). At the moment I have a
very simple web application that is similar to Project Euler: you
provide the ID of the exercise and the output of the program, and it
tells you if it's correct or not. However, it can only be used with
programs that produce an output (usually a short string or a number).

In the next step I would like to do the following. The user can upload
his/her script, and the system tests it with various inputs and tells
you if it's OK or not (like checkio.org for instance). How to get
started with this?

There are several questions:
* What is someone sends an infinite loop? There should be a time limit.
* What is someone sends a malicious code? The script should be run in a sandbox.

All tips are appreciated.

Thanks,

Laszlo
 
A

Aseem Bansal

However, it can only be used with programs that produce an output

Just interested, what else are you thinking of checking?
 
J

Jabba Laci

Let's take this simple exercise:

"Write a function that receives a list and decides whether the list is
sorted or not."

Here the output of the function is either True or False, so I cannot
test it with my current method.

Laszlo
 
J

John Gordon

In said:
Let's take this simple exercise:
"Write a function that receives a list and decides whether the list is
sorted or not."
Here the output of the function is either True or False, so I cannot
test it with my current method.

Make a master input file and a master output file for each exercise. If
the student program's output matches the master output when run from the
master input, then it is correct.
 
J

John Gordon

In said:
There are several questions:
* What is someone sends an infinite loop? There should be a time limit.

You could run the judge as a background process, and kill it after ten
seconds if it hasn't finished.
* What is someone sends a malicious code? The script should be run in a
sandbox.

You could run the judge from its own account that doesn't have access to
anything else. For extra security, make the judge program itself owned by
a separate account (but readable/executable by the judge account.)

I suppose you'd have to disable mail access from the judge account too.
Not sure how to easily do that.
 
N

Ned Batchelder

I just found Docker ( http://docs.docker.io/en/latest/faq/ ). It seems
sandboxing could be done with this easily.

At edX, I wrote CodeJail (https://github.com/edx/codejail) to use
AppArmor to run Python securely.

For grading Python programs, we use a unit-test like series of
challenges. The student writes problems as functions (or classes), and
we execute them with unit tests (not literally unittest, but a similar
idea). We also tokenize the code to check for simple things like, did
you use a while loop when the requirement was to write a recursive
function. The grading code is not open-source, unfortunately, because
it is part of the MIT courseware.

--Ned.
 
J

Jabba Laci

Hi Ned,

Could you please post here your AppArmor profile for restricted Python scripts?

Thanks,

Laszlo
 
N

Ned Batchelder

Hi Ned,

Could you please post here your AppArmor profile for restricted Python scripts?
Laszlo, the instructions are in the README, including the AppArmor
profile. It isn't much:

#include <tunables/global>

<SANDENV>/bin/python {
#include <abstractions/base>
#include <abstractions/python>

<SANDENV>/** mr,
# If you have code that the sandbox must be able to access, add lines
# pointing to those directories:
/the/path/to/your/sandbox-packages/** r,

/tmp/codejail-*/ rix,
/tmp/codejail-*/** rix,
}

Note that there are other protections beyond AppArmor, setrlimits is also used to limit some resource use.

--Ned.

BTW: Top-posting makes it harder to follow threads of conversations, better form is to add your comments below the person you're replying to.
Thanks,

Laszlo
 
D

Dennis Lee Bieber

Make a master input file and a master output file for each exercise. If
the student program's output matches the master output when run from the
master input, then it is correct.

As long as the student doesn't have access to the master in/out data,
but only examples...

Hearsay in my junior year at college was of a senior who couldn't
manage to get his program to work -- so he basically embedded lots of
output statements which basically wrote the expected output, based on
access to the test input data.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top