Taint (like in Perl) as a Python module: taint.py

  • Thread starter Johann C. Rocholl
  • Start date
J

Johann C. Rocholl

The following is my first attempt at adding a taint feature to Python
to prevent os.system() from being called with untrusted input. What do
you think of it?

# taint.py - Emulate Perl's taint feature in Python
# Copyright (C) 2007 Johann C. Rocholl <[email protected]>
#
# Permission is hereby granted, free of charge, to any person
# obtaining a copy of this software and associated documentation files
# (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge,
# publish, distribute, sublicense, and/or sell copies of the Software,
# and to permit persons to whom the Software is furnished to do so,
# subject to the following conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
# BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
# ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
# CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.


"""
Emulate Perl's taint feature in Python

This module replaces all functions in the os module (except stat) with
wrappers that will raise an Exception called TaintError if any of the
parameters is a tainted string.

All strings are tainted by default, and you have to call untaint on a
string to create a safe string from it.

Stripping, zero-filling, and changes to lowercase or uppercase don't
taint a safe string.

If you combine strings with + or join or replace, the result will be a
tainted string unless all its parts are safe.

It is probably a good idea to run some checks on user input before you
call untaint() on it. The safest way is to design a regex that matches
legal input only. A regex that tries to match illegal input is very
hard to prove complete.

You can run the following examples with the command
python taint.py -v
to test if this module works as designed.
Traceback (most recent call last):
TaintError
safe = untaint(unsafe)
tainted(safe) False
os.system(safe) 256
safe + unsafe u'testtest'
safe.join([safe, unsafe]) u'testtesttest'
tainted(safe + unsafe) True
tainted(safe + safe) False
tainted(unsafe.join([safe, safe])) True
tainted(safe.join([safe, unsafe])) True
tainted(safe.join([safe, safe])) False
tainted(safe.replace(safe, unsafe)) True
tainted(safe.replace(safe, safe)) False
tainted(safe.capitalize()) or tainted(safe.title()) False
tainted(safe.lower()) or tainted(safe.upper()) False
tainted(safe.strip()) or tainted(safe.rstrip()) or tainted(safe.lstrip()) False
tainted(safe.zfill(8)) False
tainted(safe.expandtabs())
True
"""

import os
import types


class TaintError(Exception):
"""
This exception is raised when you try to call a function in the os
module with a string parameter that isn't a SafeString.
"""
pass


class SafeString(unicode):
"""
A string class that you must use for parameters to functions in
the os module.
"""

def __add__(self, other):
"""Create a safe string if the other string is also safe."""
if tainted(other):
return unicode.__add__(self, other)
return untaint(unicode.__add__(self, other))

def join(self, sequence):
"""Create a safe string if all components are safe."""
for element in sequence:
if tainted(element):
return unicode.join(self, sequence)
return untaint(unicode.join(self, sequence))

def replace(self, old, new, *args):
"""Create a safe string if the replacement text is also
safe."""
if tainted(new):
return unicode.replace(self, old, new, *args)
return untaint(unicode.replace(self, old, new, *args))

def strip(self, *args):
return untaint(unicode.strip(self, *args))

def lstrip(self, *args):
return untaint(unicode.lstrip(self, *args))

def rstrip(self, *args):
return untaint(unicode.rstrip(self, *args))

def zfill(self, *args):
return untaint(unicode.zfill(self, *args))

def capitalize(self):
return untaint(unicode.capitalize(self))

def title(self):
return untaint(unicode.title(self))

def lower(self):
return untaint(unicode.lower(self))

def upper(self):
return untaint(unicode.upper(self))


# Alias to the constructor of SafeString,
# so that untaint('abc') gives you a safe string.
untaint = SafeString


def tainted(param):
"""
Check if a string is tainted.
If param is a sequence or dict, all elements will be checked.
"""
if isinstance(param, (tuple, list)):
for element in param:
if tainted(element):
return True
elif isinstance(param, dict):
return tainted(param.values())
elif isinstance(param, (str, unicode)):
return not isinstance(param, SafeString)
else:
return False


def wrapper(function):
"""Create a new function that checks its parameters first."""
def check_first(*args, **kwargs):
"""Check all parameters for unsafe strings, then call."""
if tainted(args) or tainted(kwargs):
raise TaintError
return function(*args, **kwargs)
return check_first


def install_wrappers(module, innocent):
"""
Replace each function in the os module with a wrapper that checks
the parameters first, except if the name of the function is in the
innocent list.
"""
for name, function in module.__dict__.iteritems():
if name in innocent:
continue
if type(function) in [types.FunctionType,
types.BuiltinFunctionType]:
module.__dict__[name] = wrapper(function)


install_wrappers(os, innocent = ['stat'])


if __name__ == '__main__':
import doctest
doctest.testmod()
 
G

Gabriel Genellina

En Mon, 05 Feb 2007 19:13:04 -0300, Johann C. Rocholl
The following is my first attempt at adding a taint feature to Python
to prevent os.system() from being called with untrusted input. What do
you think of it?

A simple reload(os) will drop all your wrapped functions, leaving the
original ones.
I suppose you don't intend to publish the SafeString class - but if anyone
can get a SafeString instance in any way or another, he can convert
*anything* into a SafeString trivially.
And tainted() returns False by default?????

Sorry but in general, this won't work :(
 
B

Ben Finney

Gabriel Genellina said:
I suppose you don't intend to publish the SafeString class - but if
anyone can get a SafeString instance in any way or another, he can
convert *anything* into a SafeString trivially.

The point (in Perl) of detecting taint isn't to prevent a programmer
from deliberately removing the taint. It's to help the programmer find
places in the code where taint accidentally remains.
And tainted() returns False by default?????
Sorry but in general, this won't work :(

I'm inclined to agree that the default should be to flag an object as
tainted unless known otherwise.
 
G

Gabriel Genellina

En Mon, 05 Feb 2007 23:01:51 -0300, Ben Finney
The point (in Perl) of detecting taint isn't to prevent a programmer
from deliberately removing the taint. It's to help the programmer find
places in the code where taint accidentally remains.

I'm not convinced at all of the usefulness of tainting.
How do you "untaint" a string? By checking some conditions?
Let's say, you validate and untaint a string, regarding it's future usage
on a command line, so you assume it's safe to use on os.system calls - but
perhaps it still contains a sql injection trap (and being untainted you
use it anyway!).
Tainting may be useful for a short lived string, one that is used on the
*same* process as it was created. And in this case, unit testing may be a
good way to validate the string usage along the program.
But if you store input text on a database or configuration file (username,
password, address...) it may get used again by *another* process, maybe a
*different* program, even months later. What to do? Validate all input for
any possible type of unsafe usage before storing them in the database, so
it is untainted? Maybe... but I'd say it's better to ensure things are
*done* *safely* instead of trusting a flag. (Uhmm, perhaps it's like "have
safe sex; use a condom" instead of "require an HIV certificate")

That is:
- for sql injection, use parametrized queries, don't build SQL statements
by hand.
- for html output, use any safe template engine, always quoting inputs.
- for os.system and similar, validate the command line and arguments right
before being executed.
and so on.
 
P

Paul Rubin

Gabriel Genellina said:
I'm not convinced at all of the usefulness of tainting.
How do you "untaint" a string? By checking some conditions?

In perl? I don't think you can untaint a string, but you can make a
new untainted string by extracting a regexp match from the tainted
string's contents.
Let's say, you validate and untaint a string, regarding it's future
usage on a command line, so you assume it's safe to use on os.system
calls - but perhaps it still contains a sql injection trap (and being
untainted you use it anyway!).

Well, ok, you didn't check it carefully enough, but at least you made
an attempt. Taint checking is a useful feature in perl.
Tainting may be useful for a short lived string, one that is used on
the *same* process as it was created. And in this case, unit testing
may be a good way to validate the string usage along the program.

Unit testing is completely overrated for security testing. It checks
the paths through the program that you've written tests for. Taint
checking catches errors in paths that you never realized existed.
- for sql injection, use parametrized queries, don't build SQL
statements by hand.
- for html output, use any safe template engine, always quoting inputs.
- for os.system and similar, validate the command line and arguments
right before being executed. and so on.

Right, but it's easy to make errors and overlook things, and taint
checking catches a lot of such mistakes.
 
J

Johann C. Rocholl

I'm inclined to agree that the default should be to flag an object as
tainted unless known otherwise.

That's true. For example, my first attempt didn't prevent this:
os.open(buffer('/etc/passwd'), os.O_RDONLY)

Here's a stricter version:

def tainted(param):
"""
Check if a parameter is tainted. If it's a sequence or dict, all
values will be checked (but not the keys).
"""
if isinstance(param, unicode):
return not isinstance(param, SafeString)
elif isinstance(param, (bool, int, long, float, complex, file)):
return False
elif isinstance(param, (tuple, list)):
for element in param:
if tainted(element):
return True
elif isinstance(param, dict):
return tainted(param.values())
else:
return True
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,189
Members
46,734
Latest member
manin

Latest Threads

Top