sql escaping module

D

David Bear

Being new to pgdb, I'm finding there are lot of things I don't understand
when I read the PEP and the sparse documentation on pgdb.

I was hoping there would be a module that would properly escape longer text
strings to prevent sql injection -- and other things just make sure the
python string object ends up being a properly type for postgresql. I've
bought 3 books on postgresql and none of th code samples demonstrate this.

web searchs for 'python sql escape string' yeild way too many results.

Any pointers would be greatly appreciated.
 
F

Fredrik Lundh

David said:
Being new to pgdb, I'm finding there are lot of things I don't understand
when I read the PEP and the sparse documentation on pgdb.

I was hoping there would be a module that would properly escape longer text
strings to prevent sql injection -- and other things just make sure the
python string object ends up being a properly type for postgresql. I've
bought 3 books on postgresql and none of th code samples demonstrate this.

web searchs for 'python sql escape string' yeild way too many results.

Any pointers would be greatly appreciated.

for x in range(1000000):
print "USE PARAMETERS TO PASS VALUES TO THE DATABASE"

</F>
 
F

Fredrik Lundh

Fredrik said:
for x in range(1000000):
print "USE PARAMETERS TO PASS VALUES TO THE DATABASE"

for an example, see "listing 2" in the following article:

http://www.amk.ca/python/writing/DB-API.html

(the database used in that example uses the "?" parameter style. your database
may prefer another style; check the paramstyle variable. see the DB API spec
for a descriptoin)

(a linuxjournal version of that article is linked from the pygresql site)

</F>
 
T

Tim Roberts

David Bear said:
Being new to pgdb, I'm finding there are lot of things I don't understand
when I read the PEP and the sparse documentation on pgdb.

I was hoping there would be a module that would properly escape longer text
strings to prevent sql injection -- and other things just make sure the
python string object ends up being a properly type for postgresql. I've
bought 3 books on postgresql and none of th code samples demonstrate this.

All of the Python database modules will do this protection for you.
Example:

db = psycopg2.connect(database='dbname')
c = db.cursor()
c.execute( "INSERT INTO table1 VALUES (%s,%s,%s);", (var1, var2, var3) )

Note that I have used a "comma", not the Python % operator, and I have not
done any quoting in the query. By doing that, I am instructing the
database module to do whatever protective quoting may be required for the
values I have passed, and substitute the quoted values into the string.

As long as you use that scheme, you should be safe from injection. It's
only when people try to do it themselves that they get in trouble, as in:

c.execute( "INSERT INTO table1 VALUES ('%s','%s','%s');" % (var1, var2,
var3) ) # THIS IS WRONG
 
F

Frank Millman

David said:
Being new to pgdb, I'm finding there are lot of things I don't understand
when I read the PEP and the sparse documentation on pgdb.

I was hoping there would be a module that would properly escape longer text
strings to prevent sql injection -- and other things just make sure the
python string object ends up being a properly type for postgresql. I've
bought 3 books on postgresql and none of th code samples demonstrate this.

web searchs for 'python sql escape string' yeild way too many results.

Any pointers would be greatly appreciated.

I think I know where David is coming from, as I also battled to
understand this. I think that I have now 'got it', so I would like to
offer my explanation.

I used to think that each DB-API module transformed the 'string +
parameters' into a valid SQL command before passing it to the db.
However, this is not what is happening.

Every modern database provides an API to allow applications to interact
with the database programmatically. Typically these are intended for C
programs, but other languages may be supported. The authors of the
various DB-API modules provide a python wrapper around this to allow
use from a python program.

Each of the API's includes the capability of passing commands in the
form of 'string + parameters' directly into the database. This means
that the data values are never embedded into the SQL command at all,
and therefore there is no possibility of injection attacks.

The various API's use different syntaxes for passing the parameters. It
would have been nice if the DB-API had specified one method, and left
it to the author of each module to transform this into the form
required by the underlying API. Unfortunately the DB-API allows a
choice of 'paramstyles'. There may be technical reasons for this, but
it does make supporting multiple databases awkward.

Frank Millman
 
F

Fredrik Lundh

Frank said:
Each of the API's includes the capability of passing commands in the
form of 'string + parameters' directly into the database. This means
that the data values are never embedded into the SQL command at all,
and therefore there is no possibility of injection attacks.

another advantage with parameters is that if you do multiple operations which
differ only in parameters, the database may skip the SQL compilation and query
optimization passes.
The various API's use different syntaxes for passing the parameters. It
would have been nice if the DB-API had specified one method, and left
it to the author of each module to transform this into the form
required by the underlying API. Unfortunately the DB-API allows a
choice of 'paramstyles'. There may be technical reasons for this, but
it does make supporting multiple databases awkward.

agreed.

on the other hand, it shouldn't be that hard to create a function does this mapping
on the fly, so that drivers can be updated support any paramstyle... time for a DB
API 3.0 specification, perhaps ?

(I'd also like to see a better granularity; the current connection/cursor model is a
bit limited; a connection/cursor/query/result set model would be nicer, but I guess
ODBC gets in the way here...)

</F>
 
S

Steve Holden

Fredrik said:
Frank Millman wrote:




another advantage with parameters is that if you do multiple operations which
differ only in parameters, the database may skip the SQL compilation and query
optimization passes.




agreed.
indeed. I suspect (not having been involved in the decisions) that the
variations were to minimise the work module implementers had to do to
get their modules working.
on the other hand, it shouldn't be that hard to create a function does this mapping
on the fly, so that drivers can be updated support any paramstyle... time for a DB
API 3.0 specification, perhaps ?
It would be a little tricky to convert name-based ("named" and
"pyformat", requiring a data mapping) parameterizations to positional
ones ("qmark", "numeric" and "format", requiring a data sequence) and
vice versa. It's probably a worthwhile effort, though.
(I'd also like to see a better granularity; the current connection/cursor model is a
bit limited; a connection/cursor/query/result set model would be nicer, but I guess
ODBC gets in the way here...)
Yes, it would at least be nice to include some of the more advanced ways
of presenting query results.

regards
Steve
 
D

David Bear

Fredrik said:
for x in range(1000000):
print "USE PARAMETERS TO PASS VALUES TO THE DATABASE"

</F>
Yes. Fredrik and others. Thank you for the advice.

I know have the following code:

...
parmChar = '%s'
sqlInsert = """INSERT INTO %s (%s) VALUES (%s); """ % (tn, ",
".join(fieldnames), ", ".join([parmChar] * len(fieldnames)))
try:
cursor.execute(sqlInsert, datum)
except pgdb.DatabaseError:
logerror("Error on record insert \n %s \n %s" % (sqlInsert,
traceback.print_exc()))

I was not aware that the python db interface would just handle proper
escaping of python data types to proper postgresql data types.

Any other hints on database programming much appreciated.
 
D

David Bear

Steve Holden wrote:

My news server didn't get Franks initial post to the group, so I'm glad that
Steve included it in his followup.

The statement above can cause relief or pain. Letting the DBAPI handle
proper string escapes, formating, etc., is a big relief. However, I am
still wondering what happens under the covers. If I have a string '1\n'
that I've read from some source and I really intend on inserting it into
the data base as a number 1, if the tape column it goes into is of type int
or num or float, will the DBAPI really know what to do with the newline?
 
F

Frank Millman

David said:
The statement above can cause relief or pain. Letting the DBAPI handle
proper string escapes, formating, etc., is a big relief. However, I am
still wondering what happens under the covers. If I have a string '1\n'
that I've read from some source and I really intend on inserting it into
the data base as a number 1, if the tape column it goes into is of type int
or num or float, will the DBAPI really know what to do with the newline?

Try it and see. This is what I get -
TypeError: not all arguments converted during string formatting

Different DBAPI modules may handle it differently.

Frank
 
S

Steve Holden

David said:
My news server didn't get Franks initial post to the group, so I'm glad that
Steve included it in his followup.

The statement above can cause relief or pain. Letting the DBAPI handle
proper string escapes, formating, etc., is a big relief. However, I am
still wondering what happens under the covers. If I have a string '1\n'
that I've read from some source and I really intend on inserting it into
the data base as a number 1, if the tape column it goes into is of type int
or num or float, will the DBAPI really know what to do with the newline?
Yes. If you read the DB API documentation
(http://www.python.org/peps/pep-0249.html) you will see that there's a
section on "Type Objects and Constructors". It's those that ensure a
value will be coerced into the required form if possible.

regards
Steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,274
Messages
2,571,365
Members
48,049
Latest member
robinsonkoff

Latest Threads

Top