Loop through a dict changing keys

G

Gnarlodious

What is the best way (Python 3) to loop through dict keys, examine the
string, change them if needed, and save the changes to the same dict?

So for input like this:
{'Mobile': 'string', 'context': '<malicious code>', 'order': '7',
'time': 'True'}

I want to booleanize 'True', turn '7' into an integer, escape
'<malicious code>', and ignore 'string'.

Any elegant Python way to do this?

-- Gnarlie
 
A

Alexander Kapps

What is the best way (Python 3) to loop through dict keys, examine the
string, change them if needed, and save the changes to the same dict?

So for input like this:
{'Mobile': 'string', 'context': '<malicious code>', 'order': '7',
'time': 'True'}

I want to booleanize 'True', turn '7' into an integer, escape
'<malicious code>', and ignore 'string'.

Any elegant Python way to do this?

-- Gnarlie

I think JSON could be of some use, but I've not used it yet,
otherwise something like this could do it:

#!/usr/bin/python

from cgi import escape

def convert(string):
for conv in (int, lambda x: {'True': True, 'False': False}[x],
escape):
try:
return conv(string)
except (KeyError, ValueError):
pass
return string


d = {'Mobile': 'string',
'context': '<malicious code>',
'order': '7',
'time': 'True'}

print d

for key in d:
d[key] = convert(d[key])

print d


$ ./conv.py
{'Mobile': 'string', 'order': '7', 'context': '<malicious code>',
'time': 'True'}
{'Mobile': 'string', 'order': 7, 'context': '&lt;malicious
code&gt;', 'time': True}
 
M

MRAB

What is the best way (Python 3) to loop through dict keys, examine the
string, change them if needed, and save the changes to the same dict?

So for input like this:
{'Mobile': 'string', 'context': '<malicious code>', 'order': '7',
'time': 'True'}

I want to booleanize 'True', turn '7' into an integer, escape
'<malicious code>', and ignore 'string'.

Any elegant Python way to do this?
How about:

for key, value in my_dict.items():
if value == "True":
my_dict[key] = True
 
8

88888 dihedral

Is there an FAQ available here? Please check the PYTHON official site and the active state PYTHON examples first, also check the PLEAC comparisons of a lot programming languages first!
 
8

88888 dihedral

Is there an FAQ available here? Please check the PYTHON official site and the active state PYTHON examples first, also check the PLEAC comparisons of a lot programming languages first!
 
P

PoD

What is the best way (Python 3) to loop through dict keys, examine the
string, change them if needed, and save the changes to the same dict?

So for input like this:
{'Mobile': 'string', 'context': '<malicious code>', 'order': '7',
'time': 'True'}

I want to booleanize 'True', turn '7' into an integer, escape
'<malicious code>', and ignore 'string'.

Any elegant Python way to do this?

-- Gnarlie

How about

data = {
'Mobile': 'string',
'context': '<malicious code>',
'order': '7',
'time': 'True'}
types={'Mobile':str,'context':str,'order':int,'time':bool}

for k,v in data.items():
data[k] = types[k](v)
 
J

Jon Clements

What is the best way (Python 3) to loop through dict keys, examine the
string, change them if needed, and save the changes to the same dict?
So for input like this:
{'Mobile': 'string', 'context': '<malicious code>', 'order': '7',
'time': 'True'}
I want to booleanize 'True', turn '7' into an integer, escape
'<malicious code>', and ignore 'string'.
Any elegant Python way to do this?
-- Gnarlie

How about

data = {
    'Mobile': 'string',
    'context': '<malicious code>',
    'order': '7',
    'time': 'True'}
types={'Mobile':str,'context':str,'order':int,'time':bool}

for k,v in data.items():
    data[k] = types[k](v)

Bit of nit-picking, but:
False
 
P

PoD

What is the best way (Python 3) to loop through dict keys, examine
the string, change them if needed, and save the changes to the same
dict?
So for input like this:
{'Mobile': 'string', 'context': '<malicious code>', 'order': '7',
'time': 'True'}
I want to booleanize 'True', turn '7' into an integer, escape
'<malicious code>', and ignore 'string'.
Any elegant Python way to do this?
-- Gnarlie

How about

data = {
    'Mobile': 'string',
    'context': '<malicious code>',
    'order': '7',
    'time': 'True'}
types={'Mobile':str,'context':str,'order':int,'time':bool}

for k,v in data.items():
    data[k] = types[k](v)

Bit of nit-picking, but:
False

Oops :) Brain fade.
 
G

Gnarlodious

data = {
    'Mobile': 'string',
    'context': '<malicious code>',
    'order': '7',
    'time': 'True'}
types={'Mobile':str,'context':str,'order':int,'time':bool}

for k,v in data.items():
    data[k] = types[k](v)

Thanks for the tip, I didn't know you could do that. I ended up
filtering the values the bulky way, but it gives me total control over
what internet users feed my program.

-- Gnarlie
 
S

Steven D'Aprano

data = {
    'Mobile': 'string',
    'context': '<malicious code>',
    'order': '7',
    'time': 'True'}
types={'Mobile':str,'context':str,'order':int,'time':bool}

for k,v in data.items():
    data[k] = types[k](v)

Thanks for the tip, I didn't know you could do that. I ended up
filtering the values the bulky way,

What is "the bulky way"?
but it gives me total control over
what internet users feed my program.

Why does this not fill me with confidence?

As Jon Clements has already spotted a major bug in the above: using bool
as shown is not correct. Furthermore, converting '<malicious code>' into
a string does nothing, since it is already a string.

Gnarlodious, it is good that you are concerned about code injection
attacks, but defending against them is not simple or easy. I don't intend
to sound condescending, but when your response to being shown a simple
filter that maps keys to types is to say "I didn't know you could do
that", that's a good warning that your Python experience may not be quite
up to the job of out-guessing the sort of obscure tricks hostile
attackers may use.

If you think that defending against malicious code is simple, you should
read this blob post:

http://tav.espians.com/a-challenge-to-break-python-security.html

and the thread which inspired it:

http://mail.python.org/pipermail/python-dev/2009-February/086401.html


How do you sanitize user input?
 
G

Gnarlodious

How do you sanitize user input?
Thanks for your concern. This is what I now have, which merely expands
each value into its usable type (unquotes them):

# filter each value
try:
var=int(var)
except ValueError:
if var in ('False', 'True'):
var=eval(var) # extract booleans
else:
var=cgi.escape(var)

This is really no filtering at all, since all CGI variables are
written to a dictionary without checking. However, if there is no
receiver for the value I should be safe, right?

I am also trapping some input at mod_wsgi, like php query strings. And
that IP address gets quarantined. If you can suggest what attack words
to block I'll thank you for it.

I also have a system to reject variables that are not in a list, but
waiting to see what the logfiles show before deploying it.

-- Gnarlie
http://Gnarlodious.com
 
S

Steven D'Aprano

Thanks for your concern. This is what I now have, which merely expands
each value into its usable type (unquotes them):

# filter each value
try:
var=int(var)

Should be safe, although I suppose if an attacker passed (say) five
hundred thousand "9" digits, it might take int() a while to generate the
long int. Instant DOS attack.

A blunt object fix for that is to limit the user input to (say) 500
characters, which should be long enough for any legitimate input string.
But that will depend on your application.


except ValueError:
if var in ('False', 'True'):
var=eval(var) # extract booleans

Well, that's safe, but slow, and it might encourage some future
maintainer to use eval in less safe ways. I'd prefer:

try:
{'True': True, 'False': False}[var]
except KeyError:
pass # try something else


(To be a little more user-friendly, use var.strip().title() instead of
just var.)


else:
var=cgi.escape(var)

This is really no filtering at all, since all CGI variables are written
to a dictionary without checking. However, if there is no receiver for
the value I should be safe, right?

What do you mean "no receiver"?

If you mean that you don't pass the values to eval, exec, use them in SQL
queries, call external shell scripts, etc., then that seems safe to me.
But I'm hardly an expert on security, so don't take my word on it. And it
depends on what you end up doing in the CGI script.

I am also trapping some input at mod_wsgi, like php query strings. And
that IP address gets quarantined. If you can suggest what attack words
to block I'll thank you for it.

That's the wrong approach. Don't block words in a blacklist. Block
everything that doesn't appear in a whitelist. Otherwise you're
vulnerable to a blackhat coming up with an attack word that you never
thought of. There's one of you and twenty million of them. Guess who has
the advantage?
 
C

Chris Angelico

types={'Mobile':str,'context':str,'order':int,'time':bool}

for k,v in data.items():
    data[k] = types[k](v)

Thanks for the tip, I didn't know you could do that. I ended up
filtering the values the bulky way, but it gives me total control over
what internet users feed my program.

It should be noted that this will not in any way sanitize
data['context']. It calls the str() function on it, thus ensuring that
it's a string, but that's all. If you're needing to deal with
(potentially) malicious input, you'll want to swap in a function that
escapes it in some way (if it's going into a database, your database
engine will usually provide a 'quote' or 'escape' function; if it's to
go into a web page, I think cgi.escape is what you want).

ChrisA
 
8

88888 dihedral

Uh, sounds reasonable, if one loops over an index variable that could be altered during the loop execution then the loop may not end as expected.
 
8

88888 dihedral

Uh, sounds reasonable, if one loops over an index variable that could be altered during the loop execution then the loop may not end as expected.
 
I

Ian Kelly

Uh, sounds reasonable, if one loops over an index variable  that could be altered during the loop execution then the loop may not end as expected.
From the docs: "Iterating views while adding or deleting entries in
the dictionary may raise a RuntimeError or fail to iterate over all
entries."

Changing the values of existing entries while iterating is considered
to be safe, though.
 
G

Gnarlodious

Steven: Thanks for those tips, I've implemented all of them. Also only
allowing whitelisted variable names. Feeling much more confident.

-- Gnarlie
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,158
Messages
2,570,882
Members
47,414
Latest member
djangoframe

Latest Threads

Top