Loop through a dict changing keys

Gnarlodious · Oct 15, 2011

What is the best way (Python 3) to loop through dict keys, examine the
string, change them if needed, and save the changes to the same dict?

So for input like this:
{'Mobile': 'string', 'context': '<malicious code>', 'order': '7',
'time': 'True'}

I want to booleanize 'True', turn '7' into an integer, escape
'<malicious code>', and ignore 'string'.

Any elegant Python way to do this?

-- Gnarlie

Alexander Kapps · Oct 15, 2011

What is the best way (Python 3) to loop through dict keys, examine the
string, change them if needed, and save the changes to the same dict?

So for input like this:
{'Mobile': 'string', 'context': '<malicious code>', 'order': '7',
'time': 'True'}

I want to booleanize 'True', turn '7' into an integer, escape
'<malicious code>', and ignore 'string'.

Any elegant Python way to do this?

-- Gnarlie

I think JSON could be of some use, but I've not used it yet,
otherwise something like this could do it:

#!/usr/bin/python

from cgi import escape

def convert(string):
for conv in (int, lambda x: {'True': True, 'False': False}[x],
escape):
try:
return conv(string)
except (KeyError, ValueError):
pass
return string

d = {'Mobile': 'string',
'context': '<malicious code>',
'order': '7',
'time': 'True'}

print d

for key in d:
d[key] = convert(d[key])

print d

$ ./conv.py
{'Mobile': 'string', 'order': '7', 'context': '<malicious code>',
'time': 'True'}
{'Mobile': 'string', 'order': 7, 'context': '<malicious
code>', 'time': True}

MRAB · Oct 15, 2011

What is the best way (Python 3) to loop through dict keys, examine the
string, change them if needed, and save the changes to the same dict?

So for input like this:
{'Mobile': 'string', 'context': '<malicious code>', 'order': '7',
'time': 'True'}

I want to booleanize 'True', turn '7' into an integer, escape
'<malicious code>', and ignore 'string'.

Any elegant Python way to do this?

How about:

for key, value in my_dict.items():
if value == "True":
my_dict[key] = True

88888 dihedral · Oct 15, 2011

Is there an FAQ available here? Please check the PYTHON official site and the active state PYTHON examples first, also check the PLEAC comparisons of a lot programming languages first!

88888 dihedral · Oct 15, 2011

Is there an FAQ available here? Please check the PYTHON official site and the active state PYTHON examples first, also check the PLEAC comparisons of a lot programming languages first!

PoD · Oct 16, 2011

What is the best way (Python 3) to loop through dict keys, examine the
string, change them if needed, and save the changes to the same dict?

So for input like this:
{'Mobile': 'string', 'context': '<malicious code>', 'order': '7',
'time': 'True'}

I want to booleanize 'True', turn '7' into an integer, escape
'<malicious code>', and ignore 'string'.

Any elegant Python way to do this?

-- Gnarlie

How about

data = {
'Mobile': 'string',
'context': '<malicious code>',
'order': '7',
'time': 'True'}
types={'Mobile':str,'context':str,'order':int,'time':bool}

for k,v in data.items():
data[k] = types[k](v)

Jon Clements · Oct 16, 2011

What is the best way (Python 3) to loop through dict keys, examine the
string, change them if needed, and save the changes to the same dict?

Click to expand...

So for input like this:
{'Mobile': 'string', 'context': '<malicious code>', 'order': '7',
'time': 'True'}

Click to expand...

I want to booleanize 'True', turn '7' into an integer, escape
'<malicious code>', and ignore 'string'.

Click to expand...

Any elegant Python way to do this?

Click to expand...

-- Gnarlie

Click to expand...

How about

data = {
'Mobile': 'string',
'context': '<malicious code>',
'order': '7',
'time': 'True'}
types={'Mobile':str,'context':str,'order':int,'time':bool}

for k,v in data.items():
data[k] = types[k](v)

Bit of nit-picking, but:
False

PoD · Oct 16, 2011

What is the best way (Python 3) to loop through dict keys, examine
the string, change them if needed, and save the changes to the same
dict?

Click to expand...

So for input like this:
{'Mobile': 'string', 'context': '<malicious code>', 'order': '7',
'time': 'True'}

Click to expand...

I want to booleanize 'True', turn '7' into an integer, escape
'<malicious code>', and ignore 'string'.

Click to expand...

Any elegant Python way to do this?

Click to expand...

-- Gnarlie

Click to expand...

How about

data = {
Â Â 'Mobile': 'string',
Â Â 'context': '<malicious code>',
Â Â 'order': '7',
Â Â 'time': 'True'}
types={'Mobile':str,'context':str,'order':int,'time':bool}

for k,v in data.items():
Â Â data[k] = types[k](v)

Click to expand...

Bit of nit-picking, but:
False

Oops

Brain fade.

Gnarlodious · Oct 16, 2011

data = {
'Mobile': 'string',
'context': '<malicious code>',
'order': '7',
'time': 'True'}
types={'Mobile':str,'context':str,'order':int,'time':bool}

for k,v in data.items():
data[k] = types[k](v)

Thanks for the tip, I didn't know you could do that. I ended up
filtering the values the bulky way, but it gives me total control over
what internet users feed my program.

-- Gnarlie

Steven D'Aprano · Oct 17, 2011

data = {
Â Â 'Mobile': 'string',
Â Â 'context': '<malicious code>',
Â Â 'order': '7',
Â Â 'time': 'True'}
types={'Mobile':str,'context':str,'order':int,'time':bool}

for k,v in data.items():
Â Â data[k] = types[k](v)

Click to expand...

Thanks for the tip, I didn't know you could do that. I ended up
filtering the values the bulky way,

What is "the bulky way"?

but it gives me total control over
what internet users feed my program.

Why does this not fill me with confidence?

As Jon Clements has already spotted a major bug in the above: using bool
as shown is not correct. Furthermore, converting '<malicious code>' into
a string does nothing, since it is already a string.

Gnarlodious, it is good that you are concerned about code injection
attacks, but defending against them is not simple or easy. I don't intend
to sound condescending, but when your response to being shown a simple
filter that maps keys to types is to say "I didn't know you could do
that", that's a good warning that your Python experience may not be quite
up to the job of out-guessing the sort of obscure tricks hostile
attackers may use.

If you think that defending against malicious code is simple, you should
read this blob post:

http://tav.espians.com/a-challenge-to-break-python-security.html

and the thread which inspired it:

http://mail.python.org/pipermail/python-dev/2009-February/086401.html

How do you sanitize user input?

Gnarlodious · Oct 17, 2011

How do you sanitize user input?

Thanks for your concern. This is what I now have, which merely expands
each value into its usable type (unquotes them):

# filter each value
try:
var=int(var)
except ValueError:
if var in ('False', 'True'):
var=eval(var) # extract booleans
else:
var=cgi.escape(var)

This is really no filtering at all, since all CGI variables are
written to a dictionary without checking. However, if there is no
receiver for the value I should be safe, right?

I am also trapping some input at mod_wsgi, like php query strings. And
that IP address gets quarantined. If you can suggest what attack words
to block I'll thank you for it.

I also have a system to reject variables that are not in a list, but
waiting to see what the logfiles show before deploying it.

-- Gnarlie
http://Gnarlodious.com

Steven D'Aprano · Oct 17, 2011

Thanks for your concern. This is what I now have, which merely expands
each value into its usable type (unquotes them):

# filter each value
try:
var=int(var)

Should be safe, although I suppose if an attacker passed (say) five
hundred thousand "9" digits, it might take int() a while to generate the
long int. Instant DOS attack.

A blunt object fix for that is to limit the user input to (say) 500
characters, which should be long enough for any legitimate input string.
But that will depend on your application.

except ValueError:
if var in ('False', 'True'):
var=eval(var) # extract booleans

Well, that's safe, but slow, and it might encourage some future
maintainer to use eval in less safe ways. I'd prefer:

try:
{'True': True, 'False': False}[var]
except KeyError:
pass # try something else

(To be a little more user-friendly, use var.strip().title() instead of
just var.)

else:
var=cgi.escape(var)

This is really no filtering at all, since all CGI variables are written
to a dictionary without checking. However, if there is no receiver for
the value I should be safe, right?

What do you mean "no receiver"?

If you mean that you don't pass the values to eval, exec, use them in SQL
queries, call external shell scripts, etc., then that seems safe to me.
But I'm hardly an expert on security, so don't take my word on it. And it
depends on what you end up doing in the CGI script.

I am also trapping some input at mod_wsgi, like php query strings. And
that IP address gets quarantined. If you can suggest what attack words
to block I'll thank you for it.

That's the wrong approach. Don't block words in a blacklist. Block
everything that doesn't appear in a whitelist. Otherwise you're
vulnerable to a blackhat coming up with an attack word that you never
thought of. There's one of you and twenty million of them. Guess who has
the advantage?

Chris Angelico · Oct 17, 2011

types={'Mobile':str,'context':str,'order':int,'time':bool}

for k,v in data.items():
data[k] = types[k](v)

Click to expand...

Thanks for the tip, I didn't know you could do that. I ended up
filtering the values the bulky way, but it gives me total control over
what internet users feed my program.

It should be noted that this will not in any way sanitize
data['context']. It calls the str() function on it, thus ensuring that
it's a string, but that's all. If you're needing to deal with
(potentially) malicious input, you'll want to swap in a function that
escapes it in some way (if it's going into a database, your database
engine will usually provide a 'quote' or 'escape' function; if it's to
go into a web page, I think cgi.escape is what you want).

ChrisA

88888 dihedral · Oct 17, 2011

Uh, sounds reasonable, if one loops over an index variable that could be altered during the loop execution then the loop may not end as expected.

88888 dihedral · Oct 17, 2011

Uh, sounds reasonable, if one loops over an index variable that could be altered during the loop execution then the loop may not end as expected.

Ian Kelly · Oct 17, 2011

Uh, sounds reasonable, if one loops over an index variable that could be altered during the loop execution then the loop may not end as expected.

From the docs: "Iterating views while adding or deleting entries in

the dictionary may raise a RuntimeError or fail to iterate over all
entries."

Changing the values of existing entries while iterating is considered
to be safe, though.

Gnarlodious · Oct 17, 2011

Steven: Thanks for those tips, I've implemented all of them. Also only
allowing whitelisted variable names. Feeling much more confident.

-- Gnarlie

Python dict as unicode	1	Nov 24, 2010
Ordered dict by default	14	Feb 5, 2009
overriding __getitem__ for a subclass of dict	18	Nov 15, 2009
Newbie ? file structures in Dict, List, Tuples etc How	1	Dec 12, 2007
Fixed keys() mapping	3	Jan 11, 2007
Padding strings for a clean visual print out...	5	Dec 23, 2023
Using a dict as if it were a module namespace	5	Jan 27, 2008
Changing behaviour of namespaces	4	Sep 21, 2006

Loop through a dict changing keys

Gnarlodious

Alexander Kapps

MRAB

88888 dihedral

88888 dihedral

PoD

Jon Clements

PoD

Gnarlodious

Steven D'Aprano

Gnarlodious

Steven D'Aprano

Chris Angelico

88888 dihedral

88888 dihedral

Ian Kelly

Gnarlodious

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads