Yet Another Command Line Parser

M

Manlio Perillo

Regards.
In the standard library there are two modules for command line
parsing: optparse and getopt.
In the Python Cookbook there is another simple method for parsing,
using a docstring.

However sometimes (actually, in all my small scripts) one has a simple
function whose arguments are choosen on the command line.

For this reason I have written a simple module, optlist, that parses
the command line as it was a function's argument list.

It is more simple to post an example:


import optlist


def main(a, b, *args, **kwargs):
print 'a =', a
print 'b =', b

print 'args:', args
print 'kwargs:', kwargs

optlist.call(main)


And on the shell:
shell: script.py 10, 20, 100, x=1



Since sometimes one needs to keep the options, I have provided an
alternate syntax, here is an example:


import optlist

optlist.setup('a, b, *args, **kwargs')

print 'a =', optlist.a
print 'b =', optlist.b

print 'args:', optlist.args
print 'kwargs:', optlist.kwargs



Finally, the module is so small that I post it here:

-------------------------- optlist.py --------------------------------

import sys

# add spaces to avoids errors like: 1 2, 3 4 -> (12, 34)
_options = ' '.join(sys.argv[1:])

def call(func):
"""
Call func, passing to it the arguments from the command line
"""
exec('func(' + _options + ')')

def setup(template):
"""
Template is a string containing the argument list.
The command line options are evaluated according to the template
and the values are stored in the module dictionary
"""
exec('def helper(' + template +
'):\n\tglobals().update(locals())')
exec('helper(' + _options + ')')

----------------------------------------------------------------------



I hope that this is not 'Yet Another Unuseful Module' and that the
code is correct.

The only problem is that error messages are ugly.



Regards Manlio Perillo
 
A

Andrew Dalke

Manlio said:
# add spaces to avoids errors like: 1 2, 3 4 -> (12, 34)
_options = ' '.join(sys.argv[1:])

def call(func):
"""
Call func, passing to it the arguments from the command line
"""
exec('func(' + _options + ')')
The only problem is that error messages are ugly.

And it's a huge security hole. What if I did


script.py "x=6)\
import os
os.system('ls -l')"

Even if not a security hole, it's tricky to handle the
combined shell and Python escaping rules

script.py x="This is a string"

won't work, while

script.py 'x="This is a string"'

should. Embedding ! and \escaped characters should be
even more fun.

Andrew
(e-mail address removed)
 
M

Manlio Perillo

Manlio said:
# add spaces to avoids errors like: 1 2, 3 4 -> (12, 34)
_options = ' '.join(sys.argv[1:])

def call(func):
"""
Call func, passing to it the arguments from the command line
"""
exec('func(' + _options + ')')
The only problem is that error messages are ugly.

And it's a huge security hole.

I know that executing arbitrary code is a security hole.
However it is intended for 'personal' use.
In this way for my scripts I have only to write a single line of code
for options handling.
Later, for production code, one can use getopt.

What if I did


script.py "x=6)\
import os
os.system('ls -l')"

A solution is to use eval, but this does not handle keyword arguments.
Even if not a security hole, it's tricky to handle the
combined shell and Python escaping rules

script.py x="This is a string"

won't work, while

script.py 'x="This is a string"'

should. Embedding ! and \escaped characters should be
even more fun.

I'm not a shell expert, but the solution isn't simply to use ' or ''?

script.py x='\n'



Thanks and regards Manlio Perillo
 
I

Ian Bicking

Manlio said:
Regards.
In the standard library there are two modules for command line
parsing: optparse and getopt.
In the Python Cookbook there is another simple method for parsing,
using a docstring.

However sometimes (actually, in all my small scripts) one has a simple
function whose arguments are choosen on the command line.

For this reason I have written a simple module, optlist, that parses
the command line as it was a function's argument list.

It is more simple to post an example:


import optlist


def main(a, b, *args, **kwargs):
print 'a =', a
print 'b =', b

print 'args:', args
print 'kwargs:', kwargs

optlist.call(main)


And on the shell:
shell: script.py 10, 20, 100, x=1

I think it would be better if this was called like
script 10 20 100 --x=1

With something like:

def parse_args(args):
kw = {}
pos = []
for arg in args:
if arg.startswith('--') and '=' in arg:
name, value = arg.split('=', 1)
kw[name] = value
else:
pos.append(arg)
return pos, kw

def call(func, args=None):
if args is None:
args = sys.argv[1:]
pos, kw = parse_args(args)
func(*pos, **kw)


This isn't exactly what you want, since you want Python expressions
(e.g., 10 instead of '10'). But adding expressions (using eval) should
be easy. Or, you can be more restrictive, and thus safer:

def coerce(arg_value):
try:
return int(arg_value)
except TypeError:
pass
try:
return float(arg_value)
except TypeError:
pass
return arg_value

Or a little less restrictive, allowing for dictionaries and lists, but
still falling back on strings:

def coerce(arg_value):
# as above for int and float
if arg_value[0] in ('[', '{'):
return eval(arg_value)
return arg_value
 
A

Alex Martelli

Andrew Dalke said:
And it's a huge security hole. What if I did

script.py "x=6)\
import os
os.system('ls -l')"

Not to defend exec (ugly thing it is), but in this case I'm not sure
what the security hole would be. If I enter that tricky commandline at
a shell prompt, it will be just as if i had executed the 'ls -l' at the
same shell prompt; weird, but where is the huge security hole? It's not
as if there were setuid shell scripts (is there...? I sure hope not!-).

IOW, what's the difference between that and the commandline

script.py 'x=6' && ls -l

for example? The latter is no security hole, after all.

I understand and agree with the other criticisms you extend to the OP's
code, but this one leaves me perplexed. exec is a huge security hole of
you're doing it on untrusted data, data supplied by somebody else than
the uid running the script; but how are commandline arguments
'untrusted'...?


Alex
 
A

Andrew Dalke

Alex:
Not to defend exec (ugly thing it is), but in this case I'm not sure
what the security hole would be.

In some sense we're both right, or wrong. Security depends on
the system. If someone saw that code, found it interesting, added
it to a script, which passed through a few people to someone
who uses it as part of a public service, then it's possible a
malicious user of that service may be able to execute arbitrary
code on the server.

If I enter that tricky commandline at
a shell prompt, it will be just as if i had executed the 'ls -l' at the
same shell prompt; weird, but where is the huge security hole? It's not
as if there were setuid shell scripts (is there...? I sure hope not!-).

In that environment there are fewer problems.
but how are commandline arguments 'untrusted'...?

I had to think about that for a bit. Much of the work I do
(for money or otherwise) ends up being called by some sort
of web interface or is the interface to such code. Much of
the data I use can come from untrusted sources. So I've
developed a programming habit of being distrustful of any
data I get, even if it's from me.

As a consequence that also means I don't need to think about
the multiple levels in the system.

Andrew
(e-mail address removed)
 
A

Alex Martelli

Andrew Dalke said:
In some sense we're both right, or wrong. Security depends on

Yeah, I see your POV.
I had to think about that for a bit. Much of the work I do
(for money or otherwise) ends up being called by some sort
of web interface or is the interface to such code. Much of
the data I use can come from untrusted sources. So I've
developed a programming habit of being distrustful of any
data I get, even if it's from me.

As a consequence that also means I don't need to think about
the multiple levels in the system.

Yes: never trusting any data anywhere is a safer habit to acquire, and
if you do get into that mindset your code will have fewer risks of
vulnerabilities. An "Only the paranoids survive" kind of stance.

However, it's an interesting characteristic of security that it is _not_
free: each security measure, precaution and stance carries a cost in
terms of convenience and productivity. In any given situation, there
_are_ upper limits to the total amount of such costs that can and will
be born in the name of security. Thus, I believe it's _good_ for
security to be aware exactly of how much security you're buying, what
threat you are warding off and to what extent, with each security
measure you are taking -- a cost/benefit analysis.

Many practices that weaken security _also_ damage code quality in other
ways, for example by being prone to hard-to-reproduce,
hard-to-track-down bugs. The 'exec' statement that you criticized
surely falls into that category. For _those_ practices, I believe that
cost/benefit analysis may well be nearly superfluous: the old slogan
"quality is free" has some truth to it, in as much as the costs of
making good quality code today tend to be repaid with interest in
lowering maintenance costs in the future, enhancing reusability, etc.

So I think a "knee-jerk reaction" against some kinds of 'code smells' is
quite OK. More generally, I'm not sure "knee-jerk security" is a net
win, though. The classic example is forcing people to use
12-characters-long, randomly assigned passwords that they can't change:
inevitably they _will_ write those passwords down somewhere, creating
far worse security risks than if some cost-benefit analysis had been
done to find a reasonable compromise between security and practicality.

These are rather general considerations, musings if you will, about
security and development practices, and in no way meant to defend the
'exec' statement you were criticizing.


Alex
 
A

Andrew Dalke

Alex:
However, it's an interesting characteristic of security that it is _not_
free: each security measure, precaution and stance carries a cost in
terms of convenience and productivity.

Almost completely agreed, though I think there are cases where
a solution with better security doesn't have that tradeoff.

Using Python is one .. haven't had to worry much about stack
overflows, etc. and I've been much more productive. :)

Andrew
(e-mail address removed)
 
A

Alex Martelli

Andrew Dalke said:
Alex:

Almost completely agreed, though I think there are cases where
a solution with better security doesn't have that tradeoff.

Using Python is one .. haven't had to worry much about stack
overflows, etc. and I've been much more productive. :)

Right, of course: more generally, increasing code quality tends to
enhance security as a side effect (since many bugs might potentially be
subject to security exploits), as I indicated, yet it also tends to
lower lifetime costs (since maintenance costs are such a high part of
lifetime costs).

Using Python _is_ a case of "increasing code quality"!-)


Alex
 
M

Manlio Perillo

Manlio said:
# add spaces to avoids errors like: 1 2, 3 4 -> (12, 34)
_options = ' '.join(sys.argv[1:])

def call(func):
"""
Call func, passing to it the arguments from the command line
"""
exec('func(' + _options + ')')
The only problem is that error messages are ugly.

And it's a huge security hole. What if I did


script.py "x=6)\
import os
os.system('ls -l')"

I'm not sure (it does not works on Windows 'shell'), have you run this
code? It does not raises a SyntaxError?
Even if not a security hole, it's tricky to handle the
combined shell and Python escaping rules

script.py x="This is a string"

won't work, while

script.py 'x="This is a string"'

Actually on Windows the right syntax is
script.py x='"This is a string"'
should. Embedding ! and \escaped characters should be
even more fun.


Thanks and regards Manlio Perillo
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,989
Messages
2,570,207
Members
46,782
Latest member
ThomasGex

Latest Threads

Top