all() is slow?

  • Thread starter OKB (not okblacke)
  • Start date
I

Ian Kelly

The use of exec also results in (seemingly) arbitrary constraints on
the input. Like, why can't "--" be a name? Because exec? Is there some
other reason?

That's by design, not because of exec. The names are supposed to be
actual Python names, things that can used to designate keyword
arguments ("MyTuple(foo=42)") or to retrieve elements using attribute
lookup ("my_tuple.foo"). Using "--" as a name would be a syntax error
in either of those cases.

Cheers,
Ian
 
A

alex23

I don't really know anything about him or why people respect him, so I
have no reason to share your faith.

But you're happy to accept the opinions of random posters saying "exec
is evil"? (And it's really not a good idea to be proud of your
ignorance...)
Like, why can't "--" be a name?

Why would you ever want it to be?
I don't like the use of exec, and I don't like the justification (it
seems handwavy).

As opposed to your in-depth critique?
I pointed this out in a thread full of people saying
"never EVER use exec this way", so it's obviously not just me that
thinks this is awful.

No, instead you have a thread full of people happy to criticise
something for which they're providing no alternative implementation.
You can't exactly say _why_ it's bad, other than other people have
echoed it, but you won't actually do anything about it.
I think somebody will read it and think this is a good idea.

Just as I thought.
 
A

alex23

     CPython is slow. It's a naive interpreter.  There's
almost no optimization during compilation.  Try PyPy
or Shed Skin.

Sometimes people need to understand the performance characteristics of
CPython because it's what they have to use. Pointing them at
alternative implementations isn't an answer.
 
D

Devin Jeanpierre

(And it's really not a good idea to be proud of your
ignorance...)

I wasn't bragging.
But you're happy to accept the opinions of random posters saying "exec
is evil"? [...]
As opposed to your in-depth critique? [...]
No, instead you have a thread full of people happy to criticise
something for which they're providing no alternative implementation.
You can't exactly say _why_ it's bad, other than other people have
echoed it, but you won't actually do anything about it.

I said it was bad because I found it difficult to read and it was
"weird". I also mentioned that it's conceivable that it has security
flaws, but that's not as big a deal. I believe I also said that it was
bad because it resulted in """arbitrary""" limitations in
functionality. So, yes, I did say why it's bad, and it's not just
because other people say so. My reasons are weak, but that's a
different story.

I also mentioned the alternative implementation, which uses a dict.
There was even already a patch submitted to make namedtuple work this
way, so I don't think I had to be too specific. R. Hettinger rejected
this patch, which was what I was referring to when I was referring to
handwaviness.

So, no.
Just as I thought.

Woo condescension.

Devin
 
S

Steven D'Aprano

No, instead you have a thread full of people happy to criticise
something for which they're providing no alternative implementation. You
can't exactly say _why_ it's bad, other than other people have echoed
it, but you won't actually do anything about it.

In fairness there are alternative implementations. They are not identical
to the version using exec. There is at least one use-case for *not* using
exec, even at the cost of functionality: a restricted Python environment
without exec.

On the other hand, a restricted Python without exec is not actually
Python.
 
S

Steven D'Aprano

I don't really know anything about him or why people respect him, so I
have no reason to share your faith.

That's fine.

I don't expect you to take my word on it (and why should you, I could be
an idiot or a sock-puppet), but you could always try googling for
"Raymond Hettinger python" and see what comes up. He is not some fly-by
Python coder who snuck some dubious n00b code into the standard library
when no-one was looking :)

The mere fact that it was accepted into the standard library should tell
you that core Python developers consider it an acceptable technique.
That's not to say the technique is uncontroversial. But there are still
people who dislike "x if flag else y" and @decorator syntax --
controversy, in and of itself, isn't necessarily a reason to avoid
certain idioms.


Are you familiar with the idea of "code smell"?

http://www.codinghorror.com/blog/2006/05/code-smells.html
http://www.joelonsoftware.com/articles/Wrong.html

I would agree that the use of exec is a code smell. But that doesn't mean
it is wrong or bad, merely that it needs a second look before accepting
it. There's a world of difference between "You MUST NOT use exec" and
"You SHOULD NOT use exec".

See RFC 2119 if you are unclear on the difference:

http://www.ietf.org/rfc/rfc2119.txt


Well. It reads fine in a certain sense, in that I can figure out what's
going on (although I have some troubles figuring out why the heck
certain things are in the code). The issue is that what's going on is
otherworldly: this is not a Python pattern, this is not a normal
approach. To me, that means it does not read fine.

There's nothing inside the template being exec'ed that couldn't be found
in non-exec code. So if you're having trouble figuring out parts of the
code, the presence of the exec is not the problem.

Having said that, dynamic code generation is well known for often being
harder to read than "ordinary" code. But then, pointers are hard too.


The use of exec also results in (seemingly) arbitrary constraints on the
input. Like, why can't "--" be a name? Because exec? Is there some other
reason?

Because Python doesn't allow "--" to be an attribute name, and so
namedtuple doesn't let you try:

t = namedtuple("T", "foo -- bar")(1, 2, 3)
print(t.foo)
print(t.--)
print(t.bar)
 
D

Devin Jeanpierre

I don't expect you to take my word on it (and why should you, I could be
an idiot or a sock-puppet), but you could always try googling for
"Raymond Hettinger python" and see what comes up. He is not some fly-by
Python coder who snuck some dubious n00b code into the standard library
when no-one was looking :)

Alright, I know *something* about him. I knew he was a core developer,
and that he was responsible for namedtuple. I also discovered (when I
looked up his activestate profile) some other stuff he wrote. I don't
really know anything about him outside of that -- i.e. I have no idea
what parts of Python he's contributed things to in the past that could
make me go, "oh, wow, _he_ did that?" and so on. I don't really feel
like a few minutes research would give me the right feel, it generally
has to come up organically.

Anyway, if we step back, for a trustworthy developer who wrote
something seemingly-crazy, I should be willing to suspend judgement
until I see the relevant facts about something that the developer
might have and I don't. But he did give the facts,
( http://bugs.python.org/issue3974 again) , and I'm not convinced.

Things can go terribly wrong when abusing exec e.g.
http://www.gossamer-threads.com/lists/python/bugs/568206 . That
shouldn't ever happen with a function such as this. exec opens doors
that should not be opened without a really good reason, and those
reasons don't strike me that way.
The mere fact that it was accepted into the standard library should tell
you that core Python developers consider it an acceptable technique.

I've seen core developers rail against the namedtuple source code. In
fairness, I don't believe exec was the subject of the rant --
nonetheless its presence isn't evidence of general support, and even
if it were, my tastes have always differed from that of the core
developers.
That's not to say the technique is uncontroversial. But there are still
people who dislike "x if flag else y" and @decorator syntax --
controversy, in and of itself, isn't necessarily a reason to avoid
certain idioms.

I think there's somewhat a difference in magnitude of objections
between using exec as a hacked-together macro system, and using "x if
flag else y" when if statements would do.

If the exec trick is reasonable, we should normalize it in the form of
a real, useful macro system, that can protect us against exec's many
flaws (code injection, accidental syntax errors, etc.) and tell future
programmers how to do this safely and in a semi-approvable way.
I would agree that the use of exec is a code smell. But that doesn't mean
it is wrong or bad, merely that it needs a second look before accepting
it. There's a world of difference between "You MUST NOT use exec" and
"You SHOULD NOT use exec".

Do I really need a second look? I see exec, I wonder what it's doing.
It isn't doing anything that couldn't be done subjectively better with
e.g. a dict, so I disapprove of the usage of exec.
There's nothing inside the template being exec'ed that couldn't be found
in non-exec code. So if you're having trouble figuring out parts of the
code, the presence of the exec is not the problem.

There's more overhead going back and forth to the template, and
there's related things that I can't be sure are because of exec or
because of design decisions, etc. It makes code reading more
challenging, even if it's still possible. That said, sure, some of
these are problems with whatever else he's done.
Having said that, dynamic code generation is well known for often being
harder to read than "ordinary" code. But then, pointers are hard too.

And on the other other hand, Python lacks explicit support for both
pointers and code generation (unless you count strings and ctypes).
Because Python doesn't allow "--" to be an attribute name, and so
namedtuple doesn't let you try:

t = namedtuple("T", "foo -- bar")(1, 2, 3)
print(t.foo)
print(t.--)
print(t.bar)

'--' is a valid attribute name on virtually any object that supports
attribute setting (e.g. function objects). Of course, you need to use
setattr() and getattr(). Is this really the reason, or is it a
limitation caused primarily by the usage of exec and the need to
prevent code injection? If somebody added this feature later on, would
this create a security vulnerability in certain projects that used
namedtuple in certain ways?

Devin
 
G

gene heskett

Alright, I know *something* about him. I knew he was a core developer,
and that he was responsible for namedtuple. I also discovered (when I
looked up his activestate profile) some other stuff he wrote. I don't
really know anything about him outside of that -- i.e. I have no idea
what parts of Python he's contributed things to in the past that could
make me go, "oh, wow, _he_ did that?" and so on. I don't really feel
like a few minutes research would give me the right feel, it generally
has to come up organically.

Anyway, if we step back, for a trustworthy developer who wrote
something seemingly-crazy, I should be willing to suspend judgement
until I see the relevant facts about something that the developer
might have and I don't. But he did give the facts,
( http://bugs.python.org/issue3974 again) , and I'm not convinced.

Things can go terribly wrong when abusing exec e.g.
http://www.gossamer-threads.com/lists/python/bugs/568206 . That
shouldn't ever happen with a function such as this. exec opens doors
that should not be opened without a really good reason, and those
reasons don't strike me that way.

If, in the sense that this python 'exec' essentially duplicates the bash
version, then I have found it quite useful, it was taught to me several
years ago by another teacher who was a long time Solaris fan, and if it
were to go away, I have several bash scripts running here right now that
would require major re-writes.

Well, its certainly not a new concept. All the major 'shell interpreters'
have it, why not python?
I've seen core developers rail against the namedtuple source code. In
fairness, I don't believe exec was the subject of the rant --
nonetheless its presence isn't evidence of general support, and even
if it were, my tastes have always differed from that of the core
developers.


I think there's somewhat a difference in magnitude of objections
between using exec as a hacked-together macro system, and using "x if
flag else y" when if statements would do.

If the exec trick is reasonable, we should normalize it in the form of
a real, useful macro system, that can protect us against exec's many
flaws (code injection, accidental syntax errors, etc.) and tell future
programmers how to do this safely and in a semi-approvable way.


Do I really need a second look? I see exec, I wonder what it's doing.
It isn't doing anything that couldn't be done subjectively better with
e.g. a dict, so I disapprove of the usage of exec.


There's more overhead going back and forth to the template, and
there's related things that I can't be sure are because of exec or
because of design decisions, etc. It makes code reading more
challenging, even if it's still possible. That said, sure, some of
these are problems with whatever else he's done.


And on the other other hand, Python lacks explicit support for both
pointers and code generation (unless you count strings and ctypes).


'--' is a valid attribute name on virtually any object that supports
attribute setting (e.g. function objects). Of course, you need to use
setattr() and getattr(). Is this really the reason, or is it a
limitation caused primarily by the usage of exec and the need to
prevent code injection? If somebody added this feature later on, would
this create a security vulnerability in certain projects that used
namedtuple in certain ways?

Devin


Cheers, Gene
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
My web page: said:
Ever heard of .cshrc?
That's a city in Bosnia. Right?
(Discussion in comp.os.linux.misc on the intuitiveness of commands.)
 
E

Ethan Furman

Devin said:
Well. It reads fine in a certain sense, in that I can figure out
what's going on (although I have some troubles figuring out why the
heck certain things are in the code). The issue is that what's going
on is otherworldly: this is not a Python pattern, this is not a normal
approach. To me, that means it does not read fine.

Certainly it's a Python pattern -- it's what you do to dynamically
generate code.

The use of exec also results in (seemingly) arbitrary constraints on
the input. Like, why can't "--" be a name? Because exec? Is there some
other reason?

'--' not being allowed for a name has *nothing* to do with exec, and
everything to do with `--` not being a valid Python identifier.

'--' is a valid attribute name on virtually any object that supports
attribute setting (e.g. function objects). Of course, you need to use
setattr() and getattr(). Is this really the reason, or is it a
limitation caused primarily by the usage of exec and the need to
prevent code injection? If somebody added this feature later on, would
this create a security vulnerability in certain projects that used
namedtuple in certain ways?

So you think

somevar = getattr(my_named_tuple, '--')

is more readable than

somevar = my_named_tuple.spam

?

~Ethan~
 
T

Terry Reedy

'--' is a valid attribute name on virtually any object that supports
attribute setting (e.g. function objects).

ob.-- is not valid Python because '--' is not a name.
Of course, you need to use setattr() and getattr().

I consider the fact that CPython's setattr accepts non-name strings to
be a bit of a bug. Or if you will, leniency for speed. (A unicode name
check in Py3 would be much more expensive than an ascii name check in
Py2.) I would consider it legitimate for another implementation to only
accept names and to use a specialized name_dict for attribute dictionaries.

So I consider it quite legitimate for namedtuple to requires real names
for the fields. The whole point is to allow ob.name access to tuple
members. Someone who plans to use set/getattr with arbitrary strings
should just use a dict instead of a tuple.
 
O

OKB (not okblacke)

John said:
CPython is slow. It's a naive interpreter. There's
almost no optimization during compilation. Try PyPy
or Shed Skin.

PyPy is interesting, but I use various libraries that make use of C
extension modules. I'm not going to compile them all myself, which is
apparently what I would need to do for PyPy. PyPy or other
implementations won't work for me unless they're completely drop-in
replacements for the interpreter.

--
--OKB (not okblacke)
Brendan Barnwell
"Do not follow where the path may lead. Go, instead, where there is
no path, and leave a trail."
--author unknown
 
D

Devin Jeanpierre

'--' not being allowed for a name has *nothing* to do with exec, and
everything to do with `--` not being a valid Python identifier.

The only reason valid python identifiers come into it at all is
because they get pasted into a string where identifiers would go, and
that string is passed to exec().

So really, does it have "nothing" to do with exec? Or does your
argument eventually boil down to the use of exec?
is more readable than

Of course not. I do, however, think that it's conceivable that I'd
want to key a namedtuple by an invalid identifier, and to do that,
yes, I'd need to use getattr().

Devin
 
I

Ian Kelly

Of course not. I do, however, think that it's conceivable that I'd
want to key a namedtuple by an invalid identifier, and to do that,
yes, I'd need to use getattr().

Care to give a real use case? You could even go a step further and
use, say, arbitrary ints as names if you're willing to give up
getattr() and use "ob.__class__.__dict__[42].__get__(ob,
ob.__class__)" everywhere instead. The fact that somebody might
conceivably want to do this doesn't make it a good idea, though.

I do find it a bit funny that you're criticizing a somewhat smelly
implementation detail by complaining that it doesn't support an
equally smelly feature.

Cheers,
Ian
 
E

Ethan Furman

Devin said:
The only reason valid python identifiers come into it at all is
because they get pasted into a string where identifiers would go, and
that string is passed to exec().

So really, does it have "nothing" to do with exec? Or does your
argument eventually boil down to the use of exec?

As I recall the big reason for namedtuples was things like

sys.version_info[1] # behind door number one is...

being much more readable as

sys.version_info.minor

In other words, the tuple offsets are named -- hence, namedtuples. And
only valid identifiers will work.

So, no, it has nothing to do with 'exec', and everything to do with the
problem namedtuple was designed to solve.

~Ethan~
 
S

Steven D'Aprano

Care to give a real use case?

A common use-case is for accessing fields from an external data source,
using the same field names. For example, you might have a database with a
field called "class", or a CSV file with columns "0-10", "11-20", etc.

Personally, I wouldn't bother using attributes to access fields, I'd use
a dict, but some people think it's important to use attribute access.

You could even go a step further and use,
say, arbitrary ints as names if you're willing to give up getattr() and
use "ob.__class__.__dict__[42].__get__(ob, ob.__class__)" everywhere
instead. The fact that somebody might conceivably want to do this
doesn't make it a good idea, though.

Obviously you would write a helper function rather than repeat that mess
in-line everywhere.
 
S

Steven D'Aprano

The only reason valid python identifiers come into it at all is because
they get pasted into a string where identifiers would go, and that
string is passed to exec().

That is patently untrue. If you were implementing namedtuple without
exec, you would still (or at least you *should*) prevent the user from
passing invalid identifiers as attribute names. What's the point of
allowing attribute names you can't actually *use* as attribute names?

You could remove the validation, allowing users to pass invalid field
names, but that would be a lousy API. If you want field names that aren't
valid identifiers, the right solution is a dict, not attributes.

Here's a re-implementation using a metaclass:

http://pastebin.com/AkG1gbGC

and a diff from the Python bug tracker removing exec from namedtuple:

http://bugs.python.org/file11608/new_namedtuples.diff


You will notice both of them keep the field name validation.
 
O

OKB (not okblacke)

Devin said:
The only reason valid python identifiers come into it at all is
because they get pasted into a string where identifiers would go, and
that string is passed to exec().

The whole point of named tuples is to be able to access the members
via attribute access as in "obj.attr". Things like "obj.--" are not
valid Python syntax, so you can't use "--" as the name of a namedtuple
field. Yes, you can do "getattr(obj, '--')" if you want, but it's quite
reasonable for namedtuple to refrain from catering to that sort of
perverse usage.

--
--OKB (not okblacke)
Brendan Barnwell
"Do not follow where the path may lead. Go, instead, where there is
no path, and leave a trail."
--author unknown
 
S

Steven D'Aprano

Then why doesn't Python do this anywhere else? e.g. why can I
setattr(obj, 'a#b') when obj is any other mutable type?

That is implementation-specific behaviour and not documented behaviour
for Python. If you try it in (say) IronPython or Jython, you may or may
not see the same behaviour.

The docs for getattr state:

getattr(x, 'foobar') is equivalent to x.foobar


which implies that getattr(x, 'a!b') should be equivalent to x.a!b which
will give a syntax error. The fact that CPython does less validation is
arguably a bug and not something that you should rely on: it is *not* a
promise of the language.

As Terry Reedy already mentioned, the namespace used in classes and
instances are ordinary generic dicts, which don't perform any name
validation. That's done for speed. Other implementations may use
namespaces that enforce legal names for attributes, and __slots__ already
does:
.... __slots__ = ['a', 'b!']
....
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Error when calling the metaclass bases
__slots__ must be identifiers


[...]
To go off on another tangent, though, I don't really understand how you
guys can think this is reasonable, though. I don't get this philosophy
of restricting inputs that would otherwise be perfectly valid

But they aren't perfectly valid. They are invalid inputs. Just because
getattr and setattr in CPython allow you to create attributes with
invalid names doesn't mean that everything else should be equally as
slack.
 
D

Devin Jeanpierre

which implies that getattr(x, 'a!b') should be equivalent to x.a!b

No, it does not. The documentation states equivalence for two
particular values, and there is no way to deduce truth for all cases
from that. In fact, if it _was_ trying to say it was true for any
attribute value, then your example would be proof that the
documentation is incorrect, since CPython breaks that equivalence.

Devin

Then why doesn't Python do this anywhere else? e.g. why can I
setattr(obj, 'a#b') when obj is any other mutable type?

That is implementation-specific behaviour and not documented behaviour
for Python. If you try it in (say) IronPython or Jython, you may or may
not see the same behaviour.

The docs for getattr state:

   getattr(x, 'foobar') is equivalent to x.foobar


which implies that getattr(x, 'a!b') should be equivalent to x.a!b which
will give a syntax error. The fact that CPython does less validation is
arguably a bug and not something that you should rely on: it is *not* a
promise of the language.

As Terry Reedy already mentioned, the namespace used in classes and
instances are ordinary generic dicts, which don't perform any name
validation. That's done for speed. Other implementations may use
namespaces that enforce legal names for attributes, and __slots__ already
does:
...     __slots__ = ['a', 'b!']
...
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
TypeError: Error when calling the metaclass bases
   __slots__ must be identifiers


[...]
To go off on another tangent, though, I don't really understand how you
guys can think this is reasonable, though. I don't get this philosophy
of restricting inputs that would otherwise be perfectly valid

But they aren't perfectly valid. They are invalid inputs. Just because
getattr and setattr in CPython allow you to create attributes with
invalid names doesn't mean that everything else should be equally as
slack.
 
A

alex23

No, it does not. The documentation states equivalence for two
particular values

It states equivalence for two values _based on the name_.

"If the string is the name of one of the object’s attributes, the
result is the value of that attribute. For example, getattr(x,
'foobar') is equivalent to x.foobar."

The string 'a!b' is the name of the attribute, ergo getattr(x, 'a!b')
_is_ x.a!b. If x.a!b isn't valid CPython, then etc.
CPython breaks that equivalence

So you're outright ignoring the comments that this behaviour is to
make CPython more performant?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,996
Messages
2,570,238
Members
46,826
Latest member
robinsontor

Latest Threads

Top