Usefulness of the "not in" operator

S

Steven D'Aprano

Tim said:
Yes, we do, because I'm always reading code from other people that didn't
follow that rule.

No no no, they *do* follow the rule. They just have a better memory for
operator precedence than you do :)
 
N

Nobody

Thanks - nice clear explanation. Appreciated. For an encore, can you
give an example of where this is actually useful? It seems a pretty
narrow utility.

It's useful insofar as it allows you to define "numbers" given nothing
other than abstraction and application, which are the only operations
available in the lambda calculus.

The particular formulation makes it easy to define addition, which is
just composition:

(f^(M+N))(x) = (f^M)((f^N)(x))

I.e.:

def church_add(a, b):
return lambda f, x: a(f, b(f, x))
 
A

Alec Taylor

As you see, this way of writing constants gives you much more poetic
freedom than in other programming languages.

It's useful insofar as it allows you to define "numbers" given nothing
other than abstraction and application, which are the only operations
available in the lambda calculus.

Heh. This is why mathematicians ALWAYS make use of previously-defined
objects! In pure lambda calculus, constants are even more painful than
in SPL[1]...

ChrisA
[1] http://shakespearelang.sourceforge..../shakespeare.html#SECTION00045000000000000000
 
J

Jussi Piitulainen

Chris said:
But this translation implies looking at the result and ascertaining
the state, which is less appropriate to a programming language. It's
more like:

"If you found that you were able to start the car, the key must have
been in the ignition."

and is thus quite inappropriate to the imperative style. A
functional language MAY be able to use this style, but Python wants
to have the condition and then the action.

This is not in an imperative context. The context is (generalized)
Boolean expressions, where there should not be any action, just
expressions returning values that are combined to produce a
(generalized) Boolean value.

Defined order of evaluation and short-circuiting complicate the
picture, but as a matter of style, I think there should not be any
action part in such an expression. Usually.

And "not in" is fine as far as I am concerned.
 
A

Alexander Kapps

The Church numeral for N is a function of two arguments which applies its
first argument N times to its second, i.e. (f^N)(x) = f(f(...(f(x))...)).

[SNIP]

Thanks! That's a lot more understandable than Wikipedia. Some
brain-food for the winter. ;-)
 
D

DevPlayer

Sure, but note that you can also reformulate != using not and ==, <
using not and >=, etc. Operators like "not in" and "is not" should
really be considered single tokens, even though they seem to use "not".
And I think they are really convenient.

-- Alain.

1. I thought "x not in y" was later added as syntax sugar for "not x
in y"
meaning they used the same set of tokens. (Too lazy to check the
actual tokens)

2. "x not in y" ==>> (True if y.__call__(x) else False)
class Y(object):
def __contains__(self, x):
for item in y:
if x == y:
return True
return False

And if you wanted "x not in y" to be a different token you'd have to
ADD

class Y(object):
def __not_contained__(self, x):
for item in self:
if x == y:
return False
return True

AND with __not_contained__() you'd always have to iterate the entire
sequence to make sure even the last item doesn't match.

SO with one token "x not in y" you DON'T have to itterate through the
entire sequence thus it is more effiecient.
 
S

Steven D'Aprano

1. I thought "x not in y" was later added as syntax sugar for "not x in
y"
meaning they used the same set of tokens. (Too lazy to check the actual
tokens)

Whether the compiler has a special token for "not in" is irrelevant.
Perhaps it uses one token, or two, or none at all because a pre-processor
changes "x not in y" to "not x in y". That's an implementation detail.
What's important is whether it is valid syntax or not, and how it is
implemented.

As it turns out, the Python compiler does not distinguish the two forms:
1 0 LOAD_NAME 0 (x)
3 LOAD_NAME 1 (y)
6 COMPARE_OP 7 (not in)
9 PRINT_EXPR
10 LOAD_CONST 0 (None)
13 RETURN_VALUE1 0 LOAD_NAME 0 (x)
3 LOAD_NAME 1 (y)
6 COMPARE_OP 7 (not in)
9 PRINT_EXPR
10 LOAD_CONST 0 (None)
13 RETURN_VALUE


Also for what it is worth, "x not in y" goes back to at least Python 1.5,
and possibly even older. (I don't have any older versions available to
test.)


2. "x not in y" ==>> (True if y.__call__(x) else False)

y.__call__ is irrelevant. But for what it's worth:

(1) Instead of writing "y.__call__(x)", write "y(x)"

(2) Instead of writing "True if blah else False", write "bool(blah)".

class Y(object):
def __contains__(self, x):
for item in y:
if x == y:
return True
return False

You don't have to define a __contains__ method if you just want to test
each item sequentially. All you need is to obey the sequence protocol and
define a __getitem__ that works in the conventional way:

.... def __init__(self, args):
.... self._data = list(args)
.... def __getitem__(self, i):
.... return self._data
....False

Defining a specialist __contains__ method is only necessary for non-
sequences, or if you have some fast method for testing whether an item is
in the object quickly. If all you do is test each element one at a time,
in numeric order, don't bother writing __contains__.

And if you wanted "x not in y" to be a different token you'd have to ADD

Tokens are irrelevant. "x not in y" is defined to be the same as "not x
in y" no matter what. You can't define "not in" to do something
completely different.

class Y(object):
def __not_contained__(self, x):
for item in self:
if x == y:
return False
return True

AND with __not_contained__() you'd always have to iterate the entire
sequence to make sure even the last item doesn't match.

SO with one token "x not in y" you DON'T have to itterate through the
entire sequence thus it is more effiecient.

That's not correct.
 
D

DevPlayer

Stated in response to OP wanting a seperate token for "not in" verse
"is not".
Whether the compiler has a special token for "not in" is irrelevant.
I don't know.
Perhaps it uses one token, or two, or none at all because a
pre-processor changes "x not in y" to "not x in y". That's
an implementation detail.
I agree.
What's important is whether it is valid syntax or not, and how it is
implemented.
I agree.
As it turns out, the Python compiler does not distinguish the two forms:


  1           0 LOAD_NAME                0 (x)
              3 LOAD_NAME                1 (y)
              6 COMPARE_OP               7 (not in)
              9 PRINT_EXPR
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE        >>> dis(compile('not x in y', '', 'single'))

  1           0 LOAD_NAME                0 (x)
              3 LOAD_NAME                1 (y)
              6 COMPARE_OP               7 (not in)
              9 PRINT_EXPR
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE

So cool! Thanks for showing how to do that.

I tried to say implementing a seperate method was not efficient.
Also for what it is worth, "x not in y" goes back to at least Python 1.5,
and possibly even older. (I don't have any older versions available to
test.)
So "not in" was added as an alternative (just a long time ago).
I too am glad they added it.
(2) Instead of writing "True if blah else False", write "bool(blah)".
Good tip! I like.
You don't have to define a __contains__ method if you just want to test
each item sequentially. All you need is to obey the sequence protocol and
define a __getitem__ that works in the conventional way:

Didn't intend to show how to implement __contains__ using "==" and
__not_contains__ "<>" in python but to show that python didn't benefit
from the not_in loop as much as for example assembly language does
it's loop (x86 LOOPE/LOOPZ vs LOOPNZ/LOOPNE).
...     def __init__(self, args):
...             self._data = list(args)
...     def __getitem__(self, i):
...             return self._data
...>>> t = Test("abcde")False

Another new thing for me.
Defining a specialist __contains__ method is only necessary for non-
sequences, or if you have some fast method for testing whether an item is
in the object quickly. If all you do is test each element one at a time,
in numeric order, don't bother writing __contains__.


Tokens are irrelevant. "x not in y" is defined to be the same as "not x
in y" no matter what.
You can't define "not in" to do something completely different.
I agree they are not implemented differently.
I agree that they shouldn't be implemented differently.
I disagree they can not be implemented differently. I think they can.
But I see no reason too.
That's not correct.
Steven
I tried to prove my point and failded and instead proved (to myself)
you are correct. It is not more efficient. Also I should have used if
<> y: continue to have better tried to make the point but it wouldn't
have mattered. I still would have been wrong.

But I did walk away from this topic with some goodie tips. Thanks
Steven.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,156
Messages
2,570,878
Members
47,413
Latest member
KeiraLight

Latest Threads

Top