Default scope of variables

Dave Angel · Jul 5, 2013

]

Anyway, none of the calculations that has been given takes into account
the fact that names can be /less/ than one million characters long.

Click to expand...

Not in *my* code they don't!!!

*wink*

The
actual number of non-empty strings of length at most 1000000 characters,
that consist only of ascii letters, digits or underscores, and that
don't start with a digit, is

sum(53*63**i for i in range(1000000)) == 53*(63**1000000 - 1)//62

Click to expand...

I take my hat of to you sir, or possibly madam. That is truly an inspired
piece of pedantry.

It's perhaps worth mentioning that some non-ascii characters are allowed
in identifiers in Python 3, though I don't know which ones.

Click to expand...

PEP 3131 describes the rules:

http://www.python.org/dev/peps/pep-3131/

For example:

py> import unicodedata as ud
py> for c in 'Ã©Ã¦Â¥ÂµÂ¿Î¼Ð–ášƒâ€°â‡„âˆž':
... print(c, ud.name(c), c.isidentifier(), ud.category(c))
...
Ã© LATIN SMALL LETTER E WITH ACUTE True Ll
Ã¦ LATIN SMALL LETTER AE True Ll
Â¥ YEN SIGN False Sc
Âµ MICRO SIGN True Ll
Â¿ INVERTED QUESTION MARK False Po
Î¼ GREEK SMALL LETTER MU True Ll
Ð– CYRILLIC CAPITAL LETTER ZHE True Lu
ášƒ OGHAM LETTER FEARN True Lo
â€° PER MILLE SIGN False Po
â‡„ RIGHTWARDS ARROW OVER LEFTWARDS ARROW False So
âˆž INFINITY False Sm

The isidentifier() method will let you weed out the characters that
cannot start an identifier. But there are other groups of characters
that can appear after the starting "letter". So a more reasonable
sample might be something like:

py> import unicodedata as ud
py> for c in 'Ã©Ã¦Â¥ÂµÂ¿Î¼Ð–ášƒâ€°â‡„âˆž':
... xc = "X" + c
... print(c, ud.name(c), xc.isidentifier(), ud.category(c))
...

In particular,
http://docs.python.org/3.3/reference/lexical_analysis.html#identifiers

has a definition for id_continue that includes several interesting
categories. I expected the non-ASCII digits, but there's other stuff
there, like "nonspacing marks" that are surprising.

I'm pretty much speculating here, so please correct me if I'm way off.

Joshua Landau · Jul 5, 2013

The isidentifier() method will let you weed out the characters that cannot
start an identifier. But there are other groups of characters that can
appear after the starting "letter". So a more reasonable sample might be
something like: ....
In particular,
http://docs.python.org/3.3/reference/lexical_analysis.html#identifiers

has a definition for id_continue that includes several interesting
categories. I expected the non-ASCII digits, but there's other stuff there,
like "nonspacing marks" that are surprising.

I'm pretty much speculating here, so please correct me if I'm way off.

For my calculation above, I used this code I quickly mocked up:

import unicodedata as unidata
from sys import maxunicode
from collections import defaultdict
from itertools import chain

def get():
xid_starts = set()
xid_continues = set()

id_start_categories = "Lu, Ll, Lt, Lm, Lo, Nl".split(", ")
id_continue_categories = "Mn, Mc, Nd, Pc".split(", ")

characters = (chr(n) for n in range(maxunicode + 1))

print("Making normalized characters")

normalized = (unidata.normalize("NFKC", character) for character in characters)
normalized = set(chain.from_iterable(normalized))

print("Assigning to categories")

for character in normalized:
category = unidata.category(character)

if category in id_start_categories:
xid_starts.add(character)
elif category in id_continue_categories:
xid_continues.add(character)

return xid_starts, xid_continues

Please note that "xid_continues" actually represents "xid_continue - xid_start".

Joshua Landau · Jul 5, 2013

In particular,
http://docs.python.org/3.3/reference/lexical_analysis.html#identifiers

has a definition for id_continue that includes several interesting
categories. I expected the non-ASCII digits, but there's other stuff there,
like "nonspacing marks" that are surprising.

"nonspacing marks" are just accents, so it makes sense *Ã¡* mon avis.

Rotwang · Jul 5, 2013

]

Anyway, none of the calculations that has been given takes into account
the fact that names can be /less/ than one million characters long.

Click to expand...

Not in *my* code they don't!!!

*wink*

The
actual number of non-empty strings of length at most 1000000 characters,
that consist only of ascii letters, digits or underscores, and that
don't start with a digit, is

sum(53*63**i for i in range(1000000)) == 53*(63**1000000 - 1)//62

Click to expand...

I take my hat of to you sir, or possibly madam. That is truly an inspired
piece of pedantry.

FWIW, I'm male.

PEP 3131 describes the rules:

http://www.python.org/dev/peps/pep-3131/

Thanks.

Neil Cerutti · Jul 5, 2013

On 07/04/2013 01:32 AM, Steven D'Aprano wrote:

Well, if I ever have more than 63,000,000 variables[1] in a
function, I'll keep that in mind.

Click to expand...

[1] Based on empirical evidence that Python supports names
with length at least up to one million characters long, and
assuming that each character can be an ASCII letter, digit or
underscore.

Click to expand...

Well, the number wouldn't be 63,000,000. Rather it'd be
63**1000000

You should really count only the ones somebody might actually
want to use. That's a much, much smaller number, though still
plenty big.

Inner scopes (I don't remember the official name) is a great
feature of C++. It's not the right feature for Python, though,
since Python doesn't have deterministic destruction. It wouldn't
buy much except for namespace tidyness.

for x in range(4):
print(x)
print(x) # Vader NOoooooOOOOOO!!!

Python provides deterministic destruction with a different
feature.

Chris Angelico · Jul 5, 2013

Python provides deterministic destruction with a different
feature.

You mean 'with'? That's not actually destruction, it just does one of
the same jobs that deterministic destruction is used for (RAII). It
doesn't, for instance, have any influence on memory usage, nor does it
ensure the destruction of the object's referents. But yes, it does
achieve (one of) the most important role(s) of destruction.

ChrisA

Neil Cerutti · Jul 5, 2013

You mean 'with'? That's not actually destruction, it just does
one of the same jobs that deterministic destruction is used for
(RAII). It doesn't, for instance, have any influence on memory
usage, nor does it ensure the destruction of the object's
referents. But yes, it does achieve (one of) the most important
role(s) of destruction.

Yes, thanks. I meant the ability to grab and release a
resource deterministically.

Wayne Werner · Jul 7, 2013

Oh. Uhm... ahh... it would have helped to mention that it also has a
commit() method! But yes, that's correct; if the object expires (this
is C++, so it's guaranteed to call the destructor at that close brace
- none of the Python vagueness about when __del__ is called) without
commit() being called, then the transaction will be rolled back.

If one wants to duplicate this kind of behavior in Python, that's what
context managers are for combined with a `with` block, which does
guarantee that the __exit__ method will be called - in this case it could
be something as simple as:

from contextlib import contextmanager

@contextmanager
def new_transaction(conn):
tran = conn.begin_transaction()
yield tran
if not tran.committed:
tran.rollback()

Which you would then use like:

conn = create_conn()
with new_transaction(conn) as tran:
rows_affected = do_query_stuff(tran)
if rows_affected == 42:
tran.commit()

And then you get the desired constructor/destructor behavior of having
guaranteed that code will be executed at the start and at the end. You can
wrap things in try/catch for some error handling, or write your own
context manager class.

HTH,
Wayne

Chris Angelico · Jul 7, 2013

Which you would then use like:

conn = create_conn()
with new_transaction(conn) as tran:
rows_affected = do_query_stuff(tran)
if rows_affected == 42:
tran.commit()

Yep. There's a problem, though, when you bring in subtransactions. The
logic wants to be like this:

with new_transaction(conn) as tran:
tran.query("blah")
with tran.subtransaction() as tran:
tran.query("blah")
with tran.subtransaction() as tran:
tran.query("blah")
# roll this subtransaction back
tran.query("blah")
tran.commit()
tran.query("blah")
tran.commit()

The 'with' statement doesn't allow this. I would need to use some kind
of magic to rebind the old transaction to the name, or else use a list
that gets magically populated:

with new_transaction(conn) as tran:
tran[-1].query("blah")
with subtransaction(tran):
tran[-1].query("blah")
with subtransaction(tran):
tran[-1].query("blah")
# roll this subtransaction back
tran[-1].query("blah")
tran[-1].commit()
tran[-1].query("blah")
tran[-1].commit()

I don't like the look of this. It might work, but it's hardly ideal.
This is why I like to be able to nest usages of the same name.

ChrisA

Steven D'Aprano · Jul 7, 2013

Which you would then use like:

conn = create_conn()
with new_transaction(conn) as tran:
rows_affected = do_query_stuff(tran)
if rows_affected == 42:
tran.commit()

Click to expand...

Yep. There's a problem, though, when you bring in subtransactions. The
logic wants to be like this: [snip hideous code]
I don't like the look of this. It might work, but it's hardly ideal.
This is why I like to be able to nest usages of the same name.

Yes, and the obvious way to nest usages of the same name is to use a
instance with a class attribute and instance attribute of the same name:

class Example:
attr = 23

x = Example()
x.attr = 42
print(x.attr)
del x.attr
print(x.attr)

If you need more than two levels, you probably ought to re-design your
code to be less confusing, otherwise you may be able to use ChainMap to
emulate any number of nested scopes.

One interesting trick is you can use a ChainMap as function globals.
Here's a sketch for what you can do in Python 3.3:

from types import FunctionType
from collections import ChainMap

class _ChainedDict(ChainMap, dict):
# Function dicts must be instances of dict :-(
pass

def chained_function(func, *dicts):
"""Return a new function, copied from func, using a
ChainMap as dict.
"""
dicts = dicts + (func.__globals__, builtins.__dict__)
d = _ChainedDict(*dicts)
name = func.__name__
newfunc = FunctionType(
func.__code__, d, name, closure=func.__closure__)
newfunc.__dict__.update(func.__dict__)
newfunc.__defaults__ = func.__defaults__
return newfunc

And in use:

py> f = chained_function(lambda x: x+y, {'y': 100})
py> f(1)
101
py> f.__globals__.maps.insert(0, {'y': 200})
py> f(1)
201
py> del f.__globals__.maps[0]['y']
py> f(1)
101

Steven D'Aprano · Jul 7, 2013

for x in range(4):
print(x)
print(x) # Vader NOoooooOOOOOO!!!

That loops do *not* introduce a new scope is a feature, not a bug. It is
*really* useful to be able to use the value of x after the loop has
finished. That's a much more common need than being able to have an x
inside the loop and an x outside the loop.

Ethan Furman · Jul 7, 2013

Yep. There's a problem, though, when you bring in subtransactions. The
logic wants to be like this:

Is there some reason you can't simply do this?

with new_transaction(conn) as tran1:
tran1.query("blah")
with tran1.subtransaction() as tran2:
tran2.query("blah")
with tran2.subtransaction() as tran3:
tran3.query("blah")
# roll this subtransaction back
tran2.query("blah")
tran2.commit()
tran1.query("blah")
tran1.commit()

Ethan Furman · Jul 7, 2013

The 'with' statement doesn't allow this. I would need to use some kind
of magic to rebind the old transaction to the name, or else use a list
that gets magically populated:

with new_transaction(conn) as tran:
tran[-1].query("blah")
with subtransaction(tran):
tran[-1].query("blah")
with subtransaction(tran):
tran[-1].query("blah")
# roll this subtransaction back
tran[-1].query("blah")
tran[-1].commit()
tran[-1].query("blah")
tran[-1].commit()

The other option is to build the magic into the new_transaction class, then your code will look like:

with new_transaction(conn) as tran:
tran.query("blah")
with tran.subtransaction():
tran.query("blah")
with tran.subtransaction():
tran.query("blah")
# roll this subtransaction back
tran.query("blah")
tran.commit()
tran.query("blah")
tran.commit()

This would definitely make more sense in a loop.

Chris Angelico · Jul 8, 2013

If you need more than two levels, you probably ought to re-design your
code to be less confusing, otherwise you may be able to use ChainMap to
emulate any number of nested scopes.

The subtransactions are primarily to represent the database equivalent
of a try/except block, so they need to be able to be nested
arbitrarily.

ChrisA

Chris Angelico · Jul 8, 2013

Is there some reason you can't simply do this?

with new_transaction(conn) as tran1:
tran1.query("blah")
with tran1.subtransaction() as tran2:
tran2.query("blah")
with tran2.subtransaction() as tran3:
tran3.query("blah")

# roll this subtransaction back
tran2.query("blah")
tran2.commit()
tran1.query("blah")
tran1.commit()

That means that I, as programmer, have to keep track of the nesting
level of subtransactions. Extremely ugly. A line of code can't be
moved around without first checking which transaction object to work
with.

ChrisA

Steven D'Aprano · Jul 8, 2013

]

That means that I, as programmer, have to keep track of the nesting
level of subtransactions. Extremely ugly. A line of code can't be moved
around without first checking which transaction object to work with.

I feel your pain, but I wonder why we sometimes accept "a line of code
can't be moved around" as an issue to be solved by the language. After
all, in general most lines of code can't be moved around.

Chris Angelico · Jul 8, 2013

]

That means that I, as programmer, have to keep track of the nesting
level of subtransactions. Extremely ugly. A line of code can't be moved
around without first checking which transaction object to work with.

Click to expand...

I feel your pain, but I wonder why we sometimes accept "a line of code
can't be moved around" as an issue to be solved by the language. After
all, in general most lines of code can't be moved around.

It's not something to be solved by the language, but it's often
something to be solved by the program's design. Two lines of code that
achieve the same goal should normally look the same. This is why
Python's policy is "one obvious way to do something" rather than
"spell it five different ways in the same file to make a nightmare for
other people coming after you". Why should database queries be spelled
"trans1.query()" in one place, and "trans2.query()" in another?
Similarly, if I want to call another function and that function needs
to use the database, why should I pass it trans3 and have that come
out as trans1 on the other side? Unnecessarily confusing. Makes much
more sense to use the same name everywhere.

ChrisA

Steven D'Aprano · Jul 8, 2013

It's not something to be solved by the language, but it's often
something to be solved by the program's design. Two lines of code that
achieve the same goal should normally look the same. This is why
Python's policy is "one obvious way to do something" rather than "spell
it five different ways in the same file to make a nightmare for other
people coming after you". Why should database queries be spelled
"trans1.query()" in one place, and "trans2.query()" in another?

Is that a trick question? They probably shouldn't. But it's a big leap
from that to "...and therefore `for` and `while` should introduce their
own scope".

Similarly, if I want to call another function and that function needs to
use the database, why should I pass it trans3 and have that come out as
trans1 on the other side? Unnecessarily confusing. Makes much more sense
to use the same name everywhere.

"Is your name not Bruce? That's going to cause a little confusion."

Chris Angelico · Jul 8, 2013

Is that a trick question? They probably shouldn't. But it's a big leap
from that to "...and therefore `for` and `while` should introduce their
own scope".

No, it's not a trick question; I was responding to Ethan's suggestion
as well as yours, and he was saying pretty much that.

BruceA
(maybe that'll reduce the confusion?)

Neil Cerutti · Jul 8, 2013

That loops do *not* introduce a new scope is a feature, not a bug. It is
*really* useful to be able to use the value of x after the loop has
finished.

I don't buy necessarily buy that it's "*really*" useful but I do
like introducing new names in (not really the scope of)
if/elif/else and for statement blocks.

z = record["Zip"]
if int(z) > 99999:
zip_code = z[:-4].rjust(5, "0")
zip4 = z[-4:]
else:
zip_code = z.rjust(5, "0")
zip4 = ""

As opposed to:

zip_code = None
zip4 = None
z = record["Zip"]
if int(z) > 99999:
zip_code = z[:-4].rjust(5, "0")
zip4 = z[-4:]
else:
zip_code = z.rjust(5, "0")
zip4 = ""

C Python: Running Python code within function scope	1	Sep 4, 2012
scope of function parameters	21	May 29, 2011
PyMyth: Global variables are evil... WRONG!	20	Nov 12, 2013
variable scope	2	Jul 5, 2011
Translater + module + tkinter	1	Feb 16, 2023
Final chapter of "Learn PHP, MySQL and JavaScript"	3	Jun 4, 2024
Scope	8	Jun 4, 2005
trouble with nested closures: one of my variables is missing...	0	Oct 14, 2012

Default scope of variables

Dave Angel

Joshua Landau

Joshua Landau

Rotwang

Neil Cerutti

Chris Angelico

Neil Cerutti

Wayne Werner

Chris Angelico

Steven D'Aprano

Steven D'Aprano

Ethan Furman

Ethan Furman

Chris Angelico

Chris Angelico

Steven D'Aprano

Chris Angelico

Steven D'Aprano

Chris Angelico

Neil Cerutti

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads