Default scope of variables

  • Thread starter Steven D'Aprano
  • Start date
J

Joshua Landau

I don't buy necessarily buy that it's "*really*" useful

Just take "really" to mean "like, I'm totz not lying".
but I do
like introducing new names in (not really the scope of)
if/elif/else and for statement blocks.

z = record["Zip"]
if int(z) > 99999:
zip_code = z[:-4].rjust(5, "0")
zip4 = z[-4:]
else:
zip_code = z.rjust(5, "0")
zip4 = ""

I'd probably break down and cry if "if"s introduced a new scope in
Pythons before the "nonlocal" keyword (assuming current Python
semantics where "=" defaults to only the inner-most scope).
 
J

Joshua Landau

def foo():
for i in range(3):
print("outer",i)
def inner():
for i in range(4):
print("inner",i)
inner()
print("outer",i)

That works, but you then have to declare all your nonlocals, and it
hardly reads well.

Stealing concepts shamelessly from
http://www.slideshare.net/r1chardj0n3s/dont-do-this-24000445, you can
do this:

import inspect
from contextlib import contextmanager

@contextmanager
def scope(namespace=None):
old_names = inspect.currentframe().f_back.f_back.f_locals.copy()

yield

names = inspect.currentframe().f_back.f_back.f_locals

if namespace is not None:
new_names = {k:v for k, v in names.items() if k not in
old_names and v is not namespace}
namespace.update(**new_names)

names.clear()
names.update(old_names)

So you *can* do:
.... with scope():
.... for i in range(3):
.... print("Inner:", i)
.... print("Outer", i)
Inner: 0
Inner: 1
Inner: 2
Outer 0
Inner: 0
Inner: 1
Inner: 2
Outer 1
Inner: 0
Inner: 1
Inner: 2
Outer 2

:)

If you pass scope() a dictionary, all the new variables will get added to it.
 
A

alex23

Yep. There's a problem, though, when you bring in subtransactions. The
logic wants to be like this:

with new_transaction(conn) as tran:
tran.query("blah")
with tran.subtransaction() as tran:
tran.query("blah")
with tran.subtransaction() as tran:
tran.query("blah")
# roll this subtransaction back
tran.query("blah")
tran.commit()
tran.query("blah")
tran.commit()

The 'with' statement doesn't allow this.

I'd take the 'explicit is better than implicit' approach to naming the
context managers here and rather than use 'tran' choose names that
reflect for what the transaction is to be used ie summarising what's in
the "blah" of each query.

with new_transaction(conn) as folder_tran:
folder_tran.query("blah")
with folder_tran.subtransaction() as file_tran:
file_tran.query("blah")
with file_tran.subtransaction() as type_tran:
type_tran.query("blah")

(for want of a better heirachical example...)
 
C

Chris Angelico

with new_transaction(conn) as folder_tran:
folder_tran.query("blah")
with folder_tran.subtransaction() as file_tran:
file_tran.query("blah")
with file_tran.subtransaction() as type_tran:
type_tran.query("blah")

Warp my code around a language limitation? Only if I absolutely have to.

The subtransactions are NOT concepted as separate transactions. They
are effectively the database equivalent of a try/except block. Would
you want to have to use a different name for a builtin just because
you're inside a try block?

a = int("123")
try:
b = int1(user_input)
except ValueError:
b = 0
c = int("234")

No. That assignment to b should be int(), same as a and c.

ChrisA
 
A

alex23

The subtransactions are NOT concepted as separate transactions. They
are effectively the database equivalent of a try/except block.

Sorry, I assumed each nested query was somehow related to the prior
one. In which case, I'd probably go with Ethan's suggestion of a
top-level transaction context manager with its own substransaction
method.
 
C

Chris Angelico

Sorry, I assumed each nested query was somehow related to the prior
one. In which case, I'd probably go with Ethan's suggestion of a
top-level transaction context manager with its own substransaction
method.

Yeah, that would probably be the best option in this particular
instance. Though I do still like the ability to have variables shadow
each other, even if there's a way around one particular piece of code
that uses the technique.

ChrisA
 
C

Chris Angelico

I have been following this sub-thread with interest, as it resonates with
what I am doing in my project.

Just FYI, none of my own code will help you as it's all using libpqxx,
but the docs for the library itself are around if you want them (it's
one of the standard ways for C++ programs to use PostgreSQL).
I came up with the following context manager -

class DbSession:
def __exit__(self, type, exc, tb):
if self.transaction_active:
self.conn.commit()
self.transaction_active = False

Hmm. So you automatically commit. I'd actually be inclined to _not_ do
this; make it really explicit in your code that you now will commit
this transaction (which might throw an exception if you have open
subtransactions). The way the code with libpqxx works is that any
transaction (including a subtransaction) must be explicitly committed,
or it will be rolled back. So there's one possible code path that
results in persistent changes to the database, and anything else
won't:

* If the object expires without being committed, it's rolled back.
* If an exception is thrown and unwinds the stack, roll back.
* If a Unix signal is sent that terminates the program, roll back.
* If the process gets killed -9, definitely roll back.
* If the computer the program's running on spontaneously combusts, roll back.
* If the hard drive is physically ripped from the server during
processing, roll back.

(Note though that most of these guarantees are from PostgreSQL, not
from libpqxx. I'm talking about the whole ecosystem here, not
something one single library can handle.)

I have absolute 100% confidence that nothing can possibly affect the
database unless I explicitly commit it (aside from a few
non-transaction actions like advancing a sequence pointer, which
specifically don't matter (rolled back transactions can result in gaps
in a sequence of record IDs, nothing more)). It's an extremely
comfortable work environment - I can write whatever code I like, and
if I'm not sure if it'll work or not, I just comment out the commit
line and run. *Nothing* can get past that and quietly commit it behind
my back.

ChrisA
 
I

Ian Kelly

When any of them need any database access, whether for reading or for
updating, they execute the following -

with db_session as conn:
conn.transaction_active = True # this line must be added if
updating
conn.cur.execute(__whatever__)

I'd probably factor out the transaction_active line into a separate
DbSession method.

@contextmanager
def updating(self):
with self as conn:
conn.transaction_active = True
yield conn

Then you can do "with db_session" if you're merely reading, or "with
db_session.updating()" if you're writing, and you don't need to repeat
the transaction_active line all over the place.

I would also probably make db_session a factory function instead of a global.
 
F

Frank Millman

Chris Angelico said:
Just FYI, none of my own code will help you as it's all using libpqxx,
but the docs for the library itself are around if you want them (it's
one of the standard ways for C++ programs to use PostgreSQL).

I support multiple databases (PostgreSQL, MS SQL Server, sqlite3 at this
stage) so I use generic Python as much as possible.
Hmm. So you automatically commit. I'd actually be inclined to _not_ do
this; make it really explicit in your code that you now will commit
this transaction (which might throw an exception if you have open
subtransactions).

I endeavour to keep all my database activity to the shortest time possible -
get a connection, execute a command, release the connection. So setting
'transaction_active = True' is my way of saying 'execute this command and
commit it straight away'. That is explicit enough for me. If there are
nested updates they all follow the same philosophy, so the transaction
should complete quickly.

Frank
 
I

Ian Kelly

You could also do it like this:

def updating(self):
self.transaction_active = True
return self

Yes, that would be simpler. I was all set to point out why this
doesn't work, and then I noticed that the location of the
"transaction_active" attribute is not consistent in the original code.
The DbSession class places it on self, and then the example usage
places it on the connection object (which I had based my version on).
Since that seems to be a source of confusion, it demonstrates another
reason why factoring this out is a good thing.
 
E

Ethan Furman

Yes, that would be simpler. I was all set to point out why this
doesn't work, and then I noticed that the location of the
"transaction_active" attribute is not consistent in the original code.
The DbSession class places it on self, and then the example usage
places it on the connection object

It looks like DbSession has a conn object, and in the example he has DbSession() named as conn -- ironic, considering
this is a variable scoping thread. ;)
 
E

Ethan Furman

The object returned by __enter__ is the conn object, not the
DbSession, so naming it "conn" is correct.


Huh. I didn't realize a different object could be returned by __enter__ without affecting which object's __exit__ gets
called. Thanks for the lesson! :)
 
F

Frank Millman

Ian Kelly said:
Yes, that would be simpler. I was all set to point out why this
doesn't work, and then I noticed that the location of the
"transaction_active" attribute is not consistent in the original code.
The DbSession class places it on self, and then the example usage
places it on the connection object (which I had based my version on).
Since that seems to be a source of confusion, it demonstrates another
reason why factoring this out is a good thing.

You had me worried there for a moment, as that is obviously an error.

Then I checked my actual code, and I find that I mis-transcribed it. It
actually looks like this -

with db_session as conn:
db_session.transaction_active = True
conn.cur.execute(...)

I am still not quite sure what your objection is to this. It feels
straightforward to me.

Here is one possible answer. Whenever I want to commit a transaction I have
to add the extra line. There is a danger that I could mis-spell
'transaction_active', in which case it would not raise an error, but would
not commit the transaction, which could be a hard-to-trace bug. Using your
approach, if I mis-spelled 'db_session.connect()', it would immediately
raise an error.

Is that your concern, or are there other issues?

Frank
 
E

Ethan Furman

You had me worried there for a moment, as that is obviously an error.

Then I checked my actual code, and I find that I mis-transcribed it. It
actually looks like this -

with db_session as conn:
db_session.transaction_active = True
conn.cur.execute(...)

I am still not quite sure what your objection is to this. It feels
straightforward to me.

Here is one possible answer. Whenever I want to commit a transaction I have
to add the extra line. There is a danger that I could mis-spell
'transaction_active', in which case it would not raise an error, but would
not commit the transaction, which could be a hard-to-trace bug. Using your
approach, if I mis-spelled 'db_session.connect()', it would immediately
raise an error.

Is that your concern, or are there other issues?

That concern is big enough. I've been bitten by that type of thing enough in my own code to want to avoid it where
possible. Plus the `with db_session.updating() as conn:` saves keystrokes. ;)
 
I

Ian Kelly

You had me worried there for a moment, as that is obviously an error.

Then I checked my actual code, and I find that I mis-transcribed it. It
actually looks like this -

with db_session as conn:
db_session.transaction_active = True
conn.cur.execute(...)

I am still not quite sure what your objection is to this. It feels
straightforward to me.

Here is one possible answer. Whenever I want to commit a transaction I have
to add the extra line. There is a danger that I could mis-spell
'transaction_active', in which case it would not raise an error, but would
not commit the transaction, which could be a hard-to-trace bug. Using your
approach, if I mis-spelled 'db_session.connect()', it would immediately
raise an error.

Is that your concern, or are there other issues?

Yes, that is one concern. Another is that since you mistakenly typed
"conn" instead of "db_session" once, you might make the same mistake
again in actual code, with the same effect (unless the conn object
doesn't allow arbitrary attributes, which is a possibility). Another
is that the code adheres better to the DRY principle if you don't need
to copy that line all over the place.
 
F

Frank Millman

Ian Kelly said:
Yes, that is one concern. Another is that since you mistakenly typed
"conn" instead of "db_session" once, you might make the same mistake
again in actual code, with the same effect (unless the conn object
doesn't allow arbitrary attributes, which is a possibility). Another
is that the code adheres better to the DRY principle if you don't need
to copy that line all over the place.

Thanks to you and Ethan - that does make sense.

I have reviewed my code base to see how many occurrences there are, and
there are just three.

All database objects inherit from a DbOject class. The class has a save()
method and a delete() method. Each of these requires a commit, so I use my
technique there. All database updates are activated by calling save() or
delete().

I have an init.py script to 'bootstrap' a brand new installation, which
requires populating an empty database with some basic structures up front.
DbObject cannot be used here as the required plumbing is not in place, so I
use my technique here as well.

However, this does not invalidate your general point, so I will keep it in
mind.

Thanks

Frank
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,129
Messages
2,570,770
Members
47,326
Latest member
Itfrontdesk

Latest Threads

Top