Class changes in circular imports when __name__ == '__main__'

  • Thread starter Spencer Pearson
  • Start date
S

Spencer Pearson

Hi! I'm writing a package with several files in it, and I've found
that "isinstance" doesn't work the way I expect under certain
circumstances.

Short example: here are two files.
# fileone.py
import filetwo

class AClass( object ):
pass

if __name__ == '__main__':
a = AClass()
filetwo.is_aclass( a )

# filetwo.py

import fileone

def is_aclass( a ):
print "The argument is", ("" if isinstance(a, fileone.AClass) else
"not"), "an instance of fileone.AClass"


If you run fileone.py, it will tell you that "The argument is not an
instance of fileone.AClass", which seems strange to me, given that the
fileone module is the one that CREATES the object with its own AClass
class. And if you replace "if __name__ == '__main__'" with "def
main()", start Python, import fileone, and call fileone.main(), it
tells you that the argument IS an instance of AClass.

So, the module's name change to __main__ when you run it on its own...
well, it looks like it puts all of the things defined in fileone in
the __main__ namespace INSTEAD of in the fileone module's namespace,
and then when filetwo imports fileone, the class is created again,
this time as fileone.AClass, and though it's identical in function to
__main__.AClass, one "is not" the other.

Is this kind of doubled-back 'isinstance' inherently sinful? I mean, I
could solve this problem by giving all of my classes "classname"
attributes or something, but maybe it's just a sign that I shouldn't
have to do this in the first place.
 
A

Arnaud Delobelle

Spencer Pearson said:
Hi! I'm writing a package with several files in it, and I've found
that "isinstance" doesn't work the way I expect under certain
circumstances.

Short example: here are two files.
# fileone.py
import filetwo

class AClass( object ):
pass

if __name__ == '__main__':
a = AClass()
filetwo.is_aclass( a )

# filetwo.py

import fileone

def is_aclass( a ):
print "The argument is", ("" if isinstance(a, fileone.AClass) else
"not"), "an instance of fileone.AClass"


If you run fileone.py, it will tell you that "The argument is not an
instance of fileone.AClass", which seems strange to me, given that the
fileone module is the one that CREATES the object with its own AClass
class. And if you replace "if __name__ == '__main__'" with "def
main()", start Python, import fileone, and call fileone.main(), it
tells you that the argument IS an instance of AClass.

So, the module's name change to __main__ when you run it on its own...
well, it looks like it puts all of the things defined in fileone in
the __main__ namespace INSTEAD of in the fileone module's namespace,
and then when filetwo imports fileone, the class is created again,
this time as fileone.AClass, and though it's identical in function to
__main__.AClass, one "is not" the other.

Is this kind of doubled-back 'isinstance' inherently sinful? I mean, I
could solve this problem by giving all of my classes "classname"
attributes or something, but maybe it's just a sign that I shouldn't
have to do this in the first place.

The behaviour is normal. I suppose you could do something like this
(untested):

# fileone.py

if __name__ == '__main__':
from fileone import *

a = AClass()
filetwo.is_aclass( a )

import sys; sys.exit()

import filetwo

class AClass( object ):
pass
 
C

Carl Banks

Hi! I'm writing a package with several files in it, and I've found
that "isinstance" doesn't work the way I expect under certain
circumstances.

Short example: here are two files.
# fileone.py
import filetwo

class AClass( object ):
  pass

if __name__ == '__main__':
  a = AClass()
  filetwo.is_aclass( a )

# filetwo.py

import fileone

def is_aclass( a ):
  print "The argument is", ("" if isinstance(a, fileone.AClass) else
"not"), "an instance of fileone.AClass"

If you run fileone.py, it will tell you that "The argument is not an
instance of fileone.AClass", which seems strange to me, given that the
fileone module is the one that CREATES the object with its own AClass
class. And if you replace "if __name__ == '__main__'" with "def
main()", start Python, import fileone, and call fileone.main(), it
tells you that the argument IS an instance of AClass.

So, the module's name change to __main__ when you run it on its own...
well, it looks like it puts all of the things defined in fileone in
the __main__ namespace INSTEAD of in the fileone module's namespace,
and then when filetwo imports fileone, the class is created again,
this time as fileone.AClass, and though it's identical in function to
__main__.AClass, one "is not" the other.

Correct. Python always treats the main script as a module called
__main__. If you then try to import the main script file from another
module, Python will actually import it again with whatever its usual
name is.

This is easily one of the most confusing and unfortunate aspects of
Python.

Is this kind of doubled-back 'isinstance' inherently sinful? I mean, I
could solve this problem by giving all of my classes "classname"
attributes or something, but maybe it's just a sign that I shouldn't
have to do this in the first place.

Even if there are better ways than isinstance, the weird behavior of
__main__ shouldn't be the reason not to use it.

My recommendation for most programmers is to treat Python files either
as scripts (which you start Python interpreter with) or modules (which
you import from within Python); never both. Store most functionality
in modules and keep startup scripts small. If you do this, the weird
semantics of __main__ is a moot point.

If you want to be able to run a module as a script while avoiding side
effects due to it being named __main__, the easiest thing to do is to
put something like the following boilerplate at the top of the module
(this causes the module to rename itself).

import sys
if __name__ == '__main__':
is_main = True # since you're overwriting __name__ you'll need
this later
__name__ = 'foo'
sys.modules['foo'] = sys.modules['__main__']
else:
is_main = False


All of this gets a lot more complicated when packages are involved.


Carl Banks
 
D

Dave Angel

Hi! I'm writing a package with several files in it, and I've found
that "isinstance" doesn't work the way I expect under certain
circumstances.

Short example: here are two files.
# fileone.py
import filetwo

class AClass( object ):
pass

if __name__ ='__main__':
a =Class()
filetwo.is_aclass( a )

# filetwo.py

import fileone

def is_aclass( a ):
print "The argument is", ("" if isinstance(a, fileone.AClass) else
"not"), "an instance of fileone.AClass"

If you run fileone.py, it will tell you that "The argument is not an
instance of fileone.AClass", which seems strange to me, given that the
fileone module is the one that CREATES the object with its own AClass
class. And if you replace "if __name__ ='__main__'" with "def
main()", start Python, import fileone, and call fileone.main(), it
tells you that the argument IS an instance of AClass.

So, the module's name change to __main__ when you run it on its own...
well, it looks like it puts all of the things defined in fileone in
the __main__ namespace INSTEAD of in the fileone module's namespace,
and then when filetwo imports fileone, the class is created again,
this time as fileone.AClass, and though it's identical in function to
__main__.AClass, one "is not" the other.
Correct. Python always treats the main script as a module called
__main__. If you then try to import the main script file from another
module, Python will actually import it again with whatever its usual
name is.

This is easily one of the most confusing and unfortunate aspects of
Python.

Is this kind of doubled-back 'isinstance' inherently sinful? I mean, I
could solve this problem by giving all of my classes "classname"
attributes or something, but maybe it's just a sign that I shouldn't
have to do this in the first place.
Even if there are better ways than isinstance, the weird behavior of
__main__ shouldn't be the reason not to use it.

My recommendation for most programmers is to treat Python files either
as scripts (which you start Python interpreter with) or modules (which
you import from within Python); never both. Store most functionality
in modules and keep startup scripts small. If you do this, the weird
semantics of __main__ is a moot point.

If you want to be able to run a module as a script while avoiding side
effects due to it being named __main__, the easiest thing to do is to
put something like the following boilerplate at the top of the module
(this causes the module to rename itself).

import sys
if __name__ ='__main__':
is_main =rue # since you're overwriting __name__ you'll need
this later
__name__ =foo'
sys.modules['foo'] =ys.modules['__main__']
else:
is_main =alse


All of this gets a lot more complicated when packages are involved.


Carl Banks
Perhaps a better answer would be to import __main__ from the second module.

But to my way of thinking, the answer should be to avoid ever having
circular imports. This is just the most blatant of the problems that
circular imports can cause.

I don't know of any cases where circular dependencies are really
necessary, but if one decides to use them, then two things should be done:

1) do almost nothing in top-level code in any module involved in such
circular dependency. Top-level should have all of the imports, and none
of the executable code.
2) do not ever involve the startup script in the loop. If necessary,
make it two lines, importing,then calling the real mainline.

DaveA
 
C

Carl Banks

Perhaps a better answer would be to import __main__ from the second module.

Then what if the module is imported from a different script? It'll
try to import __main__ but get a different script than expected.

But to my way of thinking, the answer should be to avoid ever having
circular imports.  This is just the most blatant of the problems that
circular imports can cause.

I don't know of any cases where circular dependencies are really
necessary, but if one decides to use them, then two things should be done:

I don't think they're ever necessary but sometimes it's convenient.
This could be one of those cases. One of the less misguided reasons
to invoke a module as a script is to run tests on the module. When
you do that you might need to call an outside module to set up a test
environment, and that module might happen to import the calling
module. You could refactor the test to avoid the circular import, but
that kind of defeats the convenience of stowing the test in the
module.

1) do almost nothing in top-level code in any module involved in such
circular dependency.  Top-level should have all of the imports, and none
of the executable code.
2) do not ever involve the startup script in the loop.  If necessary,
make it two lines, importing,then calling the real mainline.

All good advice for that situation. I would add that if you define a
base class in one module and subclass in another, you want to keep
those modules out of cycles. The problem with circular imports is
that you don't usually know what order the modules will be imported,
but you need to be sure the base class is defined when you subclass.
(I learned that lesson the hard way, and I had to hack up an import
hook to enforce that imports occurred in the correct order.)


Carl Banks
 
D

Dave Angel

Then what if the module is imported from a different script? It'll
try to import __main__ but get a different script than expected.
Then the module needs to adjust its expectations. The point is it
should never try to import the script by name.

DaveA
 
C

Carl Banks

Then the module needs to adjust its expectations.

No, it shouldn't. It shouldn't have any expectations at all, because
importing __main__ and expecting to get a particular module is a
foolish thing to do. There are a bunch of reasons why __main__ might
not be the original script. Example: running the profiler on it.

The point is it
should never try to import the script by name.

Importing __main__ directly is worse than the problem it's trying to
solve.

And "never" is too strong a word. I already in this thread gave a
solution whereby the script can be imported by name safely, by
renaming itself and assigning itself an item in sys.modules. When you
do that, you can import the main script by name.


Carl Banks
 
S

Spencer Pearson

All right, thank you for helping! I'd had a little voice in the back
of my mind nagging me that it might not be logical to include a bunch
of classes and function definitions in my startup file, but I never
got around to splitting it up. The module/script distinction makes
sense, and it seems more elegant, too. Also, my program works now that
I've rearranged things, which is a plus. Thanks!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,740
Latest member
AdolphBig6

Latest Threads

Top