Python Oddity - print a reserved name

M

Michael Foord

Here's a little oddity with 'print' being a reserved word...
pass
something = thing()
something.print = 3 SyntaxError: invalid syntax
print something.__dict__ {}
something.__dict__['print'] = 3
print something.__dict__ {'print': 3}
print something.print SyntaxError: invalid syntax

See that I can't set the something.print attribute directly, but can
set it indirectly. Is this behaviour 'necessary' or just an anomaly of
the way IDLE detects Syntax Errors ?

Regards,

Fuzzy

http://www.voidspace.org.uk/atlantibots/pythonutils.html
 
D

Duncan Booth

Michael said:
something.__dict__['print'] = 3

Or, slightly prettier, use:

setattr(something, 'print', 3)
See that I can't set the something.print attribute directly, but can
set it indirectly. Is this behaviour 'necessary' or just an anomaly of
the way IDLE detects Syntax Errors ?

No, that is simply how Python works. You can only use the direct syntax to
set attributes whose names are valid Python identifiers, but indirectly you
can use any string at all as the attribute name. It doesn't do any harm and
sometimes it can be extremely useful.

You can do this pretty much anywhere that Pythonn uses a dict internally.
For example you can call functions with arbitrary keyword arguments
provided you use the ** syntax.
 
P

Peter Otten

Michael said:
See that I can't set the something.print attribute directly, but can
set it indirectly. Is this behaviour 'necessary' or just an anomaly of
the way IDLE detects Syntax Errors ?

I'd say checking at runtime whether attribute names are valid costs too much
performance, so it's not done. Another nice example (on the interactive
prompt, this has nothing to do with idle):
.... pass
....['!@~%', '__doc__', '__module__']

:)

Peter
 
S

Steve Holden

Michael said:
Here's a little oddity with 'print' being a reserved word...


SyntaxError: invalid syntax
print something.__dict__
{}
something.__dict__['print'] = 3
print something.__dict__

{'print': 3}

SyntaxError: invalid syntax


See that I can't set the something.print attribute directly, but can
set it indirectly. Is this behaviour 'necessary' or just an anomaly of
the way IDLE detects Syntax Errors ?
It's necessary. You will find that keywords (those having a specific
meaning as a language token, such as "def" and "print") can't be used as
attributes. The fact that you can set the attributes by manipulating the
__dict__ directly isn't relevant - you will find you can only access
such attributes indirectly, since any attempt to do so directly will
result in a syntax error no matter how the Python interpreter is invoked.

regards
Steve
 
A

Alex Martelli

Steve Holden said:
It's necessary. You will find that keywords (those having a specific
meaning as a language token, such as "def" and "print") can't be used as
attributes. The fact that you can set the attributes by manipulating the
__dict__ directly isn't relevant - you will find you can only access
such attributes indirectly, since any attempt to do so directly will
result in a syntax error no matter how the Python interpreter is invoked.

I think that, while you're right for CPython, other Python
implementations are allowed to relax this restriction, and Jython does
so systematically. Since Jython regularly accesses attributes on
Java-implemented classes, I guess that forcing Jython users to jump
through hoops to access, say, foo.yield, bar.print, or baz.def, might
reduce usability too badly. Possibly make it completely unusable if you
had to implement an interface using such a name, too...

This kind of thing, however, is also true of CPython whenever it's
accessing "outside" objects through attributes; and for .NET
implementations I believe that CLR compliant languages are not allowed
to forbid certain method names along their interfaces to other
components. I'm not sure how CORBA's standard Python bindings address
the same problem, how it's met in various interfaces to XML-RPC, COM,
SOAP, and other distributed-objects or foreign-objects APIs.

Given how pervasive this problem is, I do recall some ruminations about
allowing arbitrary identifiers in the specific case in which they fall
right after a dot in a compound name. I don't recall that anything ever
came of these ruminations, though.


Alex
 
M

Michael Hudson

This kind of thing, however, is also true of CPython whenever it's
accessing "outside" objects through attributes; and for .NET
implementations I believe that CLR compliant languages are not
allowed to forbid certain method names along their interfaces to
other components. I'm not sure how CORBA's standard Python bindings
address the same problem, how it's met in various interfaces to
XML-RPC, COM, SOAP, and other distributed-objects or foreign-objects
APIs.

I'm fairly sure the approach taken by CORBA bindings is the good old
"append an underscore" hack. I don't know what happens if an
interface declares methods called both "print" and "print_", but
giving the author a good kick seems an appropriate response...
Given how pervasive this problem is, I do recall some ruminations
about allowing arbitrary identifiers in the specific case in which
they fall right after a dot in a compound name. I don't recall that
anything ever came of these ruminations, though.

I think the only problem is that noone has done the work yet.
Python's parser isn't the nicest thing ever. Two snippings spring to
mind:

/* This algorithm is from a book written before
the invention of structured programming... */

(Parser/pgen.c from the Python source).

<glyph> It's interesting that people often say "Hey, I'm looking for
something to work on!"
<glyph> then someone else says "Glyph's code needs a little help."
then the original asker says "SWEET MARY MOTHER OF GOD I'M NOT
TOUCHING THAT! I mean, uh, that's too much work or I'm not
good at it. Or something."

(from Twisted.Quotes).

It's not my itch, and I'm not that interested in learning how to
scratch it...

Cheers,
mwh
 
M

Michael Foord

Duncan Booth said:
Michael said:
something.__dict__['print'] = 3

Or, slightly prettier, use:

setattr(something, 'print', 3)
See that I can't set the something.print attribute directly, but can
set it indirectly. Is this behaviour 'necessary' or just an anomaly of
the way IDLE detects Syntax Errors ?

No, that is simply how Python works. You can only use the direct syntax to
set attributes whose names are valid Python identifiers, but indirectly you
can use any string at all as the attribute name. It doesn't do any harm and
sometimes it can be extremely useful.

You can do this pretty much anywhere that Pythonn uses a dict internally.
For example you can call functions with arbitrary keyword arguments
provided you use the ** syntax.

Right - but although 'print' is a reserved word there is no *need* for
object.print to be reserved.. and as Alex has pointed out that could
actually be damned inconvenient..........

Oh well.....

Regards,


Fuzzy

http://www.voidspace.org.uk/atlantibots/pythonutils.html
 
D

Diez B. Roggisch

Michael said:
Right - but although 'print' is a reserved word there is no *need* for
object.print to be reserved.. and as Alex has pointed out that could
actually be damned inconvenient..........

I tried to explain my views on that before:

http://groups.google.com/groups?hl=...to9%24ql8%2403%241%40news.t-online.com&rnum=4

The key issue is, that while
<function foo at 0x401eab1c>

is ok,

can't possibly made working without unclear context-driven hacks.

And if on "normal" function level this can't be allowed, IMHO for the sake
of consistency class methods should also not allow that - because then the
different behaviour causes confusion...
 
B

Bengt Richter

I tried to explain my views on that before:

http://groups.google.com/groups?hl=...to9%24ql8%2403%241%40news.t-online.com&rnum=4

The key issue is, that while

<function foo at 0x401eab1c>

is ok,


can't possibly made working without unclear context-driven hacks.
Why? If a print statement were just syntactic sugar for print((outfile,), ... rest, of, line)
(normally calling a builtin) then, giving your example args,

def print(*args):
import sys
sys.stdout.write( repr(args)+'\n')

print print, 1, 'two', 3

would output (faked ;-)

'((), <function print at 0x008FDE70> 1, "two", 3)'

instead of blowing up, and the _statement_

print(print)

would, de-sugared, act like print((), (print)) and thus output

'((), <function print at 0x008FDE70>)'

without the def print(...

print print

would de-sugar to

print((), print)

and output

<built-in function print>

and

type(print)

would not be a print statement, so would return

<type 'builtin_function_or_method'>

interactively.
Etc., etc.
And if on "normal" function level this can't be allowed, IMHO for the sake
of consistency class methods should also not allow that - because then the
different behaviour causes confusion...

Maybe there's a problem with my suggestion, but I don't see it at the moment.

Regards,
Bengt Richter
 
D

Diez B. Roggisch

Why? If a print statement were just syntactic sugar for print((outfile,),
... rest, of, line) (normally calling a builtin) then, giving your example
args,

def print(*args):
import sys
sys.stdout.write( repr(args)+'\n')

print print, 1, 'two', 3

would output (faked ;-)

'((), <function print at 0x008FDE70> 1, "two", 3)'


How should that happen? As

print

produces a newline, one should expect that

print print

gets translated to

print(outfile, print(outfile,))

which would clearly produce two newlines and raises the question what the
inner print gets evaluated to so that the outer one wouldn't print
anything.

While that would be sort of accetable, what you suggest makes print
ambigious - depending on the context. And above that:

has no sideeffects whatsoever, where

clearly has.

As BDFL has already said, he regrets to have introduced print as a statement
- but as he did, there is no really elegant way around having print as a
reserved keyword. One _can_ think of taking context into account to either
interpret print as keyword or as identifier while parsing - the question
is: Is it worth it? IMHO no - but as Alex said: If it makes you happy,
implement it :) The same applies to all other keywords as well, btw.
 
D

Duncan Grisby

Michael Hudson said:
I'm fairly sure the approach taken by CORBA bindings is the good old
"append an underscore" hack. I don't know what happens if an
interface declares methods called both "print" and "print_", but
giving the author a good kick seems an appropriate response...

CORBA prepends an underscore, so it would use "_print". Identifiers in
CORBA IDL are not permitted to start with an underscore, so there is
no possibility of a clash with another IDL defined identifier. If the
Python mapping appended an underscore for clashes, it would be
susceptible to the issue you mention.

Cheers,

Duncan.
 
B

Bengt Richter

How should that happen? As

print

produces a newline, one should expect that

print print

gets translated to

print(outfile, print(outfile,))
No, the second print in print print is just a name.
The translation is
print((), print)
where after this translation both prints are ordinary names, and the print
statement per se is effectively forgotten as far as compilation is concerned.

That's why it finds <function print at 0x008FDE70> from the def print if there is one,
or otherwise the builtin print _function_. Either way, it's a function found in the
ordinary way by the name print.

The point is, a print statement is recognized _syntactically_ by 'print' being the
first name in the statement, but for the rest of the statement 'print' is an ordinary name.
Thus
print; foo = print; foo((), 'Hi there')
would print a newline, bind foo to the builtin print function, and invoke the latter via foo,
(using () to indicate default outfile). None could also be used to indicate default outfile.
I just picked () as more flexibly adaptable to what I hadn't thought of yet re specifying outfile ;-)
Maybe a keyword arg would be better yet. I.e., as in def print(*args **kw) with
outfile=kw.get('outfile', sys.stdout). But that's an implementation detail.
which would clearly produce two newlines and raises the question what the
inner print gets evaluated to so that the outer one wouldn't print
anything.
Not applicable. See above for translation explanation.
While that would be sort of accetable, what you suggest makes print
ambigious - depending on the context. And above that:
No, you have misunderstood (I wasn't clear enough) my suggestion. No ambiguity,
but yes, a print name token as the leading token of a statement is interpreted
specially. You could interpret print(x) differently from print (x) analogously
to 123.__doc__ (illegal) vs 123 .__doc__ which gives you int docs.
has no sideeffects whatsoever, where


clearly has.
as does

sys.stdout.write('\n')

so I guess I am missing your point.
As BDFL has already said, he regrets to have introduced print as a statement
- but as he did, there is no really elegant way around having print as a
reserved keyword. One _can_ think of taking context into account to either
interpret print as keyword or as identifier while parsing - the question
is: Is it worth it? IMHO no - but as Alex said: If it makes you happy,
implement it :) The same applies to all other keywords as well, btw.
It doesn't seem that complex a context for print, but I haven't thought
about the other keywords yet (no time really for this even ;-/). I wonder
what a comprehensive writeup of python name semantics would reveal,
discussing all the different name spaces and ways that names are searched for
and used in various contexts.

Regards,
Bengt Richter
 
D

Diez B. Roggisch

No, the second print in print print is just a name.
The translation is
print((), print)
where after this translation both prints are ordinary names, and the print
statement per se is effectively forgotten as far as compilation is
concerned.

I understood that that was what you _wanted_ it to be - but how do you want
to parse that? It would involve saying "first encounter of print is a
keyword, then an identifier" - that condition is reset after the statement,
so to speak the next line or a semicolon.

That surely is possible, but has to be done _after_ lexcal analysis - which
makes things more complicated as necessary, for a very questionable
benefit.
The point is, a print statement is recognized _syntactically_ by 'print'
being the first name in the statement, but for the rest of the statement
'print' is an ordinary name. Thus
print; foo = print; foo((), 'Hi there')

How shall that work? Right now, print ; print produces two newlines. Do you
want to alter that behaviour? Do you want to complicate parsing even more
by saying that print on the right side of an expression is the name,
otherwise its executed?
That would be inconsistent - every newby would ask why

produces a nl.

would print a newline, bind foo to the builtin print function, and invoke
the latter via foo, (using () to indicate default outfile). None could
also be used to indicate default outfile. I just picked () as more
flexibly adaptable to what I hadn't thought of yet re specifying outfile
;-) Maybe a keyword arg would be better yet. I.e., as in def print(*args
**kw) with outfile=kw.get('outfile', sys.stdout). But that's an
implementation detail.
No, you have misunderstood (I wasn't clear enough) my suggestion. No
ambiguity, but yes, a print name token as the leading token of a statement
is interpreted specially. You could interpret print(x) differently from
print (x) analogously to 123.__doc__ (illegal) vs 123 .__doc__ which gives
you int docs.
as does

sys.stdout.write('\n')

so I guess I am missing your point.


The point is that two things that look alike should behave the same for
reasons of orthogonality - having as special-case of print here does
produce behaviour that yields to confusion, as I said before. That is of
course also true right now - but because print beeing a reserved keyword,
nobody falls into any traps here - its just forbidden.

It doesn't seem that complex a context for print, but I haven't thought
about the other keywords yet (no time really for this even ;-/). I wonder
what a comprehensive writeup of python name semantics would reveal,
discussing all the different name spaces and ways that names are searched
for and used in various contexts.

Well, its certainly more complicated than just altering the lexical phase -
it requrires a post-reduction step - deep in the parser.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,206
Messages
2,571,069
Members
47,677
Latest member
MoisesKoeh

Latest Threads

Top