Uniform Function Call Syntax (UFCS)

I

Ian Kelly

Same difference. It can't simply look for the name in globals(), it
has to figure out based on the caller's globals.

But that would all be done in getattr, so I don't think it affects
hasattr's implementation at all. Since hasattr doesn't push anything
onto the stack, getattr doesn't have to care whether it was called
directly from Python or indirectly via getattr; either way the scope
it needs is just the top frame of the stack.

Could be a different matter in other implementations, though.
 
J

jongiddy

But that would all be done in getattr, so I don't think it affects
hasattr's implementation at all. Since hasattr doesn't push anything
onto the stack, getattr doesn't have to care whether it was called
directly from Python or indirectly via getattr; either way the scope
it needs is just the top frame of the stack.

Could be a different matter in other implementations, though.

In CPython, the UFCS would not be done in PyObject_GetAttr() as that would affect hasattr() as well. Instead, it would be implemented in the bytecode for LOAD_ATTR. If LOAD_ATTR was about to return an AttributeError, e.g. for[].len, it would perform the equivalent of a LOAD_NAME operation, with thedifference that if the name is not found or is not callable, it returns AttributeError instead of NameError.

If the name is found, then it would return something: for [].len, it would return the len() function wrapped to know that it's first argument was the list, which might be done by creating a fake Method object, as shown in Ian's code.

But getattr([], 'len') and hasattr([], 'len') would both return False.

I'm beginning to think it is too un-Pythonic - too much implicitness, unless it can be spelt differently, something like [].len(_) or [].len(...) to explicitly indicate that it plans to call a function, but might call a method if one is available.
 
S

Steven D'Aprano

]
Actually, this is something that I've run into sometimes. I can't think
of any Python examples, partly because Python tends to avoid unnecessary
method chaining, but the notion of "data flow" is a very clean one -
look at shell piping, for instance. Only slightly contrived example:

cat foo*.txt | gzip | ssh other_server 'gunzip | foo_analyze'

The data flows from left to right, even though part of the data flow is
on a different computer.

A programming example might come from Pike's image library [...]

Stdio.write_file("foo.png",Image.PNG.encode(Image.JPEG.decode(
Stdio.read_file("foo.jpg")).autocrop().rotate(0.5).grey()));

With UFCS, that could become perfect data flow:

read_file("foo.jpg").JPEG_decode().autocrop().rotate(0.5).grey()
.PNG_encode().write_file("foo.png");

As far as I am concerned, the biggest problem with chained method calls
is that it encourages long one-liners. But I think chained calls are
quite natural to read, and rather similar to the postfix notation used by
Forth:

"foo.jpg" read_file JPEG_decode autocrop 0.5 rotate grey PNG_encode
"foo.png" write_file


Although Forth has a (justified) reputation for being hard to read,
postfix notation is not the cause. The above can be understood easily as
a chain of function calls: read the file, then decode, then autocrop,
then rotate, they grey, then encode, then write the file. You read and
write the calls in the same first-to-last order as you would perform them.

The equivalent prefix notation used by function calls is unnaturally
backwards and painful to read:

write_file(PNG_encode(grey(rotate(autocrop(JPEG_decode(
read_file("foo.jpg"))), 0.5))), "foo.png");

I had to solve the syntactic ambiguity here by importing all the
appropriate names

I'm not sure how this is *syntactic* ambiguity.

As I see it, the only syntactic ambiguity occurs when you have functions
of two arguments. Using shell notation:

plus(1, 2) | divide(2)

Assuming divide() takes two arguments, does that give 3/2 or 2/3? I would
expect that the argument being piped in is assigned to the first
argument. But I'm not sure how this sort of design ambiguity is fixed by
importing names into the current namespace.

(Note that Forth is brilliant here, as it exposes the argument stack and
gives you a rich set of stack manipulation commands.)

While we're talking about chaining method and function calls, I'll take
the opportunity to link to this, in case anyone feels like adapting it to
UFCS:

http://code.activestate.com/recipes/578770
 
C

Chris Angelico

]
Stdio.write_file("foo.png",Image.PNG.encode(Image.JPEG.decode(
Stdio.read_file("foo.jpg")).autocrop().rotate(0.5).grey()));

With UFCS, that could become perfect data flow:

read_file("foo.jpg").JPEG_decode().autocrop().rotate(0.5).grey()
.PNG_encode().write_file("foo.png");

I had to solve the syntactic ambiguity here by importing all the
appropriate names

I'm not sure how this is *syntactic* ambiguity.

The ambiguity I'm talking about here is with the dot. The original
version has "Stdio.read_file" as the first function called; for a
Python equivalent, imagine a string processing pipeline and having
"re.sub" in the middle of it. You can't take "re.sub" as the name of
an attribute on a string without some fiddling around that completely
destroys the point of data-flow syntax. So I cheated, and turned
everything into local (imported) names (adorning the ones that needed
it). This is a bad idea in Pike for the same reason it's a bad idea in
Python - you end up with a massively polluted global namespace.

This could be solved, though, by having a completely different symbol
that means "the thing on my left is actually the first positional
parameter in the function call on my right", such as in your example:
plus(1, 2) | divide(2)

This would be absolutely identical to:

divide(plus(1, 2), 2)

Maybe you could even make it so that:

plus(1, 2) x=| divide(y=2)

is equivalent to

divide(x=plus(1, 2), y=2)

for the sake of consistency, and to allow the pipeline to inject
something someplace other than the first argument.

I'm not sure whether it'd be as useful in practice, though. It would
depend partly on the exact syntax used. Obviously the pipe itself
can't be used as it already means bitwise or, and this needs to be
really REALLY clear about what's going on. But a data-flow notation
would be of value in theory, at least.

ChrisA
 
R

Roy Smith

Steven D'Aprano said:
(Note that Forth is brilliant here, as it exposes the argument stack and
gives you a rich set of stack manipulation commands.)

As does PostScript (which, despite its reputation as a printer format,
is really a full-fledged programming language). I suspect that people
who didn't grow up with RPN (i.e. H/P calculators) find it amazingly
obtuse. In much the same way I find Objective-C amazingly obtuse. Oh,
wait, that's the other thread.
 
S

Steven D'Aprano

class Circle:
def squared(self):
raise NotImplementedError("Proven impossible in 1882")

The trouble is that logically Circle does have a 'squared' attribute,
while 3 doesn't; and yet Python guarantees this:

foo.squared()
# is equivalent [1] to
func = foo.squared
func()

Which means that for (3).squared() to be 9, it has to be possible to
evaluate (3).squared,

Given UFCS, that ought to return the global squared function, curried
with 3 as its first (and only) argument.

UFCS would be a pretty big design change to Python, but I don't think it
would be a *problem* as such. It just means that x.y, hasattr(x, y) etc.
would mean something different to what they currently mean.

which means that hasattr (which is defined by
attempting to get the attribute and seeing if an exception is thrown)
has to return True.

Yes. And this is a problem why?

Obviously it would mean that the semantics of hasattr will be different
than they are now, but it's still a coherent set of semantics.

In fact, one can already give a class a __getattr__ method which provides
UFCS functionality. (Hmmm, you need a way to get the caller's globals.
You know, this keeps coming up. I think it's high-time Python offered
this as a supported function.) That's no more a problem than any other
dynamically generated attribute.

Stick that __getattr__ in object itself, and UFCS is now language wide.
That would make an awesome hack for anyone wanting to experiment with
this!


Except that it's even more complicated than that, because hasattr wasn't
defined in your module, so it has a different set of globals.

hasattr doesn't care about globals, nor does it need to. hasattr behaves
like the equivalent to:

def hasattr(obj, name):
try:
obj.name
except AttributeError:
return False
return True

give or take. And yes, if accessing your attribute has side effects,
using hasattr does too:

py> class Spam(object):
.... @property
.... def spam(self):
.... print("Spam spam spam spam LOVERLY SPAAAAM!!!!")
.... return "spam"
....
py> x = Spam()
py> hasattr(x, "spam")
Spam spam spam spam LOVERLY SPAAAAM!!!!
True

If that's a worry to you, you can try inspect.getattr_static.

In fact,
this would mean that hasattr would become quite useless. (Hmm, PEP 463
might become a prerequisite of your proposal...) It also means that
attribute lookup becomes extremely surprising any time the globals
change; currently, "x.y" means exactly the same thing for any given
object x and attribute y, no matter where you do it.

*cough*

class Example:
def __getattr__(self, name):
if name == 'module_name':
if __name__ == '__main__':
return "NOBODY expects the Spanish Inquisition!"
else:
return __name__
raise AttributeError("no attribute %r" % name)


:)
 
S

Steven D'Aprano

In fact, what's the point of having the duality?

len(x) <==> x.__len__()

x < y <==> x.__lt__(y)

str(x) <==> x.__str__()


Interface on the left, implementation on the right. That's especially
obvious when you consider operators like < + - * etc.

Consider x + y. What happens?

#1 First, Python checks whether y is an instance of a *subclass* of x. If
so, y gets priority, otherwise x gets priority.

#2 If y gets priority, y.__radd__(x) is called, if it exists. If it
returns something other than NotImplemented, we are done.

#3 However if y.__radd__ doesn't exist, or it returns NotImplemented,
then Python continues as if x had priority.

#3 If x has priority, then x.__add__(y) is called, if it exists. If it
returns something other than NotImplemented, we are done.

#4 However if it doesn't exist, or it returns NotImplemented, then
y.__radd__(x) is called, provided it wasn't already tried in step #2.

#5 Finally, if neither object has __add__ or __radd__, or both return
NotImplemented, then Python raises TypeError.


That's a lot of boilerplate if you were required to implement it yourself
in every single operator method. Better, Python handles all the boiler
plate, all you have to do is just handle the cases you care about, and
return NotImplemented for everything else.
 
C

Chris Angelico

Yes. And this is a problem why?

Obviously it would mean that the semantics of hasattr will be different
than they are now, but it's still a coherent set of semantics.

Coherent perhaps, but in direct opposition to the OP's statement about
how hasattr should return False even if there's a global to be found.

A coherent meaning for this kind of thing would almost certainly not
be possible within the OP's requirements, although it's entirely
possible something sensible could be put together.

(By the way, would (3).squared return a curried reference to squared
as of when you looked it up, or would it return something that
late-binds to whatever 'squared' is in scope as of when you call it?
If the latter, then hasattr would have to always return True, and
getattr would have to return something that does the late-bind lookup
and turns NameError into AttributeError.)

ChrisA
 
M

Marko Rauhamaa

Steven D'Aprano said:
In fact, what's the point of having the duality?
x < y <==> x.__lt__(y)

[...]

Consider x + y. What happens?

#1 First, Python checks whether y is an instance of a *subclass* of x. If
so, y gets priority, otherwise x gets priority.

#2 If y gets priority, y.__radd__(x) is called, if it exists. If it
returns something other than NotImplemented, we are done.

#3 However if y.__radd__ doesn't exist, or it returns NotImplemented,
then Python continues as if x had priority.

#3 If x has priority, then x.__add__(y) is called, if it exists. If it
returns something other than NotImplemented, we are done.

#4 However if it doesn't exist, or it returns NotImplemented, then
y.__radd__(x) is called, provided it wasn't already tried in step #2.

#5 Finally, if neither object has __add__ or __radd__, or both return
NotImplemented, then Python raises TypeError.

In a word, Python has predefined a handful of *generic
functions/methods*, which are general and standard in GOOPS (Guile's
object system):

(define-method (+ (x <string>) (y <string)) ...)
(define-method (+ (x <matrix>) (y <matrix>)) ...)
(define-method (+ (f <fish>) (b <bicycle>)) ...)
(define-method (+ (a <foo>) (b <bar>) (c <baz>)) ...)

<URL: http://www.gnu.org/software/guile/manual/html_node/
Methods-and-Generic-Functions.html>


Marko
 
J

jongiddy

This could be solved, though, by having a completely different symbol
that means "the thing on my left is actually the first positional
parameter in the function call on my right", such as in your example:


This would be absolutely identical to:

divide(plus(1, 2), 2)

Maybe you could even make it so that:

plus(1, 2) x=| divide(y=2)

is equivalent to

divide(x=plus(1, 2), y=2)

for the sake of consistency, and to allow the pipeline to inject
something someplace other than the first argument.

I'm not sure whether it'd be as useful in practice, though. It would
depend partly on the exact syntax used. Obviously the pipe itself
can't be used as it already means bitwise or, and this needs to be
really REALLY clear about what's going on. But a data-flow notation
would be of value in theory, at least.

Perhaps a pipeline symbol plus an insertion marker would work better in Python:

plus(1, 2) ~ divide(x=^, y=2)

f.readlines() ~ map(int, ^) ~ min(^, key=lambda n: n % 10).str() ~ base64.b64encode(^, b'?-') ~ print(^)

Stdio.read_file("foo.jpg") ~ Image.JPEG_decode(^).autocrop().rotate(0.5).grey() ~ Image.PNG_encode(^) ~ Stdio.write_file("foo.png", ^)
 
I

Ian Kelly

On Jun 8, 2014 9:56 PM, "Steven D'Aprano"
Yes. And this is a problem why?

Earlier in this thread I pointed out that returning True creates problems
for duck typing. But I'm now convinced that's preferable to making getattr
and hasattr inconsistent.
 
C

Chris Angelico

In what language does "often" (\ˈȯ-fən, ÷ˈȯf-tən\) sound like
"orphan" (\ˈȯr-fən\)?

('oh-fan', 'ov-ten' or even 'off-ten' versus 'or-fen')

Language? English. :) Your point is more about accent, and if you
listen to some of the accents around the English countryside, you'll
know there are quite a few of them (which is a plot point in My Fair
Lady). Personally, I think the Midlands accents are rather delightful.
I've spent a month at a time in Buxton (one of the highest-altitude
towns in England) several times, always a pleasant time. But anyway.
The story is set in Penzance, in Cornwall, and in the typical Cornish
accent, the two words are fairly much alike. That said, though, some
performers (either through sloppiness or for deliberate comic effect)
do play around with that joke; but it's not hard to establish accents
that make the joke work.
No wonder Gilbert and Sullivan had difficulty seeking success after HMS
Pinafore...


Hmm, that's a good idea! I wonder whether we could use it... you know,
same as so many other people already have. (Starship Pinafore
productions are actually fairly common) Wonder what's in our
calendar... ooh look, later on this year!

http://gilbertandsullivan.org.au/index.php?option=com_content&view=article&id=670&Itemid=565

Anyone want to come and join us? I'll be up in the lighting box every night..

ChrisA
 
J

jongiddy

So, just to summarise the discussion:

There was some very mild support for readable pipelines, either using UFCS or an alternative syntax, but the Pythonic way to make combinations of function and method applications readable is to assign to variables over multiple lines. Make the code read down, not across.

The idea that a class method could override a function using UFCS didn't get much traction. From Zen of Python, "explicit is better than implicit" means no differences in behaviour, depending on context. The fact that x.y andx.__getattr__ may behave differently under UFCS is also a problem. Since hasattr testing and AttributeError catching are both commonly used now, thiscould cause real problems, so could probably not be changed until Python 4..

Finally, Gilbert & Sullivan are definitely due a revival.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top