Multi-dimensional list initialization

G

Greg Ewing

That said, losing:
[0] * (2, 3) == [0] * [2, 3]
would mean losing duck-typing in general.

There are precedents for this kind of thing; the
string % operator treats tuples specially, for
example.

I don't think it's all that bad if you regard
the tuple as effectively part of the syntax.
 
S

Steven D'Aprano

Did you also miss MRAB's post above? It made sense to me.

You mean MRABs post which I replied to?

Yes, I must have missed it :p

But seriously, no I didn't miss it. He doesn't give any evidence that
there is a difference between "call by ..." and "pass by ..." when
talking about binding arguments to formal parameters. His objection to
"call by ..." is that it doesn't make it clear that the evaluation rules
apply to simple binding/assignment as well as calling functions.
 
S

Steven D'Aprano

On 11/07/2012 01:01 PM, Ian Kelly wrote: [...]
Anyway, your point was to suggest that people would not be confused by
having list multiplication copy lists but not other objects, because
passing lists into functions as parameters works in basically the same
way.

Not quite; Although I wasn't clear; The variable passed in is by
*value* in contradistinction to the list which is by reference. Python
does NOT always default copy by reference *when it could*; that's the
point.

It isn't clear to me whether you are describing what you think Python
*actually* does, versus what you wish it *would* do, or what it *could*
do in some abstract hypothetical sense.

It certainly is not true that Python passes "the variable" by value, and
lists "by reference". Arguments are not passed to functions either by
value or by reference.

There is a trivial test for pass-by-value semantics: does the value get
copied? We can see that Python does not copy arguments:

py> def test(x):
.... print id(x)
....
py> spam = []
py> print id(spam); test(spam)
3071264556
3071264556

The argument is not copied, therefore Python is not pass-by-value.

There is also an easy test for pass-by-reference semantics: can you write
a procedure which, given two variables, swaps the contents of the
variables? In Pascal, that is trivial.

procedure swap(var a: int, var b: int):
var
tmp: int;
begin
tmp := a;
a := b;
b := a;
end;

swap(x, y);

(if I've remembered my Pascal syntax correctly).


In Python, you can swap two values like this:

a, b = b, a

but that's not sufficient. The test is to do the swap inside a function:

def swap(a, b):
return b, a

b, a = swap(a, b)

But that fails too, since the assignment is still taking place outside
the function.

It turns out that there is no way in Python to write such a swap
function. Tricks such as passing the variable names as strings, then
using exec, are hacks and don't count. Python is not pass by reference
either.


Hence the programmer has to remember in foo( x,y ), the names x and y
when assigned to -- *DONT* affect the variables from which they came.
But any object internals do affect the objects everywhere.

Ummm.... yes?

The programmer has to remember Python's execution model in order to
correctly predict what Python will do. What's your point?

A single exception exists;

There is no such exception in Python. Python always uses the same
argument passing (parameter binding) semantics.
 
A

Andrew Robinson

Andrew, it appears that your posts are being eaten or rejected by my
ISP's news server, because they aren't showing up for me. Possibly a side-
effect of your dates being in the distant past?
Date has been corrected since two days ago. It will remain until a
reboot....
Ignorance, though, might be bliss...
Every now and again I come across somebody who tries to distinguish
between "call by foo" and "pass by foo", but nobody has been able to
explain the difference (if any) to me.
I think the "Call by foo" came into vogue around the time of C++; Eg:
It's in books like C++ for C programmers; I never saw it used before
then so I *really* don't know for sure...

I know "Pass by value" existed all the way back to the 1960's. I see
"pass by" in my professional books from those times and even most newer
ones; but I only find "Call by value" in popular programming books of
more recent times. (Just my experience) So -- I "guess" the reason is
that when invoking a subroutine, early hardware often had an assembler
mnemonic by the name "call".

See for example: Intelx86 hardware books from the 1970's;

Most early processors (like the MC6809E, and 8080) allow both direct and
indirect *references* to a function (C would call them function
pointers); So, occasionally early assembly programs comment things like:
"; dynamic VESA libraries are called by value in register D."; And they
meant that register D is storing a function call address from two or
more vesa cards. It had little to do with the function's parameters,
(which might be globals anyway) (It procedural dynamic binding!)

Today, I don't know for sure -- so I just don't use it.
"pass" indicates a parameter of the present call; but not the present
call itself.
 
I

Ian Kelly

OK, and is this a main use case? (I'm not saying it isn't I'm asking.)

I have no idea what is a "main" use case.
There is a special keyword which signals the new type of comprehension; A
normal comprehension would say eg: '[ foo for i in xrange ]'; but when the
'for i in' is reduced to a specific keyword such as 'ini' (instead of
problematic 'in') the caching form of list comprehension would start.

FYI, the Python devs are not very fond of adding new keywords. Any
time a new keyword is added, existing code that uses that word as a
name is broken. 'ini' is particularly bad, because 1) it's not a
word, and 2) it's the name of a common type of configuration file and
is probably frequently used as a variable name in relation to such
files.
So, then, just like a comprehension -- the interpreter will begin to
evaluate the code from the opening bracket '['; But anything other than a
function/method will raise a type error (people might want to change that,
but it's safe).

The interpreter then caches all functions/initialiser methods it comes into
contact with.
Since every function/method has a parameter list (even if empty); The
interpreter would evaluate the parameter list on the first pass through the
comprehension, and cache each parameter list with it's respective function.

When the 'ini' keyword is parsed a second time, Python would then evaluate
each cached function on its cached parameter list; and the result would be
stored in the created list.
This cached execution would be repeated as many times as is needed.

Now, for your example:

values = zip(samples, times * num_groups)
if len(values) < len(times) * num_groups:
# raise an error

Might be done with:

values = zip( samples, [ lambda:times, ini xrange(num_groups) ] )

if len(values) < len(times) * num_groups

The comma after the lambda is questionable, and this construction would be
slower since lambda automatically invokes the interpreter; but it's correct.

How is this any better than the ordinary list comprehension I already
suggested as a replacement? For that matter, how is this any better
than list multiplication? Your basic complaint about list
multiplication as I understand it is that the non-copying semantics
are unintuitive. Well, the above is even less intuitive. It is
excessively complicated and almost completely opaque. If I were to
come across it outside the context of this thread, I would have no
idea what it is meant to be doing.
As an aside, how would you do the lambda inside a list comprehension?

As a general rule, I wouldn't. I would use map instead.
[lambda:6 for i in xrange(10) ] # Nope.

Thak constructs a list of 10 functions and never calls them. If you
want to actually call the lambda, then:

[(lambda: 6)() for i in range(10)]

or:

map(lambda i: 6, range(10))

But note that the former creates equivalent 10 functions and calls
each of them once, whereas the latter creates one function and calls
it ten times.
Because it's an arbitrary rule which operates differently than the
traditional idea shown in python docs?

slice.indices() is *for* (QUOTE)"representing the set of indices specified
by range(start, stop, step)"
http://docs.python.org/2/library/functions.html#slice

slice.indices() has nothing to do with it. Indexing a sequence and
calling the .indices() method on a slice are entirely different
operations. The slice.indices method is a utility method meant to be
called by __getitem__ implementations when doing slicing, not an
implementation of indexing. When a sequence is indexed, there is no
slice. That method is not related in any way to the semantics of
indexing a sequence.
 
I

Ian Kelly

OK: Then copy by reference using map....:

values = zip( map( lambda:times, xrange(num_groups) ) )
if len(values) < len(times) * num_groups ...

Done. It's clearer than a list comprehension and you still really don't
need a list multiply.

That is not equivalent to the original. Even had you not omitted some parts:

values = zip(samples, map(lambda i: times, range(num_groups)))

This still has the problem that map returns a list of num_groups
elements, each of which is times. The desired value to be passed into
zip is a *single* sequence containing len(times) * num_groups
elements. This is easily handled by list multiplication, but not so
easily by map or by a single list comprehension. Looking back at the
'ini' solution you proposed before, I see that this also would be a
problem there. Fixing the above, it would have to be something like:

values = zip(samples, reduce(operator.add, map(lambda i: times,
range(num_groups)), []))

Or from how I understand the 'ini' syntax to work:

values = zip(samples, reduce(operator.add, [lambda: times, ini
xrange(num_groups)], []))

Which brings to mind another point that I want to get to in a moment.
But when I said that I would use map instead, I meant that *if* the
body of the list comprehension is just a function application, then I
would prefer to use map over the list comprehension. But in the above
I see no benefit in using a lambda in the first place.

Getting back to that other point, notice what we ended up doing in
both of those constructions above: repeated list concatenation as a
substitute for multiplication. In fact, when we multiply (aList * 5),
this should be the equivalent of (aList + aList + aList + aList
+aList), should it not? Clearly, however, there should be no implicit
copying involved in mere list concatenation. For one thing, if the
user wants to concatenate copies, that is quite easily done
explicitly: (aList[:] + aList[:]) instead of (aList + aList). For
another, list concatenation is less likely to be used for an
initialization process. If list multiplication were to copy nested
lists, then, this would break the intuitive notion that list
multiplication is equivalent to repeated list concatenation.
Yes, but you're very blind to history and code examples implementing the
slice operation.
slice usually depends on index; index does not depend on slice.
Slice is suggested to be implemented by multiple calls to single indexes in
traditional usage and documentation.

....and then by composing the elements located at those indexes into a
subsequence.
The xrange(,,)[:] implementation breaks the tradition, because it doesn't
call index multiple times; nor does it return a result equivalent identical
to doing that.

Whether any given __getitem__ slicing implementation recursively calls
__getitem__ with a series of indexes or not is an implementation
detail. If it were possible to index a range object multiple times
and then stuff the results into another range object, then the slicing
result would be equivalent. The only reason it is not is that you
cannot construct a range object in that fashion.

I think that what you're expecting is that range(5)[:] should return a
list in Python 3 because it returns a list in Python 2. This does not
represent a change in slicing behavior -- in fact, all you got by
slicing an xrange object in Python 2 was a TypeError. This represents
an intentional break in backward compatibility between Python 2 and
Python 3, which was the purpose of Python 3 -- to fix a lot of
existing warts in Python by breaking them all at once, rather than
progressively over a long string of versions. Users porting their
scripts from Python 2 to Python 3 are advised to replace "range(...)"
with "list(range(...))" if what they actually want is a list, and I
believe the 2to3 tool does this automatically. Once the range object
is converted to a list, there is no further break with Python 2 --
slicing a list gives you a list, just as it always has.

In a nutshell, yes: range(...)[:] produces a different result in
Python 3 than in Python 2, just as it does without the slicing
operation tacked on. It was never intended that scripts written for
Python 2 should be able to run in Python 3 unchanged without careful
attention to detail.
 
C

Chris Angelico

But I've seen this scattered through code:

x := x - x - x

Can you enlighten us as to how this is better than either:
x := -x
or
x := 0 - x
? I'm not seeing it. And I'm not seeing any nonnumeric that would
benefit from being subtracted from itself twice (strings, arrays,
sets, you can subtract them from one another but not usefully more
than once).

ChrisA
 
S

Steven D'Aprano

Can you enlighten us as to how this is better than either:
x := -x
or
x := 0 - x
? I'm not seeing it.

I'm hoping that Mark intended it as an example of crappy code he has
spotted in some other language rather than a counter-example of something
you would do.

To be pedantic... there may very well be some (rare) cases where you
actually do want x -= x rather than just x = 0. Consider the case where x
could be an INF or NAN. Then x -= x should give x = NAN rather than zero.
That may be desirable in some cases.

At the very least, the compiler should NOT optimize away x = x - x to
x = 0 if x could be a float, complex or Decimal.

And I'm not seeing any nonnumeric that would
benefit from being subtracted from itself twice (strings, arrays, sets,
you can subtract them from one another but not usefully more than once).

How do you subtract strings?
 
C

Chris Angelico

I'm hoping that Mark intended it as an example of crappy code he has
spotted in some other language rather than a counter-example of something
you would do.

Ohh. Yeah, that figures. Huh.
To be pedantic... there may very well be some (rare) cases where you
actually do want x -= x rather than just x = 0. Consider the case where x
could be an INF or NAN. Then x -= x should give x = NAN rather than zero.
That may be desirable in some cases.

At the very least, the compiler should NOT optimize away x = x - x to
x = 0 if x could be a float, complex or Decimal.

Yep. In the specific case of integers, though, and in the specific
instance of CPU registers in assembly language, it's reasonable to
optimize it the *other* way - MOV reg,0 is a one-byte opcode and 1, 2,
or 4 bytes of immediate data, while SUB reg,reg (or XOR reg,reg) is a
two-byte operation regardless of data size. But that's
microoptimization that makes, uhh, itself-subtracted-from-itself sense
in Python.
How do you subtract strings?

The same way you subtract sets. Same with arrays. Python doesn't do
either, but Python also doesn't do the ":=" operator that the example
code demonstrated, so I didn't assume Python.

Pike v7.8 release 700 running Hilfe v3.5 (Incremental Pike Frontend)
"Hello, world!"-"l"; (1) Result: "Heo, word!"
({1,2,3,3,2,3,1,2,1})-({2});
(2) Result: ({ /* 6 elements */
1,
3,
3,
3,
1,
1
})

Python spells it differently:'Heo, word!'

Not sure how to do array subtraction other than with filter:
list(filter(lambda x: x!=2,[1,2,3,3,2,3,1,2,1]))
[1, 3, 3, 3, 1, 1]
But there's probably a way (list.remove only takes out the first
occurrence, so it's not equivalent).

In any case, subtracting something from _itself_ is only going to give
you an empty string, array, set, or whatever, and doing so a second
time is going to achieve nothing. Hence my comment.

But poor code we will always have with us, to paraphrase the Gospel of Matthew.

ChrisA
 
M

Mark Lawrence

I'm hoping that Mark intended it as an example of crappy code he has
spotted in some other language rather than a counter-example of something
you would do.

Correct, CORAL 66 and pointed out to me by a colleague when another team
member had resigned.
To be pedantic... there may very well be some (rare) cases where you
actually do want x -= x rather than just x = 0. Consider the case where x
could be an INF or NAN. Then x -= x should give x = NAN rather than zero.
That may be desirable in some cases.

Interesting what comes up when we get chatting here. I hope we don't
get punished for going off topic :)
At the very least, the compiler should NOT optimize away x = x - x to
x = 0 if x could be a float, complex or Decimal.

X was an int so almost certainly optimised away by the SDL compiler on
VMS of 1986 or 1987.
 
R

rusi

I'm hoping that Mark intended it as an example of crappy code he has
spotted in some other language rather than a counter-example of something
you would do.

To be pedantic... there may very well be some (rare) cases where you
actually do want x -= x rather than just x = 0. Consider the case where x
could be an INF or NAN. Then x -= x should give x = NAN rather than zero.
That may be desirable in some cases.

In x86 assembler
mov ax, 0
is 4 bytes
sub ax, ax
is 2
and therefore better (at least for those brought up on Peter Norton);
the most common being
xor ax, ax
 
C

Chris Angelico

In x86 assembler
mov ax, 0
is 4 bytes

Three bytes actually, B8 00 00 if my memory hasn't failed me. BA for
DX, B9 ought to be BX and BB CX, I think. But yes, the xor or sub is
two bytes and one clock.

ChrisA
 
D

Dennis Lee Bieber

Can you enlighten us as to how this is better than either:
x := -x
or
x := 0 - x

Of course, if one has a language that, for some reason, evaluates
right-to-left (APL, anyone), then

x := x - x - x

becomes

x := x - 0

<G>
 
P

Prasad, Ramit

Dennis said:
Of course, if one has a language that, for somereason, evaluates
right-to-left (APL, anyone), then

x := x - x - x

becomes

x := x - 0

Is that not the same as x:=-x?


~Ramit


This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,144
Messages
2,570,823
Members
47,369
Latest member
FTMZ

Latest Threads

Top