Timsort in Cpython

Z

Zachary Ware

I'm currently trying to make sense of Python's Timsort function. From the wikipedia page I was told the algorithm is located somewhere here: http://hg.python.org/cpython/file/default/Objects/listobject.c

So of all the functions in there, could somebody point to me which one is timsort?

Thanks, if anyone can help.
Alphonse23

Actually, it looks to me like it's several of them, but the main
function is here:
http://hg.python.org/cpython/file/default/Objects/listobject.c#l1902.

HTH,

Zach
 
R

Robert Kern

I'm currently trying to make sense of Python's Timsort function. From the wikipedia page I was told the algorithm is located somewhere here: http://hg.python.org/cpython/file/default/Objects/listobject.c

So of all the functions in there, could somebody point to me which one is timsort?

listsort()

http://hg.python.org/cpython/file/default/Objects/listobject.c#l1896

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
A

alphonse23

Hey guys,
Thanks for the quick reply! So why did they decide to call it listsort in the source instead? Why didn't they keep it as Timsort?

Well. I'm going to have a ton of fun trying to make sense of this.
 
R

Robert Kern

Hey guys,
Thanks for the quick reply! So why did they decide to call it listsort in the source instead? Why didn't they keep it as Timsort?

This was the first implementation of the algorithm. The algorithm was only
colloquially named "Timsort" after it was used in Python.

This the naming convention for the C implementation of builtin types' methods in
the Python codebase. The C implementation listsort() corresponds with the Python
method list.sort(). Similarly, listappend() is list.append(), listpop() is
list.pop(), etc. C.f.

http://hg.python.org/cpython/file/default/Objects/listobject.c#l2362

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
A

alphonse23

Yes I've read it. Very interesting read. There are other resources too online that make it very clear, for instance the wikipedia articles is pretty good.

Though, if anyone would be interested in helping me out further -- though by all means, I'm not lazy, I can figure it myself. But, I wanted to pass invariables into listsort and watch timsort work line by line in gdb.

listsort(PyListObject *self, PyObject *args, PyObject *kwds)

I've never worked with Cpython source before, but it looks like PyObject isjust some type of general strut.. I think anyway. How does python represent a list of ints in source? and what are the two second arguments for, assuming the first is the list strut.
 
M

mm0fmf

Answer: The lost context.

Question: What makes top-posted replies harder to read than bottom-posted?
 
A

alphonse23

sorry about that. I'm new to google groups. I'm trying to make sense of python's implementation of timsort through cpython: http://hg.python.org/cpython/file/default/Objects/listobject.c

I was replying to Terry Jan Reedy
http://hg.python.org/cpython/file/default/Objects/listsort.txt

is pretty clear (to me) for most of the basics.

Though, if anyone would be interested in helping me out further -- thoughby all means, I'm not lazy, I can figure it myself. But, I wanted to pass in variables into listsort and watch timsort work line by line in gdb.

listsort(PyListObject *self, PyObject *args, PyObject *kwds)

I've never worked with Cpython source before, but it looks like PyObject is just some type of general strut.. I think anyway. How does python represent a list of ints in source? and what are the two second arguments for, assuming the first is the list strut.



Answer: The lost context.



Question: What makes top-posted replies harder to read than bottom-posted?
variables into listsort and watch timsort work line by line in gdb.
 
D

Dennis Lee Bieber

sorry about that. I'm new to google groups. I'm trying to make sense of python's implementation of timsort through cpython: http://hg.python.org/cpython/file/default/Objects/listobject.c
Since you are new to GoogleGroups, if you can, run away from it as fast
as possible. While material may look okay on their system, it is
practically trashed when getting out into the real world. (Paragraphs
either come in as long lines with no wrapping [Usenet/Email convention is
for 80 character lines, and to allow for > quotes original text should wrap
around 72-75], or they end up double-spacing stuff that comes in following
the 80 character line length (GG is treating hard end-of-line as a
paragraph marker, and on quoting, adding a blank line between these
"paragraphs")

Either subscribe to the mailing list (and use a real mail client rather
than web-mail) or use a news reader; if your ISP doesn't provide an NNTP
server carrying comp.lang.python, it is available from Gmane as
gmane.comp.python.general (the mailing list and comp.lang.python are cross
linked, and Gmane shows the mailing list as if it were a Usenet news group)

And just another comment: preference on the group is "trim quoted
material and comment below the quote (or interspersed with the quotes)"...
The style created with M$ Outlook (Outlook goes out of its way to make it
impossible to trim/intersperse -- it considers quoted material as a
photocopy attached to the back of a new letter) in which one comments at
the top of the quoted material, and never trims to relevant material is
frowned upon.
 
I

Ian Kelly

Yes I've read it. Very interesting read. There are other resources too online that make it very clear, for instance the wikipedia articles is prettygood.

Though, if anyone would be interested in helping me out further -- thoughby all means, I'm not lazy, I can figure it myself. But, I wanted to pass in variables into listsort and watch timsort work line by line in gdb.

listsort(PyListObject *self, PyObject *args, PyObject *kwds)

I've never worked with Cpython source before, but it looks like PyObject is just some type of general strut.. I think anyway. How does python represent a list of ints in source? and what are the two second arguments for, assuming the first is the list strut.

A PyObject* generally references any Python object. The subtype
PyListObject* more specifically references a Python list. The above
signature corresponds to this Python function signature:

def listsort(self, *args, **kwds):

The first argument self is the list object to be operated on. The
second argument args is a Python tuple containing any other positional
arguments that were passed into the method. The third argument kwds
is a Python dict containing any keyword arguments that were passed
into the method.
 
S

sean.westfall

sorry about that. I'm new to google groups. I'm trying to make sense of python's implementation of timsort through cpython: http://hg.python.org/cpython/file/default/Objects/listobject.c

Since you are new to GoogleGroups, if you can, run away from it as fast

as possible. While material may look okay on their system, it is

practically trashed when getting out into the real world. (Paragraphs

either come in as long lines with no wrapping [Usenet/Email convention is

for 80 character lines, and to allow for > quotes original text should wrap

around 72-75], or they end up double-spacing stuff that comes in following

the 80 character line length (GG is treating hard end-of-line as a

paragraph marker, and on quoting, adding a blank line between these

"paragraphs")



Either subscribe to the mailing list (and use a real mail client rather

than web-mail) or use a news reader; if your ISP doesn't provide an NNTP

server carrying comp.lang.python, it is available from Gmane as

gmane.comp.python.general (the mailing list and comp.lang.python are cross

linked, and Gmane shows the mailing list as if it were a Usenet news group)



And just another comment: preference on the group is "trim quoted

material and comment below the quote (or interspersed with the quotes)"...

The style created with M$ Outlook (Outlook goes out of its way to make it

impossible to trim/intersperse -- it considers quoted material as a

photocopy attached to the back of a new letter) in which one comments at

the top of the quoted material, and never trims to relevant material is

frowned upon.

Understood. I just subscribed to the news group, though I'm using gmail. hopefully it will work better than google groups.

I'll keep all the advice here in mind whenever I post.
 
S

sean.westfall

A PyObject* generally references any Python object. The subtype

PyListObject* more specifically references a Python list. The above

signature corresponds to this Python function signature:



def listsort(self, *args, **kwds):



The first argument self is the list object to be operated on. The

second argument args is a Python tuple containing any other positional

arguments that were passed into the method. The third argument kwds

is a Python dict containing any keyword arguments that were passed

into the method.

The second argument takes the tuple which determines which varialble(key) to use the comparator on. And the third determines whether to return the list in ascending or descending order. But how do these PyObject* look in C?

How does a PyListObject* look declared in CPython. How would something likethis list = [2, 1, 5, 6, 10] look in CPython. Or what about something more complicated -- mlist = [('A',1),('B',2),('C',3)].

Thanks for the help.
 
I

Ian Kelly

The second argument takes the tuple which determines which varialble(key) to use the comparator on. And the third determines whether to return the list in ascending or descending order.

That's not exactly correct. The arguments are listed in that order,
but in fact the arguments to list.sort are keyword-only and cannot be
supplied positionally. So the "args" argument is expected to be an
empty tuple, and the "kwds" argument is a dict that contains both the
"key" and "reverse" arguments, if they were supplied.
But how do these PyObject* look in C?

It's a pointer to a struct that contains information like the class
and reference count of the object.
How does a PyListObject* look declared in CPython.

That's a pointer to a larger struct that shares the same header as the
PyObject* struct (which is basically how you do inheritance in C). It
adds information like the length and capacity of the list, plus a
pointer to an array of PyObject* that stores the contents of the list.
How would something like this list = [2, 1, 5, 6, 10] look in CPython. Or what about something more complicated -- mlist = [('A',1),('B',2),('C',3)].

To answer that question, you should really delve into the source and
see what these structs actually look like. But the first is going to
contain an array of five PyObject* values, each of which references an
int, while the second is going to contain an array of three PyObject*
values, each of which references a tuple.

I also recommend that you read the sections of the Python docs that
cover the C API, as those should help you understand how these structs
are normally handled.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,997
Messages
2,570,241
Members
46,831
Latest member
RusselWill

Latest Threads

Top