fast kdtree tree implementation for python 3?

W

_wolf

does anyone have a suggestion for a ready-to-go, fast kdtree
implementation for python 3.1 and up, for nearest-neighbor searches? i
used to use the one from numpy/scipy, but find it a pain to install
for python 3. also, i'm trying to wrap the code from http://code.google.com/p/kdtree/
using cython, but i'm still getting errors.

i wish stuff like kdtree, levenshtein edit distance and similar things
were available in the standard library.
 
S

Stefan Behnel

_wolf, 11.09.2010 20:15:
does anyone have a suggestion for a ready-to-go, fast kdtree
implementation for python 3.1 and up, for nearest-neighbor searches? i
used to use the one from numpy/scipy, but find it a pain to install
for python 3.

The latest release is supposed to work with Py3.

also, i'm trying to wrap the code from http://code.google.com/p/kdtree/
using cython, but i'm still getting errors.

If you subscribe to the cython-users mailing list, you can ask for help there.

i wish stuff like kdtree, levenshtein edit distance and similar things
were available in the standard library.

Since you're looking for an implementation, I guess you won't be the one
volunteering to maintain such code in the stdlib, would you?

Stefan
 
M

Marco Nawijn

does anyone have a suggestion for a ready-to-go, fast kdtree
implementation for python 3.1 and up, for nearest-neighbor searches? i
used to use the one from numpy/scipy, but find it a pain to install
for python 3. also, i'm trying to wrap the code fromhttp://code.google.com/p/kdtree/
using cython, but i'm still getting errors.

i wish stuff like kdtree, levenshtein edit distance and similar things
were available in the standard library.

Do you know about the kdtree implementation in biopython? I don't know
if it is already available for Python 3, but for me it worked fine in
Python 2.X.

Marco
 
W

_wolf

Since you're looking for an implementation, I guess you won't be the one
volunteering to maintain such code in the stdlib, would you?

this is indeed a problem. i am probably not the right one for this
kind of task.

however, i do sometimes feel like the standard library carries too
much cruft from yesteryear. things like decent image and sound
manipulation, fuzzy string comparison, fast asynchronous HTTP serving
and requesting are definitely things i believe a 2010 programming
language with batteries included should strive to provide.

one avenue to realize this goal could be to prioritize the packages in
pypi. pypi is basically a very good idea and has made things like
finding and installing packages much easier. however, it is also
organized like a dump pile. there are centuries old packages there few
people ever use.

i suggest to add aging (many old packages are good ones, but also
often display a crude form of inner organization; conversely, a
library not updated for a long time is unlikely to be a good answer to
your problem; aging works in both directions), popularity, and
community prioritization (where people vote for essential and relevant
solutions) to the standard library as well as to pypi; in other words,
to unify the two. batteries included is a very good idea, but there
are definitely some old and leaky batteries in there. sadly, since the
standard library modules are always included in each installation,
there are no figures on how much needed they are after all. one would
guess that were such figures available, the aifc library would come
near the end of a ranked listing.

if the community manages, by download figures and voting, to class
packages, a much clearer picture could emerge about the importance of
packages.

one could put python packages into:

* Class A all those packages without which python would not run (such
as sys and site); into

* Class B ('basics'), officially maintained packages; into

* Class C ('community'), packages that are deemed important or
desirable and which are open for community contributions (to make it
likely they get updated soon enough whenever needed); into

* Class D ('debut') all packages submitted to pypi and favorably
tested, reviewed and found relevant by a certain number of people;
into

* Class E ('entry') all packages submitted or found elsewhere on the
web, but not approved by the community; into

* Class F ('failure') all packages that were proposed but never
produced code, and all packages known to be not a good ideas to use
(see discussion going on at http://pypi.python.org/pypi/python-cjson).
Class F can help people to avoid going down the wrong path when
choosing software.

well this goes far beyond the kdtree question. maybe i'll make it a
proposal for a PEP.
 
W

_wolf

Do you know about the kdtree implementation in biopython? I don't know
if it is already available for Python 3, but for me it worked fine in
Python 2.X.

i heard they use a brute-force approach and it's slow. that's just
rumors alright. also, judging from the classes list on
http://www.biopython.org/DIST/docs/api/module-tree.html, you will see
you can probably tune in to the latest radio moscow news using it. way
too much for my needs, i just want to find the nearest neighbor on a
2D-plane. but thanks for the suggestion.
 
D

David Cournapeau

_wolf, 11.09.2010 20:15:

The latest release is supposed to work with Py3.

The latest (1.5) of numpy does, but there is no support of scipy for
python 3 yet, unless you use unreleased code. I think it is fair to
say it is still rough around the edges,

cheers,

David
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,240
Members
46,830
Latest member
HeleneMull

Latest Threads

Top