string similarity in python

A

Achim Domma

Hi,

I have a list of lets say 100-1000 strings and want to know which one is
most similar to a reference string. Does somebody know such a library for
Python? I don't need complicated scientific stuff, I think the most simple
ones will do it for my data.

regards,
Achim
 
P

Peter Otten

Achim said:
I have a list of lets say 100-1000 strings and want to know which one is
most similar to a reference string. Does somebody know such a library for
Python? I don't need complicated scientific stuff, I think the most simple
ones will do it for my data.

Remembering an algorithm called Levenshtein, google came up with something
that looks promising.

http://trific.ath.cx/resources/python/levenshtein/

HTH,
Peter
 
A

anton muhin

Achim said:
Hi,

I have a list of lets say 100-1000 strings and want to know which one is
most similar to a reference string. Does somebody know such a library for
Python? I don't need complicated scientific stuff, I think the most simple
ones will do it for my data.

regards,
Achim

get_close_matches function in standard difflib module might be what you
are looking for.

regards,
anton.
 
V

vincent wehren

| Hi,
|
| I have a list of lets say 100-1000 strings and want to know which one is
| most similar to a reference string. Does somebody know such a library for
| Python? I don't need complicated scientific stuff, I think the most simple
| ones will do it for my data.
|
| regards,
| Achim
|
|

http://trific.ath.cx/resources/python/levenshtein/

It lets you calculate Levenshtein distance as well as a ratio of similarity
based on it, allowing you to "tweak" your results. You can use the source
both as C app or as C/Python extension module.

Getting it to do what you probably won't take you more than a few minutes...

Regards

Vincent Wehren
 
L

Luca Montecchiani

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top