M
Martin Marcher
Hello,
is there something like a standard full text search engine?
I'm thinking of the equivalent for python like lucene is for java or
ferret for rails. Preferrably something that isn't exactly a clone of
one of those but more that is python friendly in terms of the API it
provides.
Things I'd like to have:
* different languages are supported (it seems most FTSs do only english)
* I'd like to be able to provide an identifier (if I index files in
the filesystem that would be the filename, or an ID if it lives in a
database, or whatever applies)
* I'd like to pass it just some (user defined) keywords with content,
the actual content (as string, or list of strings or whatever) and to
retrieve the results by search by keyword
* something like a priority should be assignable to different fields
(like field: title(priority=10, content="My Draft"),
keywords(priority=50, list_of_keywords))
Unnecessary:
* built-in parsing of different files
The "standard" I'm referring to would be something with a large and
active user base. Like... WSGI is _the_ thing to refer to when doing
webapps it should be something like $FTS-Engine is _the_ engine to
refer to.
any hints?
is there something like a standard full text search engine?
I'm thinking of the equivalent for python like lucene is for java or
ferret for rails. Preferrably something that isn't exactly a clone of
one of those but more that is python friendly in terms of the API it
provides.
Things I'd like to have:
* different languages are supported (it seems most FTSs do only english)
* I'd like to be able to provide an identifier (if I index files in
the filesystem that would be the filename, or an ID if it lives in a
database, or whatever applies)
* I'd like to pass it just some (user defined) keywords with content,
the actual content (as string, or list of strings or whatever) and to
retrieve the results by search by keyword
* something like a priority should be assignable to different fields
(like field: title(priority=10, content="My Draft"),
keywords(priority=50, list_of_keywords))
Unnecessary:
* built-in parsing of different files
The "standard" I'm referring to would be something with a large and
active user base. Like... WSGI is _the_ thing to refer to when doing
webapps it should be something like $FTS-Engine is _the_ engine to
refer to.
any hints?