Search Function with XML Results

R

rick.huby

This may not be the right place to post this - but if anyone can help
that would be great.

I want to add an Accessible search function to our website - the only
way I can do this is to find a search system that can return its
results as an XML document.

I am currently using FreeFind.com which is a great search system but
only returns the data in HTML (really old HTML too).

Another one I have found is FusioBot - this DOES return XML but I have
no way of being able to exculde parts of the page (eg the navigation -
I don't want every page to come back as a match for 'Contact' just cos
it is a word in the nav).

Again - FusionBot offers a soulution to this (by using index and
noindex tags) but these would make the page fail Accessibility checks.

I have been looking into this for weeks now - does anyone have any
alternatives or suggestions? We want the search function to be a part
of our site - not an external thing like Google Site Search.

Thanks,

Rick Huby
www.e-connected.com
 
M

Malte

This may not be the right place to post this - but if anyone can help
that would be great.

I want to add an Accessible search function to our website - the only
way I can do this is to find a search system that can return its
results as an XML document.

I am currently using FreeFind.com which is a great search system but
only returns the data in HTML (really old HTML too).

Another one I have found is FusioBot - this DOES return XML but I have
no way of being able to exculde parts of the page (eg the navigation -
I don't want every page to come back as a match for 'Contact' just cos
it is a word in the nav).

Again - FusionBot offers a soulution to this (by using index and
noindex tags) but these would make the page fail Accessibility checks.

I have been looking into this for weeks now - does anyone have any
alternatives or suggestions? We want the search function to be a part
of our site - not an external thing like Google Site Search.

Thanks,

Rick Huby
www.e-connected.com

Sounds as if FusioBot is your best bet. Can't you run the xml file
through some XSL that filters out what you want?
 
N

Nick Kew

I have been looking into this for weeks now - does anyone have any
alternatives or suggestions? We want the search function to be a part
of our site - not an external thing like Google Site Search.

Take whatever works best for the search function. If it produces
grotty markup, then postprocess that through a suitable filter.
mod_publisher gives you the maximum flexibility with both HTML
and XML, while mod_accessibility is specifically geared to
empowering your users in an HTML4 or XHTML context.
 
T

thehuby

The problem was that with FusionBot we need to use index and noindex
tags to exclude certain parts f the page from the search (eg the
Contact in the navigation would have brought up every page on the site
for the search term contact - when really we need only the contacts
page to come up).

As we are an accessible design agency we cannot use none standard HTML
in the front end code (which is what the FusionBot spider would have
been reading).

I have spoken to them however and they have adapted their system to use
<!--BEGIN NO INDEX --> B:AH BLAH BLAH <!--END NO INDEX --> or similar -
looks like we have sorted it anyway.

Thanks for your feedback though.
 
N

Nick Kew

thehuby said:
The problem was that with FusionBot we need to use index and noindex
tags to exclude certain parts f the page from the search (eg the
Contact in the navigation would have brought up every page on the site
for the search term contact - when really we need only the contacts
page to come up).

As we are an accessible design agency we cannot use none standard HTML
in the front end code (which is what the FusionBot spider would have
been reading).

If you want to serve different markup to a selected client (FusionBot),
you could do that easily with the Apache XML Namespace framework, or
with a more ad-hoc filter if your source isn't well-formed as XML.
That's putting it through a SAX filter, so the system overhead is in
the same ballpark as running oldfashioned server-side includes rather
than heavy-duty processing like XSLT.

Still simpler, if you have the choice, base the search on metadata,
and put it in <meta> elements in the head. You can still serve it
in a different form (even as RDF) to a bot, but it's valid xhtml
even as stored on disc.
have spoken to them however and they have adapted their system to use
<!--BEGIN NO INDEX --> B:AH BLAH BLAH <!--END NO INDEX --> or similar -
looks like we have sorted it anyway.

Hmmm, someone is parsing it as text, not as markup. How 1995 :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,999
Messages
2,570,247
Members
46,844
Latest member
JudyGvh32

Latest Threads

Top