Search (part 2) - content/html

R

Rob Meade

Lo all,

Ok - just spotted another problem I'm going to have.

All of the rows for the page content in the database contain formatting ie,

<b>some bold text</b>

This will cause me a problem initially as if I wanted to search for <company
name> the SQL statement
might be something like this :

WHERE PageContent Like '% company name %'

The spaces obviously prevent the word being contained within another word
(gene/generic etc) however, if there was a page that had something like this
:

<b>My Company</b>

then the above would not find it.

How would I best go about ignoring the html tags within the page content in
the database, is it possible to add this to my code in my view so that the
view already has this stripped out - thus not having to do it each time the
search is carried out?

Any info appreciated..

Regards

Rob
 
B

Bob Barrows

Rob said:
Lo all,

Ok - just spotted another problem I'm going to have.

All of the rows for the page content in the database contain
formatting ie,

<b>some bold text</b>

This will cause me a problem initially as if I wanted to search for
<company name> the SQL statement
might be something like this :

WHERE PageContent Like '% company name %'

The spaces obviously prevent the word being contained within another
word (gene/generic etc) however, if there was a page that had
something like this

<b>My Company</b>

then the above would not find it.

How would I best go about ignoring the html tags within the page
content in the database, is it possible to add this to my code in my
view so that the view already has this stripped out - thus not having
to do it each time the search is carried out?

Any info appreciated..

Regards

Rob

This is one of the pitfalls of mixing data and display code (not that I
haven't done that, mind you).

One solution would be to add an OR condition:

WHERE PageContent Like '% company name %' OR
PageContent Like '%>company name<%'

But you're already using a non-sargable
(http://www.sql-server-performance.com/sql_server_performance_audit8.asp)
search condition here that is impairing your query's performance by forcing
a table scan. Adding an OR condition will simply compound that decrease in
performance.

I suspect that you will benefit from full-text indexing. I do not have any
experience with this subject, but there is a lot of information about it in
BOL. Additionally, there is a public newsgroup devoted to the topic:
..sqlserver.fulltext, which you can browse using Google, etc.

HTH,
Bob Barrows
 
R

Rob Meade

...
This is one of the pitfalls of mixing data and display code (not that I
haven't done that, mind you).

Is there another way using a web based content management system to apply
html codes to specific words without doing this though Bob? If so, let me
know as I've not previously thought of other ways etc, any info on that
appreciated.
One solution would be to add an OR condition:

WHERE PageContent Like '% company name %' OR
PageContent Like '%>company name<%'

Hadn't thought of that, I already am using OR where they have an any word
search etc, so that could be added I guess, obviously be needed per word
that they enter as criteria.

cheers, just having a read now - still dont know what sargable means, but I
get the idea about the indexes (none being used at this time).
I suspect that you will benefit from full-text indexing. I do not have any
experience with this subject, but there is a lot of information about it in
BOL.

Cheers, will check it out..

What are your thoughts to this idea....

If I were to add a seperate table for non-formatted data, ie, the user
updates a pages content, it writes the info once to a table with no
formatting, and another with the formatting, the one with is used for
display etc, the one without for searching - removing the < > problem....I
appreciate this probably sounds some what goofy, and would obviously
increase the size of the database, it might not be as slick as other
solutions, but I'm guessing it would work....

Downside would be getting the existing 100+ pages into the new table without
any formatting!

Regards

Rob
 
C

Chris Hohmann

Rob Meade said:
...
(http://www.sql-server-performance.com/sql_server_performance_audit8.asp
)

cheers, just having a read now - still dont know what sargable means, but I
get the idea about the indexes (none being used at this time).

Here's a definition for sargable:
http://www.sql-server-performance.com/glossary.asp#Sargable

The word is derived from the phrase Search-ARGument-ABLE.
http://manuals.sybase.com/onlinebooks/group-aw/awg0800e/dbugen8/@Generic__BookTextView/26460

HTH
-Chris Hohmann
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,145
Messages
2,570,825
Members
47,371
Latest member
Brkaa

Latest Threads

Top