[Search Engine - Internal site] DB or not DB ?

R

rs

Hallo,

I have a site with more than 15000 (15 thousand) pages.
Each page has almost a textual content.
Each page is about 10-25 Kb.

I need to build an internal search engine
by using Asp Net code.


Which is the best way:

1)


create a DB (I have SQL 2005 Express)
with a Table containing 5 columns:
Id, page-link, page-title, keywords, all the textual content of the page

Column example:
05
/Einstein.htm
Einstein life
birth, death
Einstein was born in... and hand won the Nobel prize... and has dead in
Berlin.

then access to the DB using SELECT
and CONTAINS (for the 5th column)
and then go with
Me.Response.Write WhatIFound



or



2)


use no DB
and search among the page Tags (Title, Keywords, Body)
I presume by using the Regular Expression commands and the StringBuilder
and then go with
Me.Response.Write WhatIFound



-----------------

Which method of the two is better?

Also, any suggestion, optimization, advice... about
one or the two method is welcome.
 
S

sdbillsfan

Ask yourself, WWGD (what would google do). You definitely need to
create some sort of indexing tool here to spider the pages in case
content changes and then store the indexed results in a db. All that
being said, I wouldn't reinvent the wheel here. There are plenty of 3rd
party tools to do exactly what you want. Just search google for
intranet search engine
 
R

rs

I will not add a lot of pages (5-10 pages a year)
so indexing is not a problem.

I'm a new programmer and want to learn.

I'd like to receive technical information
about sizes, speed, query, chaching...
and at last to decide which of the two methods is better...
 
S

sdbillsfan

You want to automate the indexing here because the flexibility that
will allow makes the effort it would take to create well worth it.
Store your collection/indexing results in a database and the query,
caching, speed and sizes will be handled for you (you can learn about
database tuning here, a piece of knowledge almost all programmers
should have). You can use a built in text searching mechanism (every
RDBMS that I know of has one) or write (or reuse) an implementation of
any of the string searching algorithms out there. Make sure you
abstract whatever implementation you choose for each part,
collection/indexing/searching/etc as much as possible so you can modify
things as desired/needed (ie plugging in a different search algorithm,
database, etc).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top