What is best searching algorithm for URL

sandeep · Jun 1, 2006

Our team is developing proxy server(in VC++)which can handle 5000
clients. I have to implement cache part so when ever a new request com
from client I have to check the request URL content is in cache of
proxy and send to client if it is cache, if it is not there then it
have to get data from web server and store in proxy server cache.

so i am thinking to use binary tree search(or AVL tree) to search
request URL content in cache if it is not there to insert in it
is it a good idea so that insertion and searching is faster
I also used hash table and key I has chosen according to first
character in URL
So now in that bucket it contain double linked list now I have search
in it, for that I am thinking to use binary tree

Vladimir Oka · Jun 1, 2006

sandeep said:
Our team is developing proxy server(in VC++)which can handle 5000
clients. I have to implement cache part so when ever a new request com
from client I have to check the request URL content is in cache of
proxy and send to client if it is cache, if it is not there then it
have to get data from web server and store in proxy server cache.

so i am thinking to use binary tree search(or AVL tree) to search
request URL content in cache if it is not there to insert in it
is it a good idea so that insertion and searching is faster
I also used hash table and key I has chosen according to first
character in URL
So now in that bucket it contain double linked list now I have search
in it, for that I am thinking to use binary tree

Your question is best answered in comp.programming, feel free to come
back here if you encounter C specific problems while implementing your
solution. Note: you say you're using VC++ -- if you're writing C++,
comp.lang.c++ is the place to go (but still ask algorithm questions in
comp.programming).

Malcolm · Jun 1, 2006

sandeep said:
Our team is developing proxy server(in VC++)which can handle 5000
clients. I have to implement cache part so when ever a new request com
from client I have to check the request URL content is in cache of
proxy and send to client if it is cache, if it is not there then it
have to get data from web server and store in proxy server cache.

so i am thinking to use binary tree search(or AVL tree) to search
request URL content in cache if it is not there to insert in it
is it a good idea so that insertion and searching is faster
I also used hash table and key I has chosen according to first
character in URL
So now in that bucket it contain double linked list now I have search
in it, for that I am thinking to use binary tree

C has no build in hash tables nor binary trees, so you will have to
implement them yourself This is fiddly, though Chuck Falconer has a hash
library on his website which he is happy for people to use.
Anything else is comp.programming, as it is the algorithm which is your
problem. However either will probably provide perfectly acceptable
performance. .

Walter Roberson · Jun 1, 2006

Our team is developing proxy server(in VC++)which can handle 5000
clients. I have to implement cache part

binary tree search(or AVL tree) to search

hash table and key I has chosen according to first
character in URL

Implement something clean and maintainable first, and *then*
measure to see if the performance is acceptable.

If you need unusually high performance searching, you need a lot
more information about your architecture -- points such as the
amount of primary cache you have; the size of the cache line
fetched from secondary cache; the interprocess communications
mechanisms to notify that something has entered or gone out of cache;
effective shared-memory mechanisms; locks and semaphores to
ensure cache coherency; details about the filesystem, about block
allocation strategies within the filesystem, details about the
seek times (i.e., you always want to go to a -nearby- disk block, but
"nearby" will depend upon in-track seek times, track-to-track seek
times, head-switch times, intelligence of the controller, level
at which the RAID is happening...)

And you shouldn't be getting too far into any of these until you
first read Knuth's "The Art of Computer Programming" volume on
"Searching and Sorting".

CBFalconer · Jun 1, 2006

Malcolm said:
.... snip ...

C has no build in hash tables nor binary trees, so you will have
to implement them yourself This is fiddly, though Chuck Falconer
has a hash library on his website which he is happy for people to
use. Anything else is comp.programming, as it is the algorithm
which is your problem. However either will probably provide
perfectly acceptable performance. .

At <http://cbfalconer.home.att.net/download/>. The nmalloc package
is licensed under GPL (not GLPL), but other licenses can be
negotiated.

--
Some informative links:
http://www.geocities.com/nnqweb/
http://www.catb.org/~esr/faqs/smart-questions.html
http://www.caliburn.nl/topposting.html
http://www.netmeister.org/news/learn2quote.html

CBFalconer · Jun 1, 2006

CBFalconer said:
At <http://cbfalconer.home.att.net/download/>. The nmalloc package
is licensed under GPL (not GLPL), but other licenses can be
negotiated.

Correction - the hashlib package is ... nmalloc is unrestricted.

--
Some informative links:
http://www.geocities.com/nnqweb/
http://www.catb.org/~esr/faqs/smart-questions.html
http://www.caliburn.nl/topposting.html
http://www.netmeister.org/news/learn2quote.html

tedu · Jun 2, 2006

sandeep said:
I also used hash table and key I has chosen according to first
character in URL

the first character of a url is likely to be a spectacularly bad choice
of key.

Malcolm · Jun 3, 2006

CBFalconer said:
Correction - the hashlib package is ... nmalloc is unrestricted.

Why did you not LGPL it?
Not all commerical programmers work for evil huge corporations, you know.
Some are kids in bedrooms trying to make a useful living.

CBFalconer · Jun 3, 2006

Malcolm said:
Why did you not LGPL it?
Not all commerical programmers work for evil huge corporations, you know.
Some are kids in bedrooms trying to make a useful living.

Because it is a one way street. This way I at least expect
prospective licencees to contact me.

--
Some informative links:
http://www.geocities.com/nnqweb/
http://www.catb.org/~esr/faqs/smart-questions.html
http://www.caliburn.nl/topposting.html
http://www.netmeister.org/news/learn2quote.html

Can someone pls help me with a little algorithm script	1	Nov 28, 2024
What is Programming?	4	Aug 9, 2024
Searching the smaller picture in the larger picture	2	Jan 24, 2024
What is this obfuscation?	1	Jul 10, 2023
What is the best way of going about recreating the setTimeout() function?	0	Sep 2, 2022
Bootstrap Tree View doesnt search properly. I am searching for 954116679 (FSP) but it returns 0 matches found	2	May 27, 2024
Best Software for Microsoft Outlook OST To PST Conversion - 2025	9	Dec 5, 2024
Optimize a discovery algorithm	15	Jan 1, 2014

What is best searching algorithm for URL

sandeep

Vladimir Oka

Malcolm

Walter Roberson

CBFalconer

CBFalconer

tedu

Malcolm

CBFalconer

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads