stripping words from querystring

D

David

Hi,

I'm trying to pass a querystring with certain common words removed (and,
the, if, of etc). The code below replaces the keywords with "" or whatever I
choose, but what I'd like to do is remove the words completely from the
querystring. How do I remove rather than replace?

Thanks

*********
tempstr=searchstring

removewords = array ("and","the","if","of","a")
temp=trim(tempstr)

for x=0 to ubound(removewords)
temp=replace(temp,removewords(x),"")
next

newstring=Trim(temp)
 
B

Bob Barrows

David said:
Hi,

I'm trying to pass a querystring with certain common words removed
(and, the, if, of etc). The code below replaces the keywords with ""
or whatever I choose, but what I'd like to do is remove the words
completely from the querystring. How do I remove rather than replace?

Thanks

*********
tempstr=searchstring

removewords = array ("and","the","if","of","a")
temp=trim(tempstr)

for x=0 to ubound(removewords)
temp=replace(temp,removewords(x),"")
next

newstring=Trim(temp)

I don't understand. Replacing characters with an empty string is exactly the
same thing as removing them. Could you reprase your question?


Bob Barrows
 
E

Evertjan.

Bob Barrows wrote on 15 feb 2004 in
microsoft.public.inetserver.asp.general:
I don't understand. Replacing characters with an empty string is
exactly the same thing as removing them. Could you reprase your
question?

Bob, when removing words double spaces are left.

Also another problem:
removewords = array ("and","the","if","of","a")
temp=trim(tempstr)

for x=0 to ubound(removewords)
temp=replace(temp,removewords(x),"")
next

"the cat has offended their wife"

will be changed to

" ct hs fended ir we"

I would suggest this:

==========================

removewords = array ("and","the","if","of","a")
temp= " " & trim(tempstr) & " "

for x=0 to ubound(removewords)-1
temp=replace(temp, " " & removewords(x) & " "," ")
next

result = trim(temp)


==========================
 
R

Roland Hall

:
: > I would suggest this:
: <snip>
:
: Thanks - works a treat!

Not if you want to remove all words case-insensitive. And, you're missing
the last array element.

lbound(array) is always 0
ubound(array) is always count - 1

So, these lines:

for x=0 to ubound(removewords)-1
temp=replace(temp, " " & removewords(x) & " "," ")

Should be:

for x=0 to ubound(removewords)
temp=replace(lcase(temp), " " & removewords(x) & " "," ")

Test:

<%@ Language=VBScript %>
<%
Option Explicit
Response.Buffer = True

dim removewords, tempstr, x, result, temp

tempstr = "If a cat and a dog have a battle of wits, who will win the
battle?"
removewords = array ("and","the","if","of","a")
temp= " " & trim(tempstr) & " "

for x=0 to ubound(removewords)
temp=replace(lcase(temp), " " & removewords(x) & " "," ")
Response.Write(temp & ": removing " & removewords(x) & "<br />" &
vbCrLf)
next

result = trim(temp)
Response.Write("old: " & tempstr & "<br />" & vbCrLf)
Response.Write("new: " & result & "<br />" & vbCrLf)
%>

http://kiddanger.com/lab/stripcommon.asp

HTH...

--
Roland Hall
/* This information is distributed in the hope that it will be useful, but
without any warranty; without even the implied warranty of merchantability
or fitness for a particular purpose. */
Technet Script Center - http://www.microsoft.com/technet/scriptcenter/
WSH 5.6 Documentation - http://msdn.microsoft.com/downloads/list/webdev.asp
MSDN Library - http://msdn.microsoft.com/library/default.asp
 
E

Evertjan.

Roland Hall wrote on 16 feb 2004 in
microsoft.public.inetserver.asp.general:
And, you're missing the last array element.

lbound(array) is always 0
ubound(array) is always count - 1

Yes, my mistake
Not if you want to remove all words case-insensitive.

I would hesitate to remove the "A" or even the "a" for that matter.

The logical choice should be to remove all single letter words as search
words.

And for that regular expressions should take over,
or, if you are unconfortable with that, a split-join sequence.
 
R

Roland Hall

:
: Roland Hall wrote on 16 feb 2004 in
: microsoft.public.inetserver.asp.general:
:
: > And, you're missing the last array element.
: >
: > lbound(array) is always 0
: > ubound(array) is always count - 1
:
: Yes, my mistake

I'm not error free either.

: > Not if you want to remove all words case-insensitive.
:
: I would hesitate to remove the "A" or even the "a" for that matter.
:
: The logical choice should be to remove all single letter words as search
: words.

Agreed, unless enclosed within quotes.

: And for that regular expressions should take over,
: or, if you are unconfortable with that, a split-join sequence.

I prefer regular expressions. Not sure about others.

--
Roland Hall
/* This information is distributed in the hope that it will be useful, but
without any warranty; without even the implied warranty of merchantability
or fitness for a particular purpose. */
Technet Script Center - http://www.microsoft.com/technet/scriptcenter/
WSH 5.6 Documentation - http://msdn.microsoft.com/downloads/list/webdev.asp
MSDN Library - http://msdn.microsoft.com/library/default.asp
 
R

Roland Hall

:
: "Evertjan." wrote:
: : And for that regular expressions should take over,
: : or, if you are unconfortable with that, a split-join sequence.
:
: I prefer regular expressions. Not sure about others.

RegExp version:
It still needs work. This doesn't appear to be the most efficient use.

<%@ Language=VBScript %>
<%
Option Explicit
Response.Buffer = True

dim temp, tempstr, result, re, Match, Matches

tempstr = "If a cat and, and a dog have a battle of wits, who will win the
battle? I, and I do mean 'I', think the dog would win."
Response.Write("old: " & tempstr & "<br />" & vbCrLf & "<hr>")
temp = " " & trim(tempstr) & " "

set re = new RegExp
with re
.Global = True
.IgnoreCase = True
.Pattern = "(( and | and, )|( the | the, )|( if | if, | if\. | if\? )|( of
| of, | of\. | of\? )|( \w | \w, | \w\. | \w\? ))"
end with
set Matches = re.Execute(temp)
for each Match in Matches
temp = re.Replace(" " & temp & " "," ")
next

result = trim(temp)
Response.Write("RegExp: " & re.Pattern & "<br />" & vbCrLf & "<hr>")
Response.Write("new: " & result & "<br />" & vbCrLf)
%>

http://kiddanger.com/lab/stripcommonre.asp

--
Roland Hall
/* This information is distributed in the hope that it will be useful, but
without any warranty; without even the implied warranty of merchantability
or fitness for a particular purpose. */
Technet Script Center - http://www.microsoft.com/technet/scriptcenter/
WSH 5.6 Documentation - http://msdn.microsoft.com/downloads/list/webdev.asp
MSDN Library - http://msdn.microsoft.com/library/default.asp
 
E

Evertjan.

Roland Hall wrote on 16 feb 2004 in
microsoft.public.inetserver.asp.general:
It still needs work. This doesn't appear to be the most efficient use.

<script language=vbs>

' prepare IE vbs pseudo ASP test environment
set response = document

temp = "If a cat and, and a dog have a battle of wits, who will win
the battle? I, and I do mean 'I', think the dog would win."


Response.Write "old:<br>" & temp & "<br><hr>" & vbCrLf

temp = " " & temp & " "

set re = new RegExp
re.Global = True
re.IgnoreCase = True

' loose comma's, question marks, etc.
re.Pattern = ",|\?|!"
temp = re.replace(temp," ")

' loose beginning and ending single quotes, not O'Brian
re.Pattern = "' | '"
temp = re.replace(temp," ")

' loose single letter or number words
re.Pattern = "\s\S\s"
temp = re.replace(temp," ")

' loose these 4
re.Pattern = " and | the | if | of "
temp = re.replace(temp," ")

' loose multiple white space = inner trim
re.Pattern = "\s+"
temp = re.replace(temp," ")

' outer trim, regex for the fun of it
re.Pattern = "(^\s+)|(\s+$)"
temp = re.replace(temp,"")


Response.Write "new:<br>" & temp & "<br>" & vbCrLf

</script>
 
E

Evertjan.

Roland Hall wrote on 16 feb 2004 in
microsoft.public.inetserver.asp.general:
Why would you want to lose beginning and ending single quotes? This
is a search routine to remove common words but if someone wanted to
use a common word as part of their query, as you can do with Google,
you will need to be able to use quotes. Also, there is nothing
regarding double quotes. We're using this static but I'm almost sure
this is designed to work with user input.

This is not about the exact implementation but about the systematic "how"
of regex. I suppose by showing you my code, that is the main purpose.
You can always adapt it to your needs.
 
R

Roland Hall

:
: Roland Hall wrote on 16 feb 2004 in
: microsoft.public.inetserver.asp.general:
:
: > It still needs work. This doesn't appear to be the most efficient use.
:
: <script language=vbs>
:
: ' prepare IE vbs pseudo ASP test environment
: set response = document
:
: temp = "If a cat and, and a dog have a battle of wits, who will win
: the battle? I, and I do mean 'I', think the dog would win."
:
:
: Response.Write "old:<br>" & temp & "<br><hr>" & vbCrLf
:
: temp = " " & temp & " "
:
: set re = new RegExp
: re.Global = True
: re.IgnoreCase = True
:
: ' loose comma's, question marks, etc.
: re.Pattern = ",|\?|!"
: temp = re.replace(temp," ")
:
: ' loose beginning and ending single quotes, not O'Brian
: re.Pattern = "' | '"
: temp = re.replace(temp," ")
:
: ' loose single letter or number words
: re.Pattern = "\s\S\s"
: temp = re.replace(temp," ")
:
: ' loose these 4
: re.Pattern = " and | the | if | of "
: temp = re.replace(temp," ")
:
: ' loose multiple white space = inner trim
: re.Pattern = "\s+"
: temp = re.replace(temp," ")
:
: ' outer trim, regex for the fun of it
: re.Pattern = "(^\s+)|(\s+$)"
: temp = re.replace(temp,"")
:
:
: Response.Write "new:<br>" & temp & "<br>" & vbCrLf
:
: </script>

Why would you want to lose beginning and ending single quotes? This is a
search routine to remove common words but if someone wanted to use a common
word as part of their query, as you can do with Google, you will need to be
able to use quotes. Also, there is nothing regarding double quotes. We're
using this static but I'm almost sure this is designed to work with user
input.


--
Roland Hall
/* This information is distributed in the hope that it will be useful, but
without any warranty; without even the implied warranty of merchantability
or fitness for a particular purpose. */
Technet Script Center - http://www.microsoft.com/technet/scriptcenter/
WSH 5.6 Documentation - http://msdn.microsoft.com/downloads/list/webdev.asp
MSDN Library - http://msdn.microsoft.com/library/default.asp
 
R

Roland Hall

:
: Roland Hall wrote on 16 feb 2004 in
: microsoft.public.inetserver.asp.general:
:
: > Why would you want to lose beginning and ending single quotes? This
: > is a search routine to remove common words but if someone wanted to
: > use a common word as part of their query, as you can do with Google,
: > you will need to be able to use quotes. Also, there is nothing
: > regarding double quotes. We're using this static but I'm almost sure
: > this is designed to work with user input.
:
: This is not about the exact implementation but about the systematic "how"
: of regex. I suppose by showing you my code, that is the main purpose.
: You can always adapt it to your needs.

Ok, fair enough.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,145
Messages
2,570,826
Members
47,371
Latest member
Brkaa

Latest Threads

Top