how to stop users from wrecking a web page

L

lawrence

I'm worried about a user typing something like this:

"Hi, my name is <b>Super Guy and I'm so smart"

That is, I'm worried about the unclosed <b> tag. This is a constant
risk on a site like www.monkeyclaus.org, which has 5 to 20 new
comments posted each day.

My first thought was to simply add in a lot of closing tags after
every comment. After all, the software running the site could very
easily put in something like this:

</i></b></ul></ol>

plus a few more as I think of them.

Sadly, the graphic designers I work with tell me that this will not
validate.

So then I attempted to write code that would automatically close open
tags. This is really not as easy as it seems. It isn't enough to look
for this:

<i>

Because someone might write:

<i id="linkToMe">

But I can't just do a simple search for :

<i

because that might also be

<iframe>

So I've given up trying to write an automatic tag closer. I'm leaning
towards the excessive close tags, even though they don't validate, but
I was hoping someone on this group might have suggestions about better
ways to handle this.
 
D

Dylan Parry

lawrence said:
I'm worried about a user typing something like this:

"Hi, my name is <b>Super Guy and I'm so smart"

How about simply not allowing HTML tags?
 
S

SeeSchloss

So then I attempted to write code that would automatically close
open tags. This is really not as easy as it seems. It isn't
enough to look for this:

<i>

Because someone might write:

<i id="linkToMe">

But I can't just do a simple search for :

<i

because that might also be

<iframe>

But if you do a search for "<i " (with a space) or "<i>",
this should work ?
 
M

Matthias Gutfeldt

lawrence said:
I'm worried about a user typing something like this:

"Hi, my name is <b>Super Guy and I'm so smart"

That is, I'm worried about the unclosed <b> tag. This is a constant
risk on a site like www.monkeyclaus.org, which has 5 to 20 new
comments posted each day.

So I've given up trying to write an automatic tag closer. I'm leaning
towards the excessive close tags, even though they don't validate, but
I was hoping someone on this group might have suggestions about better
ways to handle this.

First strip out all illegal HTML elements (perhaps you're doing this
already). Then run the remaining content through a validator. Either
strip out non-closed elements, or give the user the option to edit his
posting. Repeat until all is well.


Matthias
 
T

Toby A Inkster

lawrence said:
I'm worried about a user typing something like this:
"Hi, my name is <b>Super Guy and I'm so smart"

Simply have your script convert all the "&" to "&amp;", all the "<" to
"&lt;" and all the ">" to "&gt;". Problem solved.
 
S

SeeSchloss

lawrence said:
Simply have your script convert all the "&" to "&amp;", all the
"<" to "&lt;" and all the ">" to "&gt;". Problem solved.

Unless he wants people to be able to use html tags.
 
L

lawrence

Well, this was all very good advice, though just slightly
disappointing. I was hoping someone knew a magic trick that I hadn't
thought of yet that would save me from the need to make a tough
decision. It seems I must make a decision.

The best option, when it comes to the comments, seems to be to reduce
the number of html tags that are allowed. If I only allow 3 I can
write script to catch unclosed tags.

There is the other "user" that I was worried about - the people who
actually own the monkeyclaus.org website. There are at least 20
artists who have permission to log in and post new entries to the
weblogs. They have to be allowed the option of adding HTML because the
software by itself doesn't recreate the full range of options that,
say, Dreamweaver allows.

I guess a "Preview" page is the best way to go there.

Thanks for all the help.
 
D

Daniel R. Tobias

lawrence said:
The best option, when it comes to the comments, seems to be to reduce
the number of html tags that are allowed. If I only allow 3 I can
write script to catch unclosed tags.

Web forums and opinion sites that limit which HTML tags you can use
often leave out ones I like... they tend to allow only the
"presentationalist" tags like <B> and <I>, and ban more "structuralist"
tags like <EM>, <STRONG>, and <CITE>, which I prefer to use out of a
desire for logical structure rather than just visual gimmickry.
 
S

Steven

lawrence said:
Well, this was all very good advice, though just slightly
disappointing. I was hoping someone knew a magic trick that I hadn't
thought of yet that would save me from the need to make a tough
decision. It seems I must make a decision.

The best option, when it comes to the comments, seems to be to reduce
the number of html tags that are allowed. If I only allow 3 I can
write script to catch unclosed tags.

There is the other "user" that I was worried about - the people who
actually own the monkeyclaus.org website. There are at least 20
artists who have permission to log in and post new entries to the
weblogs. They have to be allowed the option of adding HTML because the
software by itself doesn't recreate the full range of options that,
say, Dreamweaver allows.

I guess a "Preview" page is the best way to go there.

Thanks for all the help.


If and when you have located the unclosed tags, where would you the closing
one? In your example:

"Hi, my name is <b>Super Guy and I'm so smart"

"Hi, my name is <b>Super Guy and I'm so smart</b>", or
"Hi, my name is <b>Super Guy</b> and I'm so smart", or
"Hi, my name is <b>Super</b> Guy and I'm so smart", or
"Hi, my name is <b>S</b>uper Guy and I'm so smart", or
....?
 
M

Matthias Gutfeldt

lawrence said:
Well, this was all very good advice, though just slightly
disappointing. I was hoping someone knew a magic trick that I hadn't
thought of yet that would save me from the need to make a tough
decision. It seems I must make a decision.

I guess a "Preview" page is the best way to go there.

A "Preview" page that shows up validation errors is certainly useful.


Matthias
 
D

Daniel R. Tobias

Steven said:
If and when you have located the unclosed tags, where would you the closing
one? In your example:

"Hi, my name is <b>Super Guy and I'm so smart"

"Hi, my name is <b>Super Guy and I'm so smart</b>", or
"Hi, my name is <b>Super Guy</b> and I'm so smart", or
"Hi, my name is <b>Super</b> Guy and I'm so smart", or
"Hi, my name is <b>S</b>uper Guy and I'm so smart", or
...?

I would put the closing tag at the latest possible position where it's
syntactically legal. In the above example, it would be at the very end,
but if there are other tags, it might have to be earlier to prevent
illegal nesting:

"Hi, my name is <i>Super Guy and I'm <b>so</i> smart"

Putting </b> at the end would cause invalid overlapping of the <b> and
<i> elements, so the valid place to put </b> would be right before the </i>.

This is, of course, error correction, and has no guarantee of matching
what the author actually intended, but if the author wants the output to
match what he/she intended, he/she should try to type it correctly in
the first place.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,079
Messages
2,570,574
Members
47,207
Latest member
HelenaCani

Latest Threads

Top