S
Steve B.
Hi,
I'm building a web site that can render html from various user input.
The problem is that the html cannot be trusted, so I need to ensure it does
not contain script attack injection.
That's why I'd like to provide a set of allowed tag and to remove other
ones.
I think about regular expression. However, I was able to find some regex
samples that remove a set a untrusted tags (scripts, iframe, etc), but I'd
like to allow only a set of tag, because the regex can only remove "well
formed" tags : <script> w/o </script> wont't be removed.
So does anyone have a regex that remove any content between tags that are
not in a safe list ?
And if possible, is it possible to remove any attribute that can be
potentially dangerous ? (<span onload="javascript:attack(...)">)
Thanks in advance
I'm building a web site that can render html from various user input.
The problem is that the html cannot be trusted, so I need to ensure it does
not contain script attack injection.
That's why I'd like to provide a set of allowed tag and to remove other
ones.
I think about regular expression. However, I was able to find some regex
samples that remove a set a untrusted tags (scripts, iframe, etc), but I'd
like to allow only a set of tag, because the regex can only remove "well
formed" tags : <script> w/o </script> wont't be removed.
So does anyone have a regex that remove any content between tags that are
not in a safe list ?
And if possible, is it possible to remove any attribute that can be
potentially dangerous ? (<span onload="javascript:attack(...)">)
Thanks in advance