Now that I've decided on which library to use, I'll describe the actual code.We already know that HTML, esp. in Internet Explorer, provides many attack vectors. And new versions of the browser could add another tag or attribute that can execute code. So we need to use a whitelist, not a blacklist.Next, there are many more legit users than attackers. So when dangerous content is detected, it needs to be removed -- we can't just blow up and tell the user not to hack us. The number of false positives could actually be rather high, since some people are going to use Word and end up with a lot of tags and who knows what else. And finally, users could accidentally paste something that's potentially dangerous. Yelling at them, or even telling them to fix their code isn't going to work, since they're maybe not even aware that HTML exists.So, here's the code: SafeHtml.cs.txt (3.28 KB). It's very short and easy, thanks to the HtmlAgilityPack. The processing of style tags is pretty weak (simple replacements), but should do the trick. Enjoy!Update 2004-Mar-04: Forgot to handle <A href=”scriptType:code...”>. Be sure to add that if you use this code in production.
Remember Me