A client are on their way to migrate a large website from an old platform to a brand new EPiServer 7.5 site running on an IIS webserver. We created a small redirection engine to handle all incoming legacy URLs redirecting them to the proper pages on the new website, so as not to lose search engine ranking, visitor bookmarks and so on.
However, the legacy system supported dangerous characters, such as ampersands (&) in it’s URLs which turned out to be a bit of a hassle. Our redirection engine was put out of play immediatly as soon as the EPiServer site receieved a request to one of these addresses (like for instance http://somehost.com/somepath/something-&-somethingelse/?key=value&foo=bar). The build in System.Web.HttpRequest.ValidateInputIfRequiredByConfig() was of course doing it’s best protecting us from malicious attempts (as well as legacy URLs) by throwing HttpExceptions.
Exception Details: System.Web.HttpException: A potentially dangerous Request.Path value was detected from the client (&). Stack Trace: [HttpException (0x80004005): A potentially dangerous Request.Path value was detected from the client (&).] System.Web.HttpRequest.ValidateInputIfRequiredByConfig() +12715107 System.Web.PipelineStepManager.ValidateHelper(HttpContext context) +166
Googling the problem mostly suggested potentially stupid stuff like allowing dangerous characters in URLs, turning off request validation, allowing double escapes, fiddling with dark magick, and other things that would definently make things more challenging down the road.
The solution for this is not as complicated as one might think; it requires only a couple of lines of configuration for your site.
web.config
<system.webServer> <rewrite> <rules> <rule name="Rewrite URL to remove & from path."> <match url="^(.*)\&(.*)$" /> <conditions logicalGrouping="MatchAny" /> <action type="Rewrite" url="{R:1}and{R:2}" /> </rule> </rules> </rewrite> </system.webServer>
This will add a rewrite rule to your IIS URL Rewrite Module that will take place before the validation is performed on the incoming request, and replace the ampersand character in the path with the word and. So in other words, the previously mentioned URL http://somehost.com/somepath/something-&-somethingelse/?key=value&foo=bar would become http://somehost.com/somepath/something-and-somethingelse/?key=value&foo=bar before being validated and passed on to the website.
After this, it was only a small matter of tweaking the generated legacy URLs for the rewrite engine to respond to and instead of & where this occured. Should you need to remove the other illegal characters, just update the match regex; like for instance:
<match url="^(.*)[\<\>\&\%\*\:\?]+(.*)$" />
And update your rewrite action accordingly.
Thanks for the rule/regex. Those exceptions (both elmah and eventviewer) were also bothering me!
Hi, which web.config to be updated? and in which location?
Thanks!
The one in your website’s root directory would do just fine, and it’s in the system.webServer section.