Rewriting the DotNetNuke Url Rewriter Module

Ref: http://www.ifinity.com.au/Blog/Technical_Blog/EntryID/19/

As part of my ongoing interest in making DotNetNuke websites more person and search engine friendly, I started hunting around in the space again to see what was available. While the latest DotNetNuke releases have good ability to enter site rewrite rules (such as an automatic home.aspx redirection), the crucial problem for me is that the site doesn't generate simple Url's (instead it still outputs pagename/tabid/nn/default.aspx)

I had previously looked at the Inventua Http module, and while it is a good product, it didn't quite suit my needs. I guess it came down to not being able to get the source code either. It also didn't work with the Catalook module, which I have done some work with and still support on some sites.

Further searching brought me across Scott McCulloch's 'Friendly Urls' work on his site at http://www.ventrian.com/Resources/Projects/FriendlyUrls.aspx. (it's even a friendly url!) This showed a bit more promise, because he was automatically rewriting the Url's based on the path of the page. But Scott left 'Human Readable Urls for Pages With Multiple Parameters' listed under 'items for discussion'.

How the Url Rewriting in DotNetNuke works

Without going into the explicitly technical details, here's a quick primer. There are two facets to Url Rewriting in DotNetNuke:

  • Generating Friendly Urls for Hyperlinks, menu items and other outputs.
  • Interpreting those Friendly Urls when requested, and working out which page to show.
Generating Friendly Urls for Hyperlinks

This is really the easy part, because theoretically you can generate the friendliest Url's in the world to show - it's only when they have to be re-interpreted that it becomes a problem! Having said that, this is how it works:

  • Some code, somewhere calls DotNetNuke.Common.Globals.NavigateUrl(). This call requires the desired Tab (page), Portal, Url parameters if any. There's a few different versions of it, but they all distill down to the same call.
  • NavigateUrl determines the Http Alias of your Portal (ie www.ifinity.com.au), determines the TabId, and adds it to any query parameters. This results in the familiar Url of www.ifinity.com.au/default.aspx?tabid=38
  • NavigateUrl then checks if the Host setting of 'UseFriendlyUrls' is set to 'true'. And if so, calls the DNNFriendlyUrl provider as specified in the web.config. By default this is 'DotNetNuke.HttpModules.UrlRewrite.dll', which is part of the base code.
  • The DNNFriendlyUrl provider must have an implementation of a function called 'FriendlyUrl', which reformats the generated Url into a 'friendly' format. In the case of the standard provider, the url of www.ifinity.com.au/default.aspx?tabid=38 would come back as www.ifinity.com.au/Home/Tabid/38/default.aspx

The 'standard' FriendlyUrl provider does this by performing a simple reshuffling of the various parts of the querystring to achieve the necessary output. It's fast and works pretty well.

Interpreting Friendly Urls

The trickier part is determining what /home/tabid/38/default.aspx means when someone clicks on a hyperlink with this address. To do this, DotNetNuke uses the 'UrlRewrite' HttpModule. Now, in the standard DotNetNuke build, the UrlRewrite module and the FriendlyUrl provider are in the same assembly for ease of packaging and coding, but they don't necessarily have to be like that. When a Http Request comes into the DotNetNuke code (ie, you click on a hyperlink to load a new page) the DNN base throws the incoming Url to the UrlRewrite module, with an implicit request 'turn this into something I can understand, will you?'

You should remember that the DNN base is just a page called default.aspx, and it only knows how to interpret a classic style query string - ie ?tabid=38&key1=value1&key2=value2 etc.

This is the sequence of events:

  • The incoming System.Web.HttpContext.Current.Request (Request for short) is directed to the UrlRewrite HttpModule, in the 'OnBeginRequest' event.
  • The request is inspected for certain exception cases, and then put through a Regex expression to determine the tabid and (sometimes) the portalId
  • Assuming the TabId is found in the incoming Url, the Request.Url is rewritten in the style expected, putting the TabId into the output parameters.
  • The request passes back to the DotNetNuke framework, and the requested page is loaded.

All this happens for each and every request made to a DNN portal. So performance and scalability is paramount in any changes made in this area. If you are so inclined, I suggest stepping through the code one day just to see how often it gets called, and how much work goes on in the background to provide friendly Url's. It's when you do this you start to understand some of the issues surrounding a fully-dynamic website and fully-dynamic Url's.

How I set out to Improve it for my own purposes

Please note : the changes I am discussing here were changes I made to suit my own needs, and those needs are probably not aligned with many people in the DotNetNuke user group. So none of this is a criticism of the base code, or other people's work. It's just a discussion on getting it to work the way I wanted.

My requirements were the same as what Scott outlined in his article: better Urls for human and search engine purposes. However, most work I do with DNN modules tends to involve a lot of parameters in the query string, and so his open problem of what to do with multiple querystring parameters was the same as my own.

Specifically, I am developing two new modules: a Tagging module for tagging DNN content, and a Directory Module, for storage of Organization Directories. One of the principal aims of these two modules was nice-looking Urls with high keyword ratios in the Url. But I soon struck the same problem - what do you do with multiple parameters.

My Answer

My solution to this question was to re-arrange the order of the parameters, and use the key name of the first parameter as the page name in the Url. Confused? I bet! Here's what I mean:

  • Original,standard, DotNetNuke friendly Url: /MyPage/TabId/38/Key1/Value1/Key2/Value/default.aspx
  • My rewritten Url: /MyPage/Value1/Key2/Value2/Key1.aspx

While that may look a little strange and have you scratching your head, I can assure there's method to my madness. The first site I have implemented this on is a List of Auctioneers in Australia. It has the implementation of my new Tagging module and my new Directory module. Here are two Url's in that site, in 'original' and rewritten form:

Note: the rewriting has nothing to do with the module code, it is all in the HttpModule.URLRewrite Assembly.

I consider this a 'first draft' because I'm still not totally happy with it. I actually want to implement the Tag miniformat, and to do that I need to have the tag url looking like this : /TagList/Tag/Auctioneer - with no parameters on the end. It's certainly possible given the code I've written, but I'll leave it for a bit further down the track.

What about the old code?

That all works great for a new site such as the www.auctionlink.com.au site, as, in combination with my DotNetNuke Google Sitemap Provider, the googlebot has hoovered up all the friendly Url's and integrated them into the index nicely. So if you search for AuctionLink.com.au on Google, you should see the friendly Urls rather than the tabid/nn/default.aspx-style Urls.

But what about sites that have been in the index for a while? Indeed, this is the problem with the ifinity.com.au site - it is in the Google index with the standard, friendly Urls. I'd like to implement my version of the UrlRewrite module for this site as well, and get some friendlier Url's happening. But Google and others have the old Url's already in the index, and, because the DNN framework will still respond to the older style Url's, I could even get search-engine penalised for duplicate content - which is when the search engines deem you to have two distinct Url's pointing to the same content. Even if that's not a problem, I'd still like to have the best Url's showing in the index.

Enter the 301 Redirect

After reading Matt Cutt's blog on Google about 301 redirects, it got me thinking- I could issue a 301 redirect for every request that came into the site for an older style Url. That way search engines which take notice of the Http standards should eventually update their indexes to include the new content.

If you don't know what a 301 redirect is, it's a Http status code of '301 - Moved Permanently'. It's basically saying, yes I have the content for you, but now it's over here: please update your references. In some agents (certain browsers) it will actually call the new Url instead, where some agents will just ignore it and show the content if it's available. Google have stated (via Matt Cutt's blog) that they do read and obey 301, and you can use it to advice the Googlebot of new locations.

Implementing 301 Redirects in UrlRewrite

The code does this by detecting if a Url came into the site as an old style friendly Url - ie tabid/38/default.aspx. Then it works out what the 'friendly' url should be : home.aspx. If the incoming Url and the Friendly Url are different, then a 301 status is returned and the new friendly Url is given as the new location. There are exceptions to this, and there is a switch in the web.config to disable it once there are few 'old style' url's coming into the site (presumably once the search indexes are updated)

There is also a section in the web.config for excluding certain pages from rewrites. For instance, in the auctionLINK site, I have implemented a search function that deliberately uses a non-friendly query string. This page is excluded from 301 redirects because I don't want the query string to be friendly.

Here is the result, as shown by Fiddler, the free http monitoring tool distributed by Microsoft:

  • Original Request (raw):
    GET /Home/tabid/37/Default.aspx HTTP/1.1
    Accept: */*
    Accept-Language: en-au
    Accept-Encoding: gzip, deflate
    User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; InfoPath.1)
    Host: www.auctionlink.com.au
    Proxy-Connection: Keep-Alive
    Pragma: no-cache
  • Response from Website (raw):
    HTTP/1.1 301 Moved Permanently
    Date: Thu, 30 Aug 2007 07:26:32 GMT
    Server: Microsoft-IIS/6.0
    X-AspNet-Version: 2.0.50727
    Location: http://www.auctionlink.com.au/Home.aspx
    Cache-Control: private
    Content-Length: 0
    Connection: close

Installing and Using the Code

I have provided the source code for the HttpModule.UrlRewrite version in the Free Downloads page of this site. Steps to use it are:

  • Download the code from the Free Downloads Page.
  • Copy the 'DotNetNuke.HttpModule.UrlRewrite' binary into your website's \bin directory (it won't do anything until you change the web.config)
  • Backup your web.config, then modify the 'DNNFriendlyUrl' provider section so that it looks like the following:
  • If you would prefer not to use the 301 redirects, set redirectUnfriendly="false"
  • If you would like to use the 301 redirects, but not for a certain page/pages, put the 'TabPath' value for those pages in a semi-colon delimited list - ie home;enquires;products/myproduct;  (you should only use one forward slash, not two like the way it is stored in the database)  Note in the example above, I have excluded 'SearchResults' from redirects.
  • If you are using Redirects, it's a good idea to use a tool like Fiddler to check that your site is working as expected.  Check every page, because unfortunately it is possible for the redirect code to get stuck in a terminal loop (if anyone has an idea how to detect/stop this, I'm all ears)

The Disclaimer

This code is really in a BETA state.  It is pretty fresh off the code production line and hasn't been totally tested in anger yet.  If you do install it, please promise me that you will test it in a non-critical environment first, and that you will check every Url on your site to make sure that it works as you expect.  I'm still developing it myself and may post updates if I think they are worthwhile.

This code is a branch of Scott McCulloch's work, so full credit to him for getting me started in the right direction.  And his work is a branch off the original work for the DotNetNuke base by Charles Nurse, so full credit to him as well.  I don't claim credit for much of it at all, but by downloading it you are subjecting yourselef to the license of the DotNetNuke framework, so play nicely and attribute credit where credit is due.

 

Copyright Bruce Chapman 2007


Comments

Hi
Url rewriter is good.
It is useful as well.
Thanks for this.
This tool wiil be useful indeed.......
Rizwan, thanks for posting article on "rewriting DotNetNuke Url Rewriter Module", stuff is really gud,& the coding part is really effective.
The post completely explain about the DotNetNuke URL Rewriter. It made me easy to do the steps as mentioned in the post for rewriting the URL.

Popular posts from this blog

DevOps - Key Concepts - Roles & Responsibilities - Tools & Technologies

Trouble In the House of Google