Colin Cochrane

Colin Cochrane is a Software Developer based in Victoria, BC specializing in C#, PowerShell, Web Development and DevOps.

Search Friendly Development On The Microsoft Stack Presentation

As promised, I have made my presentation from SMX Advanced available for download for those who are interested.  

Search Friendly Development On The Microsoft Stack (599.00 kb).

Thanks again to all those who attended, and special thanks to the organizers of SMX Advanced who put on a fantastic conference, and to Vanessa Fox for inviting me to come and speak.  Developer Day was a great success, and I can't wait to see what happens next year. 

 

IIS 7 Site Won't Start After Upgrading to Vista Service Pack 1

After letting Service Pack 1 install overnight, I logged in to my machine this morning looking forward to exploring some of the new features added to IIS 7.0.  Unfortunately there was a small problem with one of the local web applications that I host from my machine.  Simply put, the application refused to start in IIS, and each attempt to start it resulted in a modal pop-up informing me that the process was in use.  I quick peek at the error log showed the following:

iis1

iis2

iis3

After I quick search I found a KB article over at Microsoft that addressed the problem.  As directed by the article, I popped open the command prompt and ran netstat -ano to get a list of what processes were listing for net traffic on what port.  First entry on the list identified the problem process.

image

(The screencap was taken after the problem was fixed, so the PID is not the same).  I opened up the task manager to find out what process was at fault, and it turned out that Skype was listening for incoming traffic on port 80 for some silly reason.  I closed up Skype and attempted to start the website and lo-and-behold, it worked.  I started Skype again, and everything was working like normal again.

I thought this may be of use to anyone who encounters a similar problem.

ASP.NET Custom Errors: Preventing 302 Redirects To Custom Error Pages

 
You can download the HttpModule here.
 
Defining custom error pages is a convenient way to show users a friendly page when they encounter an HTTP error such as a 404 Not Found, or a 500 Server Error.  Unfortunately ASP.NET handles custom error pages by responding with a 302 Temporary redirect to the error page that was defined. For example, consider an example application that has IIS configured to map all requests to it, and has the following customErrors element defined in its web.config:
 
<customErrors mode="RemoteOnly" defaultRedirect="~/error.aspx">
<error statusCode="404" redirect="~/404.aspx" />
</customError>

If a user requested a page that didn't exist, then the HTTP response would look something like:

http://www.domain.com/non-existant-page.aspx --> 302 Found
http://www.domain.com/404.aspx  --> 404 Not Found
Date: Sat, 26 Jan 2008 03:08:21 GMT
Server: Microsoft-IIS/6.0
Content-Length: 24753
Content-Type: text/html; charset=utf-8
X-Powered-By: ASP.NET
 
As you can see, there is a 302 redirect that occurs to send the user to the custom error page.  This is not ideal for two reasons:

1) It's bad for SEO

When a search engine spiders crawls your site and comes across a page that doesn't exist, you want to make sure you respond with an HTTP status of 404 and send it on its way.  Otherwise you may end up with duplicate content issues or indexing problems, depending on the spider and search engine.

2) It can lead to more incorrect HTTP status responses

This ties in with the first point, but can be significantly more serious.  If the custom error page is not configured to response with the correct status code then the HTTP response could end up looking like:

http://www.domain.com/non-existant-page.aspx --> 302 Found
http://www.domain.com/404.aspx  --> 200 OK
Date: Sat, 26 Jan 2008 03:08:21 GMT
Server: Microsoft-IIS/6.0
Content-Length: 24753
Content-Type: text/html; charset=utf-8
X-Powered-By: ASP.NET
 
Which would almost guarantee that there would be duplicate content issues for the site with the search engines, as the search spiders are simply going to assume that the error page is a normal page, like any other.Furthermore it will probably cause some website and server administration headaches, as HTTP errors won't be accurately logged, making them harder to track and identify.
I tried to find a solution to this problem, but I didn't have any luck finding anything, other than people who were also looking for a way to get around it.  So I did what I usually do, and created my own solution.
 
The solution comes in the form of a small HTTP module that hooks onto the HttpContext.Error event.  When an error occurs, the module checks if the error's type is an HttpException.  If the error is an HttpException, then the following process takes place:
  1. The response headers are cleared (context.Response.ClearHeaders() )
  2. The response status code is set to match the actual HttpException.GetHttpCode() value (context.Response.StatusCode = HttpException.GetHttpCode())
  3. The customErrorsSection from the web.config is checked to see if the HTTP status code (HttpException.GetHttpCode() ) is defined.
  4. If the statusCode is defined in the customErrorsSection then the request is transferred, server-side, to the custom error page. (context.Server.Transfer(customErrorsCollection.Get(statusCode.ToString).Redirect) )
  5. If the statusCode is not defined in the customErrorsSection, then the response is flushed, immediately sending the response to the client.(context.Response.Flush() )

Here is the source code for the module.

   1: Imports System.Web
   2: Imports System.Web.Configuration
   3:  
   4: Public Class HttpErrorModule
   5:   Implements IHttpModule
   6:  
   7:   Public Sub Dispose() Implements System.Web.IHttpModule.Dispose
   8:     'Nothing to dispose.
   9:   End Sub
  10:  
  11:   Public Sub Init(ByVal context As System.Web.HttpApplication) Implements System.Web.IHttpModule.Init
  12:     AddHandler context.Error, New EventHandler(AddressOf Context_Error)
  13:   End Sub
  14:  
  15:   Private Sub Context_Error(ByVal sender As Object, ByVal e As EventArgs)
  16:     Dim context As HttpContext = CType(sender, HttpApplication).Context
  17:     If (context.Error.GetType Is GetType(HttpException)) Then
  18:       ' Get the Web application configuration.
  19:       Dim configuration As System.Configuration.Configuration = WebConfigurationManager.OpenWebConfiguration("~/web.config")
  20:  
  21:       ' Get the section.
  22:       Dim customErrorsSection As CustomErrorsSection = CType(configuration.GetSection("system.web/customErrors"), CustomErrorsSection)
  23:  
  24:       ' Get the collection
  25:       Dim customErrorsCollection As CustomErrorCollection = customErrorsSection.Errors
  26:  
  27:       Dim statusCode As Integer = CType(context.Error, HttpException).GetHttpCode
  28:  
  29:       'Clears existing response headers and sets the desired ones.
  30:       context.Response.ClearHeaders()
  31:       context.Response.StatusCode = statusCode
  32:       If (customErrorsCollection.Item(statusCode.ToString) IsNot Nothing) Then
  33:         context.Server.Transfer(customErrorsCollection.Get(statusCode.ToString).Redirect)
  34:       Else
  35:         context.Response.Flush()
  36:       End If
  37:  
  38:     End If
  39:  
  40:   End Sub
  41:  
  42: End Class

The following element also needs to be added to the httpModules element in your web.config (replace the attribute values if you aren't using the downloaded binary):

<httpModules>
<add name="HttpErrorModule" type="ColinCochrane.HttpErrorModule, ColinCochrane" />
</httpModules>

And there you go! No more 302 redirects to your custom error pages.

Catching Unwanted Spiders And Content Scraping Bots In ASP.NET

spiderinaglass

If you have a blog that is even moderately popular then you have likely fallen victim to some form of content scraping.  Ever since it became possible to earn money through ads on a website there have been people trying to find ways to cheat the system.  The most widespread example of this comes in the form of splogs and similar spam-based websites, which consist only of ads from Google AdSense and duplicated content that is scraped from other sites.  In this post I will share a method you can use to identify "evil" spiders and content scraping bots that are wasting your website's resources.

I'll start off by defining what is considered an "evil" spider/bot.  For our purposes here, we'll be looking at spiders and bots that ignore robots.txt and nofollow when crawling a site.  These are spiders and bots that offer no value to you in allowing them to crawl your site, as the major search engines use spiders and bots that respect these rules (with the unique exception of MSN who employs a certain bot that presents itself as a regular user in order to identify sites that present different content to search engine spiders than users).  

Of these valueless spiders, some are almost certainly going to be some form of content scraping bot, which is sent to literally copy the content of your site for use elsewhere.  It is in your best interest to limit how much of your content gets scraped because you want visitors coming to your site, not some spam-filled facsimile.

This method to identify unwanted spiders involves the creation of a trap,  which can be created as follows:

1) Create a Hidden Page

To identify these undesired visitors you need to isolate them.  Create a page on your site, but do not link to it from anywhere just yet.  For the purposes of my examples, I'll call our example page "trap.aspx".

spidertrap1

Now you want to disallow this page in your robots.txt.

spidertrap2

With this trap page disallowed in the robots.txt, it will prevent good spiders from crawling it.  What is needed now is a link to the trap page with the rel="nofollow" attribute, which should be placed on your home page for maximum effect.  The link must be invisible to users otherwise you might mistake a unwitting visitor for a bad spider.

<a rel="nofollow" href="/trap.aspx" style="display:none;" />

This creates a situation in which the only requests for "/trap.aspx" will be from a spider or bot that ignores both robots.txt and nofollow, which is exactly the kind of bots we want to identify.

2) Create a Log File

Create an XML document and name it "trap.xml" (or whatever you want) and place it in the App_Data folder of your application (or wherever you want, as long as the application has write-access to the directory).  Open the new XML document and create an empty root-element "<trapRequests>" and ensure it has a complete closing tag.

<?xml version="1.0" encoding="utf-8"?>
<trapRequests>
</trapRequests>
 
You can use whatever method is best for you to log the requests, you do not need to use an XML document.  I am using XML for the purposes of this example.

3) Log What Gets Caught In The Trap

With the trap in place, you now want to keep track of the requests being made for "trap.aspx".  This can be accomplished quite easily using LINQ, as illustrated in the following example:

Imports System.Xml.Linq
Partial Class trap_aspx Inherits System.Web.UI.Page
Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) _ Handles Me.Load
 
  LogRequest(Request.UserHostAddress, Request.UserAgent)
End Sub

Private Sub LogRequest(ByVal ipAddress As String, ByVal userAgent As String)
Dim logFile As XDocument

Try

    logFile = XDocument.Load(Server.MapPath("~/App_Data/trap.xml"))
    logFile.Root.AddFirst(<request>
<date><%= Now.ToString %> </date>
<ip><%= ipAddress %> </ip>
<userAgent><%= Now.ToString %> </userAgent>
</request>)
     logFile.Save(Server.MapPath("~/App_Data/trap.xml"))
Catch ex As Exception
My.Log.WriteException(ex)
End Try
End Sub
End Class

This code sets it up so every request for this page is logged with:

  1. The Date and Time of the request.
  2. The IP address of the requesting agent.
  3. The User Agent of the requesting agent.

You can, of course, customize what information is logged to your preference.  The code will need to be adjusted if you are using a different storage method.  Once done, you will end up with an XML log file (or your custom store) with every request to "trap.aspx" that will look like:

<?xml version="1.0" encoding="utf-8"?>
<trapRequests>
<request>
<date>12/30/2007 12:54:20 PM</date>
<ip>1.2.3.4</ip>
<userAgent>ISCRAPECONTENT/1.2</userAgent>
</request>
<request>
<date>12/30/2007 2:31:51 PM</date>
<ip>2.3.4.5</ip>
<userAgent>BADSPIDER/0.5</userAgent>
</request>
</trapRequests>
 
Now you've set your trap and any unwanted bots and spiders that find it will be logged.  You are then free to use the logged data to deny access to offending IPs, User Agents, or by whatever criteria you decide is appropriate for your site.

Visual Studio 2008 Initial Impressions - Part One

The release of Visual Studio 2008 and the .NET Framework 3.5 this past Monday has created a considerable buzz in the .NET community.  With language enhancements such as LINQ (Language Integrated Query) and lambda expressions as well as a plethora of refinements to the IDE itself there are a lot of new tools available at our disposal now.  I was very eager to get acquainted with these new tools so I installed a copy of Team Edition and spent almost every free moment this week familiarizing myself with them.  Here are some of my initial impressions.

1) LINQ To SQL Classes

I work with a lot of applications that depend heavily on a backend database, so I've coded my fair share of business logic layers which can be quite tedious.  LINQ To SQL Classes take a lot of the grunt work out of that process by providing a convenient visual designer that performs automatic object-relational mapping.  All you have to do is drag a table or stored procedure from the Server Explorer to the design window and the designer automatically creates a strongly-typed object or method that is ready for use in your application.




2) Intellisense Enhancements

There were a couple of really nice usability enhancements to Intellisense in Visual Studio 2008.  Now, as you type, the Intellisense list automatically filters the list down based on what you've entered in so far.  For instance, if you have entered "MyObject.ToS" the list would be filtered to only show the items that start with "ToS", which does a nice job of speeding things up.  The other enhancement addresses the issue that many people had with previous versions of Visual Studio and the way that the Intellisense list would often obscure chunks of your code, forcing you to close the window if you had to check something that was underneath it.  Now you just have to hit "Ctrl" while the list is open and it will become semi-transparant, allowing you to see the code underneath.

 



3) Improved IDE Performance

Not a "feature", necessarily, but a welcome improvement to Visual Studio.  You'll notice this as soon as you load the environment for the first time and discover how quickly the environment loads.  The performance improvements don't stop there either, as the IDE is a lot faster and responsive throughout.


Stay tuned for Part Two where I'll go in to some more features of LINQ as well as some of the language upgrades given to Visual Basic.

Reducing Code Bloat - Or How To Cut Your HTML Size In Half

When it comes to designing and developing a web site the load time is one consideration that is often ignored, or is an afterthought once the majority of the design and structure is in place.  While high-speed internet connections are becoming increasingly common there are still a significant portion of web users out there with 56k connections, and even those with broadband connections aren't guaranteed to have a fast connection to your particular server.  Every second that a user has to wait to download your content is increasing the chance of that user deciding to move on.

Attempting to reduce the size of a web page is usually restricted to compressing larger images and optimizing them for web use.  This is a necessary step to managing page size, but there is another important factor that can significantly reduce the size of a page to improve download times: code bloat, or more specifically, (x)HTML code bloat.  Getting rid of that code bloat means less actual bytes to be downloaded by clients, as well as captilizing on what the client has already downloaded.  Unfortunately this is an option that tends to be ignored due to the perceived loss of time spent combing through markup to cut out the chaff, despite the fact that clean, efficient markup with well-planned style definitions will save countless hours when it comes to upkeep and maintenance.

To demonstrate the difference between a bloated page and one with efficient markup I created two basic pages.  One uses tables, font tags, HTML style attributes and so forth to control the structure and look of the page, while the other uses minimal markup with an external stylesheet.

1) Bloated Page

[code:html]

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>This Tag Soup Is Really Thick</title>
<meta name="description" content="The end result of lazy web page design resulting in a bloated mess of HTML.">
<meta name="keywords" content="tag soup, messy html, bloated html">
</head>
<body>
<center>
  <table width=700 height=800 bgcolor="gainsboro" border=0 cellpadding=0 cellspacing=0>
    <tr>
      <td valign=top width=150><table align=center border=0 cellpadding=0 cellspacing=0>
          <tr>
            <td align="left"><a href="#home" title="Not a real link"><font color="#4FB322" size="3" face="Geneva, Arial, Helvetica, sans-serif">Home Page</font></a></td>
          </tr>
          <tr>
            <td align="left"><a href="#about" title="Not a real link"><font color="#4FB322" size="3" face="Geneva, Arial, Helvetica, sans-serif">About Me</font></a></td>
          </tr>
          <tr>
            <td align="left"><a href="#links" title="Not a real link"><font color="#4FB322" size="3" face="Geneva, Arial, Helvetica, sans-serif">Links</font></a></td>
          </tr>
        </table></td>
      <td valign=top><table>
          <tr>
            <td align="center" height=64><h1><font color="red" face="Geneva, Arial, Helvetica, sans-serif">Welcome to My Site!</font></h1></td>
          </tr>
          <tr>
            <td align="left"><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Isn&acute;t it surprisingly ugly and bland?</font></td>
          </tr>
          <tr>
            <td align="left"><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Fusce est. Maecenas pharetra nibh vel turpis molestie gravida. Integer convallis odio eu nulla. Vivamus eget turpis eu neque dignissim dignissim. Fusce vel erat ut turpis pharetra molestie. Cras sollicitudin consequat sem. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Maecenas augue diam, sagittis eget, cursus at, vulputate at, nisl. Etiam scelerisque molestie nibh. Suspendisse ornare dignissim enim. Sed posuere nunc a lectus. Vestibulum luctus, nibh feugiat convallis ornare, lorem neque volutpat risus, a dapibus odio justo at erat. Donec vel lacus id urna luctus tincidunt. Morbi nunc. Donec fringilla sapien nec lectus. Duis at felis a leo porta tempor.</font></td>
          </tr>
          <tr>
            <td align="left"><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Maecenas malesuada felis id mauris. Ut nibh eros, vestibulum nec, ornare sollicitudin, hendrerit et, ligula. Suspendisse tellus elit, rutrum ut, tempor eget, porta bibendum, magna. Nunc sem dolor, pharetra ut, fermentum in, consequat vitae, velit. Vestibulum in ipsum. Phasellus erat. Sed eget turpis tristique eros cursus gravida. Vestibulum quis pede a libero elementum varius. Nullam feugiat accumsan enim. Aenean nec mi. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae;</font></td>
          </tr>
          <tr>
            <td align="left"><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Aenean vel neque ac orci sagittis tristique. Phasellus cursus quam a mauris. Donec posuere pede a nisl. Curabitur nec ligula eu nibh accumsan sagittis. Integer lacinia. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos hymenaeos. Praesent tortor dolor, pellentesque eget, fermentum vel, mollis ut, erat. Nullam mollis. Cras rhoncus tellus ut neque. Pellentesque sed ante.</font></td>
          </tr>
          <tr align="left">
            <td><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Donec at nunc. Nulla elementum porta elit. Donec bibendum. Fusce elit ligula, gravida et, tincidunt et, aliquam sit amet, metus. Nulla id magna. Fusce quis eros. Sed eget justo. Vivamus dictum interdum quam. Curabitur malesuada. Proin id metus. Curabitur feugiat. Nunc in turpis. Cras lobortis lobortis felis. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Mauris imperdiet aliquet ante. Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</font></td>
          </tr>
          <tr align="left">
            <td><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Etiam tristique mauris at nibh sodales pretium. In lorem eros, laoreet eget, rhoncus et, lacinia nec, pede. Fusce a quam. Pellentesque vitae lacus. Vivamus commodo. Morbi euismod, ipsum id consectetuer ornare, nisi sem suscipit pede, vel dictum purus mauris eu leo. Proin sodales. Aliquam in pede nec eros aliquet adipiscing. Nulla a purus sed risus ullamcorper tempus. Nunc neque magna, fringilla quis, ullamcorper vitae, placerat sed, orci. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae;</font></td>
          </tr>
        </table></td>
    </tr>
  </table>
</center>
</body>
</html>

[/code]

2) Cleaned Page

[code:html]

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    <title>Less Markup, More Content</title>
    <meta name="description" content="The end result of lazy web page design resulting in a bloated mess of HTML.">
    <meta name="keywords" content="tag soup, messy html, bloated html">
    <link href="style.css" rel="stylesheet" type="text/css">
  </head>
  <body>
    <div class="content">
      <ul class="menu">
        <li><a href="#home" title="Not a real link">Home Page</a></li>
        <li><a href="#home" title="Not a real link">About Me</a></li>
        <li><a href="#home" title="Not a real link">Links</a></li>
      </ul>
      <h1>Welcome To My Site!</h1>
      <p>Isn&acute;t it suprisingly ugly and bland?</p>
      <p>Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Fusce est.  Maecenas pharetra nibh vel turpis molestie gravida. Integer convallis  odio eu nulla. Vivamus eget turpis eu neque dignissim dignissim. Fusce  vel erat ut turpis pharetra molestie. Cras sollicitudin consequat sem.  Vestibulum ante ipsum primis in faucibus orci luctus et ultrices  posuere cubiliaCurae; Maecenas augue diam, sagittis eget, cursus at,  vulputate at, nisl. Etiam scelerisque molestie nibh. Suspendisse ornare  dignissim enim. Sed posuere nunc a lectus. Vestibulum luctus, nibh  feugiat convallis ornare, lorem neque volutpat risus, a dapibus odio  justo at erat. Donec vel lacus id urna luctus tincidunt. Morbi nunc.  Donec fringilla sapien nec lectus. Duis at felis a leo porta tempor. </p>
      <p>Maecenas malesuada felis id mauris. Ut nibh eros, vestibulum nec,  ornare sollicitudin, hendrerit et, ligula. Suspendisse tellus elit,  rutrum ut, tempor eget, porta bibendum, magna. Nunc sem dolor, pharetra  ut, fermentum in, consequat vitae, velit. Vestibulum in ipsum.  Phasellus erat. Sed eget turpis tristique eros cursus gravida.  Vestibulum quis pede a libero elementum varius. Nullam feugiat accumsan  enim. Aenean nec mi. Vestibulum ante ipsum primis in faucibus orci  luctus et ultrices posuere cubilia Curae; </p>
      <p>Aenean vel neque ac orci sagittis tristique. Phasellus cursus quam a  mauris. Donec posuere pede a nisl. Curabitur nec ligula eu nibh  accumsan sagittis. Integer lacinia. Class aptent taciti sociosqu ad  litora torquent per conubia nostra, per inceptos hymenaeos. Praesent  tortor dolor, pellentesque eget, fermentum vel, mollis ut, erat. Nullam  mollis. Cras rhoncus tellus ut neque. Pellentesque sed ante. </p>
      <p>Donec at nunc. Nulla elementum porta elit. Donec bibendum. Fusce  elit ligula, gravida et, tincidunt et, aliquam sit amet, metus. Nulla  id magna. Fusce quis eros. Sed eget justo. Vivamus dictum interdum  quam. Curabitur malesuada. Proin id metus. Curabitur feugiat. Nunc in  turpis. Cras lobortis lobortis felis. Pellentesque habitant morbi  tristique senectus et netus et malesuada fames ac turpis egestas.  Mauris imperdiet aliquet ante. Lorem ipsum dolor sit amet, consectetuer  adipiscing elit. </p>
      <p>Etiam tristique mauris at nibh sodales pretium. In lorem eros,  laoreet eget, rhoncus et, lacinia nec, pede. Fusce a quam. Pellentesque  vitae lacus. Vivamus commodo. Morbi euismod, ipsum id consectetuer  ornare, nisi sem suscipit pede, vel dictum purus mauris eu leo. Proin  sodales. Aliquam in pede nec eros aliquet adipiscing. Nulla a purus sed  risus ullamcorper tempus. Nunc neque magna, fringilla quis, ullamcorper  vitae, placerat sed, orci. Pellentesque habitant morbi tristique  senectus et netus et malesuada fames ac turpis egestas. Vestibulum ante  ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; </p>
    </div>
  </body>
</html>

[/code]

External Style Sheet

[code:html]

@charset "utf-8";
body{font:13pt Geneva, Arial, Helvetica, sans-serif;}
.menu{float:left;height:800px;list-style-type:none;width:150px;}
.menu a, .menu a:visited{color:#4FB322;}
.content{background:gainsboro;color:white;margin:auto;position:relative;width:700px;}
h1{color:red;line-height:64px;text-align:center;}
p{margin:4px;}

[/code]

Even in this basic example you can see a fairly dramatic improvement when the excess HTML is trimmed and CSS is used to control style.  The original page is 51 lines, where the cleaned page is only 26 lines, plus 7 lines in the style sheet.  The cleaned page is a third the size of the original (counting the style sheet), and more realistically is actually half the size because the style sheet would be cached by most client browsers and wouldn't be downloaded for every page request.  As far as raw kilobytes it's a difference of 6KB to 4KB, which isn't a particularly exciting difference in this case, but one that is quickly magnified as the length of the page increases.  This is especially true with dynamic applications that pull content from a database, most importantly content such as product listings that utilize the same markup and are repeated multiple times. Fortunately in the case of dynamic pages involving looping procedures that output the same markup with different content, cutting down the bloat can be as easy as a few modifications to those procedures.

Furthermore if you wanted to change, for instance, the font color or the line-height in the original, you would have to modify every font tag and table cell to accomplish that.  Implementing those changes in the second example requires a single modification to the style-sheet.  The time-saved here is once again significantly amplified when considered in a situation with multiple pages (in many cases this can be hundreds or even thousands).

When all is said and done, this isn't meant to be a be-all end-all guide for optimizing your markup because I could write a book and still not cover it all.  Rather it was meant to highlight an aspect of web page performance and optimization that is usually swept under the rug in favour of those that are more directly appreciable such as eye-candy and new features.  While clean markup might not be as "glamourous" as other aspects of web development, it is an important aspect to keeping load time in check and a crucial factor in reducing them amount of time spent maintaining and updating.

Disabling Configuration Inheritance For ASP.NET Child Applications

Configuration inheritance is a very robust feature of ASP.NET that allows you to set configuration settings in the Web.Config of a parent application and have it automatically be applied to all of its child applications.  There are certain situations, however, when there are configuration settings that you don't want to apply to a child application.  The usual course of action is to override the setting in question in the child application's web.config file, which is ideal if there are only a handful of settings to deal with.  This is less than ideal when there are a significant number of settings that need to be overridden, or when you simply want the child application to be largely independent of its parent. 

There solution is Configuration <location> Settings which allows you to selectively exclude portions of an application's web.config from being inherited by child applications.  This made my job a heck of a lot easier yesterday in the installation of BlogEngine.NET for a client that had a CMS.  Naturally the CMS had a rather lengthy web.config packed with a significant amount of settings that were specific to the CMS application, and would generally break any child application that someone would install. Naturally the installation of the blog application was no exception, as I quickly found out there were at least 20 various settings that were preventing it from running.  I had the blog application configured as a seperate application through a virtual directory that was a subdirectory of the main website application (which included the CMS), so by default ASP.NET was applying all the settings the the main application to the blog application.  

The prospect of digging through the large web.config of the CMS and identifying every setting that I would need to override was not a prospect I was particularly thrilled about, so I turned to the <location> element to save me some time.  Since the blog application was essentially independent of the main application there were no inherited settings that it would depend on, so I modified the parent web application's web.config to the following (edited to protect client).

[code:xml]

<location path="." inheritInChildApplications="false"> 
  <system.web>
  ...
  </system.web>
</location> 

[/code]

By wrapping the <system.web> element with the <location> element and setting the path attribute to "." and inheritInChildApplications attribute to "false" I prevented every child application of the main web application from inheriting the settings in the <system.web> element.  After making this modification I tried accessing the blog and just like that there were no more errors and it loaded perfectly.  I could have also set the <location> element to apply specifically for the blog by way of the path attribute if I didn't want to disable inheritance across every child application.  

Aside from the convenience that configuration <location> settings provided in this case, it is also a tremendously powerful way to manage configuration inheritance in your ASP.NET applications.  If you haven't read the MSDN Documentation on this topic, I really encourage you to do so, because there are so many scenarios that configuration <location> settings can be used that you will quickly discover that it is a incredibly valuable tool to have at your disposal. 

Using the ASP.NET Web.Sitemap to Manage Meta Data

kick it on DotNetKicks.com

In an ASP.NET application the web.sitemap is very convenient tool for managing the site's structure, especially when used with an asp:Menu control.  One aspect of the web.sitemap that is often overlooked is the ability to add custom attributes to the <siteMapNode> elements, which provides very useful leverage for managing a site's meta-data.  I think the best way to explain would be through a simple example.

Consider a small site with three pages: /default.aspx, /products.aspx, and /services.aspx. The web.sitemap for this site would look like this:

<?xml version="1.0" encoding="utf-8" ?>

<siteMap xmlns="http://schemas.microsoft.com/AspNet/SiteMap-File-1.0" >

<siteMapNode url="~/" title="Home">

     <siteMapNode url="~/products.aspx" title="Products" />
     <siteMapNode url="~/services.aspx" title="Services" />

</siteMapNode>

 

Now let's add some custom attributes where we can set the Page Title (because the title attribute is where the asp:Menu control looks for the name of a menu item and it's probably best to leave that as is), the Meta Description, and the Meta Keywords elements. 

 

<?xml version="1.0" encoding="utf-8" ?>

<siteMap xmlns="http://schemas.microsoft.com/AspNet/SiteMap-File-1.0" >

<siteMapNode url="~/" title="Home" pageTitle="Homepage" metaDescription="This is my homepage!" metaKeywords="homepage, keywords, etc">

     <siteMapNode url="~/products.aspx" title="Products" pageTitle="Our Products" metaDescription="These are our fine products" metaKeywords="products, widgets"/>
     <siteMapNode url="~/services.aspx" title="Services" pageTitle="Our Services" metaDescription="Services we offer" metaKeywords="services, widget cleaning"/>

</siteMapNode>

 

Now with that in place all we need is a way to access these new attributes and use them to set the elements on the pages.  This can be accomplished by adding a module to your project, we'll call it "MetaDataFunctions" for this example.  In this module you add the following procedure.

 

Public Sub GenerateMetaTags(ByVal TargetPage As Page)
    Dim head As HtmlHead = TargetPage.Master.Page.Header
    Dim meta As
New HtmlMeta

    If SiteMap.CurrentNode IsNot Nothing Then
      meta.Name = "keywords"
      meta.Content = SiteMap.CurrentNode("metaKeywords")

      head.Controls.Add(meta)

      meta =
New HtmlMeta
      meta.Name = "description"
      meta.Content = SiteMap.CurrentNode("metaDescription")
      head.Controls.Add(meta)

      TargetPage.Title = SiteMap.CurrentNode.Description
   
Else

      meta.Name = "keywords"
      meta.Content = "default keywords"

      head.Controls.Add(meta)

      meta =
New HtmlMeta
      meta.Name = "description"
      meta.Content = "default description"

      head.Controls.Add(meta)

      TargetPage.Title = "default page title"
    End If
  End Sub

 

Then all you have to do is call this procedure on the Page_Load event like so...

 

Protected Sub Page_Load(ByVal sender As Object, ByVal e As EventArgs) Handles Me.Load

    GenerateMetaTags(Me)    

End Sub

 

..and you'll be up and running.  Now you have a convenient, central location where you can see and manage your site's meta-data.

SEO Best Practices - Dynamic Pages in ASP.NET

kick it on DotNetKicks.com

One of the greatest time-savers in web development is the use of dynamic pages to serve up database driven content.  The most common examples of which are content management systems and product information pages.  More times than not these pages hinge on a querystring parameter such as /page.aspx?id=12345 to determine which record needs to be retrieved from the database and output to the page.  What is surprising is how many sites don't adequatly validate that crucial parameter.

Any parameter that can be tampered with by a user, such as a querystring, must be validated as a matter of basic security.  That being said, this validation must also adequately deal with a situation when that parameter is not valid.  Whether the parameter is for a non-existant record, or whether the parameter contains letters where it should only be numbers, the end-result is the same: the expected page does not exist.  As simple as this sounds there are countless applications out there that seem to completely ignore any sort of error handling, and are content to have Server Error in "/" Application be the extent of their error handling.  Somewhere in the development cycle the developers of these application decided that the default ASP.NET error page would be the best thing to show to the site's visitors, and that a 500 SERVER ERROR was the ideal response to send to any search engine spiders that might have the misfortune of coming across a link with a bad parameter in it.

With a dynamic page that depends on querystring parameters to generate its content, the following basic measures should be taken:

Protected Sub Page_Load(ByVal sender As Object, ByVal e As EventArgs) Handles Me.Load    
'Ensure that the requested URI actually has any querystring keys
If Request.Querystring.HasKeys() Then

'Ensure that the requested URI has the expected parameter, and that the parameter isn't empty
If Request.Querystring("id") IsNot Nothing Then

'Perform any additional type validation to ensure that the string value can be cast to the required type.

Else

Response.StatusCode = 404

Response.Redirect("/404.aspx",True)

End If

Else

Response.StatusCode = 404

Response.Redirect("/404.aspx",True)

End Sub


This is a basic example, but demonstrates how to perform simple validation against the querystring that will properly redirect anyone that reaches the page with a bad querystring in the request URL.  A similar approach should be taken when attempting to retrieve the data in the case that the record is not found.

Another useful trick is to define the default error redirect in the web.config file (<customErrors mode="RemoteOnly" defaultRedirect="/error.aspx">), and use that page to respond to the error appropriately by using the Server.GetLastError() method to get the most recent server exception and handling that exception as required.

There are many other ways to manage server responses when there is an error in your ASP.NET application.  What is most important is knowing that you need to handle these errors properly, up to and including an appropriate response to the request.   

ASP.NET's Answer to WordPress

It's an exciting time to be an ASP.NET developer.  As the ASP.NET community continues to grow we find ourselves with an ever-increasing aresenal of tools, controls and frameworks at our disposal.  Unfortunately this can make the decision on a component a little more difficult, as very few of the components out there have reached that point where they are largely considered the "standard".  ASP.NET blogging engines certainly fall in to this category, as many of you have probably noticed.

 When I decided to create this blog I had some definite requirements in mind when it came to choosing a blog engine.  First and foremost it had to render completely valid XHTML because I practice what I preach in respect to W3C compliance.  Control over SEO-important aspects such as canonical URLs and meta descrition elements were a necessity.  It also had to have a URL strategy that didn't rely on a "/page.aspx?id=123456789"-style mess of a querystring. Finally, the underlying code had to be well-organized and lean.  Without a de-facto "standard" for ASP.NET blog engines I started hunting around and researching the options that were out there.

I tried some different ASP.NET blog engines such as dasBlog and subText, but found the markup that was rendered was not acceptable.  Then I came across BlogEngine.NET, which, I was thrilled to discover, met all of my requirements.  It's light weight, very easy to set up, and is very well-designed and organized under the hood.  A nice feature is that, by default, it doesn't require a SQL  database to run, instead storing all posts, comments and settings in local XML files.  Of course SQL database integration is available, and is also easy to get up and running.  Aesthetically it comes with a nice collection of themes, which are quite easy to modify, and creating your own themes is a straightforward process. 

The above factors have led me to believe that BlogEngine.NET is in a position to become to ASP.NET what WordPress is for PHP.  If any of you are currently trying to decide on a solid ASP.NET blog system you should definitely try BlogEngine.NET.  You won't be disappointed.