Colin Cochrane

Colin Cochrane is a Software Developer based in Victoria, BC specializing in C#, PowerShell, Web Development and DevOps.

ASP.NET Custom Errors: Preventing 302 Redirects To Custom Error Pages

 
You can download the HttpModule here.
 
Defining custom error pages is a convenient way to show users a friendly page when they encounter an HTTP error such as a 404 Not Found, or a 500 Server Error.  Unfortunately ASP.NET handles custom error pages by responding with a 302 Temporary redirect to the error page that was defined. For example, consider an example application that has IIS configured to map all requests to it, and has the following customErrors element defined in its web.config:
 
<customErrors mode="RemoteOnly" defaultRedirect="~/error.aspx">
<error statusCode="404" redirect="~/404.aspx" />
</customError>

If a user requested a page that didn't exist, then the HTTP response would look something like:

http://www.domain.com/non-existant-page.aspx --> 302 Found
http://www.domain.com/404.aspx  --> 404 Not Found
Date: Sat, 26 Jan 2008 03:08:21 GMT
Server: Microsoft-IIS/6.0
Content-Length: 24753
Content-Type: text/html; charset=utf-8
X-Powered-By: ASP.NET
 
As you can see, there is a 302 redirect that occurs to send the user to the custom error page.  This is not ideal for two reasons:

1) It's bad for SEO

When a search engine spiders crawls your site and comes across a page that doesn't exist, you want to make sure you respond with an HTTP status of 404 and send it on its way.  Otherwise you may end up with duplicate content issues or indexing problems, depending on the spider and search engine.

2) It can lead to more incorrect HTTP status responses

This ties in with the first point, but can be significantly more serious.  If the custom error page is not configured to response with the correct status code then the HTTP response could end up looking like:

http://www.domain.com/non-existant-page.aspx --> 302 Found
http://www.domain.com/404.aspx  --> 200 OK
Date: Sat, 26 Jan 2008 03:08:21 GMT
Server: Microsoft-IIS/6.0
Content-Length: 24753
Content-Type: text/html; charset=utf-8
X-Powered-By: ASP.NET
 
Which would almost guarantee that there would be duplicate content issues for the site with the search engines, as the search spiders are simply going to assume that the error page is a normal page, like any other.Furthermore it will probably cause some website and server administration headaches, as HTTP errors won't be accurately logged, making them harder to track and identify.
I tried to find a solution to this problem, but I didn't have any luck finding anything, other than people who were also looking for a way to get around it.  So I did what I usually do, and created my own solution.
 
The solution comes in the form of a small HTTP module that hooks onto the HttpContext.Error event.  When an error occurs, the module checks if the error's type is an HttpException.  If the error is an HttpException, then the following process takes place:
  1. The response headers are cleared (context.Response.ClearHeaders() )
  2. The response status code is set to match the actual HttpException.GetHttpCode() value (context.Response.StatusCode = HttpException.GetHttpCode())
  3. The customErrorsSection from the web.config is checked to see if the HTTP status code (HttpException.GetHttpCode() ) is defined.
  4. If the statusCode is defined in the customErrorsSection then the request is transferred, server-side, to the custom error page. (context.Server.Transfer(customErrorsCollection.Get(statusCode.ToString).Redirect) )
  5. If the statusCode is not defined in the customErrorsSection, then the response is flushed, immediately sending the response to the client.(context.Response.Flush() )

Here is the source code for the module.

   1: Imports System.Web
   2: Imports System.Web.Configuration
   3:  
   4: Public Class HttpErrorModule
   5:   Implements IHttpModule
   6:  
   7:   Public Sub Dispose() Implements System.Web.IHttpModule.Dispose
   8:     'Nothing to dispose.
   9:   End Sub
  10:  
  11:   Public Sub Init(ByVal context As System.Web.HttpApplication) Implements System.Web.IHttpModule.Init
  12:     AddHandler context.Error, New EventHandler(AddressOf Context_Error)
  13:   End Sub
  14:  
  15:   Private Sub Context_Error(ByVal sender As Object, ByVal e As EventArgs)
  16:     Dim context As HttpContext = CType(sender, HttpApplication).Context
  17:     If (context.Error.GetType Is GetType(HttpException)) Then
  18:       ' Get the Web application configuration.
  19:       Dim configuration As System.Configuration.Configuration = WebConfigurationManager.OpenWebConfiguration("~/web.config")
  20:  
  21:       ' Get the section.
  22:       Dim customErrorsSection As CustomErrorsSection = CType(configuration.GetSection("system.web/customErrors"), CustomErrorsSection)
  23:  
  24:       ' Get the collection
  25:       Dim customErrorsCollection As CustomErrorCollection = customErrorsSection.Errors
  26:  
  27:       Dim statusCode As Integer = CType(context.Error, HttpException).GetHttpCode
  28:  
  29:       'Clears existing response headers and sets the desired ones.
  30:       context.Response.ClearHeaders()
  31:       context.Response.StatusCode = statusCode
  32:       If (customErrorsCollection.Item(statusCode.ToString) IsNot Nothing) Then
  33:         context.Server.Transfer(customErrorsCollection.Get(statusCode.ToString).Redirect)
  34:       Else
  35:         context.Response.Flush()
  36:       End If
  37:  
  38:     End If
  39:  
  40:   End Sub
  41:  
  42: End Class

The following element also needs to be added to the httpModules element in your web.config (replace the attribute values if you aren't using the downloaded binary):

<httpModules>
<add name="HttpErrorModule" type="ColinCochrane.HttpErrorModule, ColinCochrane" />
</httpModules>

And there you go! No more 302 redirects to your custom error pages.

Catching Unwanted Spiders And Content Scraping Bots In ASP.NET

spiderinaglass

If you have a blog that is even moderately popular then you have likely fallen victim to some form of content scraping.  Ever since it became possible to earn money through ads on a website there have been people trying to find ways to cheat the system.  The most widespread example of this comes in the form of splogs and similar spam-based websites, which consist only of ads from Google AdSense and duplicated content that is scraped from other sites.  In this post I will share a method you can use to identify "evil" spiders and content scraping bots that are wasting your website's resources.

I'll start off by defining what is considered an "evil" spider/bot.  For our purposes here, we'll be looking at spiders and bots that ignore robots.txt and nofollow when crawling a site.  These are spiders and bots that offer no value to you in allowing them to crawl your site, as the major search engines use spiders and bots that respect these rules (with the unique exception of MSN who employs a certain bot that presents itself as a regular user in order to identify sites that present different content to search engine spiders than users).  

Of these valueless spiders, some are almost certainly going to be some form of content scraping bot, which is sent to literally copy the content of your site for use elsewhere.  It is in your best interest to limit how much of your content gets scraped because you want visitors coming to your site, not some spam-filled facsimile.

This method to identify unwanted spiders involves the creation of a trap,  which can be created as follows:

1) Create a Hidden Page

To identify these undesired visitors you need to isolate them.  Create a page on your site, but do not link to it from anywhere just yet.  For the purposes of my examples, I'll call our example page "trap.aspx".

spidertrap1

Now you want to disallow this page in your robots.txt.

spidertrap2

With this trap page disallowed in the robots.txt, it will prevent good spiders from crawling it.  What is needed now is a link to the trap page with the rel="nofollow" attribute, which should be placed on your home page for maximum effect.  The link must be invisible to users otherwise you might mistake a unwitting visitor for a bad spider.

<a rel="nofollow" href="/trap.aspx" style="display:none;" />

This creates a situation in which the only requests for "/trap.aspx" will be from a spider or bot that ignores both robots.txt and nofollow, which is exactly the kind of bots we want to identify.

2) Create a Log File

Create an XML document and name it "trap.xml" (or whatever you want) and place it in the App_Data folder of your application (or wherever you want, as long as the application has write-access to the directory).  Open the new XML document and create an empty root-element "<trapRequests>" and ensure it has a complete closing tag.

<?xml version="1.0" encoding="utf-8"?>
<trapRequests>
</trapRequests>
 
You can use whatever method is best for you to log the requests, you do not need to use an XML document.  I am using XML for the purposes of this example.

3) Log What Gets Caught In The Trap

With the trap in place, you now want to keep track of the requests being made for "trap.aspx".  This can be accomplished quite easily using LINQ, as illustrated in the following example:

Imports System.Xml.Linq
Partial Class trap_aspx Inherits System.Web.UI.Page
Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) _ Handles Me.Load
 
  LogRequest(Request.UserHostAddress, Request.UserAgent)
End Sub

Private Sub LogRequest(ByVal ipAddress As String, ByVal userAgent As String)
Dim logFile As XDocument

Try

    logFile = XDocument.Load(Server.MapPath("~/App_Data/trap.xml"))
    logFile.Root.AddFirst(<request>
<date><%= Now.ToString %> </date>
<ip><%= ipAddress %> </ip>
<userAgent><%= Now.ToString %> </userAgent>
</request>)
     logFile.Save(Server.MapPath("~/App_Data/trap.xml"))
Catch ex As Exception
My.Log.WriteException(ex)
End Try
End Sub
End Class

This code sets it up so every request for this page is logged with:

  1. The Date and Time of the request.
  2. The IP address of the requesting agent.
  3. The User Agent of the requesting agent.

You can, of course, customize what information is logged to your preference.  The code will need to be adjusted if you are using a different storage method.  Once done, you will end up with an XML log file (or your custom store) with every request to "trap.aspx" that will look like:

<?xml version="1.0" encoding="utf-8"?>
<trapRequests>
<request>
<date>12/30/2007 12:54:20 PM</date>
<ip>1.2.3.4</ip>
<userAgent>ISCRAPECONTENT/1.2</userAgent>
</request>
<request>
<date>12/30/2007 2:31:51 PM</date>
<ip>2.3.4.5</ip>
<userAgent>BADSPIDER/0.5</userAgent>
</request>
</trapRequests>
 
Now you've set your trap and any unwanted bots and spiders that find it will be logged.  You are then free to use the logged data to deny access to offending IPs, User Agents, or by whatever criteria you decide is appropriate for your site.

How Being An SEO Analyst Made Me A Better Web Developer

Being a successful web developer requires constant effort to refine your existing abilities while expanding your skill-set to include the new technologies that are continually released, so when I first started my job as a search engine optimization analyst I was fully expecting my web development skills to dull.  I could not have been more wrong.

Before I get to the list of reasons why being an SEO analyst made me a better web developer I'm going to give a quick overview of how I got into search engine optimization in the first place. Search engine optimization first captured my interest when I wanted to start driving more search traffic for a website I developed for the BCDA (and continue to volunteer my services as webmaster). Due to some dissatisfaction with our hosting provider I decided that we would switch hosting providers as soon as our existing contract expired and go with one of my preferred hosts. However, as a not-for-profit organization the budget for the website was limited and the new hosting was going cost a little more, so I decided to set up an AdSense account to bring in some added income. 

The expectations weren't high; I was hoping to bring in enough revenue through the website to cover hosting costs.  At that point I did not have much experience with SEO so I started researching and looking for strategies I could use on the site.  As I read more and more articles, blogs and whitepapers I became increasingly fascinated with the industry while I would apply all of the newfound knowledge to the site.  Soon after I responded to a job posting at an SEO firm, applied, and was shortly thereafter starting my new career as an SEO analyst. 

My first few weeks at the job were spent learning procedures, familiarizing myself with the various tools that we use, and, most importantly, honing my SEO skills.  I spent the majority of my time auditing and reporting on client sites, which exposed me to a lot of different websites, programming and scripting languages, and tens of thousands of lines of code.  During this process I realized that my web development skills weren't getting worse, they were actually getting better.   The following list examines the reasons for this improvement.

1) Coding Diversity

To properly analyze a site, identify problems, and be able to offer the right solutions I often have to go deeper than just HTML on a website.  This meant that I had to be proficient at coding in a variety of different languages, because I don't believe in pointing out problems with a site unless I can recommend how to best fix them.  Learning the different languages came quickly from the sheer volume of code I was faced with every day, and got easier with each language I learned.

2) Web Standards and Semantic Markup

In a recent post, Reducing Code Bloat Part Two: Semantic HTML, I discussed the importance of semantic HTML to lean, tidy markup for your web documents.  While I have always been a proponent of web standards and semantic markup my experience in SEO has served to solidify my beliefs.  After you have pored through 12,000 lines of markup that should have been 1000, or spent two hours implementing style modifications that should have taken five minutes, the appreciation for semantic markup and web standards is quickly realized.

3) Usability and Accessibility

Once I've optimized a site to draw more search traffic I need to help make sure that that traffic actually sticks around.  A big part of this is the usability and accessibility of a site.  There are a lot of other websites out there for people to visit and they are not going to waste time trying to figure out how to navigate through a meandering quagmire of a design.  This aspect of my job forces me to step into the shoes of the average user, which is something that a lot of developers need to do more often.  It has also made me more considerate when utilizing features and technologies, such as AJAX, in respect to accessibility, such that I ensure that the site is still accessible when that feature is disabled or is not supported.

4) The Value of Content

Before getting into SEO, I was among the many web developers guilty of thinking that a website's success can be ensured by implementing enough features, and that enough cool features could make up for a lack of simple, quality content.  Search engine optimization taught me the value of content, and that the right balance of innovative features and content will greatly enhance the effectiveness of both.

That covers some of the bigger reasons that working as an SEO analyst made me a better web developer.  Chances are that I will follow up on this post in the future with more reasons that I am sure to realize as I continue my career in SEO.  In fact one of the biggest reasons I love working in search engine optimization and marketing is that it is an industry that is constantly changing and evolving, and there is always sometime new to learn.

Reducing Code Bloat Part Two - Semantic HTML

In my first post on this subject, Reducing Code Bloat - Or How To Cut Your HTML Size In Half, I demonstrated how you can significantly reduce the size of a web document by simply moving style definitions externally and getting rid of a table-based layout.  In this installment I will look at the practice of semantic HTML and how effective it can be at keeping your markup tidy and lean.

What is Semantic HTML?

Semantic HTML, in a nutshell, is the practice of creating HTML documents that only contain the author's "meaning" and not how that meaning is presented.  This boils down to using only structural elements and not presentational elements.  A common issue with semantic HTML is identifying the often subtle differences between elements that represent the author's meaning versus a presentational element.  Consider the following examples:

<p>
I really felt the need to <i>emphasize</i> the point.
</p>

 

<p>
I really felt the need to <em>emphasize</em> the point.
</p>

Many people would consider these two snips of code to be essentially the same; a paragraph, with the word "emphasize" in italics.  When considering the semantic value of the elements, however, there is one significant difference.  The element <i> is a purely presentational (or visual), and it has no meaning to the semantic structure of the document.  The <em> element, on the other hand, has a meaningful semantic value in the document's structure because it defining its contents as being emphasized.  Visually we usually see the contents both <i> and <em> elements rendered as italicized text which is why the difference often seems like nitpicking, but the standard italicizing for <em> elements is simply the choice made by most web browsers, it has no inherent visual style.

Making Your Markup Work for You

Some of you might be thinking to yourself "some HTML tags have a different meaning than others? So what?".  Certainly a reasonable response, because when it comes down to it, HTML semantics can seem like pointless nitpicking.  That being said, this nitpicking drives home what HTML is really supposed to do: define markup.  Think of it like being a writer for a magazine.  You write your article, including paragraphs and indicating words or sentences that should be emphasized, or considered more strongly, in relation to the rest, and submit it.  You don't decide what font is used, the colour of the text, or how much space should be between each line, that's the editor's job.  It is exactly the same concept with semantic HTML: the structure and meaning of the content is the job of HTML, and the presentation is the job of the browser and CSS.

Without the burden of presentational elements you only have to worry about a core group of elements with meaningful structure value.  For instance...

<p>
This is a chunk of text <span class="red14pxitalic">where</span> random words 
are <span class="green16pxbold">styled</span> differently,
<span class="red14pxitalic">with</span> some words 
<span class="green16pxbold">being</span> red, italicized and at 14px, and others
being green, bold, and 16px.
</p>
 
with an external style-sheet definition...
span.red14pxitalic{color:red;font-size:14px;font-style:italic;}
span.green16pxbold{color:green;font-size:16px;font-weight:bold;}

...turns into this...

<p>
This is a chunk of text <em>where</em> random words 
are <strong>styled</strong> differently, <em>with</em> some words 
<strong>being</strong> red, italicized and at 14px, and others being green, 
bold, and 16px.
</p>

with an external style-sheet definition...

p em{color:red;font-size:14px;font-style:italic;}
p strong{color:green;font-size:16px;font-weight:bold;}

The second example accomplishes the same visual result as the first, but contains actual semantic worth in the document's structure.  It also illustrates how much simpler it is to create a new document because you don't have to worry about style.  You create the document and use meaningful semantic elements to identify whatever parts of the content are necessary, and let the style-sheet take care of the rest (assuming of course that the style-sheet was properly defined with the necessary styles).  By using a more complete range of HTML elements you will find yourself needing <span class="whatever"> tags less and less and find your markup becoming cleaner, easier to read, and smaller.

 

Code Size Comparison

 

  HTML Characters (Including Spaces) CSS Characters (Including Spaces)
Example One 302 127
Example Two 216 103

 

Part Three will continue looking at semantic HTML, as well as strategies you can use when defining your style framework.

Using CSS To Create Two Common HTML Border Effects

Seperating the style from the markup of a web document is generally a painless, if sometimes time-consuming, task.  In many cases however, the process can have some added speed-bumps; most notably when the original HTML is using an infamous table-based layout.  The two most common speedbumps when dealing with table-based layouts and styling are recreating the classic borderless table and keeping the default table border appearance.

The appearance of these two kinds of table are as follows 

Default Border

1 2
3 4

Borderless

1 2
3 4

The markup for these two tables looks like:

[code:html]

<!--Default Border -->
<table border="1">
<tbody>
<tr>
<td>1</td>
<td>2</td>
</tr>
<tr>
<td>3</td>
<td>4</td>
</tr>
</tbody>
</table>
<!-- Borderless -->
<table border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td>1</td>
<td>2</td>
</tr>
<tr>
<td>3</td>
<td>4</td>
</tr>
</tbody>
</table>

[/code]

If you want to get the same effects while losing the HTML attributes you can use the folllow CSS:

Default Border

[code:html]

table{border-spacing:0px;border:solid 1px #8D8D8D;width:130px;}
table td{
border:solid 1px #C0C0C0;
border-bottom:solid 1px #8D8D8D;
border-left:solid 1px #8D8D8D;
display:table-cell;
margin:0;
padding:0;}

[/code]


Borderless

[code:html]

table{border:none;border-collapse:collapse;}
table td{padding:0;margin:0;}

[/code]

Duplicating the default table border look requires extra rules in its style definition because the default border contains two shades so the border-color values must be set accordingly. 

That is the basic method to replicating HTML table effects with CSS that are usually created with HTML attributes.

7 Firefox Add Ons For Web Developers

Firefox add-ons can be one of the best set of tools a web developer can have but with so many out there it can be hard to decide on which ones are the best for you.  I was reluctant to post this at first, given the countless other blogs out their who have posted their own add-on lists, but felt this would be of value for those looking for recommendations that weren't posted 2-3 years ago.  I avoided going for the "Top 841 Add Ons Every Web Developer Must Have!!!!" angle, as this isn't a top list.  It's simply a list of my personal favourites because, as a guy who's known around the office for having too many add-ons installed for Firefox, I've definitely used a lot of add-ons, some which were fanstastic, and some which left something to be desired.  Over time I managed to pare down my pile of add-ons in to a core set that all save me a considerable amount of time, and also share the same basic qualities:

  1. Easy to Use
  2. Not Obtrusive
  3. Light-Weight
  4. Accurate

Ease of use is important to saving time (an add-on isn't going to increase productivity if it takes just as long to figure out how to use it, configure it, or even just to use it).  Not being obtrusive is important because I want my add-ons to be un-noticable other than a toolbar or icon unless I actively want to utlilize it, and this ties in with being light-weight, as a light-weight add-on is not going to cause any decreases in performance.  Finally, accuracy is imperitive because I have to be able to trust what information the add-on is providing me with.

With those qualities explained, I'll get on to the list.

1) Firebug

For web development and/or design if you are going to install one add-on, this is the one.  Firebug lets you view, edit, debug and monitor HTML, CSS and JavaScript live on any web page.  I use Firebug extensively every day to test layout changes before I make them, to debug script errors, navigate through a page's markup with ease, monitor all the HTTP requests made for a given page, and more.  It also has some incredibly useful features like a diagram that shows all the measurements for dimension, margin, padding, border, etc... for a selected element, a list of applied styles for a selected element, and the "inspect" feature that lets you simply click on the part of a web page you want to analyze and have Firebug automatically get all the details on it for you.

 



2) Web Developer

Web Developer is an add-on that provides a toolbar and menu that gives you access to a host of different development tools.  Just one of the 12 toolbar buttons gives you the following options:



There are tons of other useful tools included with Web Developer, including some of my personal favourites:

  • Outline different HTML elements (with ability to display element type and custom outline colour)
  • Show/hide/outline images with oversized dimensions, missing alt attributes, adjusted dimensions, missing dimensions, etc
  • Show image paths, alt and title attributes, sizes
  • Disable javascript
  • Disable CSS styles (external, interal, inline, etc..) 

3) Greasemonkey

One of the most popular add-ons out there, Greasemonkey simply lets you run custom JavaScript against any page you want.  There are thousands of Greasemonkey scripts out there for those who don't want to create their own, offering enhancements such as numbering search results in Google, MSN and Yahoo, killing pop-up ads, and pretty much anything else you could think of.

4) HTML Validator

A must have if you care about W3C validation.  HTML Validator displays an icon in the bottom of your browser indicating if the current page validates against its DOCTYPE, as well as the number of errors and warnings.  You can configure the validation method it uses (HTML Tidy, or SGML which is used by the W3C Markup Validation Service), and it also displays a list of errors and highlights them in when you view the page's source.

5) MeasureIt

Simple and easy-to use, MeasureIt is an add-on that, when activated, displays an overlay where you can drag and position a dynamic ruler so you can measure dimensions on a web page.



6) Session Manager

Session Manager is an incredibly useful add-on that tracks your Firefox sessions and allows you reload older sessions or recover a crashed session.  If you're like me and usually have the same websites opened in different tabs, this add-on will make your life a lot easier.

7) QuickRestart

QuickRestart is a simple and convenient add-on that simply adds a button to your toolbar that restarts Firefox, just like when you install an add-on or upgrade.

Hopefully you will find these add-ons as useful as I have.  I'll also be posting a list of useful add-ons for Internet Explorer later this week as well.  I encourage everyone to comment with your own recommendations for Firefox add-ons that you find useful for web development.

 

Reducing Code Bloat - Or How To Cut Your HTML Size In Half

When it comes to designing and developing a web site the load time is one consideration that is often ignored, or is an afterthought once the majority of the design and structure is in place.  While high-speed internet connections are becoming increasingly common there are still a significant portion of web users out there with 56k connections, and even those with broadband connections aren't guaranteed to have a fast connection to your particular server.  Every second that a user has to wait to download your content is increasing the chance of that user deciding to move on.

Attempting to reduce the size of a web page is usually restricted to compressing larger images and optimizing them for web use.  This is a necessary step to managing page size, but there is another important factor that can significantly reduce the size of a page to improve download times: code bloat, or more specifically, (x)HTML code bloat.  Getting rid of that code bloat means less actual bytes to be downloaded by clients, as well as captilizing on what the client has already downloaded.  Unfortunately this is an option that tends to be ignored due to the perceived loss of time spent combing through markup to cut out the chaff, despite the fact that clean, efficient markup with well-planned style definitions will save countless hours when it comes to upkeep and maintenance.

To demonstrate the difference between a bloated page and one with efficient markup I created two basic pages.  One uses tables, font tags, HTML style attributes and so forth to control the structure and look of the page, while the other uses minimal markup with an external stylesheet.

1) Bloated Page

[code:html]

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>This Tag Soup Is Really Thick</title>
<meta name="description" content="The end result of lazy web page design resulting in a bloated mess of HTML.">
<meta name="keywords" content="tag soup, messy html, bloated html">
</head>
<body>
<center>
  <table width=700 height=800 bgcolor="gainsboro" border=0 cellpadding=0 cellspacing=0>
    <tr>
      <td valign=top width=150><table align=center border=0 cellpadding=0 cellspacing=0>
          <tr>
            <td align="left"><a href="#home" title="Not a real link"><font color="#4FB322" size="3" face="Geneva, Arial, Helvetica, sans-serif">Home Page</font></a></td>
          </tr>
          <tr>
            <td align="left"><a href="#about" title="Not a real link"><font color="#4FB322" size="3" face="Geneva, Arial, Helvetica, sans-serif">About Me</font></a></td>
          </tr>
          <tr>
            <td align="left"><a href="#links" title="Not a real link"><font color="#4FB322" size="3" face="Geneva, Arial, Helvetica, sans-serif">Links</font></a></td>
          </tr>
        </table></td>
      <td valign=top><table>
          <tr>
            <td align="center" height=64><h1><font color="red" face="Geneva, Arial, Helvetica, sans-serif">Welcome to My Site!</font></h1></td>
          </tr>
          <tr>
            <td align="left"><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Isn&acute;t it surprisingly ugly and bland?</font></td>
          </tr>
          <tr>
            <td align="left"><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Fusce est. Maecenas pharetra nibh vel turpis molestie gravida. Integer convallis odio eu nulla. Vivamus eget turpis eu neque dignissim dignissim. Fusce vel erat ut turpis pharetra molestie. Cras sollicitudin consequat sem. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Maecenas augue diam, sagittis eget, cursus at, vulputate at, nisl. Etiam scelerisque molestie nibh. Suspendisse ornare dignissim enim. Sed posuere nunc a lectus. Vestibulum luctus, nibh feugiat convallis ornare, lorem neque volutpat risus, a dapibus odio justo at erat. Donec vel lacus id urna luctus tincidunt. Morbi nunc. Donec fringilla sapien nec lectus. Duis at felis a leo porta tempor.</font></td>
          </tr>
          <tr>
            <td align="left"><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Maecenas malesuada felis id mauris. Ut nibh eros, vestibulum nec, ornare sollicitudin, hendrerit et, ligula. Suspendisse tellus elit, rutrum ut, tempor eget, porta bibendum, magna. Nunc sem dolor, pharetra ut, fermentum in, consequat vitae, velit. Vestibulum in ipsum. Phasellus erat. Sed eget turpis tristique eros cursus gravida. Vestibulum quis pede a libero elementum varius. Nullam feugiat accumsan enim. Aenean nec mi. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae;</font></td>
          </tr>
          <tr>
            <td align="left"><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Aenean vel neque ac orci sagittis tristique. Phasellus cursus quam a mauris. Donec posuere pede a nisl. Curabitur nec ligula eu nibh accumsan sagittis. Integer lacinia. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos hymenaeos. Praesent tortor dolor, pellentesque eget, fermentum vel, mollis ut, erat. Nullam mollis. Cras rhoncus tellus ut neque. Pellentesque sed ante.</font></td>
          </tr>
          <tr align="left">
            <td><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Donec at nunc. Nulla elementum porta elit. Donec bibendum. Fusce elit ligula, gravida et, tincidunt et, aliquam sit amet, metus. Nulla id magna. Fusce quis eros. Sed eget justo. Vivamus dictum interdum quam. Curabitur malesuada. Proin id metus. Curabitur feugiat. Nunc in turpis. Cras lobortis lobortis felis. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Mauris imperdiet aliquet ante. Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</font></td>
          </tr>
          <tr align="left">
            <td><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Etiam tristique mauris at nibh sodales pretium. In lorem eros, laoreet eget, rhoncus et, lacinia nec, pede. Fusce a quam. Pellentesque vitae lacus. Vivamus commodo. Morbi euismod, ipsum id consectetuer ornare, nisi sem suscipit pede, vel dictum purus mauris eu leo. Proin sodales. Aliquam in pede nec eros aliquet adipiscing. Nulla a purus sed risus ullamcorper tempus. Nunc neque magna, fringilla quis, ullamcorper vitae, placerat sed, orci. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae;</font></td>
          </tr>
        </table></td>
    </tr>
  </table>
</center>
</body>
</html>

[/code]

2) Cleaned Page

[code:html]

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    <title>Less Markup, More Content</title>
    <meta name="description" content="The end result of lazy web page design resulting in a bloated mess of HTML.">
    <meta name="keywords" content="tag soup, messy html, bloated html">
    <link href="style.css" rel="stylesheet" type="text/css">
  </head>
  <body>
    <div class="content">
      <ul class="menu">
        <li><a href="#home" title="Not a real link">Home Page</a></li>
        <li><a href="#home" title="Not a real link">About Me</a></li>
        <li><a href="#home" title="Not a real link">Links</a></li>
      </ul>
      <h1>Welcome To My Site!</h1>
      <p>Isn&acute;t it suprisingly ugly and bland?</p>
      <p>Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Fusce est.  Maecenas pharetra nibh vel turpis molestie gravida. Integer convallis  odio eu nulla. Vivamus eget turpis eu neque dignissim dignissim. Fusce  vel erat ut turpis pharetra molestie. Cras sollicitudin consequat sem.  Vestibulum ante ipsum primis in faucibus orci luctus et ultrices  posuere cubiliaCurae; Maecenas augue diam, sagittis eget, cursus at,  vulputate at, nisl. Etiam scelerisque molestie nibh. Suspendisse ornare  dignissim enim. Sed posuere nunc a lectus. Vestibulum luctus, nibh  feugiat convallis ornare, lorem neque volutpat risus, a dapibus odio  justo at erat. Donec vel lacus id urna luctus tincidunt. Morbi nunc.  Donec fringilla sapien nec lectus. Duis at felis a leo porta tempor. </p>
      <p>Maecenas malesuada felis id mauris. Ut nibh eros, vestibulum nec,  ornare sollicitudin, hendrerit et, ligula. Suspendisse tellus elit,  rutrum ut, tempor eget, porta bibendum, magna. Nunc sem dolor, pharetra  ut, fermentum in, consequat vitae, velit. Vestibulum in ipsum.  Phasellus erat. Sed eget turpis tristique eros cursus gravida.  Vestibulum quis pede a libero elementum varius. Nullam feugiat accumsan  enim. Aenean nec mi. Vestibulum ante ipsum primis in faucibus orci  luctus et ultrices posuere cubilia Curae; </p>
      <p>Aenean vel neque ac orci sagittis tristique. Phasellus cursus quam a  mauris. Donec posuere pede a nisl. Curabitur nec ligula eu nibh  accumsan sagittis. Integer lacinia. Class aptent taciti sociosqu ad  litora torquent per conubia nostra, per inceptos hymenaeos. Praesent  tortor dolor, pellentesque eget, fermentum vel, mollis ut, erat. Nullam  mollis. Cras rhoncus tellus ut neque. Pellentesque sed ante. </p>
      <p>Donec at nunc. Nulla elementum porta elit. Donec bibendum. Fusce  elit ligula, gravida et, tincidunt et, aliquam sit amet, metus. Nulla  id magna. Fusce quis eros. Sed eget justo. Vivamus dictum interdum  quam. Curabitur malesuada. Proin id metus. Curabitur feugiat. Nunc in  turpis. Cras lobortis lobortis felis. Pellentesque habitant morbi  tristique senectus et netus et malesuada fames ac turpis egestas.  Mauris imperdiet aliquet ante. Lorem ipsum dolor sit amet, consectetuer  adipiscing elit. </p>
      <p>Etiam tristique mauris at nibh sodales pretium. In lorem eros,  laoreet eget, rhoncus et, lacinia nec, pede. Fusce a quam. Pellentesque  vitae lacus. Vivamus commodo. Morbi euismod, ipsum id consectetuer  ornare, nisi sem suscipit pede, vel dictum purus mauris eu leo. Proin  sodales. Aliquam in pede nec eros aliquet adipiscing. Nulla a purus sed  risus ullamcorper tempus. Nunc neque magna, fringilla quis, ullamcorper  vitae, placerat sed, orci. Pellentesque habitant morbi tristique  senectus et netus et malesuada fames ac turpis egestas. Vestibulum ante  ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; </p>
    </div>
  </body>
</html>

[/code]

External Style Sheet

[code:html]

@charset "utf-8";
body{font:13pt Geneva, Arial, Helvetica, sans-serif;}
.menu{float:left;height:800px;list-style-type:none;width:150px;}
.menu a, .menu a:visited{color:#4FB322;}
.content{background:gainsboro;color:white;margin:auto;position:relative;width:700px;}
h1{color:red;line-height:64px;text-align:center;}
p{margin:4px;}

[/code]

Even in this basic example you can see a fairly dramatic improvement when the excess HTML is trimmed and CSS is used to control style.  The original page is 51 lines, where the cleaned page is only 26 lines, plus 7 lines in the style sheet.  The cleaned page is a third the size of the original (counting the style sheet), and more realistically is actually half the size because the style sheet would be cached by most client browsers and wouldn't be downloaded for every page request.  As far as raw kilobytes it's a difference of 6KB to 4KB, which isn't a particularly exciting difference in this case, but one that is quickly magnified as the length of the page increases.  This is especially true with dynamic applications that pull content from a database, most importantly content such as product listings that utilize the same markup and are repeated multiple times. Fortunately in the case of dynamic pages involving looping procedures that output the same markup with different content, cutting down the bloat can be as easy as a few modifications to those procedures.

Furthermore if you wanted to change, for instance, the font color or the line-height in the original, you would have to modify every font tag and table cell to accomplish that.  Implementing those changes in the second example requires a single modification to the style-sheet.  The time-saved here is once again significantly amplified when considered in a situation with multiple pages (in many cases this can be hundreds or even thousands).

When all is said and done, this isn't meant to be a be-all end-all guide for optimizing your markup because I could write a book and still not cover it all.  Rather it was meant to highlight an aspect of web page performance and optimization that is usually swept under the rug in favour of those that are more directly appreciable such as eye-candy and new features.  While clean markup might not be as "glamourous" as other aspects of web development, it is an important aspect to keeping load time in check and a crucial factor in reducing them amount of time spent maintaining and updating.

Three CSS Roll Over Techniques You Might Not Know About

When it comes to rollover effects in web design the most common way to accomplish the effect has traditionally been with JavaScript:


JavaScript in the HEAD section

[code:html]

<script language="JavaScript">
<!--
// preload images
if (document.images) {
img_on =new Image(); img_on.src ="../images/1.gif";
img_off=new Image(); img_off.src="../images/2.gif";
}
function handleOver() {
if (document.images) document.imgName.src=img_on.src;
}
function handleOut() {
if (document.images) document.imgName.src=img_off.src;
}
//-->
</script>

[/code]


And in the element with the rollover effect

[code:html]


<a href="http://www.domain.com" onMouseOver="handleOver();return true;" onMouseOut="handleOut();return true;"><img name="imgName" alt="Rollover!" src="/images/1.gif"/></a>

[/code]

The reason this method is used so commonly is because it is simple to implement and, more importantly, avoids the "lag" on the first mouseover that comes when using a CSS background-image switch on an selector:hover rule due to the delay required to download the rollover image. One thing that a lot of people don't realize is that there are methods to accomplish this effect in CSS without the initial rollover lag.

Method One - CSS Preloading

This is the quick and dirty way to force browsers to download rollover images when they initially load the page. Let's say you have the following document:

[code:html]


<html>
<head>
<title>My Rollover Page</title>
<style type="text/css">
#rollover{background:url(/images/1.gif);}
#rollover:hover{background:url(/images/2.gif);}
</style>
</head>
<body>
<div>
<a id="rollover" href="http://www.domain.com">My Rollover Link</a>
</div>
</body>
</html>

[/code]

In this page there would be a noticible delay when a user first mouses over the "rollover" anchor. The CSS Preloading method uses an invisible dummy element set to visibility:hidden, and has the "active" version of the rollover image set as its background.

[code:html]

<html>
<head>
<title>My Rollover Page</title>
<style type="text/css">
#preload{position:absolute;visibility:hidden;}
#image2{background:url(/images/2.gif);}
#rollover{background:url(/images/1.gif);}
#rollover:hover{background:url(/images/2.gif);}
</style>
</head>
<body>
<div id="preload">
<div id="image2"></div>
</div>
<div>
<a id="rollover" href="http://www.domain.com">My Rollover Link</a>
</div>
</body> 
</html>
 

[/code]

Method Two - Image Visibility Swap

This method accomplishes the same goal of forcing the browser to load both of the rollover images, but attacks it in a different way. Using the same example as above, we basically set the background of the containing anchor element to the "active" state of the rollover, and set the contained image to be the "inactive" state. Then it's just a matter of hiding the image element on hover.

[code:html]

<html>
<head>
<title>My Rollover Page</title>
<style type="text/css">
#rollover{background:url(/images/2.gif");display:block;height:50px;width:50px;}
#rollover:hover img{visibility:hidden;}
</style>
</head>
<body>
<div>
<a id="rollover" href="http://www.domain.com"><img src="/images/1.gif" alt="My Rollover's Inactive Image" /></a>
</div>
</body> 
</html>  

[/code]

This is the method that this site uses for the ColinCochrane.com logo in the header.

Method Three - Multistate Image

This method avoids the preloading problem altogether by using only one image that contains the inactive and active states. This is accomplished by creating an image that has the inactive and active versions stacked on top of eachother, like so:



Then all you do is set the element's height to half of that of the image and use the background-position property to shift the states on hover:

[code:html]

<html>
<head>
<title>My Rollover Page</title>
<style type="text/css">
#rollover{background:url(/images/multi.gif") bottom;display:block;height:20px;width:100px;}
#rollover:hover{background-position:top;}
</style>
</head>
<body>
<div>
<a id="rollover" href="http://www.domain.com"></a>
</div>
</body> 
</html>

[/code]


Now you have some different techniques to consider when implementing rollover effects on your website.

What A Doctype Really Says About Your Markup

I have combed through thousands upon thousands of client's HTML documents since I began working in web development, and even more in my career as an SEO.  Much of this time is spent fixing invalid markup, shaving off unneeded code, and generally doing what the original developer should have done in the first place.  One thing that I quickly realized was that a disturbingly large majority of the sites I came across that were created by "professional" developers and firms seem to have absolutely no idea what a Doctype really is.  This is especially true when I see these developers slapping an XHTML Doctype on their pages, somehow thinking that since it is newer it will make them (the developers) look better.  As a developer who has actually taken the time to pore through the W3C specifications for the different revisions of HTML and XHTML, I find that practice rather irritating. 

That being said, as a developer who actually knows the difference between the various doctypes it is easy to spot markup from lazy and/or ignorant developers.  I should clarify that I don't expect all markup out there to pass the W3C validator 100%, nor do I expect perfect seperation of markup and structure.  However, when you place that doctype declaration at the top of your document, you are essentially saying "these are the rules that this document is going to abide by", and when those rules are obviously ignored I believe that says a lot about the developer who created the page.

Those tell-tale signs are easy to spot.  Self-closing ("/>") elements under an HTML doctype, or lack thereof in an XHTML document.  Capitalized elements and attributes in an XHTML document (xml is case-sensitive remember!).  Undefined elements such as <font> which has a cockroach-like ability of staying around, or attributes that aren't defined for the element they are declared on.  All indicators that the developer doesn't really understand the language they are working with.

HTML really isn't that complicated.  Someone who had never seen a piece of code in their life could pick it up within a week and have a working knowledge of the language.  Unfortunately that seems to be the average professional understanding of it as well, because it is apparantly too much to expect someone who makes a living using a markup language to take a few hours and actually learn how to use it properly.

Using the ASP.NET Web.Sitemap to Manage Meta Data

kick it on DotNetKicks.com

In an ASP.NET application the web.sitemap is very convenient tool for managing the site's structure, especially when used with an asp:Menu control.  One aspect of the web.sitemap that is often overlooked is the ability to add custom attributes to the <siteMapNode> elements, which provides very useful leverage for managing a site's meta-data.  I think the best way to explain would be through a simple example.

Consider a small site with three pages: /default.aspx, /products.aspx, and /services.aspx. The web.sitemap for this site would look like this:

<?xml version="1.0" encoding="utf-8" ?>

<siteMap xmlns="http://schemas.microsoft.com/AspNet/SiteMap-File-1.0" >

<siteMapNode url="~/" title="Home">

     <siteMapNode url="~/products.aspx" title="Products" />
     <siteMapNode url="~/services.aspx" title="Services" />

</siteMapNode>

 

Now let's add some custom attributes where we can set the Page Title (because the title attribute is where the asp:Menu control looks for the name of a menu item and it's probably best to leave that as is), the Meta Description, and the Meta Keywords elements. 

 

<?xml version="1.0" encoding="utf-8" ?>

<siteMap xmlns="http://schemas.microsoft.com/AspNet/SiteMap-File-1.0" >

<siteMapNode url="~/" title="Home" pageTitle="Homepage" metaDescription="This is my homepage!" metaKeywords="homepage, keywords, etc">

     <siteMapNode url="~/products.aspx" title="Products" pageTitle="Our Products" metaDescription="These are our fine products" metaKeywords="products, widgets"/>
     <siteMapNode url="~/services.aspx" title="Services" pageTitle="Our Services" metaDescription="Services we offer" metaKeywords="services, widget cleaning"/>

</siteMapNode>

 

Now with that in place all we need is a way to access these new attributes and use them to set the elements on the pages.  This can be accomplished by adding a module to your project, we'll call it "MetaDataFunctions" for this example.  In this module you add the following procedure.

 

Public Sub GenerateMetaTags(ByVal TargetPage As Page)
    Dim head As HtmlHead = TargetPage.Master.Page.Header
    Dim meta As
New HtmlMeta

    If SiteMap.CurrentNode IsNot Nothing Then
      meta.Name = "keywords"
      meta.Content = SiteMap.CurrentNode("metaKeywords")

      head.Controls.Add(meta)

      meta =
New HtmlMeta
      meta.Name = "description"
      meta.Content = SiteMap.CurrentNode("metaDescription")
      head.Controls.Add(meta)

      TargetPage.Title = SiteMap.CurrentNode.Description
   
Else

      meta.Name = "keywords"
      meta.Content = "default keywords"

      head.Controls.Add(meta)

      meta =
New HtmlMeta
      meta.Name = "description"
      meta.Content = "default description"

      head.Controls.Add(meta)

      TargetPage.Title = "default page title"
    End If
  End Sub

 

Then all you have to do is call this procedure on the Page_Load event like so...

 

Protected Sub Page_Load(ByVal sender As Object, ByVal e As EventArgs) Handles Me.Load

    GenerateMetaTags(Me)    

End Sub

 

..and you'll be up and running.  Now you have a convenient, central location where you can see and manage your site's meta-data.