Colin Cochrane

Colin Cochrane is a Software Developer based in Victoria, BC specializing in C#, PowerShell, Web Development and DevOps.

Reducing Code Bloat - Or How To Cut Your HTML Size In Half

When it comes to designing and developing a web site the load time is one consideration that is often ignored, or is an afterthought once the majority of the design and structure is in place.  While high-speed internet connections are becoming increasingly common there are still a significant portion of web users out there with 56k connections, and even those with broadband connections aren't guaranteed to have a fast connection to your particular server.  Every second that a user has to wait to download your content is increasing the chance of that user deciding to move on.

Attempting to reduce the size of a web page is usually restricted to compressing larger images and optimizing them for web use.  This is a necessary step to managing page size, but there is another important factor that can significantly reduce the size of a page to improve download times: code bloat, or more specifically, (x)HTML code bloat.  Getting rid of that code bloat means less actual bytes to be downloaded by clients, as well as captilizing on what the client has already downloaded.  Unfortunately this is an option that tends to be ignored due to the perceived loss of time spent combing through markup to cut out the chaff, despite the fact that clean, efficient markup with well-planned style definitions will save countless hours when it comes to upkeep and maintenance.

To demonstrate the difference between a bloated page and one with efficient markup I created two basic pages.  One uses tables, font tags, HTML style attributes and so forth to control the structure and look of the page, while the other uses minimal markup with an external stylesheet.

1) Bloated Page

[code:html]

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>This Tag Soup Is Really Thick</title>
<meta name="description" content="The end result of lazy web page design resulting in a bloated mess of HTML.">
<meta name="keywords" content="tag soup, messy html, bloated html">
</head>
<body>
<center>
  <table width=700 height=800 bgcolor="gainsboro" border=0 cellpadding=0 cellspacing=0>
    <tr>
      <td valign=top width=150><table align=center border=0 cellpadding=0 cellspacing=0>
          <tr>
            <td align="left"><a href="#home" title="Not a real link"><font color="#4FB322" size="3" face="Geneva, Arial, Helvetica, sans-serif">Home Page</font></a></td>
          </tr>
          <tr>
            <td align="left"><a href="#about" title="Not a real link"><font color="#4FB322" size="3" face="Geneva, Arial, Helvetica, sans-serif">About Me</font></a></td>
          </tr>
          <tr>
            <td align="left"><a href="#links" title="Not a real link"><font color="#4FB322" size="3" face="Geneva, Arial, Helvetica, sans-serif">Links</font></a></td>
          </tr>
        </table></td>
      <td valign=top><table>
          <tr>
            <td align="center" height=64><h1><font color="red" face="Geneva, Arial, Helvetica, sans-serif">Welcome to My Site!</font></h1></td>
          </tr>
          <tr>
            <td align="left"><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Isn&acute;t it surprisingly ugly and bland?</font></td>
          </tr>
          <tr>
            <td align="left"><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Fusce est. Maecenas pharetra nibh vel turpis molestie gravida. Integer convallis odio eu nulla. Vivamus eget turpis eu neque dignissim dignissim. Fusce vel erat ut turpis pharetra molestie. Cras sollicitudin consequat sem. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Maecenas augue diam, sagittis eget, cursus at, vulputate at, nisl. Etiam scelerisque molestie nibh. Suspendisse ornare dignissim enim. Sed posuere nunc a lectus. Vestibulum luctus, nibh feugiat convallis ornare, lorem neque volutpat risus, a dapibus odio justo at erat. Donec vel lacus id urna luctus tincidunt. Morbi nunc. Donec fringilla sapien nec lectus. Duis at felis a leo porta tempor.</font></td>
          </tr>
          <tr>
            <td align="left"><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Maecenas malesuada felis id mauris. Ut nibh eros, vestibulum nec, ornare sollicitudin, hendrerit et, ligula. Suspendisse tellus elit, rutrum ut, tempor eget, porta bibendum, magna. Nunc sem dolor, pharetra ut, fermentum in, consequat vitae, velit. Vestibulum in ipsum. Phasellus erat. Sed eget turpis tristique eros cursus gravida. Vestibulum quis pede a libero elementum varius. Nullam feugiat accumsan enim. Aenean nec mi. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae;</font></td>
          </tr>
          <tr>
            <td align="left"><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Aenean vel neque ac orci sagittis tristique. Phasellus cursus quam a mauris. Donec posuere pede a nisl. Curabitur nec ligula eu nibh accumsan sagittis. Integer lacinia. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos hymenaeos. Praesent tortor dolor, pellentesque eget, fermentum vel, mollis ut, erat. Nullam mollis. Cras rhoncus tellus ut neque. Pellentesque sed ante.</font></td>
          </tr>
          <tr align="left">
            <td><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Donec at nunc. Nulla elementum porta elit. Donec bibendum. Fusce elit ligula, gravida et, tincidunt et, aliquam sit amet, metus. Nulla id magna. Fusce quis eros. Sed eget justo. Vivamus dictum interdum quam. Curabitur malesuada. Proin id metus. Curabitur feugiat. Nunc in turpis. Cras lobortis lobortis felis. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Mauris imperdiet aliquet ante. Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</font></td>
          </tr>
          <tr align="left">
            <td><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Etiam tristique mauris at nibh sodales pretium. In lorem eros, laoreet eget, rhoncus et, lacinia nec, pede. Fusce a quam. Pellentesque vitae lacus. Vivamus commodo. Morbi euismod, ipsum id consectetuer ornare, nisi sem suscipit pede, vel dictum purus mauris eu leo. Proin sodales. Aliquam in pede nec eros aliquet adipiscing. Nulla a purus sed risus ullamcorper tempus. Nunc neque magna, fringilla quis, ullamcorper vitae, placerat sed, orci. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae;</font></td>
          </tr>
        </table></td>
    </tr>
  </table>
</center>
</body>
</html>

[/code]

2) Cleaned Page

[code:html]

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    <title>Less Markup, More Content</title>
    <meta name="description" content="The end result of lazy web page design resulting in a bloated mess of HTML.">
    <meta name="keywords" content="tag soup, messy html, bloated html">
    <link href="style.css" rel="stylesheet" type="text/css">
  </head>
  <body>
    <div class="content">
      <ul class="menu">
        <li><a href="#home" title="Not a real link">Home Page</a></li>
        <li><a href="#home" title="Not a real link">About Me</a></li>
        <li><a href="#home" title="Not a real link">Links</a></li>
      </ul>
      <h1>Welcome To My Site!</h1>
      <p>Isn&acute;t it suprisingly ugly and bland?</p>
      <p>Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Fusce est.  Maecenas pharetra nibh vel turpis molestie gravida. Integer convallis  odio eu nulla. Vivamus eget turpis eu neque dignissim dignissim. Fusce  vel erat ut turpis pharetra molestie. Cras sollicitudin consequat sem.  Vestibulum ante ipsum primis in faucibus orci luctus et ultrices  posuere cubiliaCurae; Maecenas augue diam, sagittis eget, cursus at,  vulputate at, nisl. Etiam scelerisque molestie nibh. Suspendisse ornare  dignissim enim. Sed posuere nunc a lectus. Vestibulum luctus, nibh  feugiat convallis ornare, lorem neque volutpat risus, a dapibus odio  justo at erat. Donec vel lacus id urna luctus tincidunt. Morbi nunc.  Donec fringilla sapien nec lectus. Duis at felis a leo porta tempor. </p>
      <p>Maecenas malesuada felis id mauris. Ut nibh eros, vestibulum nec,  ornare sollicitudin, hendrerit et, ligula. Suspendisse tellus elit,  rutrum ut, tempor eget, porta bibendum, magna. Nunc sem dolor, pharetra  ut, fermentum in, consequat vitae, velit. Vestibulum in ipsum.  Phasellus erat. Sed eget turpis tristique eros cursus gravida.  Vestibulum quis pede a libero elementum varius. Nullam feugiat accumsan  enim. Aenean nec mi. Vestibulum ante ipsum primis in faucibus orci  luctus et ultrices posuere cubilia Curae; </p>
      <p>Aenean vel neque ac orci sagittis tristique. Phasellus cursus quam a  mauris. Donec posuere pede a nisl. Curabitur nec ligula eu nibh  accumsan sagittis. Integer lacinia. Class aptent taciti sociosqu ad  litora torquent per conubia nostra, per inceptos hymenaeos. Praesent  tortor dolor, pellentesque eget, fermentum vel, mollis ut, erat. Nullam  mollis. Cras rhoncus tellus ut neque. Pellentesque sed ante. </p>
      <p>Donec at nunc. Nulla elementum porta elit. Donec bibendum. Fusce  elit ligula, gravida et, tincidunt et, aliquam sit amet, metus. Nulla  id magna. Fusce quis eros. Sed eget justo. Vivamus dictum interdum  quam. Curabitur malesuada. Proin id metus. Curabitur feugiat. Nunc in  turpis. Cras lobortis lobortis felis. Pellentesque habitant morbi  tristique senectus et netus et malesuada fames ac turpis egestas.  Mauris imperdiet aliquet ante. Lorem ipsum dolor sit amet, consectetuer  adipiscing elit. </p>
      <p>Etiam tristique mauris at nibh sodales pretium. In lorem eros,  laoreet eget, rhoncus et, lacinia nec, pede. Fusce a quam. Pellentesque  vitae lacus. Vivamus commodo. Morbi euismod, ipsum id consectetuer  ornare, nisi sem suscipit pede, vel dictum purus mauris eu leo. Proin  sodales. Aliquam in pede nec eros aliquet adipiscing. Nulla a purus sed  risus ullamcorper tempus. Nunc neque magna, fringilla quis, ullamcorper  vitae, placerat sed, orci. Pellentesque habitant morbi tristique  senectus et netus et malesuada fames ac turpis egestas. Vestibulum ante  ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; </p>
    </div>
  </body>
</html>

[/code]

External Style Sheet

[code:html]

@charset "utf-8";
body{font:13pt Geneva, Arial, Helvetica, sans-serif;}
.menu{float:left;height:800px;list-style-type:none;width:150px;}
.menu a, .menu a:visited{color:#4FB322;}
.content{background:gainsboro;color:white;margin:auto;position:relative;width:700px;}
h1{color:red;line-height:64px;text-align:center;}
p{margin:4px;}

[/code]

Even in this basic example you can see a fairly dramatic improvement when the excess HTML is trimmed and CSS is used to control style.  The original page is 51 lines, where the cleaned page is only 26 lines, plus 7 lines in the style sheet.  The cleaned page is a third the size of the original (counting the style sheet), and more realistically is actually half the size because the style sheet would be cached by most client browsers and wouldn't be downloaded for every page request.  As far as raw kilobytes it's a difference of 6KB to 4KB, which isn't a particularly exciting difference in this case, but one that is quickly magnified as the length of the page increases.  This is especially true with dynamic applications that pull content from a database, most importantly content such as product listings that utilize the same markup and are repeated multiple times. Fortunately in the case of dynamic pages involving looping procedures that output the same markup with different content, cutting down the bloat can be as easy as a few modifications to those procedures.

Furthermore if you wanted to change, for instance, the font color or the line-height in the original, you would have to modify every font tag and table cell to accomplish that.  Implementing those changes in the second example requires a single modification to the style-sheet.  The time-saved here is once again significantly amplified when considered in a situation with multiple pages (in many cases this can be hundreds or even thousands).

When all is said and done, this isn't meant to be a be-all end-all guide for optimizing your markup because I could write a book and still not cover it all.  Rather it was meant to highlight an aspect of web page performance and optimization that is usually swept under the rug in favour of those that are more directly appreciable such as eye-candy and new features.  While clean markup might not be as "glamourous" as other aspects of web development, it is an important aspect to keeping load time in check and a crucial factor in reducing them amount of time spent maintaining and updating.