Colin Cochrane

Colin Cochrane is a Software Developer based in Victoria, BC specializing in C#, PowerShell, Web Development and DevOps.

Reducing Code Bloat Part Two - Semantic HTML

In my first post on this subject, Reducing Code Bloat - Or How To Cut Your HTML Size In Half, I demonstrated how you can significantly reduce the size of a web document by simply moving style definitions externally and getting rid of a table-based layout.  In this installment I will look at the practice of semantic HTML and how effective it can be at keeping your markup tidy and lean.

What is Semantic HTML?

Semantic HTML, in a nutshell, is the practice of creating HTML documents that only contain the author's "meaning" and not how that meaning is presented.  This boils down to using only structural elements and not presentational elements.  A common issue with semantic HTML is identifying the often subtle differences between elements that represent the author's meaning versus a presentational element.  Consider the following examples:

<p>
I really felt the need to <i>emphasize</i> the point.
</p>

 

<p>
I really felt the need to <em>emphasize</em> the point.
</p>

Many people would consider these two snips of code to be essentially the same; a paragraph, with the word "emphasize" in italics.  When considering the semantic value of the elements, however, there is one significant difference.  The element <i> is a purely presentational (or visual), and it has no meaning to the semantic structure of the document.  The <em> element, on the other hand, has a meaningful semantic value in the document's structure because it defining its contents as being emphasized.  Visually we usually see the contents both <i> and <em> elements rendered as italicized text which is why the difference often seems like nitpicking, but the standard italicizing for <em> elements is simply the choice made by most web browsers, it has no inherent visual style.

Making Your Markup Work for You

Some of you might be thinking to yourself "some HTML tags have a different meaning than others? So what?".  Certainly a reasonable response, because when it comes down to it, HTML semantics can seem like pointless nitpicking.  That being said, this nitpicking drives home what HTML is really supposed to do: define markup.  Think of it like being a writer for a magazine.  You write your article, including paragraphs and indicating words or sentences that should be emphasized, or considered more strongly, in relation to the rest, and submit it.  You don't decide what font is used, the colour of the text, or how much space should be between each line, that's the editor's job.  It is exactly the same concept with semantic HTML: the structure and meaning of the content is the job of HTML, and the presentation is the job of the browser and CSS.

Without the burden of presentational elements you only have to worry about a core group of elements with meaningful structure value.  For instance...

<p>
This is a chunk of text <span class="red14pxitalic">where</span> random words 
are <span class="green16pxbold">styled</span> differently,
<span class="red14pxitalic">with</span> some words 
<span class="green16pxbold">being</span> red, italicized and at 14px, and others
being green, bold, and 16px.
</p>
 
with an external style-sheet definition...
span.red14pxitalic{color:red;font-size:14px;font-style:italic;}
span.green16pxbold{color:green;font-size:16px;font-weight:bold;}

...turns into this...

<p>
This is a chunk of text <em>where</em> random words 
are <strong>styled</strong> differently, <em>with</em> some words 
<strong>being</strong> red, italicized and at 14px, and others being green, 
bold, and 16px.
</p>

with an external style-sheet definition...

p em{color:red;font-size:14px;font-style:italic;}
p strong{color:green;font-size:16px;font-weight:bold;}

The second example accomplishes the same visual result as the first, but contains actual semantic worth in the document's structure.  It also illustrates how much simpler it is to create a new document because you don't have to worry about style.  You create the document and use meaningful semantic elements to identify whatever parts of the content are necessary, and let the style-sheet take care of the rest (assuming of course that the style-sheet was properly defined with the necessary styles).  By using a more complete range of HTML elements you will find yourself needing <span class="whatever"> tags less and less and find your markup becoming cleaner, easier to read, and smaller.

 

Code Size Comparison

 

  HTML Characters (Including Spaces) CSS Characters (Including Spaces)
Example One 302 127
Example Two 216 103

 

Part Three will continue looking at semantic HTML, as well as strategies you can use when defining your style framework.

Using CSS To Create Two Common HTML Border Effects

Seperating the style from the markup of a web document is generally a painless, if sometimes time-consuming, task.  In many cases however, the process can have some added speed-bumps; most notably when the original HTML is using an infamous table-based layout.  The two most common speedbumps when dealing with table-based layouts and styling are recreating the classic borderless table and keeping the default table border appearance.

The appearance of these two kinds of table are as follows 

Default Border

1 2
3 4

Borderless

1 2
3 4

The markup for these two tables looks like:

[code:html]

<!--Default Border -->
<table border="1">
<tbody>
<tr>
<td>1</td>
<td>2</td>
</tr>
<tr>
<td>3</td>
<td>4</td>
</tr>
</tbody>
</table>
<!-- Borderless -->
<table border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td>1</td>
<td>2</td>
</tr>
<tr>
<td>3</td>
<td>4</td>
</tr>
</tbody>
</table>

[/code]

If you want to get the same effects while losing the HTML attributes you can use the folllow CSS:

Default Border

[code:html]

table{border-spacing:0px;border:solid 1px #8D8D8D;width:130px;}
table td{
border:solid 1px #C0C0C0;
border-bottom:solid 1px #8D8D8D;
border-left:solid 1px #8D8D8D;
display:table-cell;
margin:0;
padding:0;}

[/code]


Borderless

[code:html]

table{border:none;border-collapse:collapse;}
table td{padding:0;margin:0;}

[/code]

Duplicating the default table border look requires extra rules in its style definition because the default border contains two shades so the border-color values must be set accordingly. 

That is the basic method to replicating HTML table effects with CSS that are usually created with HTML attributes.

Reducing Code Bloat - Or How To Cut Your HTML Size In Half

When it comes to designing and developing a web site the load time is one consideration that is often ignored, or is an afterthought once the majority of the design and structure is in place.  While high-speed internet connections are becoming increasingly common there are still a significant portion of web users out there with 56k connections, and even those with broadband connections aren't guaranteed to have a fast connection to your particular server.  Every second that a user has to wait to download your content is increasing the chance of that user deciding to move on.

Attempting to reduce the size of a web page is usually restricted to compressing larger images and optimizing them for web use.  This is a necessary step to managing page size, but there is another important factor that can significantly reduce the size of a page to improve download times: code bloat, or more specifically, (x)HTML code bloat.  Getting rid of that code bloat means less actual bytes to be downloaded by clients, as well as captilizing on what the client has already downloaded.  Unfortunately this is an option that tends to be ignored due to the perceived loss of time spent combing through markup to cut out the chaff, despite the fact that clean, efficient markup with well-planned style definitions will save countless hours when it comes to upkeep and maintenance.

To demonstrate the difference between a bloated page and one with efficient markup I created two basic pages.  One uses tables, font tags, HTML style attributes and so forth to control the structure and look of the page, while the other uses minimal markup with an external stylesheet.

1) Bloated Page

[code:html]

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>This Tag Soup Is Really Thick</title>
<meta name="description" content="The end result of lazy web page design resulting in a bloated mess of HTML.">
<meta name="keywords" content="tag soup, messy html, bloated html">
</head>
<body>
<center>
  <table width=700 height=800 bgcolor="gainsboro" border=0 cellpadding=0 cellspacing=0>
    <tr>
      <td valign=top width=150><table align=center border=0 cellpadding=0 cellspacing=0>
          <tr>
            <td align="left"><a href="#home" title="Not a real link"><font color="#4FB322" size="3" face="Geneva, Arial, Helvetica, sans-serif">Home Page</font></a></td>
          </tr>
          <tr>
            <td align="left"><a href="#about" title="Not a real link"><font color="#4FB322" size="3" face="Geneva, Arial, Helvetica, sans-serif">About Me</font></a></td>
          </tr>
          <tr>
            <td align="left"><a href="#links" title="Not a real link"><font color="#4FB322" size="3" face="Geneva, Arial, Helvetica, sans-serif">Links</font></a></td>
          </tr>
        </table></td>
      <td valign=top><table>
          <tr>
            <td align="center" height=64><h1><font color="red" face="Geneva, Arial, Helvetica, sans-serif">Welcome to My Site!</font></h1></td>
          </tr>
          <tr>
            <td align="left"><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Isn&acute;t it surprisingly ugly and bland?</font></td>
          </tr>
          <tr>
            <td align="left"><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Fusce est. Maecenas pharetra nibh vel turpis molestie gravida. Integer convallis odio eu nulla. Vivamus eget turpis eu neque dignissim dignissim. Fusce vel erat ut turpis pharetra molestie. Cras sollicitudin consequat sem. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Maecenas augue diam, sagittis eget, cursus at, vulputate at, nisl. Etiam scelerisque molestie nibh. Suspendisse ornare dignissim enim. Sed posuere nunc a lectus. Vestibulum luctus, nibh feugiat convallis ornare, lorem neque volutpat risus, a dapibus odio justo at erat. Donec vel lacus id urna luctus tincidunt. Morbi nunc. Donec fringilla sapien nec lectus. Duis at felis a leo porta tempor.</font></td>
          </tr>
          <tr>
            <td align="left"><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Maecenas malesuada felis id mauris. Ut nibh eros, vestibulum nec, ornare sollicitudin, hendrerit et, ligula. Suspendisse tellus elit, rutrum ut, tempor eget, porta bibendum, magna. Nunc sem dolor, pharetra ut, fermentum in, consequat vitae, velit. Vestibulum in ipsum. Phasellus erat. Sed eget turpis tristique eros cursus gravida. Vestibulum quis pede a libero elementum varius. Nullam feugiat accumsan enim. Aenean nec mi. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae;</font></td>
          </tr>
          <tr>
            <td align="left"><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Aenean vel neque ac orci sagittis tristique. Phasellus cursus quam a mauris. Donec posuere pede a nisl. Curabitur nec ligula eu nibh accumsan sagittis. Integer lacinia. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos hymenaeos. Praesent tortor dolor, pellentesque eget, fermentum vel, mollis ut, erat. Nullam mollis. Cras rhoncus tellus ut neque. Pellentesque sed ante.</font></td>
          </tr>
          <tr align="left">
            <td><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Donec at nunc. Nulla elementum porta elit. Donec bibendum. Fusce elit ligula, gravida et, tincidunt et, aliquam sit amet, metus. Nulla id magna. Fusce quis eros. Sed eget justo. Vivamus dictum interdum quam. Curabitur malesuada. Proin id metus. Curabitur feugiat. Nunc in turpis. Cras lobortis lobortis felis. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Mauris imperdiet aliquet ante. Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</font></td>
          </tr>
          <tr align="left">
            <td><font color="#FFFFFF" size="3" face="Geneva, Arial, Helvetica, sans-serif">Etiam tristique mauris at nibh sodales pretium. In lorem eros, laoreet eget, rhoncus et, lacinia nec, pede. Fusce a quam. Pellentesque vitae lacus. Vivamus commodo. Morbi euismod, ipsum id consectetuer ornare, nisi sem suscipit pede, vel dictum purus mauris eu leo. Proin sodales. Aliquam in pede nec eros aliquet adipiscing. Nulla a purus sed risus ullamcorper tempus. Nunc neque magna, fringilla quis, ullamcorper vitae, placerat sed, orci. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae;</font></td>
          </tr>
        </table></td>
    </tr>
  </table>
</center>
</body>
</html>

[/code]

2) Cleaned Page

[code:html]

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    <title>Less Markup, More Content</title>
    <meta name="description" content="The end result of lazy web page design resulting in a bloated mess of HTML.">
    <meta name="keywords" content="tag soup, messy html, bloated html">
    <link href="style.css" rel="stylesheet" type="text/css">
  </head>
  <body>
    <div class="content">
      <ul class="menu">
        <li><a href="#home" title="Not a real link">Home Page</a></li>
        <li><a href="#home" title="Not a real link">About Me</a></li>
        <li><a href="#home" title="Not a real link">Links</a></li>
      </ul>
      <h1>Welcome To My Site!</h1>
      <p>Isn&acute;t it suprisingly ugly and bland?</p>
      <p>Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Fusce est.  Maecenas pharetra nibh vel turpis molestie gravida. Integer convallis  odio eu nulla. Vivamus eget turpis eu neque dignissim dignissim. Fusce  vel erat ut turpis pharetra molestie. Cras sollicitudin consequat sem.  Vestibulum ante ipsum primis in faucibus orci luctus et ultrices  posuere cubiliaCurae; Maecenas augue diam, sagittis eget, cursus at,  vulputate at, nisl. Etiam scelerisque molestie nibh. Suspendisse ornare  dignissim enim. Sed posuere nunc a lectus. Vestibulum luctus, nibh  feugiat convallis ornare, lorem neque volutpat risus, a dapibus odio  justo at erat. Donec vel lacus id urna luctus tincidunt. Morbi nunc.  Donec fringilla sapien nec lectus. Duis at felis a leo porta tempor. </p>
      <p>Maecenas malesuada felis id mauris. Ut nibh eros, vestibulum nec,  ornare sollicitudin, hendrerit et, ligula. Suspendisse tellus elit,  rutrum ut, tempor eget, porta bibendum, magna. Nunc sem dolor, pharetra  ut, fermentum in, consequat vitae, velit. Vestibulum in ipsum.  Phasellus erat. Sed eget turpis tristique eros cursus gravida.  Vestibulum quis pede a libero elementum varius. Nullam feugiat accumsan  enim. Aenean nec mi. Vestibulum ante ipsum primis in faucibus orci  luctus et ultrices posuere cubilia Curae; </p>
      <p>Aenean vel neque ac orci sagittis tristique. Phasellus cursus quam a  mauris. Donec posuere pede a nisl. Curabitur nec ligula eu nibh  accumsan sagittis. Integer lacinia. Class aptent taciti sociosqu ad  litora torquent per conubia nostra, per inceptos hymenaeos. Praesent  tortor dolor, pellentesque eget, fermentum vel, mollis ut, erat. Nullam  mollis. Cras rhoncus tellus ut neque. Pellentesque sed ante. </p>
      <p>Donec at nunc. Nulla elementum porta elit. Donec bibendum. Fusce  elit ligula, gravida et, tincidunt et, aliquam sit amet, metus. Nulla  id magna. Fusce quis eros. Sed eget justo. Vivamus dictum interdum  quam. Curabitur malesuada. Proin id metus. Curabitur feugiat. Nunc in  turpis. Cras lobortis lobortis felis. Pellentesque habitant morbi  tristique senectus et netus et malesuada fames ac turpis egestas.  Mauris imperdiet aliquet ante. Lorem ipsum dolor sit amet, consectetuer  adipiscing elit. </p>
      <p>Etiam tristique mauris at nibh sodales pretium. In lorem eros,  laoreet eget, rhoncus et, lacinia nec, pede. Fusce a quam. Pellentesque  vitae lacus. Vivamus commodo. Morbi euismod, ipsum id consectetuer  ornare, nisi sem suscipit pede, vel dictum purus mauris eu leo. Proin  sodales. Aliquam in pede nec eros aliquet adipiscing. Nulla a purus sed  risus ullamcorper tempus. Nunc neque magna, fringilla quis, ullamcorper  vitae, placerat sed, orci. Pellentesque habitant morbi tristique  senectus et netus et malesuada fames ac turpis egestas. Vestibulum ante  ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; </p>
    </div>
  </body>
</html>

[/code]

External Style Sheet

[code:html]

@charset "utf-8";
body{font:13pt Geneva, Arial, Helvetica, sans-serif;}
.menu{float:left;height:800px;list-style-type:none;width:150px;}
.menu a, .menu a:visited{color:#4FB322;}
.content{background:gainsboro;color:white;margin:auto;position:relative;width:700px;}
h1{color:red;line-height:64px;text-align:center;}
p{margin:4px;}

[/code]

Even in this basic example you can see a fairly dramatic improvement when the excess HTML is trimmed and CSS is used to control style.  The original page is 51 lines, where the cleaned page is only 26 lines, plus 7 lines in the style sheet.  The cleaned page is a third the size of the original (counting the style sheet), and more realistically is actually half the size because the style sheet would be cached by most client browsers and wouldn't be downloaded for every page request.  As far as raw kilobytes it's a difference of 6KB to 4KB, which isn't a particularly exciting difference in this case, but one that is quickly magnified as the length of the page increases.  This is especially true with dynamic applications that pull content from a database, most importantly content such as product listings that utilize the same markup and are repeated multiple times. Fortunately in the case of dynamic pages involving looping procedures that output the same markup with different content, cutting down the bloat can be as easy as a few modifications to those procedures.

Furthermore if you wanted to change, for instance, the font color or the line-height in the original, you would have to modify every font tag and table cell to accomplish that.  Implementing those changes in the second example requires a single modification to the style-sheet.  The time-saved here is once again significantly amplified when considered in a situation with multiple pages (in many cases this can be hundreds or even thousands).

When all is said and done, this isn't meant to be a be-all end-all guide for optimizing your markup because I could write a book and still not cover it all.  Rather it was meant to highlight an aspect of web page performance and optimization that is usually swept under the rug in favour of those that are more directly appreciable such as eye-candy and new features.  While clean markup might not be as "glamourous" as other aspects of web development, it is an important aspect to keeping load time in check and a crucial factor in reducing them amount of time spent maintaining and updating.

Three CSS Roll Over Techniques You Might Not Know About

When it comes to rollover effects in web design the most common way to accomplish the effect has traditionally been with JavaScript:


JavaScript in the HEAD section

[code:html]

<script language="JavaScript">
<!--
// preload images
if (document.images) {
img_on =new Image(); img_on.src ="../images/1.gif";
img_off=new Image(); img_off.src="../images/2.gif";
}
function handleOver() {
if (document.images) document.imgName.src=img_on.src;
}
function handleOut() {
if (document.images) document.imgName.src=img_off.src;
}
//-->
</script>

[/code]


And in the element with the rollover effect

[code:html]


<a href="http://www.domain.com" onMouseOver="handleOver();return true;" onMouseOut="handleOut();return true;"><img name="imgName" alt="Rollover!" src="/images/1.gif"/></a>

[/code]

The reason this method is used so commonly is because it is simple to implement and, more importantly, avoids the "lag" on the first mouseover that comes when using a CSS background-image switch on an selector:hover rule due to the delay required to download the rollover image. One thing that a lot of people don't realize is that there are methods to accomplish this effect in CSS without the initial rollover lag.

Method One - CSS Preloading

This is the quick and dirty way to force browsers to download rollover images when they initially load the page. Let's say you have the following document:

[code:html]


<html>
<head>
<title>My Rollover Page</title>
<style type="text/css">
#rollover{background:url(/images/1.gif);}
#rollover:hover{background:url(/images/2.gif);}
</style>
</head>
<body>
<div>
<a id="rollover" href="http://www.domain.com">My Rollover Link</a>
</div>
</body>
</html>

[/code]

In this page there would be a noticible delay when a user first mouses over the "rollover" anchor. The CSS Preloading method uses an invisible dummy element set to visibility:hidden, and has the "active" version of the rollover image set as its background.

[code:html]

<html>
<head>
<title>My Rollover Page</title>
<style type="text/css">
#preload{position:absolute;visibility:hidden;}
#image2{background:url(/images/2.gif);}
#rollover{background:url(/images/1.gif);}
#rollover:hover{background:url(/images/2.gif);}
</style>
</head>
<body>
<div id="preload">
<div id="image2"></div>
</div>
<div>
<a id="rollover" href="http://www.domain.com">My Rollover Link</a>
</div>
</body> 
</html>
 

[/code]

Method Two - Image Visibility Swap

This method accomplishes the same goal of forcing the browser to load both of the rollover images, but attacks it in a different way. Using the same example as above, we basically set the background of the containing anchor element to the "active" state of the rollover, and set the contained image to be the "inactive" state. Then it's just a matter of hiding the image element on hover.

[code:html]

<html>
<head>
<title>My Rollover Page</title>
<style type="text/css">
#rollover{background:url(/images/2.gif");display:block;height:50px;width:50px;}
#rollover:hover img{visibility:hidden;}
</style>
</head>
<body>
<div>
<a id="rollover" href="http://www.domain.com"><img src="/images/1.gif" alt="My Rollover's Inactive Image" /></a>
</div>
</body> 
</html>  

[/code]

This is the method that this site uses for the ColinCochrane.com logo in the header.

Method Three - Multistate Image

This method avoids the preloading problem altogether by using only one image that contains the inactive and active states. This is accomplished by creating an image that has the inactive and active versions stacked on top of eachother, like so:



Then all you do is set the element's height to half of that of the image and use the background-position property to shift the states on hover:

[code:html]

<html>
<head>
<title>My Rollover Page</title>
<style type="text/css">
#rollover{background:url(/images/multi.gif") bottom;display:block;height:20px;width:100px;}
#rollover:hover{background-position:top;}
</style>
</head>
<body>
<div>
<a id="rollover" href="http://www.domain.com"></a>
</div>
</body> 
</html>

[/code]


Now you have some different techniques to consider when implementing rollover effects on your website.