Pro HTML5 Accessibility: Building an Inclusive Web (2012)

C H A P T E R 5

HTML5: The New Semantics and New Approaches to Document Markup

In this chapter, you’ll start to look at the HTML5 specification in more detail, especially the aspects of it that most relate to the development of accessible interfaces. There are many new APIs that do background client/server processing and data storage that can be leveraged for rich, responsive applications, but you’ll be seeing mostly the aspects of HTML5 that impact accessibility for users.

You’ll also see the new HTML5 elements and the semantics used to define document outlines and new structural forms.

HTML5: What’s New?

Early versions of HTML were relatively simple document markup languages. They allowed documents to be linked and referenced, and they provided semantics for structuring the content to support a small range of browsers and platforms. This variety of content ranged from the likes of data tables to content providing interaction via simple links (which is really the heart of the Web) to form controls, as well as some very simple media—implemented a la embedded graphics.

But that was really it. These items formed the basis of the abilities of earlier HTML markup languages. Clever people then worked out ways of compressing audio and video by using plugins such as Flash, Silverlight, QuickTime for the Maccies, or RealPlayer for PC heads (and indeed what plugin fun all that was). These technologies enabled the Web to be used as a platform for rich media like animated content and video.

Note Video is now really big business, with Netflix in the US accounting for between 22-35 percent of all Web traffic (depending on who you ask). So the nature of the delivery platform is pretty important. And yes, Netflix was one of the first companies to adopt HTML5 to serve video to its clients.

These early iterations of HTML were expected to go way above the call of duty, as the Web moved far beyond pages being mere documents into the application space and so on.

OK, so what’s new in HTML5? Quite a lot. So much that HTML5 totally breaks the model of being merely a declarative document markup language. HTML5 has a host of new features that cover a broad range of new exciting functions. In many ways, much of it might not seem very HTML like, and in truth the language goes way beyond its predecessors. HTML5 really is a disruptive technology.

Caution With any new or game-changing technology, you have to take care and adopt a flexible trial-and-error approach. The way HTML5 adoption works (in part, anyway) is that the specification makes parsing rules that give browser manufacturers a standard they can build their browsers around. Support within the browser often is provided in a piecemeal way. As the spec evolves, a browser vendor might implement certain parts of it only.

From an accessibility perspective, the net result is that what is or isn’t supported by the browser or assistive technology (AT) vendor can seem to be a little arbitrary. For example, it took absolutely years for CSS2 to be widely supported in the browser. And when it finally was, it was often only partially supported, and this led to bugs and glitches (such as “box-model rendering inconsistencies”). There are the same peculiarities in the world of AT. For example, the <scope> attribute of the <table> element made authoring more accessible data tables in HTML4 much quicker than using the better supported <headers> and for/id methods. However, it was so poorly supported that this easier authoring method, unfortunately, just couldn’t be relied upon.

New HTML5 Semantics

HTML is a language that is largely associated with meaning. There are controls that you are familiar with already that have an inherent behavior (such as links being clickable or activated via the keyboard, and so on), but these behaviors are handled by a user agent, such as the browser. The underlying code provides a basic structure that allows the core content (which is usually and mostly text) to be consumed by a broad range of devices. This content layer provides a practical basis for the “author once, and publish to many devices” model that gives a huge edge to electronic communication, enabling interoperability without a great deal of customization between a plethora of devices.

Note The HTML5 spec is quite strict when it comes to elements being used only for their intended purpose.

The preceding observations really make sense when you think of the importance that semantics have on interoperability and, indeed, accessibility. Think of what we have covered already regarding how important structured content is in supporting an accessibility architecture and passing usable information via the accessibility API to the AT. My apologies if your head still hurts, but all this should be a little clearer now.

However, this rule of “the right tool for the right job” that the HTML5 spec is quite explicit about really does break down in the wild. Developers pretty much just do what they want, as long as it “works” for their purposes. It’s also worth noting that the concept of “it seems OK, it’s working, just don’t look too close under the hood” really breaks down when it comes to accessibility.

This is because the stuff that is broken under the hood might only come to light when you do some kind of expert accessibility auditing or user testing with real people. We’ll look at this topic in more detail in Chapter 9.

It suffices to say that the advice the new spec gives does focus on software interoperability and not its impact on the end user. In fact, the spec tries to not define the user experience, and there is a double standard at play to some degree. Why? Because certain aspects of the specification do actually outline what the user experience should be in some detail. However, when it comes to accessibility—more often than not—this isn’t clearly delineated.

Semantic Ninjas

It is also worth noting that while I was being a little blasé about some developers’ attitudes to their code (“It seems OK to me, so it’s OK”), this would actually be fine with me if the stuff that developers threw together did work for people with disabilities. I guess I have a particular prejudice against web content that doesn’t work for people with disabilities and older people. I’m mentioning this because well-formed code, strict validation, or even semantic correctness take second, third, and fourth place behind a positive user experience for people with disabilities who might be using ATs.

Ideally, these should support accessibility, and in many ways they do—but not all the time and in all situations. If well formedness, validation, and semantic correctness were cast-iron guarantees of accessibility, it would be happy days for all, but they’re not. So don’t think in absolutes; there are many relative considerations when it comes to HTML5 and accessibility.

Just so it’s clear when you read this: I’m all for following the rules, but only when they work in a way that doesn’t break the user experience for people with disabilities—or without disabilities, actually. I’m less concerned with the rules of well-coded, well formedness, or conformance to normative specification outlines or whatever you want to call it. Having said that, I don’t condone sloppy code or designing code in any old way; I am merely acknowledging that there will be times when you have to follow your own intuition or insightful empirical judgment rather than merely what the specification dictates.

There are some people who call themselves knowledgeable about accessibility when actually they are specification zealots, and possibly validation vampires. Conformance to either approach is no guarantee of accessibility, good usability, or a positive user experience. All these development tools really can do is indicate the developer knew what she was doing when it came to writing code that conforms to a formal published grammar, but they don’t necessarily factor in the user experience in terms of user-agent support and exactly what works with either the browser or the AT. The only thing that will help developers understand the practical reality is the experience of user testing or becoming proficient with using a screen reader. At that point, the developer’s own testing might well help signal when something is or isn’t working. This kind of knowledge is very important because it is based on the subjective reality of the user experience.

Note OK, actually well-formed code is important. Good semantics are really important, but real-world accessibility is very nuanced—particularly for new content and element types. However, bear in mind that an un-encoded ampersand (which will throw a validation error) neverresulted in an inaccessible webpage. Having some heading structure, rather than none, on a web document is far more important than merely ensuring the headings are ordered in the correct way.

Semantic No-Gos

The introductory part of the HTML5 specification that deals with new elements warns against both the use of nonconforming content and the use of conforming elements in nonconforming ways. You are advised not to use the nonconforming attribute value ("carpet", in the example) and the nonconforming attribute ("texture" in the example), which are not allowed in HTML5. Use of these is shown here:

<label>Carpet: <input type="carpet" name="c" texture="deep pile" /></label>

This is certainly good to know—not very useful, but good to know. The spec suggests, however, that the following code example makes more semantic sense and could therefore be valid HTML5:

<label>Carpet: <input type="text" class="carpet" name="c" data-texture="deep pile" /></label>

Note I’m not joking about the carpet example! You can check it on http://dev.w3.org/html5/spec/Overview.html#elements.

OK, so while the example may seem initially a little silly it isn’t without purpose. You can see from it how the semantics do make more sense. The <input type="text" /> segment is standard input that is given an arbitrary attribute (class="carpet"). The data-texture="deep pile"segment will be ignored by user agents that don’t understand it, which is all of them!

On a more serious note, if you did have a document fragment that could be used to represent the heading of a corporate site, this would also be nonconforming because the second line is not intended to be a heading of a subsection, merely a subheading or subtitle. You might think the semantics are a bit iffy, and you are right. However, there was no way in earlier versions of HTML to define the relationship between grouped headings. You can see that the following is nonconforming:

<body>
<h1>ABC Company</h1>
<h2>Leading the way in widget design since 1432</h2>
[…]

HTML5 brings a <hgroup> element to help to define where a heading can have a direct subheading, such as what you find here:

<body>
<hgroup>
<h1>ABC Company</h1>
<h2>Leading the way in widget design since 1432</h2>
</hgroup>
[…]

This kind of code is useful when you have a strap line that provides a cool tagline for your site or project. For example, if you have a web site for an animal shelter, you could have the following code:

<hgroup>
<h1>Animal Sanctuary</h1>
<h2>A lifeline for all creatures great and small</h2>
</hgroup>
...

Figure 5-1 shows an example of the code in Listing 5-1 in a page, as well as how it might look with a little Cascading Style Sheets (CSS) styling thrown in.

Listing 5-1. Using an <hgroup>

body
{
background-image:url(../Images/gray_white_tile2.png);
}

#hg
{
margin: 10px;
padding: 10px;
border-radius: 15px;
background-color: #333;
height: 80px;
margin: 20px;
padding: 15px;
}

h1
{
font-family: Lucida Sans Unicode, Lucida Grande, sans-serif;
color: #C90;
background-color: #333;
margin: 0;
padding: 0;
border-radius: 15px;
}

h2
{
font-family: Century Gothic, sans-serif;
color: #FFF;
font-size: 50%;
}

Figure 5-1. Use of new <group> example

Note The preceding example is the beginning of a new approach to document layout. It’s a more sophisticated way of defining content and outlines that uses new elements such as <section>, <article>, <header>, <footer>, and so on with HTML5. The HTML5 spec does warn that the use of scripting will change the values of many attributes, text, and indeed the entire structure of the document. You can easily make this happen dynamically so that a user agent must update these semantics of a document in order to represent the current state of the document correctly. It is very important to maintain the semantic integrity of the document so that interoperability and accessibility can also be supported properly. The spec does advise authors “to use declarative alternatives to scripting where possible, as declarative mechanisms are often more maintainable, and many users disable scripting.” That is good advice.

Global Attributes in HTML5

The following are common attributes that can be added to any of the native elements:

· accesskey

· class

· contenteditable

· contextmenu

· dir

· draggable

· dropzone

· hidden

· id

· lang

· spellcheck

· style

· tabindex

· title

You’ve seen some of these global attributes before, such as id (a unique identifier for an element that you can use as a hook for your CSS or JavaScript), class (which is the same as id, except it’s reusable on many elements), title (advisory information for an element), accesskey (a way of providing author-defined shortcut keys), tabindex (for giving sequential keyboard focus to elements), and style (adding CSS), to name a few.

Note With the accesskey attribute, the idea was that authors would be able to add quick, shortcut keys to their web content. It seems like a good idea, unless you use AT—because if you do, those keystrokes already do something! As an author, I strongly suggest you avoid using access keys, unless you can find some reserved keys that are not already used by the lion’s share of AT out there. Good luck with that, by the way.

There are a load of event-handler attributes that can be also be added to any HTML element:

· onabort

· onblur*

· oncanplay

· oncanplaythrough

· onchange

· onclick

· oncontextmenu

· oncuechange

· ondblclick

· ondrag

· ondragend

· ondragenter

· ondragleave

· ondragover

· ondragstart

· ondrop

· ondurationchange

· onemptied

· onended

· onerror*

· onfocus*

· oninput

· oninvalid

· onkeydown

· onkeypress

· onkeyup

· onload*

· onloadeddata

· onloadedmetadata

· onloadstart

· onmousedown

· onmousemove

· onmouseout

· onmouseover

· onmouseup

· onmousewheel

· onpause

· onplay

· onplaying

· onprogress

· onratechange

· onreset

· onscroll*

· onseeked

· onseeking

· onselect

· onshow

· onstalled

· onsubmit

· onsuspend

· ontimeupdate

· onvolumechange

· onwaiting

Note The preceding event handlers that are marked with asterisks might change meanings, depending on the context they’re used in—in this case, if they are used on a <body> element or window object.

Some More ARIA, Sir?

The specification also recommends that for assistive technology products that might need more detail than the current HTML5 spec can provide, a “set of annotations for assistive technology products can be specified (the ARIA role and aria-* attributes).”

We covered a lot of ground on ARIA in an earlier chapter, but it is worth outlining here how ARIA and HTML5 play together because they both have native semantics.

Here’s an enlightening quote taken directly from the HTML5 spec:

“The following table defines the strong native semantics and corresponding default implicit ARIA semantics that apply to HTML elements. Each HTML language feature (element or attribute) in the first column implies the ARIA semantics (role, states, and/or properties) given in the cell in the second column of the same row. When multiple rows apply to an element, the role from the last row to define a role must be applied, and the states and properties from all the rows must be combined. The following is a list of how this works in detail.”¹

Table 5-1 outlines strong native semantics and their corresponding default, implicit ARIA semantics.

__________

¹ http://dev.w3.org/html5/spec/Overview.html#wai-aria

Note Where ARIA is added to a native HTML5 element, in general, the added ARIA semantics will trump the HTML and override the default semantics. However, in some cases this doesn’t happen and there are some restrictions that apply. Also note that any element can be given the presentation role, regardless of the restrictions shown in Table 5-2.

Some of this must seem rather complex and gnarly, and at first glance it kind of is. However, it’s best to be aware of how the semantic interplay between HTML5 and added languages like WAI-ARIA work. As you saw earlier, there are many similarities between the two.

Content Models

The new HTML5 elements are defined in a way that includes information about the following:

· The Category that an element belongs in

· The Context that the element can be used in

· The Content Model that outlines the children of the elements that should be included

· The DOM interface that the element should implement

Note Attributes can have any string value, including an empty sting. There are some restrictions, but these general rules apply.

The Content Model outlines what an element is expected to contain. This makes sense when you think about it, because it helps clearly outline how content should behave and what the browser should do when it encounters certain items. This is also made more relevant when you consider how nested items should behave. In general, usage of an element must follow its Content Model.

Note This can get a little murky when dealing with strong vs. weak semantics, or “what trumps what” determinations in certain contexts, such as those you can see in Table 5-1 and Table 5-2. I mentioned categories of HTML content that group elements, and here is a list of them:

Metadata content

Flow content

Sectioning content

Heading content

Phrasing content

Embedded content

Interactive content

There is a nice interactive SVG diagram in the spec that visually illustrates how these categories relate to each other. For more information, go to http://dev.w3.org/html5/spec/Overview.html#kinds-of-content.

Metadata Content

Metadata content is content that outlines the behavior or presentation of page content. What you will be used to using is typical metadata content such as JavaScript and or CSS, using the following elements:

· <base>

· <command>

· <link>

· <meta>

· <noscript>

· <script>

· <style>

· <title>

Metadata content can also provide information about how documents relate to each other.

Note noscript is still valid HTML5 content. It is a less than elegant way of presenting content to a user who doesn’t have JavaScript available for whatever reason.

Flow Content

These are the main elements used in the body of a HTML document. There are rather a lot, and you will recognize most of them:

· <a>

· <abbr>

· <address>

· <area> (if it is a descendant of a map element)

· <article>

· <aside>

· <audio>

· <b>

· <bdi>

· <bdo>

· <blockquote>

· <br>

· <button>

· <canvas>

· <cite>

· <code>

· <command>

· <data>

· <datalist>

· <del>

· <details>

· <dfn>

· <div>

· <dl>

· <em>

· <embed>

· <fieldset>

· <figure>

· <footer>

· <form>

· <h1>

· <h2>

· <h3>

· <h4>

· <h5>

· <h6>

· <header>

· <hgroup>

· <hr>

· <i>

· <iframe>

· <img>

· <input>

· <ins>

· <kbd>

· <keygen>

· <label>

· <map>

· <mark>

· <math>

· <menu>

· <meter>

· <nav>

· <noscript>

· <object>

· <ol>

· <output>

· <p>

· <pre>

· <progress>

· <q>

· <ruby>

· <s>

· <samp>

· <script>

· <section>

· <select>

· <small>

· <span>

· <strong>

· <style> (if the scoped attribute is present)

· <sub>

· <sup>

· <svg>

· <table>

· <textarea>

· <u>

· <ul>

· <var>

· <video>

· <wbr>

· <text>

Sectioning Content

This is content that is related and can be grouped together. The following new elements can be used to define new sections of thematically grouped content:

· <article>

· <aside>

· <nav>

· <section>

Heading Content

This kind of content is used when defining the headings of a document that are traditionally used to structure page content. There are some usual suspects here that we’ll revisit later in the chapter:

· <h1>

· <h2>

· <h3>

· <h4>

· <h5>

· <h6>

· <hgroup>

You will be familiar with these headings and their usage from your previous web projects. They are invaluable in creating accessible content. The <hgroup> element is the new kid on the block.

Phrasing Content

This is the main body of text in a document and the inline elements used to mark up that content:

· <a> (if it contains only phrasing content)

· <abbr area> (if it is a descendant of a map element)

· <audio>

· <b>

· <bdi>

· <bdo>

· <br>

· <button>

· <canvas>

· <cite>

· <code>

· <command>

· <data>

· <datalist>

· <del> (if it contains only phrasing content)

· <dfn>

· <em>

· <embed>

· <i>

· <iframe>

· <img>

· <input>

· <ins> (if it contains only phrasing content)

· <kbd>

· <keygen>

· <label>

· <map> (if it contains only phrasing content)

· <mark>

· <math>

· <meter>

· <noscript>

· <object>

· <output>

· <progress>

· <q>

· <ruby>

· <s>

· <samp>

· <script>

· <select>

· <small>

· <span>

· <strong>

· <sub>

· <sup>

· <svg>

· <textarea>

· <u>

· <var>

· <video>

· <wbr>

· <text>

Generally, the specification states that elements that allow phrasing content need to have some kind of embedded content, or what is known as interelement whitespace. The term “interelement whitespace” sounds like it belongs in quantum physics or string theory, but all it means is “empty space” or “empty text nodes.”

In general, what is classed by the HTML5 spec as phrasing content should contain only other phrasing content. This defines largely how valid content can be created.

Note You might have noticed that some of the phrasing content is also flow content (quite a lot are actually). This overlap should make validation errors a little easier to avoid as you mix and match.

Embedded Content

This is content that is, well, embedded or that imports another resource into the document:

· <audio>

· <canvas>

· <embed>

· <iframe>

· <img>

· <math>

· <object>

· <svg>

· <video>

Some of these elements are designed to have fallback content—content that is a functional replacement or equivalent when any of the preceding elements aren’t supported by the browser or other user agent, such as a screen reader. The accessibility, at the time of this writing, of <canvas>, for example, means that you really have to consider how users of screen readers will experience your content. (Note that <canvas> is also currently problematic for screen-magnification users because there is currently no way to expose where the focus of the canvas is to a screen magnifier.)

The following snippet illustrates fallback content for the <canvas> element:

<html>
<canvas id=”Groovy_anim_with_fallback” width=”300” height=”150”>
<p> Some fallback instruction for the user..or links to other more accessible resources</p>
</canvas>
</html>

An older browser or user agent will ignore the <canvas> because it won’t understand it; however, it will be able to parse the markup that is contained within. User agents that do understand the <canvas> element will just render that and ignore the embedded content.

Note You can use some conditional functions such as if(), which contain your <canvas> drawing methods. If they are not supported, an image can be shown instead.

<html>
<canvas id="Groovy_anim_with_fallback" width="300" height="150">
<h1>Oops..your browser won’t show our groovy Canvas Animation</h1>
<img scr="myserver/usful_andgroovy_image.png" alt="Visit the ‘overview’ section of the website which has a more accessible content">
</canvas>
</html>

In the preceding example, rather than describe a groovy graphic to a screen-reader user, I gave the image alternate text that provides some kind of useful instruction. This might not be totally valid HTML, but it provides something useful, which to me is more important.

If the image is purely decorative and the embedded content heading is sufficient to inform the user of what she can do to access a more accessible version of the content, or whether it’s just a heads-up that she can ignore it, then giving the image a null alt value (alt="") will result in the image being treated as presentational and ignored by most screen readers.

You also can give the canvas the ARIA role of presentational, as shown here:

<html>
<canvas id="Groovy_anim_with_fallback" width="300" height="150"role='presentational'>
<h1>Oops..your browser won’t show our groovy Canvas Animation</h1>
<img scr="myserver/usful_andgroovy_image.png" alt="Visit the ‘overview’ section of the website which has a more accessible content">
</canvas>
</html>

Using role="presentational" should hide the canvas from the newer screen readers that support ARIA.

We will discuss <canvas> in a later chapter. How fallback content should be handled is supposed to be outlined in an element’s definition. However, I feel this is underspecified in terms of what the fallback should be. Ideally, fallback content is a functional replacement or sometimes just a heads-up for a user that something just might not work. In reality, fallback implementations can also be disruptive to the user experience. No one wants to go back to the days of “you don’t have a groovy browser—how dare you!” type of messages, but without a clear idea of what the fallback should be, what is the developer to do? The spec is clear enough that <canvas> content, for example, should not be used where there is a more suitable HTML5 design pattern that has inherent semantics. However, we all know the world doesn’t work like that and that the street finds its own uses for things.

In an ideal world, something like <canvas> would be ready for prime time, with a fully accessible architecture—it ain’t.

Note What <canvas> is ready for is mostly visual users, but it will still present challenges for screen-magnification users because there is currently no way for the screen magnifier to follow critical changes in <canvas> content.

Interactive Content

This is content that is for user interaction:

· <a>

· <audio> (if the controls attribute is present)

· <button>

· <details>

· <embed>

· <iframe>

· <img> (if the usemap attribute is present)

· <input> (if the type attribute is not in the hidden state)

· <keygen>

· <label>

· <menu> (if the type attribute is in the toolbar state)

· <object> (if the usemap attribute is present)

· <select>

· <textarea>

· <video> (if the controls attribute is present)

The preceding list of HTML elements represents the framework for building established user-interaction design patterns. The controls have inherent activation behaviors that fire particular events.

Note The spec also outlines that flow content and phrasing content should contain palpable content. This really means “stuff that the end user can perceive and isn’t hidden.”

Paragraphs

The <p> element is one that will be very familiar to you. It represents phrasing content. That is “a block of text with one or more sentences that discuss a particular topic.”

Some older elements such as <ins> and <del> are still a part of HTML5. You might not have used them much, but they are useful for showing content that has been inserted into a paragraph or deleted from a paragraph, respectively. The following example details their use:

<section>
<h1>Example of paragraphs using <pre>ins</pre> and <pre>del</pre></h1>
<p>This is a chunk of content that was <del>deleted</del> <ins>and then updated</ins>.</p>
<p>This is another paragraph where nothing was inserted or deleted.</p>
</section>

What is new is how the <paragraph> element can be used with some of the new HTML5 sectioning elements, such as <aside>, <section>, and so on.

HTML Document Metadata

HTML5 has the following elements that provide document metadata, usually in the head of the HTML file.

The <head> Element

This provides some simple metadata for the document. It can take the following general form:

<!DOCTYPE HTML>
<html>
<head>
<title>What kinda of page am I?</title>
</head>
<body>

OK, so far so good. Nothing really new there. The following element is really important for screen-reader accessibility, so pay attention!

The <title> Element

There is nothing new about the <title> element in HTML5, and its purpose is reveal the document’s title or name to the user agent. The spec advises this:

“Authors should use titles that identify their documents even when they are used out of context, for example in a user's history or bookmarks, or in search results.”²

This is really good advice and not just because of its potential use as a way to identify the contents of a page in your bookmarks menu (though that is great). It’s also because the <title> element is the first item read by a screen reader when an HTML document loads. This makes it a very important piece of information to help a screen-reader user know where they are within a web site.

You can see from Figure 5-2 how the <title> element shows the user the identity of the site.

__________

² http://dev.w3.org/html5/spec/Overview.html#the-title-element

Figure 5-2. Use of the Animal Sanctuary <title> element

Some More Elemental Cleverness

Clever use of the <title> element can help to guide users through a process, like buying something using an online shopping function. For example, say the entire buying process has three or four stages. In this case, you should use the <title> element to re-enforce to the user what stage he is in at any given time. You can do this with prompts such as “Select your product,” “Enter a Shipping Address,” and “Enter your Credit Card Information.”

For each of the pages, the <header> might look something like the code example in Listing 5-2.

Listing 5-2. Using the <title> Element

Stage 1:
<!DOCTYPE HTML>
<html>
<head>
<title>Select your product</title>
</head>
<body>

Stage 2:
<!DOCTYPE HTML>
<html>
<head>
<title>Enter a Shipping Address</title>
</head>
<body>

Stage 3:
<!DOCTYPE HTML>
<html>
<head>
<title>Enter your Credit Card Information</title>
</head>
<body>

The page content that follows support the user in each of the steps, and it obviously allows the user to complete the transaction. So the <title> element is really useful for all users, because it appears in the heading of your browser as well as in the tab name (in Safari). See Figure 5-3.

Figure 5-3. How the <title> element is displayed in Safari

The <base> Element

This is an element used to define what’s called a document base URL. Why would you want to do this? I hear you ask, “Are URLs not defined relative to the HTML document root or index.html file?” Well, good question, and yes they are. You might want to provide more of a context for the current document, for better SEO perhaps if you have a subsection of a web site that contains collections of related documents. It is a void element, so it doesn’t take any kind of content, and it sits in the header of the document. Also, in HTML5 you can now add a target attribute to the base URL, which you couldn’t do in earlier versions of HTML. Listing 5-3 details use of the <base> element.

Listing 5-3. Use of the <base> Element

<!DOCTYPE HTML>
<html>
<head>
<title><Base> element sample</title>
<base href=http://www.mysiteandsubsite.com/repository/index.html />
</head>
<body>
<p>Visit the <a href="collectionoflinks.html">Here is a collection of interesting
documents</a>.</p>
<p>Visit the <a href="collectionofworddocuments.html">Here is a collection of
interesting Word documents</a>.</p>
<p>Visit the <a href="collectionofimages.html">Here is a collection of interesting
images</a>.</p>
</body>
</html> ³

The <link> element

The <link> element is that useful piece of header metadata that helps you define where you keep your scripts and your CSS. It is also used to indicate other resources in the header of your HTML documents. A <link> element must have a rel attribute, as shown in the following snippet:

Note When the <link> element has a rel attribute, it applies to the whole of the document. However, when it’s used on the <a> or <area> element, it refers to a link where the context is given by its location within the document.

There are various content attributes associated with the <link> element:

· href

· rel

· media

· hreflang

· type

· sizes

The title attribute is also a member of this list. The href attribute is one you are familiar with, and we just talked about the rel attribute. The type attribute is used with the icon keyword to link elements in order to create an external resource link. The icons can be auditory icons, visual icons, and so on. The sizes attribute gives the sizes of icons, as described. The hreflang can be used to provide the language of a linked resource.

__________

³ All of these links would resolve to “www.mysiteandsubsite.com/repository.”

Note If you want a screen reader to pick up on a change of language within a document, you are better off using the lang attribute. In general, the language of the document can be provided by lang="en" for English, lang="fr" for French, or lang="de" for German, respectively. I found the example in Listing 5-4 on Roger Johanssons’ web site (the excellent www.456bereastreet.com), and he got it from Wikipedia. The listing uses lang attributes and attaches them to a <div> element. When a screen reader comes across these attributes, they switch synthesis modules in order to output the language correctly.

Listing 5-4. Using the Lang Attribute

<div lang="sv">
<h2>Svenska</h2>
<p>Välkommen till Wikipedia, den fria encyklopedin som alla kan redigera.</p>
</div>

<div lang="de">
<h2>Deutsch</h2>
<p>Wikipedia ist ein Projekt zum Aufbau einer Enzyklopädie aus freien Inhalten in allen
Sprachen der Welt.</p>
</div>

<div lang="fr">
<h2>Français</h2>
<p>Bienvenue sur Wikipédia, le projet d’encyclopédie libre que vous pouvez améliorer.</p>
</div>

<div lang="es">
<h2>Español</h2>
<p>Bienvenidos a Wikipedia, la enciclopedia de contenido libre que todos pueden
editar.</p>
</div>

The heads-up that the lang attribute gives to the screen reader, enabling it to shift into an appropriate mode to output the language in a way that suits the natural prosody and so on, certainly helps the legibility. It also means the screen reader doesn’t talk like a tourist.

The media attribute is usually left blank, meaning that links within the document apply to all different types of media. The type attribute is important because it can be used to indicate the MIME or content type for all kinds of content, including rich media or HTML documents. Choosing the correct MIME type will determine how your content is parsed, for example. The title attribute gives extra advisory info to the user agent, apart from when it’s used as a way to define alternative style sheets.

To define alternative style sheets, you might code something like what’s shown in Listing 5-5.

Listing 5-5. Use of Alternate Style Sheets

<!—if you need to define a persistent style sheet->
<link rel="stylesheet" href="main.css" />

<!—if you have an alternate style sheet->
<link rel="stylesheet" href="main_pref.css" title="More accessible styles" />

<!-- some alternate style sheets->
<link rel="alternate stylesheet" href="b_w_y.css" title="Black White and Yello Layout" />
<link rel="alternate stylesheet" href="large.css" title="16 Point Layout" />
<link rel="alternate stylesheet" href="fluid.css" title="Fluid Layout" />

The <meta> Element

The <meta> element, shown in Listing 5-6, is used in the header of the HTML document to describe content that is defined by some of the previously mentioned header elements. It has name, http-equiv, content, and charset attributes. It can be used to specify a document’s character encoding, to specify an application type (if the webpage represents a particular application), and so on. Originally, the description and keywords were designed to be served to search engines, but they have been largely abused by authors over the years. However, in principle they should probably still be used, even if your favorite search engine page-ranking algorithm chooses to ignore it. There are other technologies that might find them useful, such as for use within a large CMS-powered web sites or Semantic Web applications.

Listing 5-6. Example of Meta Content

<!DOCTYPE HTML>
<html>
<head>
<meta charset="UTF-8" />
<meta name="description" content="This site outlines how the smartest guys in the room are
amoral strawmen who would sell their grandmothers into slavery to enhance their profit and
prestige. They may also gain tenure at Columbia for doing so." />
<meta name="keywords" content="Money, Gold, Economics, Greed, Folly, House of Cards, Pyramid
Scam, Ponzi Scheme" />
<title>A website about Economics</title>
</head>
<body>
[…]

Note Maybe ignore the last few keywords…at your own peril! (lol)

More examples of <meta> usage can be found at http://dev.w3.org/html5/spec-author-view/the-meta-element.html#the-meta-element.

The <style> Element and scoped Attribute

The <style> element is nothing new in HTML5, but the scoped attribute is. The scoped attribute allows you to define some inline styles for a section of your webpage. It allows you to define the range of elements that the style is applied to. If the scoped attribute is present in your HTML content, only the section that has the <style> element with the scope attribute will be affected by the CSS declaration. This could be useful for content syndication in order to maintain a particular brand or design style of an article.

Existing document styles can be defined as usual within the header of the document as shown in Listing 5-7 (inline styling is used here for illustration).

Listing 5-7. Sample CSS

<!DOCTYPE HTML>
<html>
<head>
<meta charset="UTF-8" />

<!—-document css styles declared in the normal way-->

<style type="text/css">

h1{
color: #ff0;
background-color: #999;
border-radius: 15px;
padding: 15px;
}

p{
color: #fff;
background-color: #999;
border-radius: 15px;
padding: 15px;
}

</style>
</head>

<body>
<article>
<h1>Really interesting news article that you’ll soon find in other online Papers</h1>
<p>Here is some really interesting news that just has to go around the world</p>
</article>

</body>
</html>

Then these can be combined with some inline styling using the scoped attribute if, say, you had a page with several articles, the first article is syndicated content, and you want the styles to be preserved. (See Listing 5-8.)

Listing 5-8. Sample CSS and Use of the Scoped Attribute

<!DOCTYPE HTML>
<html>
<head>
<meta charset="UTF-8" />



<style type="text/css">
h1{
color:#ff0;
background-color:#999;
border-radius: 15px;
padding: 15px;
}

p{
color:#fff;
background-color:#999;
border-radius: 15px;
padding: 15px;
}

</style>
</head>

<body>
<article>
<style scoped>

h1{
color: #ff0;
background-color: #999;
border-radius: 15px;
padding: 15px;
}

p{
color: #fff;
background-color: #999;
border-radius: 15px;
padding: 15px;
}

</style>

<h1>Really interesting document that contains nicely styled news articles that you'll
find in other online Papers</h1>
<p>Here is some really interesting news that just has to go around the world</p>
</article>

<article>
<h2>Second really interesting article that just has to go around the world, that you'll
also find in other online Papers</h2>
<p>Here is some really interesting news that just has to go around the world</p>
</article>

<article>
<h2>Third really interesting article that just has to go around the world, that you'll
also find in other online Papers</h2>
<p>Here is some really interesting news that just has to go around the world</p>
</article>

</body>
</html>

Note If all the articles were to be syndicated, you could add the scoped style attribute inline to each of the preceding articles with whatever CSS declarations were appropriate.

New HTML5 Sectioning Elements

You met some of the new HTML5 sectioning elements in some previous examples, such as when the article element was used. These new sectioning elements are just that, a way of providing useful semantics for describing frequently used parts of a web document.

· body

· section

· nav

· article

· aside

· h1 h6

· hgroup

· header

· footer

· address

You saw the new <hgroup> element earlier, and you are familiar with the <body>, <h1>, <h2>, <h3>, <h4>, <h5>, and <h6> elements.

Before we look at some of the newer section elements in more detail, we’ll take another look at headings—what they are and how to use them to make your web content more accessible.

Don’t just increase the font size and bold it! Headings in HTML have a special significance, particularly for people using assistive technology (AT). There is more going on than just the visual presentation of the heading. So it isn’t enough for you to increase the font size or just change the typeface to convey structure. Yes, for a sighted person this might be enough, but semantics are more than just skin deep. To make your content more accessible, the use of heading elements is vital. So here is a brief “Headings 101” tutorial.

A sighted user can pretty much just glance at a page and quickly be able to visually distinguish what the various section headings are. This ability, in turn, allows the reader to quickly internalize and grow the document structure. The sighted person can also tell where each section of the pages’ content begins and ends, and what section a group of paragraphs actually belongs in. She can then quickly make some judgment call on what parts of the document are of interest to her and what parts she can ignore. She can then visually jump to read the sections of interest.

Note While the use of HTML is largely about structuring content to let users access it, it is just as important for the end user to sometimes ignore and bypass content or functionality that isn’t of interest.

I did mention in an earlier chapter how a screen-reader user might navigate web content using only headings, but it’s important and worth going over again. Screen-reader users often browse a webpage by using their AT to pick different types of content, depending on what’s available in the page. It might be links—or, in this case, headings—that are a little difficult to understand if you have never seen a screen reader being used.

Note There are many online demos of screen readers in action. I recommend you head over to YouTube after you’ve read this chapter and have a look.

This is how it works.

The screen reader user uses a particular keystroke combination for whatever application he is using. JAWS, for example, uses Insert + F6, or the user can press the H key. For Window-Eyes, the user presses the number keys and H. Then a dialog window appears that contains all of the headings available to it from the document. If he is using VoiceOver, the user can open up the rotor and choose headings to navigate with. When he swipes forward or back, the screen reader jumps to the next or previous heading in the document. The screen reader will also often tell the user how many headings there are.

The screen-reader user can quickly bounce through the document, getting an overview of it and then choosing the heading that sounds like his topic of interest. Good descriptive headings are therefore really useful.

As mentioned, this is vital as an aid to comprehension and navigation for a nonsighted person. Why? Because a sighted user can very quickly visually scan a long or complex document and see what parts are of interest to them. Being able to visually scan acts as a natural navigation mechanism allowing the user to jump over unrelated or uninteresting content. A blind user cannot do this. So that person’s screen reader provides the navigational mechanism.

If there are no structured HTML headings in the document at all, the blind user must go through line after line of content until she gets what she wants. If the document is very long, having to do this can be very annoying and tedious, and it can get old pretty fast.

Note As previously mentioned, with any project you need to think about its information architecture and how it is structured. However, it isn’t so important that the headings are even strictly in the right semantic order (though it’s advisable to structure them that way). What is more important is that they are actually there in the first place, because they are vital for accessible navigation, as well as overall document structure.

Screen readers are linear output devices. This means they output the items that have focus, as speech, one at a time. So you do have to consider things like source order for elements within the DOM, the order in which headings appear, and the quality of the descriptive text you use in your headings. Make it useful.

A Quick Recap on How and Why to Use Heading Elements

Note There are lots of ways of doing the same thing with HTML. My advice on headings that follows is just that, my advice. There might be better ways of doing things, and the spec might also differ from me on some matters, but as far as the latter goes, this wouldn’t be the first time.

HTML5 headings still have six levels of semantic importance, so there’s not much different from the way things were done in earlier versions of the markup language. Table 5-3 is an example of how these various headings might be applied to a webpage about musical instruments—in this case guitars—and it would be equally applicable to HTML4 or earlier.

Table 5-4 gives an example of a possible way to use headings in a web document.

In the Guitars overview, I didn’t really use the lower level headings, but then I felt I didn’t need to. For example, if you have a very long article with detailed subheadings that are buried deep down within it, you might need to call up <h4> and <h5> or <h6> and use them as required.

As in the preceding example when those subheadings are finished and you want to mark up the start of a new section, you can again start with your <h2> headings. In the preceding example, note that I bolded text where the higher level <h2> headings are used to show you when the higher-level headings come back into use after a chunk of related content is finished.

Note It’s not advisable to bounce from headings too dramatically for no good reason, because they are used to show how content relates to other content. So try to follow a logical order within a document. However, it is true that HTML5 allows blocks of content to be structured in such a way that it can seem modular and get plugged into different webpages a la syndication.

Some general HTML5 headings rules (mine) would be

· There should only ever be one <h1> (think of Highlander).

· The rest of the headings can be used as often as needed to provide a document outline.

· Headings should ideally follow a logical order.

As I mentioned earlier, from an accessibility perspective, don’t worry too much about whether the page structure is absolutely correct all the time. In my experience, most screen-reader users will be happy that the headings are there at all and will not quibble about the order.

Note If you get the “Highlander“ reference, I’m showing my age. If not, go see the movie. You’ll like it, I promise.

Meet the New divs on the Block

OK, so that was the old school, and that was about as far is it went in terms of being able to describe content or sections of content. You are used to wrapping all of your blocks of web content (I nearly said “block level elements…”) in <divs> and then just getting on with marking up your content. Well, that’s going to change somewhat because the new HTML5 elements provide a new semantic scaffold. You will still markup your content in much the same way and provide a much-needed accessibility structure for your documents, as outlined previously. Doing this old schoolwill help to provide a level of backward compatibility with older browsers and AT.

HTML5 provides these new sectioning elements as a way of providing a markup structure for the layout of your web content. This is new, and it makes great sense. A common page format would be something like the following Groovy Times site that we looked at earlier.

Let’s remind ourselves, visually. See Figure 5-4.

Figure 5-4. The glorious Groovy Times web site

And we can say that has the general semantic layout we looked at earlier, which goes along the lines you see in Figure 5-5.

Figure 5-5. Semantic grooviness

OK, so the eagle-eyed among you—and I am sure that there are many—will notice that I changed the semantic outline that you saw the last time you looked at Groovy Times. There was a <banner> at the top. I removed this because that is an ARIA role. What is outlined is a general document overview of how you might structure a generic page by adding these ids to the <div> element.

So a webpage pre-HTML5 might have had a structure like the one shown in Listing 5-9.

Listing 5-9. Pre-HTML5 Structure

The […] represents any content and son, and I’ll leave it to your imagination as to how you would use CSS to style it.

Now, in HTML5, you can stop using the semantically useless <div> all the time, where you try to give them common sense ids so that you could remember what hooks to use in your CSS declarations and flip the code to actually describe a section’s role/purpose. So we have something like that shown in Listing 5-10.

Listing 5-10. Groovy Times HTML Example

<!DOCTYPE HTML>
<html>
<head>
<meta charset="UTF-8" />
<title> Groovy HTML5 outline example</title>
</head>
<body>

<header>
<h1> Groovy Times</h1>
</header>

<nav>
<ul>
<li>About us</li>
<li>Services</li>
<li>Contact</li>
<li>Location</li>
<li>Why Groovy?</li>
</ul>
</nav>

<div id="main content">
<h2>Not feeling that groovy right now?</h2>
<p>Be optimistic:in the 1970s, […] </p>
</div>

<footer>
[…]
</footer>
</body>
</html>

Note There is no content or main content role in HTML5. So you can see in the preceding example how I just used a regular <div> and gave it an id name that described it.

The preceding example is a very basic one that uses some of the new sectioning elements to provide a semantic outline of the document. In practice, your webpages will have a lot more content. This is where the likes of section, article, and aside come in.

Note There is a peculiar bug in earlier versions of JAWS 10/11, where it doesn’t like the <header> element and using it can be problematic (and basically won’t work with very early versions of Firefox, such as Firefox 4 and earlier. Terrill Thompson wrote about it on his blog athttp://terrillthompson.com/blog/37. It affects only JAWS and Firefox, not VoiceOver, NVDA, or WinEyes and others, and it seems to be related to the use of head in <header>.

Getting Sectioned

The <section> element is just that—an element for sections of content. It represents the section of a document that is around a general theme and often comes with a specific heading. A common use would be a chapter of a book, but online you probably won’t be writing that much! It will more likely be used to divide page content into related chunks, like an Introduction, followed by some background info on the topic and so on.

It is not to be thought of as “here is a section I want to style visually, so I’ll mark it up as a section.” If you want to do that, just use a div. You will probably use all of these elements in conjunction with generic div’s for visual styling and presentation. However, elements like <section> are to be used for structuring your webpage content, as shown in Listing 5-11.

Listing 5-11. Structuring Your HTML5 Page

<article>
<hgroup>
<h1>The Guitar Gallery</h1>
<h2>Lots of groovy guitars</h2>
</hgroup>

<section>
<h2> Fenders</h2>
<p>Are you a Fender guy or a Gibson gal? Well if it’s good enough for Jimi, it’s good enough
for me!</p>
<p>[…]</p>
<h3> The first Fender Guitars</h3>
<p>[…]</p>

</section>

<section>
<h2>Gibson</h2>
<p>I want an SG but don’t want to take out a mortgage, Dear Anne.. got a problem</p>
<p> More about my feelings of deprivation due to lack of antique Gibson guitars[…]</p>
</section>

<section>
<h2>Acoustic Dreams</h2>
<p>For the softer moments we have nylon acoustic guitarts</p>
<p> Well, I really like John Fahey and Leo Kottke, […]</p>
<h3>What kind of guitar did Robbie Basho play??</h3>
</section>
</article>

Note The <p> […] </p> bits, just represent where you would put your content.

In this example, we have several sections that are about different kinds of guitars. I hope you can see that it’s pretty straightforward using it to define related, well, sections! You can see that they are wrapped up in an <article> element, which we will look at now.

Self-Contained Article

The new <article> element is a more self-contained, independent type creature. It is used to outline a self-contained composition that can be spread around the net if required, and it won’t need to pack a spare pair of pajamas because it has everything it needs already.

Note The <article> element can also represent a widget, and not just a blog entry or comment, but a composition and so on.

So if a blog post was added to our Guitar Gallery example using the <article> element, we might find it looking something like Listing 5-12.

Listing 5-12. Using <article> to Add a Blog Post

<article id=”MyGuitarBlog”>
<header>
<h1>Your first high end Guitar</h1>
<p> Taking the plunge, and why you should do it </p>
</header>
<p>Buying your first really high end axe is up there with moving house, doing a driving test or getting married in terms of being really important but the good news is it’s nothing like as stressful as any of those are!</p>
<p>[…] </p>
<footer>
<p>Some comments from interested humans </p>
</footer>
</article>

If some comments were actually added, they might look like Listing 5-13.

Listing 5-13. Comments in Blog Post Wrapped in <article> Elements

<article id="MyGuitarBlog">
<header>
<h1>Your first high end Guitar</h1>
<p> Taking the plunge, and why you should do it </p>
</header>
<p>Buying your first really high end axe is up there with moving house, doing a driving test
or getting married in terms of being really important but the good news is it’s nothing like
as stressful as any of those are!</p>
<p>[…] </p>

<section>
<h1>My Guitar Blog Comments</h1>
<article id="comment_1">
<link href="#comment_1">
<footer>
<p>Comment by: <span ="name">Not George Harrison</span>
</p>
</footer>
<p>My guitar doesn’t Gently Weep anymore.. can anyone help?</p>
</article>

<article id="comment_2">
<link href="#comment_2">
<footer>
<p>Comment by: <span ="name">Jimi Jimi<span>
</p>
</footer>
<p>I don’t know George, but I just hurt my Little Wing, when was I walking along the
Watchtower, these days everything is just a haze, a kinda purpley one weird</p>
</article>

<article id="comment_3">
<link href="#comment_3">
<footer>
<p>Comment by: <span ="name">Mini JPage</span>
</p>
</footer>
<p>Sounds weird Not George Harrison, does the song not remain the same?</p>
</article>
</section>
</article>

The idea is that they are kind self-contained so that they can be referenced or syndicated elsewhere if needed. You can also think of an article as a small unit.

The Sectioning Bug

There is a sectioning bug that was outlined very well by Jason Kiss of Accessible Culture. If you use nested headings in HTML5 sectioning elements, as opposed to using exclusively <h1> elements, the hierarchy might be misrepresented to the user for browsers and AT that don’t recognize theoutline algorithm. Thanks to Jason for letting me illustrate the bug here.

First, HTML5 defines what’s called an outline algorithm that allows you to nest multiple <h1>headings in sectioning content (to support content syndication for example). Usually, however, there can only be one (remember Highlander!) right?

Note You can read more about the outline algorithm at http://dev.w3.org/html5/spec/sections.html#outlines.

So according to this algorithm, Listing 5-14 and Listing 5-15 are practically identical. The first uses only <h1> heading elements.

Listing 5-14. HTML5 Example Outlining Headers Algorithm

<h1>Level 1</h1>
<nav>
<h1>Level 2</h1>
</nav>
<section>
<h1>Level 2</h1>
<article>
<h1>Level 3</h1>
<aside>
<h1>Level 4</h1>
</aside>
</article>
</section>

The second uses an explicit semantic to show the different heading levels.

Listing 5-15. HTML5 Example Outlining Headers Algorithm with Explicit Semantics

<h1>Level 1</h1>
<nav>
<h2>Level 2</h2>
</nav>
<section>
<h2>Level 2</h2>
<article>
<h3>Level 3</h3>
<aside>
<h4>Level 4</h4>
</aside>
</article>
</section>

At the time of this writing, support for the new HTML5 outline algorithm isn’t great. So a user agent that doesn’t support the algorithm will output the content as multiple <h1> elements. Depending on the context of use, this might mean you are better off sticking with the second example, if content isn’t to be syndicated. If your page does only use <h1> elements, while it won’t really give the user an overview of the correct hierarchical structure, it’s not a show-stopper and won’t result in totally inaccessible content. Having said that, it’s not ideal. However, if content is to be syndicated, use method one, because every nested article might likely end up in another site; if content is more static or to be consumed on page, stick with the second method for now.

Note For more on the Sectioning Bug, visit AccessibleCulture.org at www.accessibleculture.org/articles/2011/10/jaws-ie-and-headings-in-html5.

As an Aside, Did You Hear the One About the Vicar and the […]

HTML5 also has a new <aside> element. It is also a sectioning element and is used to provide content that is tangentially related to the main article or a parent section. You might be wondering what to do with it? This is natural. An aside is normal in human speech and interaction, but it might be unusual in a markup language. There are some things that I think it could be good for, such as marking up a pull quote, which could be used on a web site where there were client testimonials. For example, take a look at Listing 5-16.

Listing 5-16. HTML5 <aside> Element Example

<p>Some our our services include, taking animals into our shelter who are mistreated and
abused. Many people come and visit us also to look for an animal to care for, like this
testimonial from a happy family who brought home a puppy shows.</p>

<aside>
<q> We are so glad we visited the Animal Sanctuary, and brought home little Puddles, he is
such a good puppy! </q>
</aside>

<p>Without the help and support of people like that we would never be able to do what we
do</p>

Other uses for it include marking up content from other sources, such as Twitter feeds or Facebook updates.

Conclusion

You had a good look at the new sectioning elements. In HTML5, much of the grouping content elements are the same with some additions, and the text level semantics are pretty similar. We’ll cover them when relevant. In the next chapter, we will take a departure and look at some of the new rich media elements, such as <video>, <audio>, and the infamous <canvas>.

All materials on the site are licensed Creative Commons Attribution-Sharealike 3.0 Unported CC BY-SA 3.0 & GNU Free Documentation License (GFDL)

If you are the copyright holder of any material contained on our site and intend to remove it, please contact our site administrator for approval.