More HTML5 Semantics - HTML5 & CSS3 FOR THE REAL WORLD (2015)

HTML5 & CSS3 FOR THE REAL WORLD (2015)

Chapter 3 More HTML5 Semantics

Our sample site is coming along nicely. We’ve given it some basic structure, along the way learning more about marking up content using HTML5’s new elements.

In this chapter, we’ll discuss even more new elements, along with some changes and improvements to familiar elements. We’ll also add some headings and basic text to our project, and we’ll discuss the potential impact of HTML5 on accessibility.

Before we dive into that, though, let’s take a step back and examine a few new—and a little tricky—concepts that HTML5 brings to the table.

A New Perspective on Content Types

For layout and styling purposes, developers have become accustomed to thinking of elements in an HTML page as belonging to one of two categories: block and inline. Although elements are still rendered as either block or inline by browsers, the HTML5 spec takes the categorization of content a step further. The specification now defines a set of more granular content models. These are broad definitions about the kind of content that should be found inside a given element. Most of the time they’ll have little impact on the way you write your markup, but it’s worth having a passing familiarity with them, so let’s have a quick look:

· Metadata content: This category is what it sounds like—data that’s not present on the page itself, but affects the page’s presentation or includes other information about the page. This includes elements such as title, link, meta, and style.

· Flow content: This includes just about every element that’s used in the body of an HTML document, including elements such as header, footer, and even p. The only elements excluded are those that have no effect on the document’s flow: script, link, and meta elements in the page’shead, for example.

· Sectioning content: This is the most interesting—and for our purposes, most relevant—type of content in HTML5. In the last chapter, we often found ourselves using the generic term “section” to refer to a block of content that could contain a heading, footer, or aside. In fact, what we were actually referring to was sectioning content. In HTML5, this includes article, aside, nav, and section. Shortly, we’ll talk in more detail about sectioning content and how it can affect the way you write your markup.

· Heading content: This type of content defines the header of a given section, and includes the various levels of heading (h1, h2, and so on).

· Phrasing content: This category is roughly the equivalent to what you’re used to thinking of as inline content; it includes elements such as em, strong, cite, and the like.

· Embedded content: This one’s fairly straightforward, and includes elements that are, well, embedded into a page, such as img, object, embed, video, and canvas.

· Interactive content: This category includes any content with which users can interact. It consists mainly of form elements, as well as links and other elements that are interactive only when certain attributes are present. Two examples include the audio element when the controlsattribute is present, and the input element with a type attribute set to anything but "hidden".

As you might gather from reading the list, some elements can belong to more than one category. There are also some elements that fail to fit into any category (for example, the head and html elements). Don’t worry if any of this seems confusing. The truth is, as a developer, you won't need to think about these categories in order to decide which element to use in which circumstance. More than anything, they're simply a way to encapsulate the different kinds of HTML tags available.

The Document Outline

In the previous edition of this book, we described in detail a new feature called the “document outline.” The purpose of this feature is to allow browsers to create page hierarchy by means of sectioning content elements instead of headings (h1 through to h6) as is done now; however, the spec gives the following warning regarding the document outline:

“There are currently no known implementations of the outline algorithm in graphical browsers or assistive technology user agents, although the algorithm is implemented in other software such as conformance checkers. Therefore the outline algorithm cannot be relied upon to convey document structure to users. Authors are advised to use heading rank (h1-h6) to convey document structure.”

If you'd like to research the document outline algorithm on your own, you can visit the W3C's website. But because there is no practical use for the outline algorithm as of this writing, we'll avoid delving into it here.

No More hgroup

Now that we have a solid handle on HTML5’s content types and document outlines, it’s time to dive back into The HTML5 Herald and add some headings to our newspaper's articles.

For brevity, we’ll deal with each part individually. Let’s add a title and subtitle to our header, just above the navigation:

<header>

<hgroup>

<h1>The HTML5 Herald</h1>

<h2>Produced With That Good Ol’ Timey HTML5 & CSS3</h2>

</hgroup>

<nav>

</nav>

</header>

But wait! This is the wrong markup. In fact, this is the markup we used for our title and tagline in the previous edition of this book. But things have changed.

You’ll notice we introduced three elements into our markup: the title of the website, which is marked up with the customary h1 element; a tagline immediately below the primary page title, marked up with h2; and a new element that wraps our title and tagline, the hgroup element.

The hgroup element was originally introduced in HTML5 to help prevent problems occurring in the document outline. Unfortunately, although some people liked the element, browser makers and screen readers stopped short of implementing it in any beneficial way, so it has been officially dropped from the W3C's version of the HTML5 specification.

Oddly, the WHATWG's version of the specification still includes hgroup, so you might still consider using it if you wish. In our case, we're going to favor the W3C's take on this element and refrain from using it to group our headings like we did in the previous code snippet. Instead, we'll do this:

<h1>HTML5 Herald

<span class="tagline">Produced With That Good Ol’ Timey HTML5 & CSS3</span>

</h1>

That's how the W3C recommends you group headings and subheadings or taglines now, with the absence of hgroup. The goal here is to ensure that the structure you use doesn't mess in a detrimental way with the document outline.

More New Elements

In addition to the structural elements we saw in Chapter 2 and the now defunct hgroup, HTML5 includes a number of other semantic elements. Let’s examine some of the more useful ones.

The figure and figcaption Elements

The figure and figcaption elements are another pair of new HTML5 elements that contribute to the improved semantics in HTML5. The figure element is explained in the spec as follows:

The figure element can […] be used to annotate illustrations, diagrams, photos, code listings, etc. […] A figure element's contents are part of the surrounding flow.

Think of charts, graphs, images to accompany text, or example code. All those types of content might be good places to use figure and potentially figcaption.

The figcaption element is simply a way to mark up a caption for a piece of content that appears inside of a figure.

In order to use the figure element, the content being placed inside it must have some relation to the main content in which the figure appears. If you can completely remove it from a document, and the document’s content can still be fully understood, you probably shouldn’t be using figure; you might, however, need to use aside or an alternative.

Let’s look at how we’d mark up a figure inside an article:

<article>

<h1>Accessible Web Apps</h1>

<p>Lorem ipsum dolor … </p>

<p>As you can see in <a href="#fig1">Figure 1</a>,

<figure id="fig1">

<figcaption>Screen Reader Support for WAI-ARIA</figcaption>

<img src="figure1.png" alt="JAWS: Landmarks 1/1, Forms 4/5 … ">

</figure>

<p>Lorem ipsum dolor … </p>

</article>

Using figcaption is fairly straightforward. It has to be inside a figure element and it can be placed either before or after the figcaption content. In the example here, we've placed it before the image.

The mark Element

The mark element “represents a run of text in one document marked or highlighted for reference purposes, due to its relevance in another context.” Admittedly, there are very few uses we can think of for the mark element. The most common is in the context of a search, where the keywords that were searched for are highlighted in the results.

The spec also mentions using mark to draw attention to text inside a quote. In any case, you want to use it to indicate "a part of the document that has been highlighted due to its likely relevance to the user's current activity".

Avoid confusing mark with em or strong; those elements add contextual importance, whereas mark separates the targeted content based on a user’s current browsing or search activity.

To use the search example, if a user has arrived at an article on your site from a Google search for the word “HTML5,” you might highlight words in the article using the mark element like this:

<h1>Yes, You Can Use <mark>HTML5</mark> Today!</h1>

The mark element can be added to the document either using server-side code, or on the client side with JavaScript after the page has loaded. Search content, for example, can be derived from a URL using search.php?query=html5, for example. In that case, your server-side code might grab the content of the variable in the query string, and then use mark tags to indicate where the word is found on the page.

The progress and meter Elements

Two new elements added in HTML5 allow for marking up of data that’s being measured or gauged in some way. The difference between them is fairly subtle: progress is used to describe the current status of a changing process that’s headed for completion, regardless of whether the completion state is defined. The traditional progress bar indicating download progress is a perfect example of this.

The meter element, meanwhile, represents an element whose range is known, meaning it has definite minimum and maximum values. The spec gives the examples of disk usage, or a fraction of a voting population—both of which have a definite maximum value. Therefore, it’s likely you would avoid using meter to indicate an age, height, or weight—all of which normally have unknown maximum values.

Let’s look in more detail at progress. The progress element can have a max attribute to indicate the point at which the task will be complete, and a value attribute to indicate the task’s status. Both of these attributes are optional. Here’s an example:

<h1>Your Task is in Progress</h1>

<p>Status: <progress max="100" value="0"><span>0</span>% </progress> </p>

This element would best be used with JavaScript to dynamically change the value of the percentage as the task progresses. You’ll notice that the code includes span tags, isolating the number value; this facilitates targeting the number directly from your script when you need to update it.

The meter element has six associated attributes. In addition to max and value, it also allows use of min, high, low, and optimum.

The min and max attributes reference the lower and upper boundaries of the range, while value indicates the current specified measurement. The high and low attributes indicate thresholds for what is considered “high” or “low” in the context. For example, your grade on a test can range from 0% (min) to 100% (max), but anything below 60% is considered low and anything above 85% is considered high. The optimum attribute refers to the ideal value. In the case of a test score, the value of optimum would be 100.

Here’s an example of meter, using the premise of disk usage:

<p>Total current disk usage: <meter value="130" min="0" max="320" low="10" high="300" title="gigabytes">63 GB</meter></p>

In Figure 3.1, you can see how the meter element looks by default in Chrome and Firefox.

The meter element in Chrome and Firefox

Figure 3.1. The meter element in Chrome and Firefox

For better accessibility, when using either progress or meter, you're encouraged to include the value as text content inside the element. So if you're using JavaScript to adjust the current state of the value attribute, you should change the text content to match.

The time Element

Dates and times are invaluable components of web pages. Search engines are able to filter results based on time, and in some cases, a specific search result can receive more or less weight by a search algorithm depending on when it was first published.

The time element has been specifically designed to deal with the problem of humans reading dates and times differently from machines. Take the following example:

<p>We'll be getting together for our next developer conference on 12 October of this year.</p>

While humans reading this paragraph would likely understand when the event will take place, it would be less clear to a machine attempting to parse the information.

Here’s the same paragraph with the time element introduced:

<p>We’ll be getting together for our next developer conference on <time datetime="2015-10-12">12 October of this year</time>.</p>

The time element also allows you to express dates and times in whichever format you like while retaining an unambiguous representation of the date and time behind the scenes, in the datetime attribute. This value could then be converted into a localized or preferred form using JavaScript, or by the browser itself (although no browsers at the time of writing support this behavior).

In earlier versions of the spec, the time element allowed use of the pubdate attribute. This was a Boolean attribute, indicating that the content within the closest ancestor article element was published on the specified date. If there was no article element, the pubdate attribute would apply to the entire document. But this attribute has been removed from the spec, even though it did seem to be useful. In his in-depth article on the time element, Aurelio De Rosa provides an alternative for the now dropped pubdate attribute, if you want to look at another method for achieving this.

The time element has some associated rules and guidelines:

· You should not use time to encode unspecified dates or times (for example, “during the ice age” or “last winter”; this is because the time element does not allow for ranges).

· The date represented cannot be “BC” or “BCE” (before the common era); it must be a date on the Gregorian Calendar.

· If the time element lacks a valid datetime attribute, the element’s text content (whatever appears between the opening and closing time tags) needs to be a valid datetime value.

Here's a chunk of HTML that includes many of the different ways to write a datetime value according to the spec:

<!-- month -->

<time>2015-11</time>

<!-- date -->

<time>2015-11-12</time>

<!-- yearless date -->

<time>11-12</time>

<!-- time -->

<time>14:54:39</time>

<!-- floating date and time -->

<time>2015-11-12T14:54:39</time>

<!-- time-zone offset -->

<time>-0800</time>

<!-- global date and time -->

<time>2015-11-12T06:54:39.929-0800</time>

<!-- week -->

<time>2015-W46</time>

<!-- duration -->

<time>4h 18m 3s</time>

The uses for the time element are endless: calendar events, publication dates (for blog posts, videos, press releases, and so forth), historic dates, transaction records, article or content updates, and much more.

Changes to Existing Features

While new elements and APIs have been the primary focus of HTML5, this latest iteration of web markup has also brought with it changes to existing elements. For the most part, any changes made have been done with backwards-compatibility in mind, to ensure that the markup of existing content is still usable.

We’ve already considered some of the changes (the doctype declaration, character encoding, and content types, for example). Let’s look at other significant changes introduced in the HTML5 spec.

The Word “Deprecated” is Deprecated

In previous versions of HTML and XHTML, elements that were no longer recommended for use (and so removed from the spec), were considered “deprecated.” In HTML5, there is no longer any such thing as a deprecated element; the term now used is “obsolete.”

Obsolete elements fall into two basic categories: “conforming” obsolete features and “non-conforming” obsolete features. Conforming features will provide warnings in the validator, but will still be supported by browsers. So you are permitted to use them but their use is best avoided.

Non-conforming features, on the other hand, are considered fully obsolete and should not be used. They will produce errors in the validator.

The W3C has a description of these features, with examples.

Block Elements Inside Links

Although most browsers handled this situation well in the past, it was never actually valid to place a block-level element (such as a div) inside an a element. Instead, to produce valid HTML, you’d have to use multiple a elements and style the group to appear as a single block.

In HTML5, you’re now permitted to wrap almost anything in an a element without having to worry about validation errors. The only block content you're unable to wrap with an a element are other interactive elements such as form elements, buttons, and other a elements.

Bold Text

A few changes have been made in the way that bold text is semantically defined in HTML5. There are essentially two ways to make text bold in most browsers: by using the b element, or the strong element.

Although the b element was never deprecated, before HTML5 it was discouraged in favor of strong. The b element previously was a way of saying “make this text appear in boldface.” Since HTML is supposed to be all about the meaning of the content, leaving the presentation to CSS, this was unsatisfactory.

According to the spec, in HTML5, the b element has been redefined to represent a section of text “to which attention is being drawn for utilitarian purposes without conveying any extra importance and with no implication of an alternate voice or mood.” Examples given are key words in a document abstract, product names in a review, actionable words in interactive text-driven software, or an article lede.

The strong element, meanwhile, still conveys more or less the same meaning. In HTML5, it represents “strong importance, seriousness, or urgency for its contents.” Interestingly, the HTML5 spec allows for nesting of strong elements. So, if an entire sentence consisted of an important warning, but certain words were of even greater significance, the sentence could be wrapped in one strong element, and each important word could be wrapped in its own nested strong.

Italicized Text

Along with modifications to the b and strong elements, changes have been made in the way the i element is defined in HTML5.

Previously, the i element was used to simply render italicized text. As with b, this definition was unsatisfactory. In HTML5, the definition has been updated to “a span of text in an alternate voice or mood, or otherwise offset from the normal prose in a manner indicating a different quality of text.” So the appearance of the text has nothing to do with the semantic meaning, although it may very well still be italic—that’s up to you.

An example of content that can be offset using i tags might be an idiomatic phrase from another language, such as reductio ad absurdum, a latin phrase meaning “reduction to the point of absurdity.” Other examples could be text representing a dream sequence in a piece of fiction, or the scientific name of a species in a journal article.

The em element is unchanged, but its definition has been expanded to clarify its use. It still refers to text that’s emphasized, as would be the case colloquially. For example, the following two phrases have the exact same wording, but their meanings change because of the different use of em:

<p>Harry’s Grill is the best <em>burger</em> joint in town.</p>

<p>Harry’s Grill <em>is</em> the best burger joint in town.</p>

In the first sentence, because the word “burger” is emphasized, the meaning of the sentence focuses on the type of “joint” being discussed. In the second sentence, the emphasis is on the word “is,” thus moving the sentence focus to the question of whether Harry’s Grill really is the best of all burger joints in town.

Neither i nor em should be used to mark up a publication title; instead, you should use cite.

Of all the four elements discussed here (b, i, em, and strong), the only one that gives contextual importance to its content is the strong element.

Big and Small Text

The big element was previously used to represent text displayed in a large font. The big element is now a non-conforming obsolete feature and should not be used. The small element, however, is still valid but has a different meaning.

Previously, small was intended to describe “text in a small font.” In HTML5, it represents “side comments such as small print.” Some examples where small might be used include information in footer text, fine print, and terms and conditions. The small element should only be used for short runs of text. So you wouldn't use small to mark up the body of an entire “terms of use” page.

Although the presentational implications of small have been removed from the definition, text inside small tags will more than likely still appear in a smaller font than the rest of the document.

For example, the footer of The HTML5 Herald includes a copyright notice. Since this is essentially legal fine print, it’s perfect for the small element:

<small>© SitePoint Pty. Ltd.</small>

A cite for Sore Eyes

The cite element was initially redefined in HTML5 accompanied by some controversy. In HTML4, the cite element represented “a citation or a reference to other sources.” Within the scope of that definition, the spec permitted a person’s name to be marked up with cite (in the case of a quotation attributed to an individual, for example).

The earlier versions of the HTML5 spec forbade the use of cite for a person’s name, seemingly going against the principle of backwards compatibility. Now the spec has gone back to a more similar definition to the original one, defining cite as “a reference to a creative work. It must include the title of the work or the name of the author (person, people, or organization) or a URL reference, or a reference in abbreviated form.”

Here's an example, taken from the spec:

<p>In the words of <cite>Charles Bukowski</cite> -

<q>An intellectual says a simple thing in a hard way. An artist says a hard thing in a simple way.</q></p>

Description (not Definition) Lists

The existing dl (definition list) element, along with its associated dt (term) and dd (description) children, has been redefined in the HTML5 spec. Previously, in addition to terms and definitions, the spec allowed the dl element to mark up dialogue, but the spec now prohibits this.

In HTML5, these lists are no longer called “definition lists”; they’re now the more generic-sounding “description lists” or “association lists.” They should be used to mark up any kind of name-value pairs, including terms and definitions, metadata topics and values, and questions and answers.

Here's an example using CSS terms and their definitions:

<dl>

<dt>Selector:</dt>

<dd>The element(s) targeted.</dd>

<dt>Property:</dd>

<dd>The feature used to add styling to the targeted element, defined before a colon.</dd>

<dt>Value:</dd>

<dd>The value given to the specified property, declared after the colon.</dd>

</dl>

Other New Elements and Features

We’ve introduced you to and expounded on some of the more practical new elements and features. In this section, let's touch on lesser-known elements, attributes, and features that have been added to the HTML5 spec.

The details Element

This new element helps mark up a part of the document that’s hidden, but can be expanded to reveal additional information. The aim of the element is to provide native support for a feature common on the Web—a collapsible box that has a title, and more info or functionality hidden away.

Normally this kind of widget is created using a combination of HTML and JavaScript. The inclusion of it in HTML5 removes the scripting requirements and simplifies its implementation for web authors, thus contributing to decreased page load times.

Here’s how it might look when marked up:

<details>

<summary>Some Magazines of Note</summary>

<ul>

<li><cite>Bird Watcher's Digest</cite></li>

<li><cite>Rower's Weekly</cite></li>

<li><cite>Fishing Monthly</cite></li>

</ul>

</details>

In the example, the contents of the summary element will appear to the user, but the rest of the content will be hidden. Upon clicking summary, the hidden content appears.

If details lacks a defined summary, the browser will define a default summary (for example, “Details”). If you want the hidden content to be visible by default, you can use the Boolean open attribute on the details element.

The summary element can be used only as a child of details, and it must be the first child if used.

As of this writing, details lacks complete browser support (IE and Firefox don't support it), but it's improving. To fill the gaps, a couple of JavaScript-based polyfills are available, including a jQuery version by Mathias Bynens and a vanilla JavaScript version by Maksim Chemerisuk.

Customized Ordered Lists

Ordered lists, marked up using the ol element, are quite common in web pages. HTML5 introduces a new Boolean attribute called reversed so that when present, it reverses the numbers on the list items, allowing you to display lists in descending order. Additionally, HTML5 has brought back the start attribute, deprecated in HTML4. The start attribute lets you specify with which number your list should begin.

Support is good for both reversed and start. As of this writing, Internet Explorer is the only browser without support for reverse-ordered lists. If you want a polyfill, you can use a script by one of the book's authors.

Scoped Styles

In HTML5, the style element, used for embedding styles directly in your pages (as opposed to referencing a linked stylesheet), allows use of a Boolean attribute called scoped. Take the following code example:

<h1>Page Title</h1>

<article>

<style scoped>

h1 {

color: blue;

}

</style>

<h1>Article Title</h1>

<p>Article content.</p>

</article>

Because the scoped attribute is present, the styles declared inside the style element will apply only to the parent element and its children (if cascading rules permit), instead of the entire document. This allows specific sections inside documents (such as the article element in this example) to be easily portable along with their associated styles.

This is certainly a handy new feature, but it's likely going to take some time for it to be implemented in all browsers. The only browser that currently supports scoped styles is Firefox. Chrome previously supported it, but it was removed due to “high code complexity.” And at the time of writing, the IE team has no immediate plans to add this feature.

The async Attribute for Scripts

The script element now allows the use of the async attribute, which is similar to the existing defer attribute. Using defer specifies that the browser should wait until the page’s markup is parsed before loading the script. The new async attribute allows you to specify that a script should load asynchronously. This means it should load as soon as it’s available, without causing other elements on the page to delay while it loads. Both defer and async are Boolean attributes.

These attributes must only be used when the script element defines an external file. For legacy browsers, you can include both async and defer to ensure that one or the other is used, if necessary. In practice, both attributes will have the effect of not pausing the browser’s rendering of the page while scripts are downloaded; however, async can often be more advantageous, as it will load the script in the background while other rendering tasks are taking place, and execute the script as soon as it’s available.

The async attribute is particularly useful if the script you’re loading has no other dependencies, and if it benefits the user experience for the script to be loaded as soon as possible, rather than after the page loads. It should also be noted, however, that if you have a page that loads multiple scripts, the defer attribute ensures that they're loaded in the order in which they appear, while there's no guaranteeing the order with async.

The picture element

One of the most recent additions to the HTML5 spec is the picture element, which is intended to help with responsive web design, specifically responsive images. picture lets you define multiple image sources. This allows users on mobile browsers to download a low-res version of the image, while offering a larger version for tablets and desktops.

The picture element has its accompanying source elements (which are also used for video and audio elements, as described in Chapter 5), in addition to some new attributes such as srcset and sizes. These two attributes can be used on picture, img, and source.

For a good discussion of the way these new features are used in responsive image implementations, see this excellent article by Eric Portis on A List Apart.

Other Notables

Here are some further new HTML5 features you'll want to look at using, each with varying levels of browser support:

· The dialog element, which represents “a part of an application that a user interacts with to perform a task; for example, a dialog box, inspector, or window.”

· The download attribute for a elements, used to indicate that the targeted resource should be downloaded rather than navigated to (useful for PDFs, for example).

· The sandbox and seamless attributes for iframe elements. sandbox lets you run an external page with restrictions and the seamless attribute integrates the iframe content more closely with the parent document, adopting its styles more seamlessly.

· The menu element and its menuitem child elements, which allow you to create a list of interactive commands. For example, you can mark up an Edit menu with options for Copy, Cut, and Paste, adding scripting functionality as needed.

· The address element, which lets you mark up contact information applying to the nearest article or body element.

There are other new elements not discussed here, simply because of lack of space. Be sure to check out the specs from time to time to see if anything new has been added or changed.

The Future of Markup — Web Components?

In the last year or so a new specification called "Web Components", initiated by engineers working on Google's Chrome browser, has gained a lot of traction in the industry with already some significant browser support. In brief, Web Components are divided into four main sections, summarized briefly here.

Custom Elements

Custom elements allow developers to define their own DOM elements with a custom API. These elements and their associated scripts and styling are meant to be easily portable and reusable as encapsulated components.

Shadow DOM

Shadow DOM allows you to define a sort of hidden sub-tree of DOM nodes that exists in its own namespace, inside a custom element. This encapsulates the sub-tree to prevent naming collisions, allowing the entire node tree to be portable along with the custom element.

HTML Imports

The HTML Imports feature is a way to include and reuse HTML documents inside of other HTML documents, similar to how you might use "include" files in PHP. Imports are included by means of HTML's <link> tag, which is commonly used to embed external CSS.

HTML Templates

Finally, there's the new template tag. This new element is part of an answer to a popular trend in front-end development called client-side templating. The template element itself does nothing, but it's used in conjunction with some scripting to allow predefined document fragments to be inserted into the document whenever they're needed.

Many expect that Web Components — in particular, Custom Elements — are the future of web markup and scripting. But time will tell. Web Components go pretty deep; we could probably write an entire book on the topic! If you want to read more, check out the spec links referenced above or the sources listed below:

· WebComponents.org

· Polymer (A Custom Elements polyfill)

· An Introduction to Web Components and Polymer by Pankaj Parashar

· Intro to Shadow DOM by Agraj Mangal

· HTML's New Template Tag by Eric Bidelman

· An Introduction to HTML Imports by Armando Roggio

Validating HTML5 Documents

In Chapter 2, we introduced you to a number of syntax changes in HTML5, and touched on some issues related to validation. Let’s expand upon those concepts a little more so that you can better understand how validating pages has changed.

The HTML5 validator is no longer concerned with code style. You can use uppercase or lowercase, omit quotes from attributes, exclude optional closing tags, and be as inconsistent as you like, and your page will still be valid.

So, you ask, what does count as an error for the HTML5 validator? It will alert you to the incorrect use of elements, elements included where they shouldn’t be, missing required attributes, incorrect attribute values, and the like. In short, the validator will let you know if your markup conflicts with the specification, so it’s still a valuable tool when developing your pages.

To give you a good idea of how HTML5 differs from the overly strict XHTML, let’s go through some specifics. This way, you can understand what is considered valid in HTML5:

· Some elements that were required in XHTML-based syntax are no longer required for a document to pass HTML5 validation; examples include the html and body elements. This happens because even if you exclude them, the browser will automatically include them in the document for you.

· Void elements (that is, elements without a corresponding closing tag or without any content) aren't required to be closed using a closing slash; examples include meta and br.

· Elements and attributes can be in uppercase, lowercase, or mixed case.

· Quotes are unnecessary around attribute values. The exceptions are when multiple space-delimited values are used, or a URL appears as a value and contains a query string with an equals (=) character in it.

· Some attributes that were required in XHTML-based syntax are no longer required in HTML5. Examples include the type attribute for script elements, and the xmlns attribute for the html element.

· Some elements that were deprecated and thus invalid in XHTML are now valid; one example is the embed element.

· Stray text that doesn’t appear inside any element but is placed directly inside the body element would invalidate an XHTML document; this is not the case in HTML5.

· Some elements that had to be closed in XHTML can be left open without causing validation errors in HTML5; examples include p, li, and dt.

· The form element isn’t required to have an action attribute.

· Form elements, such as input, can be placed as direct children of the form element; in XHTML, another element (such as fieldset or div) was required to wrap form elements.

· textarea elements are not required to have rows and cols attributes.

· The target attribute for links was previously deprecated in XHTML. It's now valid in HTML5.

· As discussed earlier in this chapter, block-level elements can be placed inside link (a) elements.

· The ampersand character (&) doesn’t need to be encoded as & if it appears as text on the page.

That’s a fairly comprehensive, though hardly exhaustive, list of differences between XHTML strict and HTML5 validation. Some are style choices, so you’re encouraged to choose a style and be consistent. We outlined some preferred style choices in the previous chapter, and you’re welcome to incorporate those suggestions in your own HTML5 projects.

Note: Stricter Validation Tools

If you want to validate your markup’s syntax style using stricter guidelines, there are tools available that can help you. One such tool is Philip Walton's HTML Inspector. To use it, you can include the script in your pages during the development phase, then open your browser's JavaScript console in the developer tools and run the command HTMLInspector.inspect(). This will display a number of warnings and recommendations right inside the console on how to improve your markup. HTML Inspector also lets you change the configuration to customize the tool to your own needs.

Summary

By now, we’ve gotten our heads around just about all the new semantic and syntactic changes in HTML5. Some of this information may be a little hard to digest straight away, but don’t worry! The best way to become familiar with HTML5 is to use it—start with your next project. Try using some of the structural elements we covered in the last chapter, or some of the text-level semantics we saw in this chapter. If you’re unsure about how an element is meant to be used, go back and read the section about it, or better yet, read the specification itself. While the language is certainly drier than the text in this book (at least, we hope it is!), the specs can provide a more complete picture of how a given element is intended to be used. Remember that the HTML5 specification is still in development, so some of what we’ve covered is still subject to change in the new HTML5.1 version (or in the HTML5 “living standard,” if you go by the WHATWG's definition). The specifications will always contain the most up-to-date information.

In the next chapter, we’ll look at a crucial segment of new functionality introduced in HTML5: forms and form-related features.