Getting a Web Development Job For Dummies (2015)
Part II. Core Technologies for Web Development
Chapter 8. Introducing HTML
In This Chapter
Discovering how HTML was created
Creating HTML headers
Looking at the structural elements of HTML
Exploring text and image elements
Using HTML tables
The web today is much different than the web before, say, the year 2000 or so. Through the 1990s, as the web emerged, web pages were based on HTML, with limited use of JavaScript. Such web pages are called static web pages, and the web of that time is now referred to as thestatic web.
Late in the 1990s, a livelier web came into place, much more responsive to users. Now, when a user visits a web page, the web page is created on the spot through a series of database calls. The ads come from some databases, the header and footer from others, and the main content from others entirely.
All this content is poured into a page layout defined in CSS, with liberal use of JavaScript to make the page lively, dynamic, and responsive. Languages like Python are used for purposes such as interfacing with databases.
This new, livelier web was called Web 2.0 when it appeared, but is better-known today as the dynamic web.
In this chapter, we introduce HTML, the web’s core technology — but a technology that is, as we describe here, only part of a suite of technologies in the web pages of today.
Discovering How the Web Became What It Is
In the past, web pages were hand-crafted. Initially, every page was its own beast, hand-coded in HTML. The HTML standard was changing rapidly, and so were the browsers that displayed web pages — mainly Netscape Navigator and Microsoft Internet Explorer, with Firefox and Google’s Chrome coming along later.
There were even cultural aspects to your development and web browser choices. If you used a Windows PC and Microsoft tools for development, and the Internet Explorer browser for web surfing, you were a hopeless square. (That’s because it was mostly large corporations that were under “account control” by Microsoft that would do such a thing.) Most developers used Macs and Netscape standards and tools. Pages optimized for Netscape Navigator were considered cooler.
In the mid-1990s, style sheets became important. Cascading Style Sheets (CSS) was the standard that was chosen from among several competitors. CSS changed a lot in its early years, as did its implementation in browsers.
The final element of the core troika of early web development technologies is JavaScript. It was originally developed as LiveScript at Netscape, also in the mid-1990s. The name was changed to JavaScript shortly before widespread adoption, even though JavaScript has nothing to do with the Java runtime environment and programming language.
Web pages became quite complex. Each page was a somewhat volatile and hand-crafted mixture of HTML, CSS, and JavaScript. (These three standards are called “the basic building blocks of the web” on the W3C site, w3c.org.) Because the functionality of these three technologies can overlap somewhat, every different web page was a new adventure.
Websites today are largely driven by databases. HTML and CSS and JavaScript still matter, of course. But they are used as much to create frameworks for database-generated content as for hand-crafted pages. (And yes, as a web professional today, you very much need to be able to do both.)
This chapter describes the first of these three core technical standards for web pages, HTML. Today’s web pages are just as likely to be crafted in DreamWeaver and programmed with PHP as created directly from HTML, CSS, and JavaScript.
But knowing the basics of HTML, and being able to explain them to colleagues who want to know what a web page can and can’t do, is vital to an understanding of how the web works. With this knowledge in your hip pocket, you’ll be better able to carry out your role as a member of your web development team.
If you already know HTML, review this chapter to make sure you know the basics as well as you think you do, and then use this information to bring colleagues up to speed.
As with other original web technologies and approaches, there is also both a cultural and a credibility aspect to knowing these technologies well. You want to be able to add new HTML, tweak existing CSS, and sling JavaScript with the old hands as well as write clean PHP code with the new.
The W3C website has a course on JavaScript that also relates its use to HTML and CSS, shown in Figure 8-1. For details, visit www.w3.org/community/webed/wiki/Category:Tutorials.
Figure 8-1: Visit W3C for basics on JavaScript programming and interaction with HTML and CSS.
The HTML tags described in this chapter are representative of what you may use in your own web pages. Showing and describing them gives us the opportunity to comment on some parts of how HTML and CSS work together, for instance. Use a current HTML reference, such as the one at w3c.org, for your actual web development work.
Exploring the Creation of HTML
Just about everyone reading this book knows what HTML is. Still, it’s worthwhile to describe its creation and evolution because they’re still relevant to how the web as a whole, and web pages, are developed today.
The core of HTML is the use of “tags” — little pieces of code — to “mark up,” or put formatting or descriptive elements, into text.
Here’s an example of simple HTML:
I like to use <b>bold</b> sometimes and <i>italic</i> sometimes. And other times I like to add <a href="w3c.org">links</a>.
How does this show up? Like so:
I like to use bold sometimes and italic sometimes. And other times I like to add links.
The <a> and </a> tag pair is called the anchor tag. Surrounding a piece of text with <a> and </a> makes the text into a hyperlink. In the anchor tag, you place the URL of the destination of the link.
Now, there has been a battle between two approaches to HTML since the early days. These are basically whether HTML is used for formatting or description.
If you believe HTML is used for formatting, you’re very happy with the bold (<b>) and italic (<i>) tags. However, some web developers preferred to use tags such as <strong> and <em>, for emphasis. The idea was that it was up to the browser, or other display software, to decide what strong and emphasis meant.
With the clarity given by hindsight, it’s clear that this latter idea is ridiculous – although some people still swear by it. Writers and editors are used to using bold and italics and underlining for certain purposes in print — and they’re happy to continue doing that in the online medium. But they weren’t ready to stop worrying about whether words were emphasized by bolding, italics, or some other convention.
So HTML continues to be used for formatting. And it will probably be used that way forever.
The following sections describe core elements of HTML. It’s worth reviewing them because every web page has to have these elements — even if, in some cases, they’re now being implemented more often in CSS or JavaScript rather than HTML.
Even if you don’t work directly with HTML, you should familiarize yourself with what it can do. Nearly everyone in the web development field understands the basics of HTML. Don’t be the only one on your project team who doesn’t.
On the web, you shouldn’t use underlining for any purpose except hyperlinks. People are used to clicking on underlined text, and it’s just too confusing if you try to do it any other way. (And also, always underline linked text — unless you come up with some other way of signaling “this is a link” that’s extremely clear and obvious.)
Discovering Header Elements
The top part of an HTML document is called the header, and is surrounded by the <head> and </head> tags. The header usually contains mostly header-specific tags, described here. These tags define elements that apply to the page as a whole.
XHTML versus HTML
XHTML is a blend of HTML, which is very widely known and used, and XML, a separate language with much stricter rules for how it’s written and interpreted. The blend of HTML and XML is called XHTML, and it’s basically HTML written under stricter rules.
In HTML, it doesn’t matter if you use uppercase or lowercase; <h1> is the same as <H1>. You can use the <p> tag to indicate a break between paragraphs. And you don’t have to put attributes, such as the URL in an anchor tag, in quotes.
In XHTML, you always use lowercase for tags. The paragraph tag is now a pair of tags, <p> and </p>, that you use to surround each paragraph. And attributes are always enclosed in quotes, such as <a href=”w3c.org”>Visit W3C today!</a>.
Many web teams use XHTML styling as a matter of course, so get used to taking this extra level of care when you write your HTML code. You might also want to change pages to XHTML when you modify them for consistency and predictability across your web pages.
The body of the web page, by contrast, is surrounded by the <body> and </body> tags. It includes the actual web page content and is where the rest of the tags are used.
Header tags include
· <title> and </title>. The name of the web page. Most browsers show it in the header bar at the top of the page, above the page itself — or in the tab, if you’re using tabbed browsers.
· <meta>. The <meta> tag contains general information about the page that’s used in various ways. The contents aren’t really specified officially, but a few conventions have grown up. You can use the meta tag to specify a description of the site and various keywords that you want search engines to use, but most search engines today don’t take much notice of the <meta> tag.
· <link> and </link>. This tag is mostly used to link to stylesheets written using CSS.
· <style>. This is a place to put CSS code that applies to this specific page. If you also link to one or more CSS stylesheets, the CSS code contained in the <style> tag will override the CSS code in the stylesheet for use on the current page.
· <script> and </script>. This is where you commonly put all the JavaScript on a page between the <script> and </script> tags.
Making Use of Core Structural Elements
HTML has several core structural elements. These elements describe the overall layout of a web page.
Search engines vary their algorithms — the rules they use — over time. But a few key elements tend to be used over and over to analyze a web page and what’s important in it. The core structural elements of a web page are a big part of this.
Headers are perhaps the most important ongoing element for search engine success. Any web page worth bothering with is going to put core topical keywords in its headers.
The dot-com boom
The early years of the web featured what was widely called the dot-com boom. The dot-com boom was a stock market boom that featured a slew of new companies with websites at their core. Because all the highly valued companies used the .com top-level domain, the boom was called the dot-com boom, and the subsequent crash, in 1999-2001, was called the dot-com crash.
The dot-com boom happened because investors believed that there would be very valuable companies coming out of the rise of the Internet, and they were right. However, it was extremely unclear which companies would benefit — and whether the beneficiaries would ultimately be companies that were being traded at the time, companies that had yet to be invented, or companies that had gone into business, but had not yet gone public, and so were not available on the stock market.
During the dot-com boom, companies such as Yahoo! saw their stock prices rise to dizzying heights. In many cases, companies that had yet to make a profit, or even any sales, were worth billions of dollars in stock-market value.
One famous example was pets.com. The idea was that the market for pet food and pet supplies online would be gigantic, and pets.com would be the leader. What ultimately sunk pets.com was that shipping 50-pound bags of dog food to people through the mail just didn’t make sense, on a large scale. Although pet store chains today certainly have e-commerce websites, pets.com went under.
The dot-com crash happened in the year 2000. The NASDAQ stock index, which was and is heavy on technology stocks, peaked at over 5,000 in March, 2000 – then crashed, falling to 3,500 a couple of months later, rising again past 4,000, then slumping to a little higher than 1,000 in late 2002.
It turns out that dot-com boom was really a broad technology boom, and that many of the winners only came onto the market later. Many companies disappeared. Others, like Yahoo!, never recovered their previous value. Some others, like Amazon, have gone onto new heights. But the three biggest stock stories since the web was invented included two companies, Facebook and Google, that only went public after the dot-com crash. The third big beneficiary, Apple, which became the world’s most valuable company, makes devices that are largely used to access Internet services, but is hardly a dot-com company at all. (Apple does use e-commerce to sell a lot of goods, but the goods are made by Apple itself, and sold in many other ways as well, so the e-commerce part is not the point of the company.)
The problem is that HTML and CSS can be used to create what look like second-level and third-level headers without actually using the HTML <H2> and <H3> tags to do it. Here are core structural elements of HTML:
· <h1> and </h1> through <h6>/<h6>. These are HTML headers. A lot of web developers seem to pride themselves on not using HTML header tags. Why? Stop tricking your users and search engines; use search-engine-recognizable header tags in your web page layouts.
· <p> and </p>. This is the HTML paragraph tag. People used to use the <P> tag as a way to separate one paragraph (thus the <P>) from the next. Or they would use the <br>, or “break” tag, which is only supposed to indicate a line break, not a meaningful separation of chunks of text.
· <blockquote> and </blockquote>. This HTML tag pair indicates a block of quoted text. It’s a great example of the different types of HTML tags, those that control formatting and those that indicate meaning. This tag pair displays text indented from the left edge of other text.
The HTML header tags, <H1> through <H6>, impose a specific appearance on the headers on a web page. The appearance is specified by the browser that displays the page. Web developers who want to control the look of headers — which means just about all web developers — for many years created their own header styles. The difficulty is that this undermined search engines that use the header tags, when they’re present, to determine what words are more important in a document. However, you can use HTML to specify headers, use CSS to override their appearance, thus customizing the look and feel of your web pages.
Using List Elements
Lists are highly recommended for frequent use on your web pages. They’re easy for the reader to scan and quickly pick out key points.
Lists are also good for writers — they make the writer get to the point quickly. This is very important on the web, where people scan pages hurriedly, looking for a key fact or insight, then hurriedly move on.
The main types of lists that you’ll use are bulleted lists and numbered lists. Both are great for helping readers pick out key points. Numbered lists work when there are steps or some other process or procedure. Web pages tend to have a lot of bulleted lists, so use numbered lists where you sensibly can.
Here are the most commonly used list elements of HTML:
· <ul> and </ul>: Use these tags to surround an entire unordered (bulleted) list.
· <ol> and </ol>: Use these tags to surround an entire ordered (numbered) list.
· <li> and </li>: Use these tags to surround each item in an unordered (that is, bulleted) or ordered (numbered) list.
You can also create a definition list. A definition list is like a bulleted list, but each bullet item is a definition term — the term that’s being defined, usually displayed in bold — followed by the definition itself. The definition list gives you another tool for breaking up your web page, avoiding long flows of paragraph text.
Here are the tags for definition lists:
· <dl> and </dl>: This is not a tag for putting information on the “down low” (keeping it secret); instead, use these tags to create a definition list.
· <dt> and </dt>; <dd> and </dd>: “dd” stands for “definition data.” The contents of a definition list include the terms that you’ll define (<dt>) and the definitions themselves (<dd>).
As with header tags, the appearance of lists created using HTML tags is often boring and ugly, and browsers always display them the same way. Many web developers do lists in their own way, not using HTML. However, as with header tags, recommended practice is to use the HTML tags, then override the look and feel of the HTML tags with CSS.
Usability and lists
This book doesn’t have room to go into usability much. Web usability is the practice of making web pages easy to use. Usability professionals can be, for example, web page designers with a usability bent; usability professionals who work with all sorts of other team members; and interaction designers, who focus specifically on intense user processes such as completing a purchase on an e-commerce site.
Lists are a great example of the kind of concerns that drive web usability. For your website’s users, reading from the screen, as on a web page, is harder than reading a printed magazine or book. That’s because the screen is generally low-resolution (although Apple’s Retina screens are leading the way in high-resolution screens), and always backlit, rather than lit by reflected light like a magazine or book. Staring at a computer screen has been described as “staring into a light bulb,” and it makes your eyes tired.
Because reading from a screen is hard, users get tired doing it. So they remember less information from text they read onscreen than in print. They also tend to rush through onscreen text, just grasping key facts, and scanning it rather than actually reading it.
This is where headers, lists, and other structural elements of a web page can help you. Text in a list tends to be shorter than narrative text in paragraphs. So shorten your text, then put it in a list. That will tend to shorten it further, and make it easier for the user’s tired eyes to pick out the key points that they’re looking for.
When you’ve completed a web page, try scanning through it quickly. You should be able to pick out the key points without reading the page closely. If not, make sure the key points are reflected in headers and lists.
One of the hardest tags to remember in HTML, until you get used to it, is the ordered list tag, <ol> and </ol>. That’s because what you’re trying to get is a bulleted list, and you would never normally think to call that “unordered” — it actually has an order, top to bottom, and you should put the most important points at the top. (The bottom of a list also tends to get extra attention, so you can put an important point there as well.) These names were given out of the old idea that HTML shouldn’t specify formatting (such as “bulleted list”), but should specify meaning (“please, Mr. Browser, do something appropriate with this unordered list”), even though from the beginning everyone’s used it for formatting. The tag you use when you want a numbered list is almost as obscure, with <ol> and </ol>representing the beginning and end of an ordered (numbered) list.
Working with Text Formatting and Image Elements
Text and images are the core elements of nearly all web pages. Only a few HTML attributes were available, in the early days, to affect how they were displayed onscreen, and CSS was not yet invented at that time. So these few tags received a lot of use, and even abuse, as web developers tried very hard to create sophisticated-looking web pages with the crude tools at their disposal.
<blink> and hostage-note web pages
HTML is a pretty blunt instrument for making your page look good, but before CSS was introduced and became widely used, it was all a web developer had. Pages looked too much the same, making websites boring. Developers tended to overuse formatting, such as bolding and colored text, in an attempt to give their pages some zing.
The <blink>/</blink> tag pair was the most famous example of this effort. Unlike many HTML tags, this one has an easy to remember and descriptive name. It does “just what it says on the tin” — it makes the text that’s surrounded by it blink.
Imagine a sweepstakes web page with the words “You may have already won!” in the center, rendered as a clickable link, in a big text size and a bright color, blinking in the center of the page. That’s what the <blink> tag was invented for.
The <blink> tag is not supported in all browsers, so that’s one very good reason not to use it. However, it’s also so annoying and frustrating to users that you should never use it for that reason as well.
The <blink> tag is still something of a standing joke among web developers, and a good thing to know about if you’re newer and want to appear (and be) knowledgeable. The <blink> tag is also a good reminder not to try too hard to be distinctive or get attention in your web page designs.
The happy move to HMTL5
HTML5 (it’s usually written that way, with no space between HTML and 5) is a new, and welcome, addition to the existing body of HTML standards.
HTML5 starts by adding several new tags, such as <audio> and <video> for multimedia. It also standardizes tags and attributes in an XHTML-friendly manner. And HTML5 includes features for pages that work on mobile phones and other portable devices.
Among the many advantages of HTML5 is that it can easily be used to deliver video in a website without the need for Adobe’s Flash tool. Flash doesn’t work on mobile devices, and it’s often buggy and crash-prone on personal computers too. Moving beyond Flash is a good step for the web.
We won’t explain HTML5 in any detail here because it’s now the standard that you’ll find in any up-to-date HTML book or descriptive website. Just be aware that HTML is today still very widely used and relevant for the future, in no small part because of the changes introduced as part of HTML5. If you want to move to the top of your profession, learning the ins and outs of HTML5, and how to get the most out of it — especially on mobile devices — is very much worthwhile.
Formatting text directly is said to be somewhat against the spirit of HTML, but people care a great deal about how text appears onscreen. HTML was pushed to its limits to create page layout and text formatting, with much difficulty across different browsers, different browser versions, and different types of computers and screen resolutions. Now, browsers are far more standardized, and the same goal is reached more effectively with a combination of simpler HTML code and CSS.
On its own, HTML doesn’t let you specify the placement of an image onscreen; images just go into the flow of text before and after them. This is very much unlike, for instance, magazines, which users and designers alike are accustomed to. HTML and CSS go some way to providing the control which designers want, and users expect, but it takes a lot of skill with these tools to create consistent page designs across platforms.
Here are the main tags that format text directly:
· <b> and </b>. Tags that render the enclosed text bold.
· <i> and </i>. Tag pair that renders enclosed text in italics.
· <font> and </font>. Tag pair that renders enclosed text in a specified font.
· <span> and </span>. Designates enclosed text for formatting commands.
· <a> and </a>. Defines an anchor or hyperlink and formats the enclosed text in a way that designates it as a hyperlink — usually underlined and in blue. The hyperlink is defined by the href attribute, as follows: <a href = "url">, where “url” is a web page address.
The tags for putting images into a flow of text are
· <img>. The image tag specifies that an image will be placed in the flow of text before and after the tag. As with the anchor tag, the image location is defined by the href attribute.
· <media>. An early way of specifying a multimedia element, such as a video clip.
Looking at Table Elements
Table tags were originally designed to be used for creating tables within a web page, with rows and columns and a caption describing the table. However, web developers badly wanted their pages to look better, and they didn’t have many tools to do it in HTML.
So web developers started making the entire web page a table, and putting text and graphics within the rows and columns. This did give a lot of control. However, it also made web page HTML very complicated and easy to “break,” in ways that were hard to find and fix.
Fairly quickly, web developers started using nested tables for page layout — perhaps one table for the top of the page, another for a left-hand column or “rail” with navigation, and a third for the main page content. This went within an overarching table that put the rails, main column, and so forth in place. If you wanted an actual table in the usual sense — a formatted set of rows and columns to organize some information — that just went within all the other tables.
A lot of the energy behind the creation and adoption of CSS was an attempt to get away from all these tables and the problems they created.
Commonly used table tags include <th> and </th> for a header cell, <thead> and </thead> for a group of header cells, <col> and </col> to define a column, and <colgroup> and </colgroup> to group columns together. <tr> and </tr> defined rows, and<td> and </td> defined a single cell in the table. The <caption> and </caption> tags gave the table’s caption — as usual with HTML, formatted and placed according to each web browser’s interpretation of the tag, beyond the control of the web developer.