Text Formatting

The First Rule of HTML

Open the template.html file that you created. Change its <title> element to contain the words Test Text

Now type a few lines into the body of the document. You can type anything you want–it doesn’t have to be what you see below. Then use Save as... to name the file test_text.html. (To save space from now on, we’ll just show the relevant portion of the document, not the whole thing.) The important part is that you type several lines. Do not type just one long word-wrapped line!

<head>
	<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
	<title>Test Text</title>
</head>
<body>
You can type anything you like in the body.
It can be serious:

   Life is like a river.

or just silly:

   The brick astonished the sunlight.
</body>

Now view the file in your browser. It looks like this:

Screenshot showing text only HTML file

All your newlines and careful spacing have disappeared because of the first rule of HTML: Any run of whitespace is replaced by a single blank. By whitespace, we mean a blank, a TAB, or a new line. The bad news of this rule is that it scoonches up all your text. The good news is that it means that the browser handles word wrap for you, and you don’t have to worry about where to break lines when you’re writing a large block of text.

For now, all of your text is inline; that is, treated as one long word-wrapped line. Later on, you’ll see how to tell the browser that you really do want new lines. First, let’s investigate some of the tags that structure inline text.

Structured Text

Let’s say that you have some word or words that you think need to be emphasized. You can mark them as emphasized by surrounding them with an opening <em> tag and closing </em> tag.

<body>
You can type <em>anything you like</em> in the body.
It can be serious:

Save the file, and reload it in the browser to see how it decides to emphasize text:

Screenshot showing use of <em> tag

You can strongly emphasize text by using an opening and closing <strong> tag. Put the tags around a word or words in your document, save the file, and reload it in the browser.

<body>
You can type <em>anything you like</em> in the body.
It can be <strong>serious</strong>:
Screenshot showing use of <strong> tag

The <em> and <strong> elements are called logical formatting elements, or logical styles. They tell what kind of effect you want, and leave it up to the browser to decide how that effect should look. While <em> and <strong> are the most widely used, there are other logical formatting elements that you can use. Try typing these examples into an HTML document and see what they produce. (Yes, type them in yourself; don’t just copy and paste. You will learn more that way. Trust me.)

<sup> - superscript
e = mc<sup>2</sup>
<sub> - subscript
Water is H<sub>2</sub>O.
<cite> - a citation or reference to some work (book, poem, etc.).
Joyce Kilmer wrote the poem <cite>Trees</cite>.
<dfn> - mark the defining instance of a word
A <dfn>quadruped</dfn> has four feet.
<samp> - Sample output from programs, etc.
The dialog box will say <samp>Click OK</samp>.
<kbd> - Keyboard input from users
Press the <kbd>Enter</kbd> key.
<code> - Computer programming code
<code>x = y + z * w;</code>
<small> - for “fine print”
<small>Not available in all colors.</small>
<q> - for quoting text. This element adds quote marks for you.
<q>And I am Marie of Romania.</q> - Dorothy Parker
<abbr> - represents an abbreviation or acronym
You are learning <abbr>HTML</abbr>.

The <cite> and <dfn> both appear in italics; the <samp>, <kbd>, and <code> all appear in a monospace font, where all the letters are the same width, like you’d see on a teletype machine or a typewriter.

What is the Big Deal Here?

You might ask, “Why have all these different ways of saying ‘italic’ or ‘monospace’?” The purpose of these elements is to tell what the text is being used for; it indicates the document’s structure. In HTML5, the whole language is oriented toward structure, and you specify how it should look by using styles (you will learn about these in grand and glorious detail later).

When you use <cite> to mark up a citation <dfn> to mark up a definition, and <em> to mark up emphasized text, they will all look italic, but a computer program will be able to go through your site and extract all the definitions (or all the citations), because you have marked them up accordingly. If you had marked them all up as <i> (the element for italic), only a human would be able to figure out which usage was which. Further, you will be able to use styles to display definitions with a yellow highlighted background and citations with a light green background, because you have marked up the content with different elements. This is really a Good Thing. Believe me.

Physical Formatting

“Well and good,” you might say, “but there are times when I really want just plain italics, with no extra meaning applied to it (for example, a foreign phrase like tabula rasa). HTML does include elements that are tied to their presentation; these are called presentational elements.

ElementMeans
<b>Bold
<i>Italic
<u>Underline
<tt>Teletype Text
<strike>Strike-through
<big>larger than normal

The last four elements are deprecated; you will find them in HTML4 and XHTML, but they have been removed from HTML5.

Nesting Elements

What if you want to have two effects on a word? Say you’re defining a word that’s very important, so you want it to be both strongly emphasized and a definition? Just put the tags one inside the other. This is called nesting the elements.

This is called the
<strong><dfn>theory of relativity</dfn></strong>.
Screenshot showing use of nesting

There’s no law that the nested tags have to be right next to each other. Try it and find out. Go ahead. Type the example into an HTML document, and see what it looks like in the browser. We’ll wait for you to get back.

Get a <strong>discount price of <em>only</em> $5.95</strong> today!

Important When you put tags inside one another, the last tag in must be the first tag out. Stated another way, you must close tags in the reverse order that you opened them. The examples above are correct; the ones below are wrong:

This is called the
<strong><dfn>theory of relativity</strong></dfn>.

Get a <strong>discount price of <em>only</strongs> $5.95</em> today!

If you do type them in the wrong order, the browser will do its best to display your text accurately. However, using HTML this way is not valid. We’ll get to that topic in more detail later.

Block Level Text Formatting

This covers many of the most important inline elements of HTML (they don’t start a new line). If you are ever to overcome the First Rule of HTML, though, you will need elements that let you start new lines. These elements are called block elements.

Start with a sample file. Open your template file, add this text, and save it with the name nash.html. Follow along and make the changes that are shown here; these notes will not show you the results at every step of the way.

<body>
The Termite
by Ogden Nash

A primal termite knocked on wood,
And tasted it, and found it good.
And that is why your Cousin May
Fell through the parlor floor today.

Ogden Nash was an American humorist,
best known for his short poems.

You may find out more about him by searching the web.
</body>

It looks great in the editor, but when you put it in the browser, the first rule of HTML takes over, and this is what it produces:

The Termite by Ogden Nash A primal termite knocked on wood, And tasted it, and found it good. And that is why your Cousin May Fell through the parlor floor today. Ogden Nash was an American humorist, best known for his short poems. You may find out more about him by searching the web.

Headings

First, let’s work on the title and author name. In most books, those would be a heading and a subheading. HTML has six elements, <h1>, <h2>, ... through <h6> which give you six levels of headings. In typography, a level one heading is a main heading, a level two heading is a subheading, etc. Let’s see what happens when you add these headings. In proper HTML usage, each heading starts a new section of the document. Since these headings really belong together (the poem is just one section of the document), you must enclose them in an <hgroup> element.

<hgroup>
<h1>The Termite</h1>
<h2>by Ogden Nash</h2>
</hgroup>

The Termite

by Ogden Nash

A primal termite knocked on wood, And tasted it, and found it good. And that is why your Cousin May Fell through the parlor floor today. Ogden Nash was an American humorist, best known for his short poems. You may find out more about him by searching the web.

Well, that’s looking a little better. Headings are automatically set in boldface, start a new line, and have built-in line spacing. Here is what all six levels look like. As the heading level increases, the font size decreases.

Level one heading <h1>

Level two heading <h2>

Level three heading <h3>

Level four heading <h4>

Level five heading <h5>
Level six heading <h6>

There’s no law that says you have to start with <h1> and go in order, but you should do so.Some people use headings to get the visual structure they desire. Here’s the sort of trick they use:

<h4>MGM Presents...</h4>
<h2>Gone with the Wind</h2>

While this technique does give the “right sizes” for the text, it breaks the document’s structure. You will find a better way to keep the appropriate structure but still get the look that you want when you study styles.

Paragraphs

You can’t use headers for all the lines in the poem and the explanatory text that follows; you don’t want everything bold, and there’d be too much space between lines. Aside from that, the poem and the explanation aren’t headers, so it would be a misuse of HTML to structure the document that way.

Instead, we are going to use the <p> element, which stands for paragraph to mark the beginning and end of paragraphs. Note that, in accordance with the rules for writing with the XHTML syntax, every opening tag must have a closing tag.

<hgroup>
<h1>The Termite</h1>
<h2>by Ogden Nash</h2>
</hgroup>

<p>
A primal termite knocked on wood,
And tasted it, and found it good.
And that is why your Cousin May
Fell through the parlor floor today.
</p>

<p>
Ogden Nash was an American humorist,
best known for his short poems.
</p>

<p>
You may find out more about him by searching the web.
</p>

When you enter this in your file, and display it in the browser, you'll see that things are really shaping up nicely. Try it and find out.

Line Breaks

The only remaining problem is the poem itself, which is still displayed as one word-wrapped line. You can’t use paragraphs for every line of the poem; if you do, the poem will appear to be doublespaced. What you need is an HTML element that says “go to the next line”, and the <br /> element does just that.

IMPORTANT: In XHTML, you must have a closing tag for every opening tag. Even though you can’t put text into a line break, you still need both tags. The technical term for an element that never has any content is an empty element.

Do not write a line break like this:

<br></br>

First, it is not valid in HTML5. Second, it may mislead you into thinking that you should put something between those tags. XHTML syntax (which we are using in this course) allows you to write empty elements with a shortcut. In the following example, the slash just before the greater than sign means “this is an opening and closing tag all wrapped up in one.” Notice the blank before the slash. It is necessary, for reasons described below.

<br />

Let’s add line breaks to complete this demonstration:

<p>
A primal termite knocked on wood,<br />
And tasted it, and found it good.<br />
And that is why your Cousin May<br />
Fell through the parlor floor today.
</p>

<p>
<cite>Ogden Nash</cite> was an American humorist,
best known for his short poems.
</p>

Two things to note about line breaks:

  1. You don’t put one after the last line of the poem; that’s because the end of the paragraph and the start of the next paragraph already gives you the line space you need.
  2. You do not need to put a line break after the word humorist; this is one place where you want HTML to do word wrap for you.