CIT 040 Index > Lecture Notes - Text Formatting

Text Formatting

The First Rule of XHTML

Open the template.html file that you created. Change its title to Test Text and type a few lines into the body of the document. You can type anything you want–it doesn’t have to be what you see below. Then use Save as... to name the file test_text.html. (To save space from now on, we’ll just show the relevant portion of the document, not the whole thing.) The important part is that you type several lines. Do not type just one long word-wrapped line!

<head>
	<title>Test Text</title>
</head>
<body>
You can type anything you like in the body.
It can be serious:

   Life is like a river.

or just silly:

   The brick astonished the sunlight.
</body>

Now view the file in your browser. It looks like this:

Screenshot showing text only XHTML file

All your newlines and careful spacing have disappeared because of the first rule of XHTML: Any run of whitespace is replaced by a single blank. By whitespace, we mean a blank, a TAB, or a new line. The bad news of this rule is that it scoonches up all your text. The good news is that it means that the browser handles word wrap for you, and you don’t have to worry about where to break lines when you’re writing a large block of text.

For now, all of our text is inline; that is, treated as one long word-wrapped line. Later on, we’ll see how to tell the browser that we really do want new lines. First, let’s investigate some of the tags that structure inline text.

Structured Text

Let’s say that you have some word or words that you think need to be emphasized. You can mark them as emphasized by surrounding them with an opening <em> tag and closing </em> tag.

<body>
You can type <em>anything you like</em> in the body.
It can be serious:

Save the file, and reload it in the browser to see how it decides to emphasize text:

Screenshot showing use of <em> tag

You can strongly emphasize text by using an opening and closing <strong> tag. Put the tags around a word or words in your document, save the file, and reload it in the browser.

<body>
You can type <em>anything you like</em> in the body.
It can be <strong>serious</strong>:
Screenshot showing use of <strong> tag

The <em> and <strong> elements are called logical formatting elements, or logical styles. They tell what kind of effect you want, and leave it up to the browser to decide how that effect should look. While <em> and <strong> are the most widely used, there are other logical formatting elements that you can use:

<sup> - superscript
e = mc<sup>2</sup>
<sub> - subscript
Water is H<sub>2</sub>O
<cite> - a citation or reference to other sources.
See the book by <cite>Elizabeth Castro</cite>.
<dfn> - mark the defining instance of a word
A <dfn>quadruped</dfn> has four feet.
<samp> - Sample output from programs, etc.
The dialog box will say <samp>Click OK</samp>
<kbd> - Keyboard input from users
Press the <kbd>Enter</kbd> key.
<code> - Computer programming code
<code>x = y + z * w;</code>

The <cite> and <dfn> both appear in italics; the <samp>, <kbd>, and <code> all appear in a monospace font, where all the letters are the same width, like you’d see on a teletype machine or a typewriter.

You might ask, “Why have all these different ways of saying ‘italic’ or ‘monospace’?” The purpose of these elements is to tell what the text is being used for; it indicates the document’s structure. In the original version of HTML, the whole language was oriented toward structure, and presentation was left entirely up to the browser.

Physical Formatting

However, many users (especially beginning authors) were more concerned with presentation, and said, “This structure is all well and good, but different browsers could make different decisions on how to display these logical styles. I’d like to be able to tell the browser that I want italic text, and not leave it up to the browser to decide what to use.” Thus, some physical formatting tags were introduced so that what you said would be what you got.

ElementMeans
<b>Bold
<i>Italic
<u>Underline
<tt>Teletype Text
<strike>Strike-through
<big>larger than normal
<small>smaller than normal

The last two are actually half physical and half logical: <big> and <small> make the text bigger or smaller than normal, but the browser makes the decision as to the proportions.

Here’s some XHTML that uses all of the physical elements, as well as <sup> and <sub>. Try it and find out.

<b>This is bold text</b>
<i>This is italic text</i>
<tt>This is teletype text</tt>
Ozone is O<sub>3</sub>.
One thousand is 10<sup>3</sup>
This is <strike>stricken</strike> text.
A <big>fantastic deal</big> always has <small>fine print</small>
Screenshot showing use of physical styles

Nesting Elements

What if you want to have two effects on a word? Say you’re defining a word that’s very important, so you want it to be both strongly emphasized and a definition? Just put the tags one inside the other. This is called nesting the elements.

This is called the
<strong><dfn>theory of relativity</dfn></strong>.
Screenshot showing use of nesting

There’s no law that the nested tags have to be right next to each other. Try it and find out.

Get a <big>discount price of <em>only</em> $5.95</big> today!

Important When you put tags inside one another, the last tag in must be the first tag out. Stated another way, you must close tags in the reverse order that you opened them. The examples above are correct; the ones below are wrong:

This is called the
<strong><dfn>theory of relativity</strong></dfn>.

Get a <big>discount price of <em>only</big> $5.95</em> today!

If you do type them in the wrong order, the browser will do its best to display your text accurately. However, using XHTML this way is not valid. We’ll get to that topic in more detail later.

Block Level Text Formatting

This covers many of the most important inline elements of XHTML (they don't start a new line). If we're ever to overcome the First Rule of XHTML, though, we'll need elements that let us start new lines. These elements are called block elements.

Let's start with a sample file. Open your template file, add this text, and save it with the name nash.html. Follow along and make the changes that we show here; these notes will not show you the results at every step of the way.

<body>
The Termite
by Ogden Nash

A primal termite knocked on wood,
And tasted it, and found it good.
And that is why your Cousin May
Fell through the parlor floor today.

<cite>Ogden Nash</cite> was an American humorist,
best known for his short poems.

You may find out more about him by searching the web.
</body>

It looks great in the editor, but when we put it in the browser, the first rule of HTML takes over, and this is what it produces:

The Termite by Ogden Nash A primal termite knocked on wood, And tasted it, and found it good. And that is why your Cousin May Fell through the parlor floor today. Ogden Nash was an American humorist, best known for his short poems. You may find out more about him by searching the web.

Headings

First, let’s work on the title and author name. In most books, those would be a heading and a subheading. HTML has six elements, <h1>, <h2>, ... through <h6> which give you six levels of headings. In typography, a level one heading is a main heading, a level two heading is a subheading, etc. Let’s see what happens when we add these headings:

<h1>The Termite</h1>
<h2>by Ogden Nash</h2>

The Termite

by Ogden Nash

A primal termite knocked on wood, And tasted it, and found it good. And that is why your Cousin May Fell through the parlor floor today. Ogden Nash was an American humorist, best known for his short poems. You may find out more about him by searching the web.

Well, that’s looking a little better. Headings are automatically set in boldface, start a new line, and have built-in line spacing. Here is what all six levels look like. As the heading level increases, the font size decreases.

Level one heading <h1>

Level two heading <h2>

Level three heading <h3>

Level four heading <h4>

Level five heading <h5>
Level six heading <h6>

There’s no law that says you have to start with <h1> and go in order, but you should do so. You may Some people use headings to get the visual structure you they desire. Type this into the body of another HTML document, and see what it produces. Try it and find out. Here’s the sort of trick they use:

<h4>MGM Presents...</h4>
<h2>Gone with the Wind</h2>

While this technique does give the “right sizes” for the text, it breaks the document’s structure. We will find a better way to keep the appropriate structure but still get the look that we want when we study styles.

Paragraphs

We can’t use headers for all the lines in the poem and the explanatory text that follows; we don’t want everything bold, and there’d be too much space. Aside from that, the poem and the explanation aren’t headers, so it would be a misuse of HTML to structure the document that way.

Instead, we are going to use the <p> element, which stands for paragraph to mark the beginning and end of paragraphs. Note that, in accordance with the rules for writing XHTML, every opening tag must have a closing tag.

<h1>The Termite</h1>
<h2>by Ogden Nash</h2>

<p>
A primal termite knocked on wood,
And tasted it, and found it good.
And that is why your Cousin May
Fell through the parlor floor today.
</p>

<p>
<cite>Ogden Nash</cite> was an American humorist,
best known for his short poems.
</p>

<p>
You may find out more about him by searching the web.
</p>

When you enter this in your file, and display it in the browser, you'll see that things are really shaping up nicely. Try it and find out.

Line Breaks

Our only remaining problem is the poem itself, which is still displayed as one word-wrapped line. We can’t use paragraphs for every line of the poem; if we do, the poem will appear to be doublespaced. What we will need is an HTML element that says “go to the next line”, and the <br /> element does just that.

IMPORTANT: In XHTML, you must have a closing tag for every opening tag. Even though you can’t put text into a line break, you still need both tags. The technical term for an element that never has any content is an empty element.

Thus, you may add a line break like this:

<br></br>

The only problem with this is that it may mislead you into thinking that you should put something between those tags. XHTML allows you to write empty elements with a shortcut. In the following example, the slash just before the greater than sign means “this is an opening and closing tag all wrapped up in one.” Notice the blank before the slash. It is necessary, for reasons described below.

<br />

Let’s add line breaks to complete this demonstration:

<p>
A primal termite knocked on wood,<br />
And tasted it, and found it good.<br />
And that is why your Cousin May<br />
Fell through the parlor floor today.
</p>

<p>
<cite>Ogden Nash</cite> was an American humorist,
best known for his short poems.
</p>

Two things to note about line breaks:

  1. We didn’t put one after the last line of the poem; we didn’t need to, because the end of the paragraph and the start of the next paragraph already gave us the line space we needed.
  2. We did not put a line break after the word humorist; this is one place where we want HTML to do word wrap for us.

More About the Empty Element Shortcut

Why do we insist on a blank before the slash in a shortcut form like <br />? Older browsers, when confronted with <br/>, will not realize that the tag name ends with the letter R, but will think that the slash is part of the tag name, and ignore the tag as meaningless.

Use the shortcut form for empty elements like <br /> only with elements that can never contain text. Let’s say you had a level two heading with no text in it; this might happen if you know you want a heading but haven’t figured out what to say yet. You should write it like this: <h2></h2>. Do not ever write it like this: <h2 />. If you do, older browsers will see only an opening <h2> without a closing tag, and the rest of your document will be much larger than you want.