Introduction

The Internet and the World Wide Web

The Internet is a collection of computers, all connected to one another. It first got its start in 1966 as the ARPANET, a project of the Defense Department’s Advanced Research Projects Agency. At some point it was opened up to university researchers, and, for many years, was used primarily for transferring email and files among university researchers and for exchanging information via newsgroups. Each computer that is connected to the network is called an internet host. In 1986 there were 5,000 hosts and 241 newsgroups.

In 1990, Tim Berners-Lee came up with the idea of having a large number of documents, all of which could be linked together and refer to one another. He called this set of documents the World Wide Web (WWW). Each person or organization’s collection of documents was called a website. The idea caught on quickly. In 1993 there were 600 websites. (The Internet itself was still growing; there were now two million Internet hosts.) In 1995 there were 100,000 websites. Today there are literally millions of websites, and the number of Internet hosts has grown as well.

Clients and Servers

Each Internet host can act as a server, or a repository files. Some hosts hold email files, some hold newsgroup postings, others hold web pages. A server’s purpose in life is to wait for requests to come in from some client, find the requested file or page, and send it back to the client. A client is usually an end user’s computer.

Think of it as a store. You’re the client; you go in and ask the server for a particular item. He goes back to where all the products are stored, prepares the item, and brings it out to you.

Similarly, when you sit at your computer, you request a particular file. That request goes to the appropriate server, which finds the file and sends it back to you. The software that you use depends on the kind of files you want. If you are requesting web pages, you use a browser. If you are requesting email, you use a program designed for reading mail. If you’re reading newsgroup posts, you may use a special news reader program. For your convenience, the latest browsers have combined all these functions into various sections of a single program.

In the first assignment for this course, you will set up an account for yourself on a Windows machine, which will be the machine that you use as a client. You will also set up an account on a UNIX machine which will serve files to you.

How To Specify a File

See pages 15 and 16 of HTML and CSS. As an example, in http://www.evc.edu/calendar/index.html, the means of access, also called the scheme is http://. The network location, also called the host is www.evc.edu, and the path is calendar/index.html.

Note that, when you’re in a browser, you don’t have to type the http://, or sometimes even the www.; the browser will automatically fill those in for you.

HTML

Before the WWW, people exchanged just plain text files or they exchanged files in some proprietary format. One of Berners-Lee’s objectives was to create a way to write files that would improve upon plain text, but be open to everyone. That method is HTML, which stands for HyperText Markup Language.

Markup

Typewritten data with red-pen markup markup comes from the bad old days before word processors. If you needed a brochure, you’d type it on a typewriter, and then literally mark it up with a red pen to tell the typesetter what you wanted it to look like. The typesetter would follow your instructions and return a finished document to you:

How to Buy a Wrench

There are two kinds of wrenches: wrenches with fixed size, and adjustable wrenches.

In this instance, the markup is used not only to show how text should be presented (italic versus normal text), but also to tell how the document is structured: some of the words form a heading, the other words are just ordinary text.

Of course, we’re not going to use a red pen to mark up our text files; instead, we’re going to use HTML elements to tell the browser what our document’s structure and presentation should be.

Hypertext

In addition to elements that let you specify how a document should be structured and presented, HTML has tags that let you tell how your document should be linked to other documents on the WWW. This ability to link documents together is called hypertext. The term was invented in 1965 by Ted Nelson, but the idea itself has been around since 1945.

HTML, XHTML, and HTML5

The rules for how you write proper HTML (before version 5) are defined by the Standard Generalized Markup Language (SGML) rulebook. In recent years, a new rulebook known as XML, the Extensible Markup Language, has emerged. It is not as powerful as the SGML rulebook, but it is far simpler. When we write HyperText Markup Language according to the rules of SGML, we call it HTML. When we write the same elements according to the rules of XML, we call it XHTML.

The newest version, HTML5, lets you write your markup in either HTML or XHTML syntax. In this course, we will write only HTML5 documents, and will use the XHTML syntax, because the rules are more consistent.

Don’t get bothered when you see old HTML pages, read other books about HTML, or see pages written in HTML5 using the HTML syntax. They aren’t doing things wrong—they are just writing according to a different set of rules than the ones that we will use. Also, do not worry about the fact that we are using a “newer model;” we will write pages so that they will be accepted by older browsers that have never heard of XHTML or HTML5.

HTML Documents

HTML documents consist of text with tags. Tags are commands written between less than (<) and greater than (>) signs, also known as angle brackets. Here’s an example of the tag that tells the browser to use boldface text:

This is a <b>special message</b> for you!

The opening tag, closing tag, and the content in between are collectively called an HTML element. Technically, tags and elements are very different, but most people use the terms interchangeably.

Things to remember about tags:

The Document Template

What we’ll do now is build a template HTML5 file; a “framework” that can be copied and filled in. All our HTML5 files will be built on this framework.

  1. Open a new file, and type the opening and closing <html> tag. This tag is a structure tag; it tells the browser that the entire document will be between these tags. Leave some blank lines between the opening and closing tags; you’ll be putting things in between them later.
    <html>
    
    </html>
  2. Now add this to the opening tag; it lets the browser know that the primary language of this document is English:
    <html xml:lang="en" lang="en">
    
    </html>
  3. Business letters have a head and a body. The head gives identifying information (who it’s from, whom it’s to, the date, etc.). The body contains the actual content of the letter. Similarly, our HTML documents will also have a head and a body. Add the opening and closing <head> and <body> tags to your template:
    <html xml:lang="en" lang="en">
    <head>
    
    </head>
    
    <body>
    
    </body>
    </html>
  4. Put a title in the head of the document. This is identifying information. Some search engines use it to index your files. If you bookmark a page, the title of the document is used as the bookmark text. Add a <title> tag:
    <html xml:lang="en" lang="en">
    <head>
        <title>Put a Title Here</title>
    </head>
    
    <body>
    
    </body>
    </html>

You’ll notice that we’ve used indenting to make it very clear where tags are in relation to one another. The browser doesn’t care how nicely indented your HTML is, but your instructor does. We don’t indent the <head> and <body> elements; it makes the entire file look “cleaner.”

Since it is the world wide web, not all web pages are in English. You can specify, for example, that a page is written entirely with Russian or Chinese characters. In our case, we’ll add the <meta> element to specify that we are using a “universal” character set called Unicode.

<html xml:lang="en" lang="en">
<head>
    <title>Put a Title Here</title>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>

<body>

</body>
</html>

Note: there is a simpler way to specify the character set, but it does not work on all browsers. The <meta> element given here works on all modern browsers.

In order to get the best of all possible worlds—a document that can be processed either as XHTML or HTML5, you must add a namespace declaration; it lets XML processing tools know which elements are part of HTML. This is useful when you have a document that contains markup from several different markup languages.

<html xml:lang="en" lang="en"
    xmlns="http://www.w3.org/1999/xhtml">
<head>
    <title>Put a Title Here</title>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>

<body>

</body>
</html>

The template is almost finished. You need to add two things at the beginning of the document. The first thing is a Document Type Declaration. The Document Type Declaration (DTD) must be the very first line in your document; do not leave a blank line before it. The DTD is not a tag! It is a declaration, which declares to the browser precisely which “flavor” of HTML the document uses. The very latest browsers will use the declaration to determine how certain tags should be displayed. When using HTML5, the document type declaration is very simple.

<!DOCTYPE html>

<html xml:lang="en" lang="en"
    xmlns="http://www.w3.org/1999/xhtml">
<head>
    <title>Put a Title Here</title>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>

<body>

</body>
</html>

Finally, add an HTML comment to tell who wrote the file. You put comments in a document for the benefit of other humans who will be reading it. The browser ignores it completely. Your comments (and they can be more than one line long) go between the opening <!-- and the closing -->. This is a comment, it’s not a tag, so it has its own rules.

<!DOCTYPE html>

<!-- Written by E.G. Valley, 4 Sep 2010 -->

<html xml:lang="en" lang="en"
    xmlns="http://www.w3.org/1999/xhtml">
<head>
    <title>Put a Title Here</title>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>

<body>

</body>
</html>