Never confuse education with intelligence, you can have a PhD and still be an idiot.
- Richard Feynman -

Introduction to HTML

From Juneday education
Jump to: navigation, search

This chapter introduces the very basics of HTML - HyperText Markup Language.

HTML is a text markup language used on the web. It is typically viewed with a web browser, which is a program for navigating the web and parsing and displaying documents written in HTML. Using tags to mark up content, HTML will represent a "page" with structure and elements.

Most of a page written in HTML consists of text, but HTML can also include images, sound and even video. Using embedded content written in certain web programming languages, a web page can also contain applications with interactive capabilities, like games and user interfaces. Client-side logic can also be achieved by incorporating (or referencing) JavaScript, a scripting language originally developed for adding programmable logic to (otherwise static) web pages.

Another key component of a web page is the hyperlink. A hyperlink is a clickable object (typically text but images can also be used as a clickable link).

Now, a web page is meant to be rendered by a browser (or some other application with HTML capabilities) and it has to be written using the HTML markup language. However, this is not a web design course (and the authors are certainly not web developers or designers!). We include this introduction to HTML in our Web basics book, purely for orientation since we do believe that a basic understanding of the web requires also basic knowledge of HTML and related topics.

Structure of an HTML document

The document starts with a doctype declaration. In HTML5 it looks as the following:

<!DOCTYPE html>

In HTML4, an example of a doctype declaration could be:


HTML prior to HTML5 includes a DTD in the doctype. A DTD is a document type definition, a formal grammar (set of rules) for the markup language version in question. The one above is for HTML 4.01 with strict rules.

After the doctype declaration follows a hiearchy of "tags". Tags are written using < and >. A tag is usually a pair of "start tag" and "end tag" with some content (or other tags) in between. The hierarchical properties of the tags comes from the fact that they can be nested in a hierarchy, like a tree structure. The root element is the root of the tree, and inside it you put branches and leafs. The root element in HTML is <html>. The end-tag following <html> is </html> (which is also the last tag in the document). Most tags work like this, you open with <sometag>, then some content and close with </sometag>

Inside the root element, we find the <head> element (which can contain for instance a <title> element). After the <head> follows the body.

Here's a small but complete document:

<!DOCTYPE html>
  <title>This is the page title</title>
<h1>This is a level one header</h1>
This is a paragraph of text with a <a href="">link to Juneday</a>.

You can save the above markup in a file called example.html and open it in a browser to investigate how the browser will display the content. You can download the file here (use e.g. wget to download it).

As you see in the example above, you may use indentation to make the structure more visible. White space is generally ignored in HTML.

More about tags

Opening tags, closing tags and attributes

So, a tag usually comes in a pair with an opening tag and some content (which might be text or other tags) and a closing tag, like <title>This is the title</title>. Tags have a name (specified in the HTML specification) but can also have "attributes". Attributes are like metadata about the tag defining some property of the tag.

For instance the so called anchor tag for links has an attribute defining the target of the link (the destination for those clicking on the link):

<a href="">This is the link text</a>

The attribute has a name and a value. The value follows the equal sign after the name and should be placed inside double quotes.

Structural elements

The tags for purely structuring the document are many. We'll list a few of them here.


Headers are for a logical structure of your document. They have levels where the level one headers ( <h1>Top level header</h1>) are for top level section headers of your text. A header for a subsection would then be <h2>Subsection header</h2> and then level 3, level 4 etc. The name of the header tags start with an "h" and then the number of the level:

<h1>This is the top level header</h1>
<p>A paragraph of text for the top level section of your text</p>
<h2>This it a level 2 subsection header</h2>
<p>This is some text for the subsection</p>

The top level header will be rendered larger by the browser. The following levels will be rendered increasingly smaller and smaller (to some degree - rendering varies between browsers). The structural elements (tags) are for logical structure and how they affect the rendered page varies between browsers.


Paragraphs of text are created with the "p tag":

This is a paragraph of text.
This is another paragraph of text.


You can create structural lists using the ol tag (for ordered lists) or the ul tag (for un-ordered lists). The elements of a list is created using the li tag:

Lists as rendered by Google Chrome
  <li>This is element one of an unordered list</li>
  <li>This is element two of the list</li>
  <li>And this is element three</li>

  <li>This list is automatically ordered (numbered)</li>
  <li>This will be numbered 2 then</li>
  <li>And this will be numbered 3!</li>


Source code with the examples

Further reading