Recommended HTML Usage
This part of the HTML specification
discusses recommended usage. These constructs should work even on
pretty broken implementations.
Structure of an HTML document
An HTML document should start with a TITLE
element.
Elements of the Body
Most text elements consist of a start tag, some content, and an end
tag.
Some elements are "empty" and consist of only a start tag. For
example, paragraphs are separated by the P element, which is
just a P tag.
Six levels of headings are supported:
Level three heading
Level four heading
five
six
Unordered lists:
- This is the first item of an unordered list.
- This is the
second item. It's kinda long, and should wrap around on most screens.
- This is the third item. It's only one paragraph, but it's got a
paragraph tag at the end.
- This is the fourth and final item.
Ordered lists:
- This is the first item of an unordered list.
- This is the second item. It's kinda long, and should wrap around
on most screens.
- This is the third item -- you know, the one with the P element.
- This is the fourth and final item.
- term
- definition
- another term
- and its definition
The address element indicates the author or source of the document.
DWC
connolly@convex.com
Other Elements
The TITLE element names the document. The content of the TITLE element
is raw character data. It should be less than 72 characters, and it
should contain no linebreaks, '<', '>', or '&' characters.
ISINDEX
Indicates the document is searchable.
Anchors
A span of text can be marked as an anchor, for example:
Fred Flinstone
Click here to view a neighbor
document.
Tags
A start tag is a name surrouded by angle brackets. An end tag has a
slash before the name.
Names
A name should be a letter followed by letter and/or numbers. Case is
not significant.
Normal Text Content
Normal text in HTML is parsed for markup. The characters '<', '>',
and '&' should be treated as special
characters, lest they be interpreted as markup.
Lines should not exceed 72 characters. Line breaks have no
significance except to separate words.
Literal Text Content
Sections of literal text are represented in HTML as replaceable
character data, RCDATA. Most markup is ignored in RCDATA.
Line breaks are significant, and characters are rendered in a
fixed-width font to preserve horizontal formatting.
This is literal text. THIS word
should line up under THIS word.
There should be exactly three blank lines between here
and here.
The characters '&', '<', and '>' should be treated as a special characters.
SGML tags look like and </end>.
The marked section close delimiter looks like ]]>.
But ]] is just two close square brackets, and
> is just a greater-than sign.
Characters that are used for markup can be represented by entity
references. Entity references are written:
&name;
The following special characters are used in HTML
- <
- lt
- >
- gt
- &
- amp
- "
- quot
- '
- apos