AGOCG logo
Graphics Multimedia VR Visualization Contents
Training Reports Workshops Briefings Index
This report is also available as an Acrobat file.
Back , Next , Title

4 HTML


About HTML

Native documents on the World-Wide Web are written in HTML, the HyperText Markup Language. HTML defines the structural elements in a document (such as headers, citations, addresses, etc.), layout information (bold and italics), the use of inline graphics together with the ability to provide hypertext links.

A simple HTML document is illustrated in Figure 4-1.


<TITLE>The World-Wide Web</TITLE>
<H1>About The World-Wide Web</H1>
<P>The World-Wide Web is a <EM>distributed multimedia
hypertext</EM> system.</P>
Figure 4-1 A Simple HTML Document.

Structural elements in the document are identified by start and end markup tags. For example the <TITLE> and </TITLE> tag is used to specify the title of the document, which is often displayed by a client. The <H1> and </H1> tag is used to define the first level heading. Clients will normally display headers differently from the body text: for example, a graphical client could display the header using a larger or different font, whereas a text-based client could display a header as centred text or in all capitals.

Figure 4-1 also illustrates the <EM> container. Text held in the container (which is defined by the <EM> start tag and the </EM> end tag) will be emphasised in some way. A graphical browser could render the emphased text by displaying it in italics, whereas a browser with audio capabilities for the visually impaired could render the emphasis by a change in the tone of the voice output.

Figure 4-1 also shows the paragraph container. It is important to understand that the <P> tag is part of a paragraph container and is no longer a paragraph separator (as many people mistakenly believe). If the </P> is not used the existence of the next <P> tag will imply a </P>. In future versions of HTML it will be possible to specify paragraph attributes: for example <P ALIGN=Centred>.

Although browsers will display the HTML document shown in Figure 4-1, for reasons of performance and upwards compatibility it is strongly recommended that HTML documents contain additional elements including the <HTML>, <HEAD> and <BODY> tags, as shown in Figure 4-2.


<HTML>
<HEAD>
<TITLE>The World-Wide Web</TITLE>
</HEAD>
<BODY>
<H1>About The World-Wide Web</H1>
<P>Information about the World-Wide Web is available 
<A HREF="http://info.cern.ch/hypertext/WWW/TheProject.html"> at
CERN</A>.</P>
</BODY>
</HTML>
Figure 4-2 A Simple HTML Document.

The <HTML> container is used to define the extent of the HTML document. Within the HTML document there are two other containers: <HEAD> and <BODY>. The <HEAD> container provides information about the document itself. This can include the title of the document (as illustrated) copyright information, keywords and expiry dates (for use by caching software). It is important to make use of the <HEAD> tag since, for example, an automatic indexing program which wishes to index the title of HTML documents can parse only the information contained in the <HEAD> container. If the <HEAD> container is not present the entire document may have to be parsed, which will place unnecessary extra load on the server.

Figure 4-2 also illustrates the use of the anchor <A> container. This tag is used to provide hypertext links. In the example the text at CERN which is contained between the <A> and </A> tags will be highlighted in some way by the browser. Selecting this highlighted phrase will cause the client to send a request for http://info.cern.ch/hypertext/WWW/TheProject.html This request will use the http protocol and will be sent to the server running on the system at info.cern.ch

HTML Authoring Tools

Initially information providers on the World-Wide Web used standard editors such as vi and emacs to create HTML documents. As WWW grew in popularity authoring tools were developed to assist information providers. This section describes the following authoring tools which are available for the Microsoft Windows environment: HTML Assistant, HTML Hyperedit, HTMLEd and InContext Spider.

HTML Assistant

HTML Assistant is a simple authoring tool which can be used to create and edit HTML documents. Frequently Asked Questions about HTML Assistant is available at the URL http://cs.dal.ca/ftp/htmlasst/htmlafaq.html HTML Assistant is available at the URL ftp://ftp.cica.indiana.edu/pub/pc/win3/misc In the UK it is available at the URL ftp://src.doc.ic.ac.uk/packages/WWW/tools/editing/ms-windows/html-assistant


Figure 4-3 HTML Assistant.

HTML Hyperedit

HTML Hyperedit (which was developed using the Toolbook authoring system) not only provides an environment for producing HTML documents, but also contains a tutorial which gives an introduction to HTML. HTML Hyperedit is available at the URL ftp://info.curtin.edu.au/pub/internet/mswindows/hyperedit In the UK it is available at the URL ftp://src.doc.ic.ac.uk/packages/WWW/tools/editing/ms-windows/win-htmledit


Figure 4-4 HTML HyperEdit

HTMLEd

HTMLEd is a simple authoring tool which can be used to create HTML documents. In the UK it is available at the URL ftp://src.doc.ic.ac.uk/packages/WWW/tools/editing/ms-windows/


Figure 4-5 HTMLEd.

InContext Spider

InContext Spider is a more sophisticated HTML authoring tool for Microsoft Windows which provides support for HTML 3 features, such as tables.


Figure 4-6 InContext Spider.

Further information about InContext Spider is available at the URL "http://www.incontext.ca/

HotDog

HotDog is another sophisticated authoring tools for Microsoft Windows which provides supprot for HTML 3 features, such as tables and forms. as as well as Netscape's HTML extensions.


Figure 4-7 HotDog

Word Processing Tools

HTML Assistant and HTML Hyperedit are self-contained authoring tools. Another approach is to develop authoring tools which work within a word processing environment. These tools are normally implemented as macros for popular word processing packages, such as Word For Windows or WordPerfect. This section describes three tools which have been developed for use within Word For Windows: the GT_HTML, CU_HTML and ANT_HTML macros.

Word processing tools have the advantage that they provide a consistent environment for existing users of word processors. However they do have their disadvantages. Because they are normally implemented as macros, they can be very slow, especially when used with large or complicated documents. There is also a danger that HTML markup which is embedded as hidden text could cause conflicts with other word processing tools if, for example, the word processed document was used by other users.

GT_HTML

One of the first word processing macros which could be used to create HTML documents was the GT_HTML macro. This macro, written for Word For Windows, was developed at the Georgia Technical Research Institute. In the UK the software is available at the URL ftp://src.doc.ic.ac.uk/packages/WWW/tools/editing/macros/ms-winword


Figure 4-8 The GT_HTML Macro.

CU_HTML

CU_HTML is a template designed to work within Word For Windows. The template was written by Anton Lam (mailto:anton-lam@cuhk.hk) The software is available at the URL ftp://ftp.cuhk.hk/pub/www/windows/util


Figure 4-9 The CU_HTML Macro.

ANT_HTML

ANT_HTML is a template designed to work within Word For Windows 6.0. The template was written by Jill Swift (mailto:jswift@freenet.fsu.edu) The software is available at the URL ftp://ftp.einet.net/einet/pc/ANT_HTML.ZIP


Figure 4-10 The ANT_HTML Macro.

Internet Assistant

Internet Assistant is the name of Microosft's tempate designed to work within Word For Windows. The software is available at the URL http://www.microsoft.com/


Figure 4-11 Internet Assistant

Browser Editing Tools

Another approach to editing HTML documents is provided by browsers which are integrated with editing tools. The Arena browser enables an external editor to be invoked to edit the displayed HTML document. Figure 4-12 illustrates the Arena browser used in conjunction with the Emacs editor.


Figure 4-12 Editing A Document From Arena.

HTML Document Conversion Tools

Authoring tools are normally used to create new HTML documents. Document conversion tools, on the other hand, can be used to convert existing documents to HTML format.

LaTeX2html

One of the first sophisticated document conversion tools to be developed was the LaTeX2html conversion program. This program was written by Nikos Drakos, Computer Based Learning Unit, University of Leeds. It set the standard for document converters, providing a wide range of feature including:

Figure 4-13 illustrates a document which has been converted by the LaTeX2html conversion program.


Figure 4-13 A Document Converted Using LaTeX2html.

LaTeX2html is available at the URL ftp://src.doc.ic.ac.uk/packages/WWW/tools/translators/latex2html Further information is available at the URL http://cbl.leeds.ac.uk/nikos/doc/www94/www94.html

RTFtohtml

The RTFtohtml conversion program enables RTF files (which can be produced by word processing packages such as Word For Windows) to be converted to HTML. The program was written by Chris Hector (Cray) based on RTF parsing software developed by Paul DuBois.

RTFtohtml is available as a command line tool for a number of Unix platforms. In addition an Apple Macintosh implementation is available. A beta version of an MSDOS implementation was announced in November 1994.

An extension of the RTFtohtml program is known as RTFtoweb. This provides a number of additional features, including creation of hypertext links at user defined section breaks. Figure 4-14 illustrates a document on Exploring The World-Wide Web Using Mosaic For Windows which is available at the URL http://www.leeds.ac.uk/ucs/docs/tut50/tut50.html


Figure 4-14 Document Converted Using RTFtoweb.

In Figure 4-14 it should be noted that the document is automatically split into a number of files. A hypertext table of contents is automatically generated. Chevrons (Next and Back) are also generated automatically which can be used to move to the next or previous section.

Further information about RTFtohtml is available at the URL ftp://ftp.cray.com/src/WWWstuff/RTF/rtftohtml_overview.html The software is available at the URL ftp://ftp.cray.com/src/WWWstuff/RTF/latest/ In the UK it is available at the URL ftp://src.doc.ic.ac.uk/packages/WWW/tools/translators/rtftohtml

RTFtoweb is available at the URL ftp://ftp.rrzn.uni-hannover.de/pub/unix-local/misc/rtftoweb/html/rtftoweb.html

HTML Quality Tools

The HTML specification states that "HTML parsers should be liberal except when verifying code. HTML generators should generate strictly conforming HTML." Put simply this means that browsers should be capable of displaying documents which contain invalid HTML, but HTML authoring tools and document converters should generate HTML which conforms strictly to the standard.

A number of HTML validation tools are available which can validate HTML documents. A number of popular tools are described below.

HoTMetal

HoTMetaL is an HTML authoring tool and validator. It will provide feedback if it encounters invalid HTML, as illustrated in Figure 4-15.


Figure 4-15 HoTMetaL.

HoTMetaL is available for the X and Microsoft Windows platforms. Two versions of the software are available: a public domain version and a licensed version. HoTMetaL Pro, the licensed version, can be used to import and validate an existing document. The public domain version will give an error and refuse to load a document which contains invalid HTML.

HoTMetaL is available at the URL ftp://src.doc.ic.ac.uk/packages/WWW/Mosaic/html/hotmetal

Weblint

A tool called weblint can be used to check for invalid HTML documents. This software is available from the URL ftp://ftp.khoros.unm.edu/pub/perl/www/weblint-1.000.tar.gz In the UK it is available at the URL ftp://src.doc.ic.ac.uk/packages/WWW/tools/weblint

SGMLS

sgmls is a tool which can be used to validate SGML documents. It is available at the URL ftp://sgml1.ex.ac.uk/pub/SGML/sgmls/ The sgmls software is used in a number of HTML validation services, such as those mentioned above. Information on installing sgmls and also pgmls (an SGML mode for emacs) is available at the URL http://web.nexor.co.uk/users/mak/doc/html/sgml-lib/html-sgml.html

HTML Validation Service

An HTML validation service is available at the URL http://www.hal.com/~markg/WebTechs/validation-form.html This service makes use of HTML forms and a CGI script which runs a HTML validation program. The service can be used to check HTML syntax by entering the HTML markup to be checked. It can also be used to check an existing HTML document by entering the URL of the document.


Figure 4-16 HTML Validation Service.

A variation on this service is available at the URL http://www.cc.gatech.edu/grads/j/Kipp.Jones/HaLidation/validation-form.html

These services make of the sgmls validation program.

The software can be installed on your local Unix system. It is available at the URL ftp://ftp.hal.com/pub/CGI/check-html.tar.Z

HTML Check Toolkit

The HTML Check Toolkit is another HTML validation program. The software can be installed using a WWW browser. The installation service, illustrated below, is based on the EIT Webmaster Starter's Kit. HTML Check Toolkit is available at the URL http://www.hal.com/~markg/HaLSoft/html-check/


Figure 4-17 Installing The Check_HTML Script.

Review of HTML Tools

Before choosing HTML authoring tools, document converters or quality tools for institutional use the following issues should be considered:

Support Who wrote the software - an experienced software developer or a student as part of a computer project? Will the software continue to be developed and supported?

Quality Does the software produce valid HTML?

Functionality What facilities does the software provide?

Other Issues If the software is based on a word processing package, what happens if the word processed document needs to be used by another word processor?

Writing Style

Writing styles for WWW documents are still developing. However there are a number of guidelines which can be provided:

Finding Out More About HTML

This document does not provide an in-depth tutorial on HTML. Many WWW resources are available which give details on writing HTML. Some of these are listed below:


Back , Next , Title

Graphics     Multimedia      Virtual Environments      Visualisation      Contents