Guide to good practices for WWW authors
As in all other aspects of providing information on the Web, the choice of the format for documents should relate to the needs and resources of the target user group.
The normal format is HTML, the native language of the Web.
HTML is recognised by all Web browsers on all platforms; it allows authors to define the structure of their documents, and to incorporate images, tables, forma and multi-media file types. The most effective way to disseminate information widely via the Web is to use A HREF="http://www.w3.org/pub/WWW/MarkUp/html-spec/">standard HTML.
HTML is an application of SGML, the Standard Generalized Markup Language. SGML, an ISO standard, is a sophisticated system for developing markup languages, enabling the definition of different types of document which, once defined and recognised, can be automatically processed and displayed. In this way, SGML facilitates interchange of documents within a community.
In the SGML system, the formal definition of a document type is provided by the 'Document Type Definition' or 'DTD' which is normally stored separate from the document. The DTD provides a list of the element types and their relationships, as well as the entities and attributes allowed in the document. For instance, the HTML syntax is formally defined by the HTML DTD .
Within any SGML-compliant document, information on the document type should be given in a declaration comment right at the beginning of the document, e.g.
<!DOCTYPE html PUBLIC "-//IETF//DTD HTML 2.0//EN">
In this example, the declaration indicates that this is an HTML document, that the HTML document type is publicly declared, that the IETF (Internet Engineering Task Force) is the owner identifier of the standard, DTD is the public text class, HTML 2.0 is the public text description of the DTD, and English is the language.
A 'Doctype' declaration is recommended for SGML conformance.
The use of a DOCTYPE declaration enables SGML-compliant browsers to recognise the type of document and use the appropriate DTD, and thus process and display the document correctly provided that the standard referred to is stable. (Note that at the time of writing this document, the HTML 3.0 standard is not agreed upon or stable.)
Documents of any type can be served via the WWW and even though HTML is the norm, other formats can be used where appropriate. For instance, where marking up in HTML is not practicable, documents can be offered in plain text format. At the other end of the scale, some documents may need to meet exacting standards of display which are not feasible with HTML. It is quite legitimate for information providers ask themselves:
For instance many users in the typography and print industry need high quality in the display of fonts, graphics, and in page fidelity. In this case, a format such as Portable Document Format (PDF) from Adobe is likely to be more appropriate than HTML.
A browser can be manually configured to recognise a specific file type and to launch the application for viewing it, e.g. Acrobat Viewer for PDF. Similarly, word-processor or spreadsheet files may be delivered via the Web. The needs of the user group are the prime factor to consider.
Consider whether HTML meets the requirements of your user group.
The decision is not a simple one however. In opting for a format other than HTML, information providers should be aware of the access limitations implicit in the use of proprietary formats, and also that users may be wary of launching executables locally because of associated security risks. See the chapter on Extending WWW in Brian Kelly's Running a World-Wide Web Service, September 1995.
Local configuration of browser preferences demonstrates the extensibility of the WWW but in the long run this is not a real solution to the complex challenges generated by the increasing proliferation of file types on the Web. The standards need to provide a sound mechanism for coping with many file formats universally. We can expect to see proposals and work proceeding rapidly on this front, for instance in the area of content negotiation between server and browser.
Graphics Multimedia Virtual Environments Visualisation Contents