Introduction to the WWW
IntroductionThe concept of 'hypertext' has been around a long time, and was first described by Vannevar Bush in 1945. A hypertext document allows the user to navigate through it in a non-linear or non-sequential fashion, by selecting parts of the text which are linked to other parts of the same document or other documents. Hypermedia is a hypertext system that is not restricted to text documents, but includes other media, effectively multimedia hypertext.
The WWW is a platform independent, distributed hypermedia system, allowing the user to access hypermedia documents stored on remote servers around the world using a range of different computing platforms.
HistoryThe World Wide Web was developed by Tim Berners-Lee and Robert Cailliau of CERN Laboratories, Geneva, to allow particle physicists throughout Europe to share information. Initially mainly a text-based system, its cross platform capability and ease of use ensured its uptake into the wider community and its continuing development. It has grown from 50 servers in January 1993 to the millions that now exist throughout the world, and is continuing to grow rapidly.
URLsThe URL (Uniform Resource Locator) is used for specifying an object, such as a file, and how to access it. It is the 'address' of the file. It consists of two main parts, and takes the form: access-method://address
where the access method is the protocol for retrieving the file. These include HTTP (Hypertext Transport Protocol), used for HTML files, and FTP (File Transfer Protocol).
HTMLHTML (Hypertext Markup Language) is a simple markup language recognised by all WWW browsers. A standard HTML document is still the most effective way to widely disseminate information on the WWW. An HTML document consists of the normal text of the document, and markup tags that define elements of the document, such as title and paragraph.
<h1>Advisory Group on Computer Graphics</h1> Advising UK Higher Education on Computer Graphics, Visualization, Multimedia and Virtual Environments. <P>The Advisory Group on Computer Graphics (AGOCG) is an initiative of the <a href=http://www.jisc.ac.uk/> JISC</a>) of the Higher Education Funding Councils and the Research Councils.</P>The code fragment above shows a number of tags, including a heading <H1> and paragraph <p>. All tags are enclosed within angle brackets, and most have a start and end tag, e.g., <P>…</P> The other tag shown in this fragment (<A HREF... </a>) creates a link to another page.
Since web pages are just text files, they can be created with even the most basic text editor. However, a number of programs have been written to help with markup and automatically convert between other formats and other HTML
HTML EditorsThe simplest are text-based editors, such as HTML Notepad for the PC, which is an extension of Windows Notepad. Text is created or imported as in a standard text editor, then marked-up using the additional pull down menus. The display shows the text and the markup tags and a separate browser is necessary to view the finished HTML document. Increasingly however, WYSIWYG HTML editors are becoming available, such as Microsoft's Front Page and Adobe's Pagemill. Front Page also provides extensions allowing greater functionality to be incorporated into the web page, but only if it is supported by the web server. A number of other packages, e.g., Word 97, now allow you to save your documents directly as HTML, providing an easy way to produces Web pages.
Conversion programsThere are many programs available to convert existing documents from various formats to HTML, including Word, Word Perfect, RTF (Rich Text Format) and Framemaker. The success with which documents can be converted often depends on how they were written originally. For example, a word processor document in which headings were created using the heading styles feature is more likely to convert correctly than one in which the headings were generated by manually increasing the font size. The more 'structural' information of this nature in the original document, the better the conversion will be. Having a structured HTML document makes it easier to maintain and more accessible to all users.
It is always worth checking that your pages, however they are produced, display correctly in a range of browsers and contain only valid HTML. For a list of HTML editors and other tools, see the SIMA Report 'Software Tools for the World-Wide Web'.
Clients and Servers
IntroductionThe WWW is based on a client server architecture. The client, often called a browser, is the software that runs on the local machine allowing the user to view documents. The server is the software that delivers the information to the client. 'Server' is also used to mean the actual machine the server software rums on.
Text browsersText-based browsers provide WWW access when a graphical interface is not available, or impractical, for example for visually disabled users or on PDAs (Personal Digital Assistant) with small screen sizes. However, they lack many of the functions of graphical browsers, and increasingly Web designers are making use of graphics as an integral part of their pages. One of the best known text-based browsers is Lynx, developed at the University of Kansas. Versions for Unix systems and DOS are available.
Graphical BrowsersThese comprise the bulk of the clients currently in use. They can display not only text, but also handle images, and other file formats. The most popular browsers at the current time are Netscape and Internet Explorer, accounting for over 90% of the browsers in use, both of which are available on a range of platforms.
The WWW has evolved a great deal, and is continuing to develop. The latest browser versions provide many new features, including greater multimedia support and style sheets (for more details see the AGOCG briefing report 'Multimedia on the WWW'). However, it should be remembered that many users will still be using older browsers, and any HTML files should also be tested with these to make sure they are accessible to a wide audience.
Helper Applications and PluginsHelper applications are external programs that display file types that the browser cannot handle. Browsers can be configured to automatically launch such applications when particular file types are encountered. Plugins are similar applications, but work within the browser, displaying the file within the browser window.
Increasingly browsers are supporting a much wider range of file types internally, and the need for helper applications for common file types is decreasing. For example, Microsoft's Internet Explorer will support several image and sound formats, MPEG, AVI and Quicktime movie files and VRML (Virtual Reality Markup Language). Modern browsers also support Java, a platform-independent programming language.
Accessing InformationThe WWW continues to grow rapidly, and so, as the amount of information available increases, finding it becomes harder, as there is no central catalogue or cataloguing system. There are a number of general directories, such as Yahoo (http://www.yahoo.co.uk/) which provide page listings by topic, but perhaps more useful are the search engines.
Search enginesSearch engines are databases indexing the contents of large numbers of web pages, which are accessed over the WWW using a query form. The usefulness of the engine depends on how many pages it has indexed, what information it has indexed from those pages and how intelligent a query you can submit, e.g. does it support 'NEAR', 'AND', 'OR' and 'NOT'. Some of the most widely used search engines include Lycos (http://www.lycos.com/) and AltaVista (http://www.altavista.digital.com/), but it is always worth trying more than one, as their databases differ. Meta search engines, such as Metacrawler (http://www.metacrawler.com) allow you to search more than one engine with a single query. The query is submitted to the meta engine, which automatically passes it on to several search engines and collates the results.
Domain NamesDomain names are unique addresses on the Internet. Usually a company or organisation will have its own domain name, e.g. mcc.ac.uk (University of Manchester) and addresses of specific machines within an organisation will end in the domain name. It is often possible to guess the correct URL given a limited amount of information. The first part of the domain name will relate to the company or organization name, and the last part refers to the type of company/organization. For example, .ac.uk refers to a UK HE site, .co.uk a UK commercial site. For a list of UK sub-domains see: http://www.nic.uk/domains/INDEX.HTMl
Home PagesMany departments now have their own web servers, and the departmental home page should name the web master, who you will need to contact to set up your own pages on that server. Once you have setup your own home page there are a number of points to bear in mind:
Good practice for the WWWThe following list is based on the SIMA report by Margaret Isaacs - Guide to good practices for WWW authors (see Bibliography).
Codes of PracticeIt is important to remember that intellectual property rights, defamation and data protection legislation all apply to the WWW as well as other forms of media. On top of this, most institutions will have their own codes of practice laying out acceptable practice, which you should read before setting up any web pages. Institutions are also bound by the JANET acceptable use policy, which governs the use of the links between the institutions. Usually, such codes of practice prohibit inclusion of (or direct links to):
Graphics Multimedia Virtual Environments Visualisation Contents