Today multimedia facilities are provided in many different ways. These are often incompatible or at least require a significant effort to reconfigure them into a usable form. This leads to a mixture of presentation formats and qualities that often distract the user from the content of the material which they are studying. You have to have the right facilities to be able to use a particular set of material (which is likely to be unique so the costs are high).
The concern is that the current situation will continue or get worse. Lack of standards leads to a loss of existing material. It may then have to be written again as it is often too expensive to convert to more portable formats. There is a real danger that multiple authored packages give a uncoordinated and muddled interface to the reader the skills to make a consistent interface are not always available.
There are possible solutions emerging as formats seem to be sorting themselves out. Standards such as GIF and JPEG for images, MPEG for moving pictures, AVI for sound are all helping the situation. The problem is often at the application level. We need to have standards for multimedia authoring.
It is critical that we address issues of portability across machine ranges and also the use of "open" formats which have some future-proofing. The community continues to use closed, proprietary standards this gives no encouragement to suppliers to publish their specifications.
The formats chosen need to reflect users needs. The industry decides what they think users need perhaps we could influence this.
The objective of SIRSA is to establish a multi-technology shared image resource for the mass storage and retrieval of still frames and motion video sequences. The image resource will have fast access through SuperJANET for synchronous (real-time) applications, for example, lectures, surgical teaching and video-conferencing. Visual material will be contributed by the HE community.
Since SIRSA is a shared resource, the selection of formats for images and video is particularly important. Visual material will normally originate from HE sites and will be networked to the server in compressed digital formats. The aim will be to accept a range of input formats which are flexible and not prescriptive. This is a pragmatic approach based on the view that contribution of material is more likely from the HE community (and hence more beneficial to the community at large) if translation is not required at source.
Within SIRSA, translation (where necessary) will be targeted to a more limited set of formats for which very fast decoders are available (for example as low cost PC cards). This is to assist the primary real-time objective of the system. Non time critical access through browsers (e.g. WWW Mosaic) will also be implemented. The more general issues being considered are not, of course, limited to the UK HE community. Digital image libraries are being established world-wide, usually based upon GIF, JPEG and MPEG formats. Other formats such as Photo-CD and Fractal Transform (e.g. MS Encarta CD-ROMs) are in use. With SuperJANET, the UK has a wideband digital network capable of delivering large image files in real-time. In choosing a set of file formats, we can consider trade-offs between image quality, file size and encoding/decoding time. There will also be significant application related requirements e.g. the acceptable compression losses between X-Ray images and fine art pictures may be very different, and are also likely to be very subjective.
The project will provide the following:
The package being produced provides detailed reference information on a large number of image processing operators, together with example images for each operator showing typical input/output and illustrating typical problems. There are also student exercises, bibliographies and references to common image processing packages.
After some thought a two level package was designed. For people who have the hardware to support it (and that should be most people) the reference comes in the form of a hypertext document written using HTML and displayed using NCSA's WWW browser, Mosaic. This approach allows lots of cross referencing and is ideal for a reference document. It also allows example images to be displayed using simple mouse clicks on the appropriate hypertext links. The Mosaic software is also public domain.
For people who want a hardcopy version of the document, a LaTeX version is produced which can be read separately. It is still intended though that the example images (which are stored in GIF format) will be displayed using a computer screen.
A converter tool (latex2html) was considered in order to turn LaTeX into HTML, but was decided against for (mainly) portability reasons. Instead a simple high level and very specialised intermediate document markup language was designed, which is then converted into both LaTeX and HTML versions using a Perl program.
The main image format which is likely to be used is GIF, since that seems to be widely used, provides lossless compression, is fairly compact, and is accepted by a large number of image processing software packages. It is also the format required by Mosaic for inline images. However, the final choice has yet to be made. Other possibilities include TIFF and JPEG. Consideration is also being given to various image format conversion packages to be distributed with the final release.
One issue which came up in discussion was whether the available bandwidth should influence the format used for transmission. We can use different formats for archiving and for delivery across networks.
As a national archive we have certain requirements for the properties of any stored information that we hold: these include longevity, ease of distribution and independence from platforms, devices or software. The latter would of course greatly facilitate the first two requirements.
The first question we have been asking ourselves is should we convert this textual information (including tabular elements) into images or should they be converted at the character level, either by keyboarding or OCR scanning. In either case the second question is what format should the conversion provide.
More recently, documentation is being sent to us by the research community in word processed form, mostly Word Perfect. This is unproblematic if the users of our holdings are also using Word Perfect but will cause difficulties if we make these documents available on-line
Another aspect of the archives' holdings are teaching data sets. The potential with these is that they could become an important resource for courseware designers. Based on this idea, archive data documentation would need to be in a format suitable for porting into a variety of courseware authoring packages.
Having recognized some of the problems, we have been considering the question of file formats in some detail. SGML was the first possibility we considered and it appeared to offer some solutions to our problems. Given its basis in ASCII, longevity is assured and the idea behind it is of course software, device and platform independence. There are however, drawbacks. As mentioned above, many of the documents we receive are in Word Perfect format, as such the codes applied to these documents could be considered, in some instances, to be analogous to markup. For instance, one would expect the printed document to have headings that look the same, that is, have the same font and size properties. However what becomes apparent from looking at the coding is that it is often very inconsistent. Headings, subheadings, tables and other elements of the documents use different coding to arrive at the same appearance.
A secondary problem with SGML is persuading people to use it. There would need to be a fundamental change in the way people perceive documents, from an appearance to structural orientation. There would also, in many cases, need to be an increase in resources as input speed would reduce to ensure accuracy in marking up documents, given the tendency towards inconsistency noted above. Substantial investments would also need to made where organizations need to backdate marking up to existing documents.
Perhaps Adobe Acrobat is the answer, simply take any of your existing documents output them as PostScript and distill them into PDF. That problem of inconsistency however just will not go away.
Issues which need to be addressed include:
Graphics Multimedia Virtual Environments Visualisation Contents