|Also available in Acrobat format|
Multimedia in the Teaching Space
2.7 DIGITAL VIDEO COMPRESSION
Digital video compression is one of the key issues in video coding, which enables efficient interchange and distribution of visual information. New applications in the field of communication, multimedia and digital television broadcasting require highly efficient and robust digital; video compression and encoding techniques. The integration of motion video as an integral part of the multimedia environment is technologically one of the most demanding tasks due to the high data-rates and real-time constraints.
The rate required to encode a full-motion video signal at VHS quality had come down from 20Mbps to well below 1Mbps . For a typical head and shoulders video conferncing application the data rates are substantially lower e.g. 0.128Mbps.
The common video compression standards currently available are MPEG-1, MPEG-2, H.261, and H.263.
MPEG-1 refers to the delivery of video for a CD-ROM quality presentation.
MPEG-2 refers to broadcast quality compressed video, and would work with HDTV.
MPEG-3 was targeted for HDTV, however it was discovered that MPEG-2 could handle HDTV.
H.261 is the most widely used international video compression standard, but as the bandwidth of communications links increases it is likely to be superseded. The standard cover the bandwidth from 64Kbps to 2Mbps.
H.263 coding standard is oriented to videoconferencing, and is a descendant of the motion compensated DCT methodology used to support the existing standards H.261, MPEG-1 and MPEG-2.
H.263 has emerged to a high compression standard for moving images, and does not focus exclusively on very low bit-rates. The improvements in H.263 compared with H.261 are mainly obtained by improvements to the motion compensations scheme.
Technology developments show an increase in interactivity in all kinds of application, especially audio-visual ones. Currently much interactivity is restricted to synthetic content, and natural audio-visual content is not supported by existing standards, nor is the combination of natural and synthetic content. Future multimedia applications will require the implementation standards to cover these types of application.
2.7.2 Compression Algorithms:
There are a number of standard in common use in the multimedia and videoconferencing domain. These will be discuss in turn as follows:
This is a video coding standard published by the ITU (International Telecom Unit) in 1990. It was designed for data-rates which were multiples of 64Kbps. These data-rates are suitable for ISDN(Integrated Systems Digital Network) lines. H.261 is the most widely used international video compression standard, but as the bandwidth of communications links increases it is likely to be superseded. The standard cover the bandwidth from 64Kbps to 2Mbps. H.261 is used in conjunction with other control standards such as H.221, H.230, H.242.
Picture Formats Supported
Picture Lumin Lumin H.261 H.263 Uncompressed Bit-rate Mbps format pixels lines support support 10 frames/s 30 frames/s Grey Colour Grey Colour QCIF 176 144 Yes Yes 2.0 3.0 6.1 9.1 CIF 352 288 Optional Optional 8.1 12.2 24.3 36.5
The encoder on works on non-interlaced pictures. The pictures are coded in term of their luminance and two colour components. H.261 support two image resolution QCIF which is 144 x 176 pixels and optionally CIF which is 288 x 352 pixels.
This is a video coding standard and was published around 1995-6 It was designed for low-bit-rate communications and early drafts specified data-rates less than 64Kbps. However this limitation was removed, and it is expected the standard will be used over a wide range of bit-rates, and it will eventually replace H.261.
H.263 differs from H.261 in the following ways:-
It uses half-pixel precision for motion compensation where H.261 used full pixel precision. Some parts of the hierarchical structure of the data-stream are now optional, so that the CODEC can be configured for a lower bit-rate., or better error recovery.
There are four negotiable options to improve performance;
1. Unrestricted Motion Vectors
2. Syntax-based arithmetic coding
3. Advanced prediction.
4. Forward and backward frame prediction similar to MPEG.
H.263 is supposed to be capable of providing the same quality at half the bit
rate to H.261. Also H.263 support 5 resolutions which enables it to compete
with the MPEG standards:-
Picture Formats Supported
Picture Lumin Lumin H.261 H.263 Uncompressed Bit-rate Mbps format pixels lines support support 10 frames/s 30 frames/s Grey Colour Grey Colour SQCIF 128 96 Yes 1.0 1.5 30.0 4.4 QCIF 176 144 Yes Yes 2.0 3.0 6.1 9.1 CIF 352 288 Optional Optional 8.1 12.2 24.3 36.5 4CIF 704 576 Optional 32.4 48.7 97.3 146.0 16CIF 1408 1152 Optional 129.8 194.6 389.3 583.9
This standard is currently in 5 parts:-
1. Systems - addresses the problem of combining one or more data streams from the video and audio parts of the MPEG-1 standard with timing information to forma single stream. Thus once combined into a single stream the data are well suited to digital storage and transmission.
2. Video - specifies a code representation that can be used for video sequences to bit-rates around 1.5Mbps. this was developed to operate principally from storage media offering a continuous transfer rate of 1.5Mbps, but it can be used more widely, because the approach is generic.
3. Audio - Specifies a coded representation for compressing audio sequences (mono and stereo). A psycho-acoustic model creates a set of data to control the quantifier and coding.
4. Conformance testing - specifies how tests can be designed to verify whether bit-streams and decoders meet the requirements specified in parts 1, 2 and 3 of the MPEG-1 standard.
These tests can be used by the manufacturers of encoders to verify whether the encoder produces valid bit-streams.
These test can be used by manufacturers of decoders to verify whether the decoder meets the requirements set out in parts 1, 2 and 3.
These test can be used to test application to verify whether the characteristics of a given bit-stream meet the application requirements.
5. Software simulation - is technically not a standard, but a technical report, give a full software implementation of the first 3 parts of the MPEG-1 standard.
This standard is currently available in 9 parts:-
1. Systems - Addresses the combining of one or more elementary streams of video and audio, as well as data into single and multiple streams suitable for storage or transmission. This is specified in two forms; the Programme Stream and the Transport Stream, each of which is optimised for a different set of applications.
2. Video - Builds on the powerful compression capabilities of MPEG-1 to offer a wide range of coding tools.
3. Audio - is a backwards compatible multi-channel extension of the MPEG-1 Audio standard.
4. Compliance testing - corresponds to Part 4 of MPEG-1.
5. Software simulation - correspond to Part 5 of MPEG-1.
6. Extensions for DSO-CC (Digital Storage media Command and Control) is the specification of a set of protocols which provide control function and operations specific to managing MPEG-1 and MPEG-2 bit-streams. These protocols may be used to support both stand-alone and networked environments. software implementation.
7. Multi-channel Audio coding - this is not constrained to be backwards-compatible with MPEG-1 Audio.
8. 10-bit Video - this was withdrawn when there was insufficient interest.
9. Extension for real-time interface for system decoders.
This standard has been developed in a period when the technology is changing rapidly, and hence it has been difficult to have a clear view of its scope. It is anticipated that MPEG-4 will :-
MPEG-4 can supply an answer to the emerging needs of applications ranging from interactive audio-visual services to remote monitoring and control. It is the first standard that is trying to take account of interactivity rather than simply determining how moving and still pictures should be transmitted and displayed.
MPEG-4 is essentially a multimedia standard integrating various communications media such as:-
This global approach to the use of audio and video, seeking to merge three worlds, namely the mobile communications domain, the digital television and film domain, and the interactive computer and human interaction domains. The existing standards do not cover these areas because their bit-rates are too high or the audio-visual standards do not exist. This standard seeks to cover a number of new functionalities which are divided into three groups:-
1. Content based interactivity:
MPEG-4 should provide access and organisation tools for audio-visual content, which may be indexing, hyperlinking, querying, browsing, uploading and down loading and deleting.
MPEG-4 should provide a syntax and coding schemes to support bit-stream editing and content manipulation to select one specific object.
MPEG-4 should provide methods fro combining synthetic and natural scenes, and is the first step towards the integration of all kinds of audio-visual information.
MPEG-4 should provide methods to randomly access, within a limited time and with high resolution parts form an audio-visual sequence.
MPEG-4 should provide subjectively better audio-visual quality compared with existing standards and associated bit-rates.
MPEG-4 should provide ability to code multiple views/soundtracks of a scene as well as synchronisation between the elementary streams. This would handle stereoscopic and multiple view of the same scene, and take into account virtual reality and 3D requirements.
3. Universal Access:
MPEG-4 should provide an error robustness capability, particularly for low bit-rate applications.
MPEG-4 should achieve scalability with a fine granularity in content, spatial resolution, temporal resolution, quality and complexity.
If MPEG-4 can achieve this functionality it will provide better audio-visual quality at comparable bit-rates compared to existing standards. As mentioned above MPEG-4 seeks to take into account current and future audio-visual applications; these applications can be considered under three criteria:-
1. Timing Constraints
Applications should be classified as real-time on non-real-time. In real-time applications the information is acquired simultaneously, processed, transmitted and used in the receiver.
2. Symmetry of Transmission facilities
Applications are classified as symmetric or asymmetric. Symmetric applications are those where equivalent transmission facilities are available at both side of the communications link.
Applications are classified as interactive or non-interactive. Interactive applications are those where the user has individual presentation control, either controlling the digital storage media or also on the scheduling sequence of the information flow.
These criteria lead to eight classes of application as follows:-
Applications Class 1
The user has presentational control, either at the level of user control of digital storage, or also on scheduling and the sequence of information flow.
1. Video-telephpony - conversations person-to -person without any limitation in the scene environment and the network used. It may include some data transfer capabilities e.g. from auxiliary still pictures, document etc.
2. Multi-point video-telephony - inter personal communication between more than two people, each on in a different place, and the control may be either in the network or in one of the terminals.
3. Videoconferencing - interpersonal communication involving more than one person in to or more connected places. Videoconferencing often takes place in a room environment or a private office setting.
4. Co-operative Working - involves simultaneous inter-personal interaction (video-telephony) and data communications, at least as important as the audio-visual information.
5. Symmetric Remote Classroom - which is very close to video-telephony or Videoconferencing, but it differs in that the audio-visual content may include scenes without a speaker, and there is usually a fixed central control point which in multi-point connections sends it scene to all sites, and selects to receive one from everyone.
6. Symmetric Remote Expertise - experts are consulted from a remote location (e.g. telemedicine).
Applications Class 2
These applications are interactive. The interactivity may be user-to-user, user-to-machine or machine-to-user. The interactivity may also be conversational and user remote control through a return data channel.
1. Asymmetric remote expertise - Experts are consulted from a remote location, where the links are asymmetric in that there is one audio-visual channel and one Audio-only channel. Discussion can take place and the expert shows an illustration, or vice-versa the expert is able to view conditions at the remote site and discussion takes place over return Audio channel. Both conditions would be applicable in telemedicine.
2. Remote Monitoring and Control - here audio-visual data is collected in real-time from a remote location, typically through a machine-user communication interface. There is typically an audio-visual channel from the remote location, and an audio or control channel in return. This would include remote control of cameras and/or microphones and would include traffic or building monitoring.
3. New gathering - new is collected from remote places where it is difficult to establish quickly a good quality connection. The interaction level is low and typically limited to a return audio channel, e.g. a direct broadcast from a remote site.
4. Asymmetric Remote Classroom - this is the situation where there is a central site sending audio-visual information and receiving sites, usually less expensive to set up, have the possibility of participating through an Audio channel.
Applications Class 3
These application are interactive in that the user can control the data flow through a control channel.
1. Multimedia messaging - messages containing text, audio, video, graphics are sent through a network to a mailbox location. Typical applications are e-mail and video answering machines.
Applications Class 4
This includes asymmetric non-real-time applications which are interactive. The user has individual presentation control, either at the level of user control of digital storage media, or also on the scheduling and the sequence of the data flow, depending on user decisions and choices - e.g. WWW or CAL decision systems.
1. Information base retrieval - audio-visual information retrieved from a remote Knowledge base on an individual basis; e.g. in entertainment, teleshopping, encyclopaedias etc. and the users may interactively browse through the audio-visual content.
2. Games - interactive audio-visual games are played with a remote computer/server or with other people through a computer /server.
Applications Class 5
In these applications the user has no individual presentation control.
1. Multimedia Presentation - local or remote multimedia presentations where no interactivity exists, and the user has no control on the scheduling or sequence of information flow.
2. Multimedia broadcasting for portable and mobile receivers - broadcasting of multimedia programmes (low resolution) for portable and mobile receivers, e.g. game-boy or watch-like terminals.
The development of MPEG-4 demonstrate the need for low-bit-rate coding algorithms and standards, and associated with this is the handling of interactivity and robust error free systems. The development of H.263 coding standard has shown the Discrete Cosine Transform motion compensation coding schemes are still able to improve compression performance.
Graphics Multimedia Virtual Environments Visualisation Contents