Next: Products

1. If visualization is the solution, what is the problem?

Present day scientific and engineering investigators are confronted with research problems that depend on gaining insight into complex and voluminous data. Previous publications - particularly [McCormick 871 - have referred to

firehoses of data, powerful computers and automatic experiments, which produce data at a greater rate than the mind can comprehend, resulting in
warehouses of data where much is left untouched, hiding unsuspected insights.

Scientific visualization is devoted to providing visual tools and methods (and some non-visual ones) to help a scientific or engineering investigator with analysing data.

Characterising the investigator's problem

There is no single visualization method or tool that can be applied successfully to all problems of data analysis. Therefore if visualization is intended to assist with demanding problems, it is worthwhile beginning by characterising in what way the investigator's problem is demanding:

Multidimensional - at the extremes there may be 1 or many independent variables. A common example is 3 where the independent variables are spatial dimensions or 2 where the third dimension can be ignored. Often in the past, the 3rd dimension has been ignored not only because the computation has been difficult. but because displaying the result has also been difficult.
Multivariate - there may be 1 or many dependent variables. Much hype from product brochures blurs the distinction between the independent and dependent variables, resulting in confusing claims such as "This system handles 4D data".
Compound data - data could exist as a number of scalars at each sampled point. However many problems respond better if the internal structure of the data is respected. Thus data about flow and gradients can be represented as vectors. Data aboul strain can be represented as tensors. Electrical data can be represented as complex numbers.
Geometry - some systems assume that a Cartesian coordinate space is being used. Many problems are defined on a curved space. Some data can be defined on parameters such as (u,v) or (phi,theta), which are themselves used to define a curve or surface. Earth based data is a common example of this.
How the data is structured - the simplest case is where the data is sampled on a regular grid. However for experimental reasons, data may only be accessible over a scattered grid, or for computational reasons, unstructured data may be used. In the latter case, the problem may be further complicated by the need to use non-linear interpolation functions.
Time-varying - there could be one or many timesteps - in other words one of the independent variables may be time. This is not necessarily the same as using time to present the result. A time-varying phenomenon could be presented as multiple displays on one frame and sometimes this is preferred if the investigator wishes to make a controlled comparison. A static phenomenon can be presented as a time sequence if there is too much complexity to be presented on one frame - so a volume can be presented as an time-based sequence of slices. Often though - and not surprisingly - time is the preferred way to present a time-varying phenomenon and has been avoided by investigators until now because of technological difficulties. Flow phenomena such as turbulence, eddies and shifting boundaries are perceived without conscious thought when presented using time.
Application control - the simplest case is no control of the application, where data is postprocessed offline of the application. In the other extreme, the investigator needs to exercise full interactive control of the application, in response to events as they are visualized.
Size of data set - many problems become complex, simply through the sheer size of the data set being examined. Effective use of present-day visualization systems often relies on being able to make partly processed copies of the data at various stages. Large data sets make this replication impossible. Vast data sets could be defined as having such a size that they cannot be accomodated at all on the investigator's local processing facilities and have their own special problems.

For convenience, the characteristics are summarised in the following table.

*Table - Characteristics of Investigator's Data*
Characteristic	Simple	Hard
Independent variables	1	Multidimensional
Independent variables	1	Multivariate
Data compounding	Scalars	Tensors
Geometry	Cartesian	Curved
Structure	Regular	Unstructured
Time	Static phenomenon	Time-varying
Application control	None - postprocess	Full interactive control steering
	Small	Vast

Visualization could be said to encompass problems of all types, whether simple or hard.

In practice many traditional solutions (graphs, bar charts) exist where the characteristics of a problem are simple in all respects or where the problem is hard to a limited degree.

The purpose of much recent work in visualization is to investigate the hard problems and bring them into the realm of the possible. In practice the difficultics are interlinked. So, while it is possible to display a field of scalars in 3D space by some suitable volume rendering techniques, it is much harder if the data are vectors, especially if there are many of them - it is easy to display them but hard to perceive them.

As might be expected, there is a gradual adoption of solutions into commercial systems.

Examples

Some examples may be useful at this point.

a simple case - temperature distribution across a flat surface, a single scalar variable defined in 2D
simple 2Dflow problems - in the simple case, the data exists as a field of veceors at regularly spaced positions in 2D space
more complex 3D flow problems - the data is defined in 3D, an unstructured grid has been used for computational reasons, the flow is time-varying. (these examples are not intended to imply that 2D problems always have a simple structure or that 3D problems are always more complex).
Multiple independent variables - chemical processes are a source of problems, that are hard in most characteristics. The study may involve studying the progress of a chemical reaction at various points in a mixture at various times, depending on several variables, such as pressure, temperature and initial fractions of the constituent substances. In addition flow rates and the use of unstructured data may be involved. In its full complexity, such a problem is still extremely hard to be visualized.

For convenience, these examples are summarised.

*Table - Examples*
Characteristic	Temperature	Simple 2D flow	Complex 3D flow	Chemical process

Independent variables	2	2	3	many
Dependent variables	1	1	1	many
Data compounding	scalar	vector	vector	vector
Geometry	Cartesian	Cartesian	Cartesian	Cartesian
Structure	regular	regular	unstructured	unstructured
Time	static	static	time-varying	time-varying

Some characteristics have not been presented in the table. For instance the data set size can be small or large in any of the problems just described.

Next: Products

Graphics Multimedia Virtual Environments Visualisation Contents