Visualisation of historical events using Lexis pencils

4. Comparing pencils - the Lexis diagram

Most studies consist of event history information from more than one individual, and we thus need methods to allow us to display more than one individual in the same graphic. The simplest method of comparing pencils is to rank them by case order, and display them side by side. This is similar in concept to a set of pencils in a pencil box, but uses no further information in the dataset. A straightforward extension to this idea is to align the pencils according to the age of the individual or according to calendar year; this then allows comparisons to be made more easily between pencils.

However, it is easily seen that these two simple displays are straightforward applications of the Lexis diagram (Lexis, 1875), the modern form of which is described in Pressat (1961). The Lexis diagram is used extensively in demography and in survival analysis and has many useful statistical properties described in detail by Keiding (1990). A typical application would look at the survival experience of a group of patients entering a clinical trial., The x-axis represents the calendar date, and the y-axis represents the time spent in the study. Each patient is represented by a solid line sloping at 45 degrees; the line slopes as both y and x increase with time. The line is anchored on the x-axis according to the date that the patient entered the trial. Figure 2 shows a typical Lexis diagram, showing eight individuals, with varying survival times. One of the individuals enters the study at time T days, and stays in the study until time T+A days. The time spent in the study (the y-value) is thus A days.

The Lexis diagram can be modified in various ways. First, information on events such as death can be added by placing an appropriate symbol at the end of the line. Second, changes of state can be introduced by using different colours to represent each. Finally, the definition of the x-axis can be modified. If we change the x-axis to represent 'date of entry to the study' rather than 'calendar date', then the solid lines will then be vertical and will no longer slope at 45 degrees, as the patient proceeds in the study, y increases but x now stays constant. The x-axis can also represent other temporal variables (such as age of entry into the study) or any other variable (typically these will be ranks of temporal variables such as 'rank order of age of entry into the study), but might simply be a case ordering. Of course, if rank orderings are used to define the x-axis, the special statistical properties of the Lexis diagram are lost.

Figure 2: A typical Lexis diagram. Each individual is represented by a 45 degree line.

We can see that the two simple ideas for comparing event history pencils discussed above are therefore simply special forms of the Lexis diagram, with differing definitions of the x and y-axes, and with the lines replaced by pencils. In the first, the x-axis is the 'case order' or index of the individual and the y-axis is 'time since start event'. The second graphic redefines the y-axis to be 'age at start event' or 'calendar time at start event', again keeping the x-axis as the index of the individual.

Returning to the original concept of a Lexis diagram, and replacing the Lexis lines by pencils we can see that a 'Lexis pencil' graphic would use a temporal variable such as 'age' or 'calendar date' along the x-axis, and use 'time in study' or 'calendar time' on the y-axis. We define this to be a two dimensional (2-D) Lexis pencil display - the dimensions refer to the number of axes.

4.1 Case study 2 - A sample of bigamists

As an example, we consider the 42 bigamy offenders in England and Wales in 1973, and examine their criminal history (which was obtained from the UK Home Office Offenders Index) over a thirty-two year period from 1963 until the end of 1994. The 42 individuals consist of 39 men and 3 women, with ages ranging from 20 years to 53 years at the time of the 1973 conviction. We examine this dataset using a 2-D Lexis pencil display. The x-axis is defined to be the rank order of the age of the individual at the 1973 conviction, and the y-axis is defined to be the time since the 1973 conviction. We display a single pencil face which represents the principal offence at conviction. Whenever an individual is convicted, a band of colour represents the type of conviction, and the remaining time the pencil face is grey. The criminal histories are displayed from their first conviction to their last conviction within the 32 year period. Code 1 represents violence offences and is assigned the colour medium-blue, code 2 sexual offences (brown), code 3 burglary (mid-green), code 4 robbery (yellow), code 5 theft (red) and code 6 fraud and deception (magenta).

Figure 3: A 2-D Lexis pencil diagram showing the criminal histories of 42 bigamy offenders who were convicted in England and Wales in 1973. The offenders are ranked with the youngest on the left and the oldest on the right. The representation of the criminal career starts with the first conviction and ends with the latest conviction in the study period. Each criminal conviction is represented by a bar of colour, with different colours representing the principal offence at that conviction.

Figure 3 shows the resulting display. At y = 0, all individuals have an offence displayed - this is their target bigamy conviction. Mostly the principal offence displayed is a sexual offence(code 2) although for ten cases other concurrent convictions, mainly of violence(code 1) and fraud (code 6) were judged more serious than the target bigamy conviction. What stands out from this display is that there are very few cases with other sexual convictions - only 2 individuals were convicted of another sexual offence (and in neither case was this another bigamy offence). For 16 individuals, the 1973 bigamy conviction was their only conviction. However, surprisingly, of the 26 individuals with other criminal convictions, 11 of these had principal convictions for fraud and forgery (code 6) - 27% of the whole sample. Finally, there seems to be little effect of age, with no obvious change in offence specialisation when tracking from the left of the figure to the right (that is from the youngest to the oldest bigamy offender). The initial analysis of this data raises questions as to whether bigamy should be considered as a sexual offence (as the UK Home Office currently classify it) or whether it is better classified as a fraud offence (Soothill et al., 1997).

The 2-D Lexis pencil displays work well when the number of individuals is small, but with larger numbers of individuals, overlap of the pencils can easily occur. How can the display be improved?

