```
library(islay)
data(islay_lithics)
```

# Visualising relationships

In this tutorial we will continue to learn about visualisation as a tool for exploratory data analysis. We will look at ways of visualising the relationship between two or more variables using bar and column plots, scatterplots, additional aesthetics and facets.

## Objectives

By the end of this tutorial you should:

- Be able to generate questions about the relationship between two or more variables
- Know how to produce bar plots, multiple density plots, stacked bar plots, and scatter plots in R using ggplot2
- Be able to refine plots for communication and export them from R

## Prerequisites

- Edward Tufte,
*The Visual Display Of Quantitative Information*(2nd edition), pp. 91–138:- Chapter 4, “Data–Ink and Graphical Redesign”
- Chapter 5, “Chartjunk: Vibrations, Grids, and Ducks”
- Chapter 6, “Data–Ink Maximization and Graphical Design”

## Generating questions about relationships

Last week we looked at using visualisation to answer questions about the variation of a variable (its distribution). Although essential for describing and understanding the nature of your dataset, questions about a single variable have a fundamentally limited explanatory value.

This week we will start looking at the covariation between two (or more) variables – in plain terms, the relationship between them. With this we can start to gain insights into causality. In statistics, we say that there is a correlation between two variables if one can measurably predict the other. This is *not* a statement about causality, merely practicality: if you knew two variables were correlated, you could make a good guess about the value of the other.

This leads to the well-known adage, “correlation is not causation”. But equally, we should be aware that correlation can be a good hint about causation!

### Exercises

Given a dataset on a burial ground, with the following variables:

- sex of the individual
- age of the individual
- age of the burial (i.e. a radiocarbon date)
- number of grave goods
- number of metal objects amongst the grave goods

- What questions of covariation could we ask of the dataset?
- If there was a correlation between the age of the individual and the number of grave goods, could that imply causation?
- What about a correlation between the number of grave goods and the number of metal objects?

## Visualising relationships

Work through section 2.5 and 2.6 of *R for Data Science” (2nd ed.)

You will then apply these techniques to an archaeological dataset.

## Lithic assemblages from Islay

Load the `islay_lithics`

dataset from islay:

We can use the `head()`

function to get a quick preview of the data frame:

`head(islay_lithics)`

```
site_code region period area flakes blades
1 LGM1 Loch Gorm South Mesolithic & Later Prehistoric 102450 159 15
2 LGM2 Loch Gorm South Mesolithic & Later Prehistoric 62497 125 6
3 LMG4 Loch Gorm South <NA> 37480 12 0
4 LGM5 Loch Gorm South Mesolithic 52473 128 18
5 LGM6 Loch Gorm South Later Prehistoric 54971 56 4
6 LGM8 Loch Gorm South <NA> 49974 29 1
chunks cores pebbles retouched total
1 16 24 0 15 229
2 11 20 4 16 182
3 1 1 6 3 23
4 17 27 7 5 202
5 8 18 12 10 108
6 20 3 0 5 58
```

Because this is an in-built dataset of the package, you can also enter `?islay_lithics`

to open the help page for the dataset, which contains more information on what it describes.

As with the last dataset, it will be useful to turn the `period`

column into a factor now, so that it will automatically be ordered in our subsequent plots:

```
<- c("Mesolithic", "Mesolithic & Later Prehistoric", "Later Prehistoric")
periods $period <- factor(islay_lithics$period, periods) islay_lithics
```

### Exercises

- Generate a plot showing the relationship between period and the number of retouched pieces. Is there a correlation? What could explain this?
- Try with two other types of lithics. Does it change your answer?
- Generate a plot showing the relationship between the number of two types of lithics.
- Add an aesthetic showing a categorical variable.
- Export the plot.