Think back…when listening to debates, how often have you heard people state that “x is linked to y”?. ‘y’, for example, could be cancer or the economic slump and ‘x’ anything from pollution levels to bacon consumption, low confidence to the weather. In saying that the two are linked, they are really only referring to an association, a statistical pattern, between them. But the implication, sometimes implicit, sometimes explicit, is that ‘x’ causes ‘y’.
But where is the evidence? The job of statistical tests is to tell us whether correlation between two measurable things (what statisticians term ‘variables’) is down to coincidence or otherwise significant. But even when there seems to be strong correlation, this still does not prove causation.
The well-worn phrase “correlation does not prove causation” is, in itself, correct but when used as ammunition in debate, usually marks the end of informed discussion. Often you get a strong sense at this point that nobody is sure who is bluffing and who really knows more. That’s because, more often than not, the entire discussion is based on correlational data and it’s a big leap from finding that there is an association between two things to knowing that one actually causes the other.
Bringing ‘correlation’ into a debate is pointless unless you know something about the strength (how close) and the direction (positive or negative) of the relationship between the two things you are discussing.
In short, you – and whoever you are debating with – need to know:
a) whether there is positive correlation (i.e. as one thing increases, so does the other) or negative correlation (as one increases, the other decreases)
b) the correlation coefficient (symbolised as ‘r’ – a figure which will always be between -1.0 and +1.0) which tells you how close the association is between the two things (when there is a causal link, this measure represents information which can begin to help you to begin to predict future interventions… a very simple example would be that of an electricity plant planning for higher outputs during a cold spell based on the correlation between high demand for electricity and cold weather)
c) the regression coefficient, Again, providing there is a causal relationship between two things, this measurable unit explains how closely the two things are connected and how much one thing will change if the other changes
It’s also useful to know the effect size eg there might be a strong correlation between two things but, in practice, the actual number of imports-exports or sick people etc involved -according to the debate - might be very small. But then again, they might still be very important and carry huge impact.
Our advice? Just as statisticians don’t do one-off ‘yes/no’ tests, it’s your duty (!) to keep probing, building up your evidence incrementally to get nearer and nearer to the truth of the matter. Above all, don’t crumble when you hear the terms ‘correlation’ and ‘causation’ in the same sentence, be ready ask some or all of the following questions:
- how strong is the correlation between the two things being considered? ? i.e. what is (the measure of) the strength of that relationship ? (if you are being technical…what is the co-efficient?)
- what more do we know about the interaction between the two things? i.e. how much change in one thing is required for a unit of change in the other, be it cancer or the economy? (what is the regression coefficient?)
- what do we know about ‘third factors’ which may have a confounding effect e.g. if you were looking at suicides and unemployment, you might ask about the potential effects of divorce and partnership breakdown which are also likely to increase as people experience longer-term unemployment…
- what do we know about the project/research/survey design? e.g. was it focused on looking for a specific effect, or intended to report what was significant of a large number of correlations?
- do we know whether other projects/ research/surveys have had similar results?
- is there other related evidence out there to support the case being made for cause and effect?