The Signal and the Noise in English Education

25/8/2014

Nate Silver’s book The Signal and the Noise is a wide-ranging examination of the ‘high stakes world of prediction, showing how we can all learn to detect the true signals amid the noise of data.” Silver came to prominence when he predicted the results of the 2012 Presidential race in the USA. Whilst his book focuses on the statistics of prediction, it offers many interesting insights into the way in which data is frequently misused and misunderstood. With the enormous expansion in education data here in England, and the extensive use of analysis of that data, Silver’s book raises important questions about many of the basic assumptions used when crunching numbers.

Silver questions whether an expansion in available information produces good ideas, at least in the short term. He looks at developments in weather forecasting to highlight the importance of uncertainty in predictions. He criticises the use of ‘frequentist’ statistics as developed by R.A. Fischer, and suggests that using flawed ideas to crunch numbers results in incorrect conclusions about the underlying factors affecting the distributions of those numbers.

His ideas suggest that, in searching for a signal within the noise of available of data, frequent mistakes are made. He highlights the ideas which flow from Bayesian statistics, in which prior assumptions are understood to have a significant effect on our understanding of the world.

More information can sometimes mean less understanding

Silver begins with some musings on historic periods within which the availability of information increased dramatically. He looks, for example, at the effects of the newly-invented printing press in the 14th century, when huge amounts of writing suddenly became, if not cheap, then certainly vastly more affordable. Whereas a typical book might have cost upwards of £70,000 in today’s prices, printing brought the cost down to closer to £70. The result was somewhat unexpected. Instead of a better understanding of a complex world, there was a huge increase in error and confusion:

“Errors could now be mass-produced, like the so-called Wicked Bible, which committed the most unfortunate typo in history to the page: thou shalt commit adultery. Meanwhile, exposure to so many new ideas was producing mass confusion. The amount of information was increasing much more rapidly than our understanding of what to do with it, or our ability to differentiate between useful information from mistruths.” p3.

Silver likens this explosion in information to the recent period in which computer technology has enabled another huge amount of writing to become readily available. He also notes that computer technology has meant that more and more numbers can be crunched, and that this is not always a good thing.

In particular, Silver looks at a controversial paper entitled “Why Most Published Research Findings Are False”, published by John P. Ionnidis in 2005. Ionnidis looked at findings published by a vast number of researchers and concluded that “that most () findings were likely to fail when applied to the real world.” Silver goes on to note that, “Bayer Laboratories recently confirmed Ionnidis’s hypothesis. They could not replicate about two-thirds of the positive findings claimed in medical journals when they attempted the experiments themselves.” p11-12

So the huge amount of information released by modern technology has to be regarded sceptically, which can be difficult in a world where there are endless numbers of written sources to confirm our existing biases. Pick an opinion, any opinion, and you can easily find many, many people who not only share your opinion, but have written, researched and published their findings which reinforce your prior beliefs. Unless you actively look for a different view, you may simply end up sharing biases and beliefs with other true believers.

This problem is compounded by the human brain’s propensity to make sense of the world using patterns in prior information. Silver interviews Tamaso Poggio, an MIT neuroscientist who studies how our brains process information and finds that “the problem, Poggio says, is that () evolutionary instincts sometimes lead us to see patterns where there are none there. “People have been doing that all the time," Poggio said. “Finding patterns in random noise.” p12

Silver’s book is an attempt to redress the balance of bad ideas by asking his readers to consider their existing biases, particularly about the use of statistics. He is particularly interested in the way bad ideas spread, and the way in which fundamentally flawed assumptions lead many to fundamentally flawed conclusions. As he says, “Capitalism and the Internet, both of which are incredibly efficient at propagating information, create the potential for bad ideas as well as good ones to spread. The bad ideas may produce disproportionate effects.” p13

Why forecasters can’t be certain

To look beyond the widespread lack of awareness of the difficulties when using statistics to explain real world phenomena, Silver introduces the complex word of forecasting weather. He notes that there has been an improvement in the accuracy of weather forecasts in the last forty years or so, a period which coincides with the rise in computing technology in general use. Here, he suggests that having more information has caused expert forecasters to revisit their fundamental assumptions, and to look at the prediction of weather in a new light. More information has helped. But, as importantly, expert weather forecasters have had to accept that weather is inherently uncertain and cannot be predicted with 100% accuracy.

Predicting the weather has always been very much a hit and miss affair. Many attempts have been made to find ways to make useful forecasts, including using everything from folklore, historical information, climate science, to gut feeling. Once expert forecasters began to accept that uncertainty was part of the process, they began to get better at predicting weather.

So, for example, forecasters today can run a computer simulation to predict the next day's weather. They then repeat the simulation many more times. If 40% of the simulations suggest that it will rain tomorrow, then the forecaster might say that there is a 40% chance of rain – it still may or may not rain where you are, but now some measurement of uncertainty has been added to the forecast. The uncertainty inherent in weather forecasting is still present, but in a way which makes it a little easier to decide whether to carry an umbrella or not.

For a weather forecast to be deemed to have merit, it must do better than ‘persistence’ (the assumption that tomorrows’ weather will be the same as the weather today) and it must be better than ‘climatology’ (the long term historical average on a particular date in a particular area).

Silver shows that weather forecasts made at intervals of more than nine days in advance are worse than climatology. He also shows that most weather forecasters have a ‘wet bias’, in that ‘they are biased towards forecasting more precipitation than will actually occur. If it rains when it isn’t supposed to, (people) curse the weatherman for ruining their picnic, whereas an unexpectedly sunny day is taken as a serendipitous bonus.” p135 Even within the numbers, there is some human input and forecasters tweak what their simulations tell them based on their own prior knowledge of weather patterns.

For more serious weather forecasting, such as that for hurricanes which might do more than ruin a picnic, organisations try to remove any possible human bias. It is a matter of life or death when predicting huge weather events such as Hurricane Katrina, which devastated the area around New Orleans in 2005. But weather forecasters recognise that uncertainty is a fact of life. Max Mayfield, who works for the US Hurricane Center and had to issue forecasts for ahead of the arrival of Katrina, puts it this way. “Uncertainty is the fundamental component of weather prediction. No forecast is complete without some description of that uncertainty.” p138.

In education, uncertainty in data is almost completely denied. If a student is awarded a B grade, they get a B grade no matter what. They might be one mark away from a grade boundary, but the system allows for no uncertainty. Equally, a database such as RAISEonline brooks no dissent. A cohort is either ‘better’ than the national average or it is not. There is no acceptance of uncertainly in the measurements or the analysis. Silver’s view is that the dominant statistical theory, which is at the heart of English education statistics, is flawed.

Frequentism gets a kicking

“Close to the root of the problem () is a flawed type of statistical thinking that () researchers are applying.” p250. Silver gives a brief history of statistician R.A. Fischer, who was “probably more responsible than any other individual for the statistical methods that remain in wide use today. He developed the terminology of the statistical significance test and much of the methodology behind it.” p251 The statistical methods which Fischer developed – based on normal distributions, and making predictions about samples of data - are usually referred to as ‘frequentism’ today.

“The idea behind frequentism is that uncertainty in a statistical problem results exclusively from collecting data among just a sample of the population rather than the whole population.” p252 Collecting information about a sample produces, in the eyes of frequentists, ‘sampling errors’ and these are the only errors which frequentists account for. Any underlying prior assumptions or ‘prior probabilities’ as Bayesians refer to them, are ignored.

This, unfortunately, means that frequentist statistics are often susceptible to bias, and as Silver says, “if you are using biased instrument, it doesn’t matter how many measurements you take – you’re aiming at the wrong target.” P253

Silver criticises the frequentist approach because it ignores human error. He also questions the underlying assumptions, particularly the assumption that measurements follow a bell curve, or normal distribution. His biggest criticism is, however, that frequentism – in striving for “immaculate statistical procedures” – often ignores underlying context or simply whether a given hypothesis is reasonable.

In the case of most education statistics, human error could include the assumption that a schools’ pupils test results are somehow normally distributed, that the children are independent of each of other and are identically distributed in schools, each of which is questionable. More important, by Silver’s thinking, is the simple suggestion that comparing one individual or group of individuals to a national ‘average’ completely ignores underlying context, or whether a given hypothesis – that a group of disadvantaged pupils are making less progress than a group of advantaged pupils, given the myriad of factors beyond the control of teachers or schools which affect progress – may simply be unreasonable.

As Silver notes, ‘data is useless without context’.

There is an alternative way of thinking about statistics

As I have noted previously, mathematical statistics are an attempt to simplify complicated data. And there is more than one way to skin that particular cat. Silver is no frequentist. He prefers the analysis which flows from the ideas of Thomas Bayes, an 18th century thinker.

Bayes believed, in essence, that we get ‘closer and closer to the truth as we gather more evidence’ p242. He didn’t believe the world was uncertain – he lived in a pre-Darwinian age and believed that that it was up to humans to make sense of what he saw as divine creation. His ideas have been developed by those interested in conditional probability in a more rational age.

“Bayes theorem begins and ends with a probabilistic expression of the likelihood of a real-world event. It does not require you to believe that the world is intrinsically uncertain. It does require you to accept, however, that your subjective perceptions of the world are approximations of the truth.” p448.

Silver, in dismissing frequentist statistics, sides with those who believe that our biases have to be confronted when looking at numbers, rather than simply assuming that the error comes in our measurements. “Science may have stumbled later when a different statistical paradigm, which de-emphasised the role of prediction and tried to recast uncertainty as resulting from errors of our measurements rather than the imperfections of our judgements, came to dominate in the twentieth century.” p243.

Ultimately, Silver believes that we improve our understanding of the world through use of human judgement rather than simply crunching incorrectly measured numbers using assumptions which are necessarily flawed. He is hopeful that frequentist statistics will drop out of use. “It will take some time for textbooks and traditions to change. But Bayes theorem holds that we will converge towards the better approach. Bayes’ theorem predicts that Bayesians will win.” p261.

We need to stop finding patterns in random noise

The Signal and the Noise is fascinating reading for those who have an interest the statistics used in English education. It shows that uncertainty is at the heart of prediction, and that the use of frequentist statistics is not universally accepted. For those who have not studied statistics, it explains Bayesian thinking clearly and accessibly.

SIlver shows the difficulty of finding a signal - to be certain - in a noisy world full of uncertainty. In a field as noisy as education, it is virtually impossible to find a signal which is clear and unambiguous. Within a small, flawed data set - such as the results allocated to pupils in a school - the noise is deafening. It follows that trying to find signals within sub groups in a school is a fool’s errand. Nate Silver’s book shows just how hard it is to find a signal, and how easy it is for us to find patterns in random noise.

0 Comments

The Signal and the Noise in English Education

Leave a Reply.

Author

Archives

Categories