A Significant Development

19/7/2015

"The journey of a thousand miles begins with one step." Lao Tzu

Whilst I’m not a big fan of aphorisms in blogs, this one seems appropriate. Last year, I started writing about the many frustrations I have had with the misunderstandings, and subsequent misuse, of basic techniques used to shed light on any numbers collected in education. I've found a few others who share the frustrations I have with the bad use of statistics in education, and it looks like the issues we’ve highlighted are beginning to be taken seriously, and that we’re beginning to see some small steps on the way to sensible use of data in schools.

Some background, then. I wrote a piece called RAISEonline is contemptible rubbish in which I discussed one of the fairly simple techniques used to compare numbers, that of tests of statistical significance. I suggest you read it if you haven’t done so already.

Following my piece about RAISEonline, physicist Philip Moriarty wrote Lies, Damned Lies and Ofsted Pseudostatistics. We were then invited to meet statisticians from Ofsted and the DfE. I wrote this post (Hammering Nails in RAISEonline's Coffin), explaining in a little more detail why the tests of statistical significance used in RAISEonline should be abandoned, and then this post (Meeting Ofsted to discuss data) after we had met Sean Harford and the Nameless Ones.

In the post about meeting Ofsted I said the following:

“I can’t say too much, other than to say that I expect to see changes in the future. The weight of children in a class represents that class and nothing else, and can’t be said to be drawn from a wider population, because schools are located within their geographical region and admit children according to their own criteria.”

Ch-ch-ch-ch-Changes

On the 16th July, the DfE released some research on the forthcoming Reception Baseline Assessments. This is from page 4 of the Reception baseline research: results of a randomised controlled trial report:

The key sentence in this paragraph is, “As the pupils were clustered within schools, they did not form a truly independent random sample.” And that small step changes everything.

A really simple explanation why

Once again, I’m going to patronise you a little, I'm afraid. If you understand the significance of the sentence I've highlighted above, good for you. Many people will not, however, so I’ll make it clear.

The DfE research report above showed that, if you were to assume that a group of children in a particular school were a representative sample of the whole population of school children, you would have to conclude that the two groups they had developed and tested were different in way which a standard test of statistical significance would suggest was not due to chance.

This is what RAISEonline - the main data set used by Ofsted, schools and the DfE to assess schools' effectiveness - does when it colours data blue and green, and when it constructs confidence intervals to test whether test results are statistically significant. It suggests that items marked blue and green should be assumed to be likely to be unusual and represent meaningful differences between the school and the national population. Essentially, green is ‘Yay!’ and blue is ‘Boo!’.
The DfE research report above states categorically that children are “clustered within schools”. This means, as I’ve argued previously, that each school population represents itself only, and cannot be assumed to be drawn independently from a national population.
The report then goes on to conclude:

If the DfE has done this in this research report, it should apply the same thinking to all data which is clustered in schools. RAISEonline does not do this. It uses statistical significance incorrectly, in a way which this report (quite rightly) specifically rejects.
Green and Blue should be removed from RAISEonline and tests of statistical significance, which require independent and identically distributed data, should not be used unless data is from random samples of a population.

James Pembroke, who has been a stalwart in raising the issues within RAISEonline, has also written about this in his blog The Gift. As he says, “This is something that Jack Marwood, myself and others have been trying to get across for a while - that there's isn't a cohort of pupils in England (or maybe anywhere for that matter) that can be considered to be a independent random sample. Not one"

I look forward to seeing what happens next.

4 Comments

joining link

19/7/2015 02:36:47 pm

Well done Jack for this progress. There are people out here who understand what you are saying bit don't have either the time or the eloquence. We are grateful (technically I can only speak for myself) for what you are doing. One day perhaps the powers that be will also see the nonsense of present avcpuntability, VA, and PRP. Thanks again. I will make our data manager aware of your latest post.

Janet Downs

20/7/2015 01:37:01 am

Let's hope this 'small step' becomes a 'giant leap' and misleading data used to judge schools is abandoned.

Kate Thompson

20/7/2015 02:22:02 pm

It's what we have known for a long time. Having been through Ofsted 3 weeks ago - they are far from letting go of making unreliable assumptions about data. Everyone in education needs to publicise the issue of misleading data.

Education_Researcher

27/7/2015 08:40:37 am

Readers should consider FFT Edu datalab on this issue

http://www.educationdatalab.org.uk/Blog/July-2015/Significance-tests-for-school-performance-indicato.aspx#.VbY7lLVhQ7w

Your comment will be posted after it is approved.

A Significant Development

Leave a Reply.

Author

Archives

Categories