What is the purpose of the current inspection regime?
When, in 1992, Ofsted began inspecting schools and publishing inspection reports, little was known about schools by those not working within them. That cannot be said of today. Schools publish all manner of information about what goes on within their walls. Government websites publish performance tables, with enormous amounts of data on subjects from recent test scores to 'Teaching staff and Education support staff expenditure’. Government-funded organisations such as the FFT provide governing bodies with mountains of information, and those overseeing schools are asked to be much more accountable for what happens in school. The accountability structures which are currently in place aim to hold schools to account in ways which would have inconceivable in a previous era.
So why send in inspectors to look even more closely at what a school does?
There are a number of clear reasons to have an independent inspection of a school. Governing bodies may have been failing to hold those running their school to account. Head teachers/senior management teams may have made decisions which are bad for their schools and those within them. Teachers may be doing a poor job. Children and parents may not be satisfied with the school. Things may not be what they seem.
So what should inspectors be doing? What checks and balances do we need? What can be reasonably expected of a short inspection period such as we have currently?
I would argue that we need independent inspections to check:
1) Whether governors are holding head teachers to account.
2) Whether head teachers/senior management teams are working well.
3) Whether Children, parents and staff are happy with their school.
Astute observers will notice that this is somewhat different to the current inspection regime which passes judgements on:
Overall Effectiveness
Achievement of Pupils
Quality of Teaching
Behaviour and Safety of Pupils
Leadership and Management
Those who have read my previous work will be unsurprised that I think that Ofsted’s current focus should be altered to reflect the dawning realisation that test score data cannot be used in the way it has been up until now. Whilst the inspection framework relies on assumptions about the reliability and validity of test score data, Inspection cannot be fit for purpose.
Making Inspection fit for purpose
I suggest three different ways which we could go.
1) Carry on regardless
If things continue as they are, 20% of schools will be judged to be providing an education which is less than ‘Good’, as defined by Ofsted. Non-selective schools serving the disadvantaged will be more likely to be in this 20%, as ‘Achievement of Pupils’ in these schools will of necessity be compared unfavourably with schools serving the advantaged.
Regardless whether Inspectors are conscious of their actions or not, judgements of ‘Achievement of Pupils’ are driven by their flawed understanding of test score data. Since this grade is highly correlated with the Quality of Teaching and Overall Effectiveness grades (over 95% in agreement), schools are effectively judged on test score data.
Schools which are badly run and inadequately held to account will not be identified by this system. Teachers are given no encouragement to identify problems in their schools, since a poor grading by Ofsted will directly impact on their working conditions. Inspectors will make erroneous judgements on the ‘Quality of Teaching’ based flawed understanding of data and superficial impressions of a school’s context.
2) Stop grading Achievement of Pupils
Achievement, as defined by Ofsted up to now, is a measure of the amount of progress children make within a given school.
This is hugely problematic, since it seems to be dawning on those in positions of influence within the education establishment that attainment data for a school – the raw test scores pupils are awarded – are a function of the children in a cohort, and they can only be said to represent that cohort and nothing else. There are no ‘trends’ from year to year, as each cohort is unique, and one may as well weigh the children and compare their mean weights.
Furthermore, since the children in a given school can’t be said to be drawn from a wider population, it does not make any sense to compare their test scores to those of the wider population as the school cohort is not an independent sample of those who have taken the same tests.
Some observers have accepted that small cohorts can often have wildly fluctuating mean results due to the variation small sample represent. These observers tend to assume that larger samples do not suffer from similar problems, and we are starting to see ‘three year trends’ being reported as a solution to the small numbers problem. This does not tackle the core issue, however. If the cohort, or school, is representative only of itself and not all those taking a test in a given year, any comparison based on the Central Limit Theorum is meaningless. All that can be said is that a cohort has a mean which is either above or below any national mean. There is no ‘significance’ test which has meaning.
Additionally, there is no way to ascertain what has contributed to any increase in relative test scores. It might be the school, but it might be shadow education, family involvement or lack thereof, or any number of other factors. So Ofsted are in no position to make any objective assessment of ‘achievement’ or ‘progress’ as it is currently defined.
Therefore, ‘Achievement of Pupils’ should be seen as a redundant measure, in same way that graded lesson observations - which Ofsted agreed (with admirable speed, earlier this year) cannot be assessed objectively - are no longer used in inspections.
The government’s Performance Tables provide test score and other data for those who wish to judge schools using numbers. Allowing parents to draw their own conclusion seems preferable to the current situation, in which dubious conclusions are drawn based on numbers which are fuzzy, biased and often Not Even Wrong.
3) Stop grading Quality of Teaching
This judgement has been shown to reflect the ‘Achievement of Pupils’ judgement in the majority of Ofsted reports. If the Achievement of Pupils grade is removed, this judgement should also be discontinued.
It is simply not possible for Inspectors to make an objective assessment of the Quality of Teaching in a two day inspection visit. In 2013, we were told that Inspectors did ‘lesson observations with senior leaders to agree the quality of teaching. And triangulate this with the progress that students are making. We look at books too, and check the quality of feedback.’. Now that lesson observations cannot be used to ‘agree the quality of teaching’, Ofsted is trying to limit the impact of the vacuum this has left.
Michael Tidd has written eloquently on this in his recent post, ‘Teaching today: not enough evidence; too much evidencing’. As Michael says, ‘All the time Ofsted are criticizing schools for failing to evidence things, or praising those schools who excel at producing evidence, other school leaders will feel compelled to continue to demand that work be evidenced.’
However this judgement grade is reworded, it will always be problematic. Ofsted must trust head teachers/senior management teams to run their schools, and governing bodies to hold head teachers/senior management teams to account, rather than try to micromanage what happens in classrooms.
Does test score data have a role in assessing schools?
It is clear that some people high up in government education still believe that test score data has some role in schools. Even within schools, there are those who think that tests scores are useful.
The past twenty years has seen vast improvements in schools test results, which could be seen as a vindication of their use to ‘drive up standards.’ I have some sympathy with this position. I can understand that those who don’t work in schools – and especially those who don’t work in ‘difficult’ schools – think that teachers need some kind of stick to encourage them to have high aspirations and to aim high for the children in their charge. There may indeed be some teachers who have ‘low expectations’. This is not my experience, and it is not what research into teacher aspirations suggests.
It is for governing bodies to put pressure on their senior management teams, and head teachers to work with their teaching teams, all with the goal of aiming as high as possible.
Nationally, a ‘floor target’ makes some kind of sense – as long as the grading structure of terminal exams doesn’t make the task impossible. But at school level, it simply doesn’t make sense to set ‘floor targets’ at, say, 85%, when a single child represents much more than one of those percentage points. 85% of 30 children – a typical Year 6 cohort – is 25 and a half children out of a class of thirty. That means that just five children have to have a bad day, or a difficult life, or any one of hundreds of problems which have nothing to do with their teachers or schools, for the school to be said to be failing. That’s wrong and makes no sense.
Finally
My suggestions are therefore:
1) Carry on regardless
2) Stop grading Achievement of Pupils
3) Stop grading Quality of Teaching
As ever, comments are more than welcome.