Examining judgements : theory and practice of awarding public examination grades
This thesis reports a study of the processes by which public examination grades are awarded. Following a review of the purposes of public examinations, new theoretical analyses are given of the issues of norm and criterion-referencing, the nature of public examination standards, the problems of defining comparable standards across widely disparate assessment domains and the more technical matters of aggregating marks and examiners' judgements. The main empirical work investigated conventional public examination grade awarding using a combination of participant observation of examiners making judgements and statistical analysis of examination outcomes. Two additional experiments are also reported; one on grade, rather than mark, aggregation methods and one on the use of strong criterionreferencing to award grades. The main conclusions of the study are as follows: 1. Examination standards are social constructs created by special groups of judges, known as awarders, who are empowered, through the examining boards as governmentregulated social institutions, to evaluate the quality of students' attainment on behalf of society as a whole. 2. As a result, examination standards can be defined only in terms of human evaluative judgements and must be set initially on the basis of such judgements. 3. The process by which awarders judge candidates' work is one in which direct and immediate evaluations are formed and revised as the awarder reads through the work. At the conscious level, it is not a computational process and it cannot, therefore, be mechanised by the use of high-level rule-bound procedures and explicit criteria. 4. Awarders' judgements of candidates' work are inadequate, by themselves, as a basis for maintaining comparable standards in successive examinations on the same syllabus. The reasons for this are related both to the social psychology of awarding meetings and to the fundamental nature of awarders' judgements. 5. The use of statistical data alongside awarders' judgements greatly improves the maintenance of standards and research should be carried out into the feasibility of using solely statistical approaches to maintain standards in successive examinations on the same syllabus. 6. A broadening of the range of interest groups explicitly represented among judges initially setting standards should also be considered.