Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.488645
Title: The effect of modelling rating severity on candidates' measurement in an English language essay examination
Author: Haynes, Anthony Bindley
Awarding Body: University of Manchester
Current Institution: University of Manchester
Date of Award: 2008
Availability of Full Text:
Access from EThOS:
Abstract:
Rater effects are of concern when different raters score candidates' responses. This study demonstrates how models can be used to evaluate the scores assigned by raters on a freeresponse English essay question from a high stakes Examinations Board in the Caribbean Region. In addition, it seeks to use these models to assess the validity of the grades based on these scores. A new Classical test theory (CIT) model was created and compared with the many-faceted Rasch measurement (MFRM) models of rater severity in this investigation, comparing the effects of modelling individual raters and Table group severity, and the additional influence of the Table Leaders (considered by policy as 'standard bearers'). Models for the whole marking period were also compared with those for seven individual days of marking within the period. The. models used exposed quite substantial differences in person measurement. In particular, the study shows how variations in severity within and across moderated (standardised) tables groups of raters can be significantly sustained and variable over time. The knock-on effect of adjusting for rater severity is to change the scores on average by only 0.5 marks but the subsequent effect on the grades showed that about 6% of the candidates' grades would change by at least 1 grade level. The statistical modelling of the Table group rather than the individual raters is also new to the literature and is used for an empirical investigation of the Table group and so by implication its Table Leader. The individual raters and Table groups significantly affect candidate scores and grades but the variation between the Table groups is smaller than between individual raters. Accounting for Table Leaders' severity has shown that the Table Leaders' input in general led to a depression of scores and grades, implying that the Table Leaders on average were more severe in their allocation of marks than the Table group and the individual raters they supervised.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.488645  DOI: Not available
Share: