Classical test theory (CTT) and item response theory (IRT) are typical examples of test-theory models. The models are used to enhance an understanding of the relationship between actual scores that are observed and the unobserved proficiency that exists in the domain. The use of IRT models has grown over the past few decades and tis signifies the importance of these models in developing and carrying out analysis of medical education assessments (Magno, 2009). The IRT models apply for several evaluations including test form assembly, item analysis and equating. However, even with the importance in many circumstances, they are mathematically more complicated as compared to CTT models and they also make relatively strong assumptions in their analysis. It may, therefore, be appropriate to use Classical test theory in some instances such as in more localized settings and when the assumptions of IRT are impossible to meet.
In measuring the self-esteem of an individual using classical test theory the initial assumption of exclusive variation in the ability of interest being the only the sole cause of systematic effects between responses is applied. An example that can be used in this case to scale the level of self-esteem in individuals is the Rosenberg Self-esteem Scale that tries to quantify the level of an individual’s sense of self-worth (Zanon, Hutz, Yoo " Hambleton, 2016). The scale is made up of ten item scales with an allowed response of 1 – 4 from strongly agree to strongly disagree respectively on six items that are negatively worded and form that are positively formulated.
When using classical test theory for this test, there are significant restrictions imposed on these scaling, and the assumptions applied. Such limitation includes, ‘1’ meaning the same thing for all questions and going from ‘1’ to ‘2’ equally informs on the mastery of all the questions. Item response theory on the other hand does not use such restrictions in the analysis.
Using IRT the scoring of the Rosenberg Self-esteem Scale creates a latent trait that summarizes all the information from the responses (Zanon, Hutz, Yoo " Hambleton, 2016). The values generated are standard and comparable across the different rounds and cohorts. It is also possible to come up with a standard error for these traits that help the user analyzing to have a guidance and the precision of the estimated latent trait obtained. The latent quality measured, i.e., self-esteem within the framework of Item Response Theory is assumed to be consistent and to follow a standard normal distribution.
Though the CTT has served well for years, its use in recent times has faced some restrictions due to the test statistics and group dependent items. The fact that an observed score in CTT is item dependent and that the responder determines the article statistics such as discrimination and difficulty increase the theoretical difficulty in using CTT. Measurement situations that need identification of biased objects, test equating, constructing item banks and computerized adaptive testing, render the use of CTT inapplicable ("Comparison of classical test theory and item response theory in terms of item parameters", 2014). It is, however, more straightforward as in this case of measuring the self-esteem and does not use complex theoretical models as IRT. The simplicity of CTT is that it can easily be applied to a pool of examinees and the success rate on the scores is empirically determined. Item discrimination done is on the responses provided, and a score obtained dichotomously, and the estimate computed to produce the weighted average. The test score involves the actual score and error score. There is rigorous standardization of the test data that is random and eliminates any variations in the testing conditions both external and internal.
References
Magno, C. (2009). Demonstrating the Difference between Classical Test Theory and Item Response Theory Using Derived Test Data. The International Journal of Educational and Psychological Assessment, 1(1), 1 - 11.
Zanon, C., Hutz, C., Yoo, H., " Hambleton, R. (2016). An application of item response theory to psychological test development. Psicologia: Reflexão E Crítica, 1 - 10. doi: 10.1186/s41155-016-0040-x
Comparison of classical test theory and item response theory in terms of item parameters. (2014). European Journal of Research on Education, 2(1), 1 - 6. Retrieved from http://iassr.org/journal