Rabu, 06 Agustus 2014

Norm-Referenced and Criterion-Referenced Testing

One major way in which test results can be interpreted from different perspectives involves the distinction between norm- and criterion-referenced testing, two different frames of reference that we can use to interpret test scores. As Thorndike and Hagen (1969) point out, a test score, especially just the number of questions answered correctly, “taken by itself, has no meaning. It gets meaning only by comparison with some reference” (Thorndike and Hagen: 241). That comparison may be with other students, or it might be with some pre-established standard or criterion, and the difference between norm- and criterion-referenced tests derives from which of these types of criterion is being used.
Norm-referenced tests (NRTs) are tests on which an examinees results are interpreted by comparing them to how well others did on the test. NRT scores are often reported in terms of test takers’ percentile scores, that is, the percentage of other examinees who scored below them. (Naturally, percentiles are most commonly used in large-scale testing; otherwise, it does not make much sense to divide test takers into 100 groups!). Those others may be all the other examinees who took the test, or, in the context of large-scale testing, they may be the norming sample—a representative group that took the test before it entered operational use, and whose scores were used for purposes such as estimating item (i.e. test question) difficulty and establishing the correspondence between test scores and percentiles. The norming sample needs to be large enough to ensure that the results are not due to chance—for example, if we administer a test to only 10 people, that is too few for us to make any kind of trustworthy generalizations about test difficulty. In practical terms, this means that most norm-referenced tests have norming samples of several hundred or even several thousand; the number depends in part on how many people are likely to take the test after it becomes operational.
The major drawback of norm-referenced tests is that they tell test users how a particular examinee performed with respect to other examinees, not how well that person did in absolute terms. In other words, we do not know how much ability or knowledge they demonstrated, except that it was more or less than a certain percentage of other test takers. That limitation is why criterion-referenced tests are so important, because we usually want to know more about students than that. “About average,” “a little below average,” and “better than most of the others by themselves do not tell teachers much about a learner s ability p er se. On the other hand, criterion-referenced tests (CRTs) assess language ability in terms of how much learners know in “absolute ’ terms, that is, in relation to one or more standards, objectives, or other criteria, and not with respect to how much other learners know. When students take a CRT, we are interested in how much ability or knowledge they are demonstrating with reference to an external standard of performance, rather than with reference to how anyone else performed. CRT scores are generally reported in terms of the percentage correct, not percentile. Thus, it is possible for all of the examinees taking a test to pass it on a CRT; in fact, this is generally desirable in criterion-referenced achievement tests, since most teachers hope that all their students have mastered the course content.
Note also that besides being reported in terms of percentage correct, scores may also be reported in terms of a scoring rubric or a rating scale, particularly in the case of speaking or writing tests. When this is done with a CRT, however, the score bands are not defined in terms of below or above “average'5 or “most students,’ but rather in terms of how well the student performed—that is, how much ability he or she demonstrated. A rubric that defined score bands in terms of the “average,” “usual,” or “most students,” for example, would be norm-referenced. 

0 komentar:

Posting Komentar