The idea of standardized testing has many people confused in general, especially in education where students in K-12 are required (at least in the State of Ohio) to sit for annual exams in various subjects including language arts, mathematics, social studies, and science. The resulting scores from these tests are standardized, and normalized against all other scores among the sample population (in this case, all students in the State of Ohio). Many have mixed feelings about standardized testing, but most likely do not fully understand what a standard score actually is.
In order to accurately compare assessment scores across multiple distributions (in this case, scores on the language arts test with the social studies test) it is required to standardize, and in some cases normalize the scores themselves for analysis. From Hinkle, Wiersma, & Jurs (2003) we know that a standard score, or z score, is the mean subtracted from the raw score divided by the standard deviation for a selected distribution. “The z score indicates he number of standard deviations a corresponding raw score is above or below the mean” (Hinkle, Wiersma, & Jurs, 2003, p. 71), and therefore, we can tell whether or not a score is above or below the mean, and by how far.
Normalizing the score considers the normal distribution itself. The normal distribution “is not determined by any specific even in nature, and it does not reflect a specific law of nature” (Hinkle, Wiersma, & Jurs, 2003, p. 80), rather it is a model describing the normal, or usual, distribution of many sets of data. By normalizing the standard distribution, one can compare distributions across many sets of data, and in so doing, better analyze trends, outliers, and central tendencies. Furthermore, in the standardized normal distribution model, it is possible to ascertain how many scores in the distribution fall within one or more deviations away from the mean.
In manufacturing, for example, many strive to optimally produce with defect rates in the six-sigma range, or 3.4 defects per one million products (or whatever they produce). These ranges are extremely small, and at the very end of the normalized distribution, but helpful in making decisions about production, quality assurance, and interventions to fix defects.
In K-12 education, we would be lucky to achieve numbers far less than in the six-sigma range, understanding that 3 deviations away from the mean includes 99% of all scores in a given normalized distribution. In six sigma operations, that percentage is now 99.99966%. For example, if the metric was high school graduation, and a school district graduated 99% of its student population, it would still only graduate at the three-sigma level.
Understanding what a normal distribution is, and what standardized data is all about would serve any K-12 educator or practitioner well, and better prepare them to discuss what scores actually mean with colleagues, parents, and community members.
Most educators (classroom teachers, etc.) use a simple percentage scale for scores, that often translate into grades, but simply comparing achievement in one subject area with another is not a fair comparison, since the scores are not standardized, nor normalized. Perhaps we should consider standardizing and normalizing all scores and grades within K-12 in order to better understand and compare student performance across subjects?
Hinkle, D. E., Wiersma, W., & Jurs, S.G. (2003). Applied Statistics for the Behavioral Sciences (5th ed.). Boston, MA: Houghton Mifflin.