Tuesday, July 31, 2012

Testing Results Rank Students But Don't Measure Learning

The New York Times covers a Texas testing controversy. The statewide tests rank students but don't measure learning effectively:
Now, in studies that threaten to shake the foundation of high-stakes test-based accountability, Mr. Stroup and two other researchers said they believe they have found the reason: a glitch embedded in the DNA of the state exams that, as a result of a statistical method used to assemble them, suggests they are virtually useless at measuring the effects of classroom instruction.
Pearson, which has a five-year, $468 million contract to create the state’s tests through 2015, uses “item response theory” to devise standardized exams, as other testing companies do. Using I.R.T., developers select questions based on a model that correlates students’ ability with the probability that they will get a question right.
That produces a test that Mr. Stroup said is more sensitive to how it ranks students than to measuring what they have learned. That design flaw also explains why Richardson students’ scores on the previous year’s TAKS test were a better predictor of performance on the next year’s TAKS test than the benchmark exams were, he said. The benchmark exams were developed by the district, the TAKS by the testing company.
South Dakota also uses Pearson. The company's name appears 82 times in the 2012 Dakota STEP Test Coordinator Manuel. The Dakota STEP also uses "item response theory." According to the 2011 Technical report:
The process of calibration, linking, and scaling utilizes item response theory (IRT), which is based on the idea that the characteristics of individual items can be used in conjunction with student item responses to produce estimates of students’ levels of achievement. The particular model used in calibration and linking for the Dakota STEP assessment is the Rasch IRT model (Rasch, 1960). [Italics in original]
Earlier this month Josh Verges reported:
That will change soon as South Dakota gets on board with a national education reform movement that has the support of the Obama administration and Republican governors. By 2014-15, state law requires that half of every South Dakota public school teacher’s yearly evaluation be based on quantitative measures of student growth.
 Verges may have buried the lede:
Schopp doesn’t know how the state will evaluate its teachers two years from now, but she’s confident the process will be an improvement over what schools are doing today.
“It’s not going to be a perfect system, but it’s not a perfect system right now,” she said. “In fact, it’s pretty dysfunctional.”
Tying evaluations to tests that don't measure learning has made the system more dysfunctional. I have a sinking feeling that the final response, like the tests themselves, will follow a Texas model.

1 comment:

Mike Larson said...

It sounds like ALEC will be bringing more to South Dakota laws. Lederman proudly displays that he has to go to them to come up with reform and not educators.