Everyone aspires to systems that have no errors, but in truth most systems we design choose which errors they prefer. The most basic difference is between Type I and Type II errors. A Type I error is a false positive – the test will skew towards over-reporting the actual number; a Type II error is a false negative – the test will skew towards under-reporting the number.
Choosing which error you prefer is critical. In medicine, we prefer Type I errors: if you are testing for cancer, better to get a few initial diagnoses that are incorrect than to miss a case where cancer is present. In law, we have generally established that we prefer Type II errors: better to have some criminals go free than to jail innocent men (and DNA tests are improving this tendency).
The DPS/DCTA teacher evaluation system is designed for Type II errors: the grievance process and collective bargaining agreement reduce the possibility that some teachers are dismissed unfairly – with the consequence that teachers that should be dismissed are not. All other things being equal, average teaching quality is lower with Type II errors.
A teacher evaluation system set up for Type I errors would dismiss more teachers who are poor performers – but with the consequence that some teachers might be dismissed unfairly. All other things being equal, average teacher quality is higher with Type I errors.
An evaluation system designed for Type II errors favors teachers (less likely to lose your job); a system designed for Type I errors favors students (less likely to have a bad teacher).
We have the correct results for the system that we have decided to use.