Thursday, May 11, 2017

Beware If the Ohio D. O. E. Publishes Teacher Scores


First, let me establish my credentials.  I am not a statistician; nor am I a math teacher; nor do I have access to all of the data required to draw conclusions about the forthcoming matters.  I am a high-school, English teacher, and since those are my qualifications, I share my concerns in a narrative. And, I will let you, fair audience, draw the conclusions.

2014 - 2015 was the first year for which teachers on my PLC received value-added scores from the State of Ohio.  We were told that these scores (by comparing longitudinal testing data for each student) measured how much each student grew in a given year compared to the State's expectation--whereby (once the results for all of a teacher's students were tabulated) the State assigned each teacher a performance score (1-5).  A score of 3 suggests that a teacher hit expectations; his or her students grew as expected--essentially a grade of C.  As you might imagine, a score below 3 is bad news, and a score above a 3 is praiseworthy.

So,  I wanted to do what any thinking person would want to do--and what we were taught to do in our statistics classes.  I wanted to work with the numbers.  So, I opened a dialogue with my peers, and most of them were willing to share their scores.  Interestingly, the data clearly demonstrated that, to a person, teachers in the honors classes had higher scores than teachers in the regular-ed classes, who then had higher scores than teachers who taught inclusion classes.  

I remembered enough from my statistics professor’s lessons to know that there is one of two possible conclusions there - either all of the honors teachers are significantly better teachers than all the other teachers, or something about the natural abilities of the children in the sample was affecting teacher scores. Thus, I was able to conclude that--since we were unable to determine if teacher quality or student ability determined teacher scores--the data were statistically invalid in terms of measuring the quality of teachers.  After all, to measure a single variable (teacher quality), all other variables must be controlled--and that had obviously not occurred. So, I made my case; we all talked about it for awhile, and life went on.

This year, the data came in for the 2015 - 2016 school year, and I was curious if we could demonstrate the same flaw in the State's data, so I initiated the same conversation from the previous year, but teachers were less willing to share their scores.  So, thinking it important to get the bottom of this matter, I asked the administration if - without any names being revealed - I might have access to the scores. To make a long story short, I was told that, because of teacher privacy, scores other than my own were none of my business.  I replied that I respected teacher privacy--and that I had no interest in knowing my peers’ scores--but that a lack of transparency about the scores created a situation where, because the State was hiding the data, teachers were helpless to point out errors or malpractice.  My argument gained no traction. But, in short, there is no checks-and-balances tool built into the system. Teachers are simply supposed to passively accept their results.

That answer is not good enough for any thinking person. So, to demonstrate the cogency of my concern, I shared these two tables of hypothetical scenarios with administration and asked them (since they would not share the data with me) to please consider our data in these terms to test for validity.

Dear math teachers and statisticians, if I am off here, feel free to crush my hypothesis. I am, after all, an English teacher doing math--at best my qualifications are dubious...

Scenario 1:

Class
Teacher Score
Teacher Score
Teacher Score
Teacher Score
Teacher Score
Honors
5
4
4
5
5
Regular
4
3
4
3
3
Inclusion
3
2
3
3
2

Honors: Mean = 4.6; Median = 4
Regular: Mean = 3.4; Median = 3
Inclusion: Mean = 2.6; Median = 3

In this situation, the data suggest that it is possible (and perhaps likely)  that student ability is affecting teacher scores.

Scenario 2:

Class
Teacher Score
Teacher Score
Teacher Score
Teacher Score
Teacher Score
Honors
2
3
5
5
3
Regular
5
3
3
2
4
Inclusion
5
4
3
3
4

Honors: Mean = 3.6; Median = 3
Regular: Mean = 3.4; Median = 3
Inclusion: Mean = 3.8; Median = 4

In this situation, the data suggest that it is less likely that student ability is affecting teacher scores.
_______

Using these models, I explained to administration that until we answer the question about whether student ability is affecting teacher scores, our scores are invalid as a measure of teacher quality. Thus, there is no evidence to support that a teacher who earned a 5 is superior, that a teacher who earned a 3 is average, or that a teacher who earned a 2 is subpar.  

Sure, teachers can feel good or bad about themselves depending on their scores.  And, administrators can speculate if the results match their feelings about certain teachers.  However,  until we can prove that all variables other than teacher quality have been controlled, the data and any conclusions derived from that data are invalid.  Thus any self loathing, self congratulations or speculation about teacher quality is dangerous.

To my knowledge, the administration ignored my suggestion to question the State's data.

Here my narrative concludes. Those past data to which I had access are clearly flawed.  And, the new data needed for teachers to protect their and their students’ best interests are being withheld from stakeholders. And, as far as I know, administrators are unwilling to test the validity of that data in even the simplest of ways.

Now, to be balanced, I must hold myself to the same standard that I hold those people whom I am discussing.  From the data and the information that I have, I cannot claim that Ohio's teacher-ranking system is fundamentally flawed, and I cannot claim that there is any malpractice or attempt to deceive by anyone mentioned or alluded to in my piece.  I can only claim that the small data sample that I have is flawed and that other data are being withheld from me.  Any other conclusions are speculation.

And, as anyone who has been alive long enough knows, any system that fails to have a built-in checks-and-balances tool is fundamentally untrustworthy, especially if those being measured have no voice.

PS: To my knowledge, in all states that have attempted to use such a system, the results have been proven invalid and the systems collapsed under their own lack of integrity.  






No comments:

Post a Comment

Beware If the Ohio D. O. E. Publishes Teacher Scores

See my other blogs about: class-size issues, ADD/ADHD, etc. My Email My Online Store: First, let me establish my credentials.  I a...