A few weeks ago I was invited by Matt Barnum to discuss various issues in education reform through a series of letters.  Matt is a TFA alum who is now in law school.  He has written several articles in various newspapers about the complexity of improving education.  Most recently he wrote something about how it is time for TFA to fold.

My first thought was that since we have so much in common, these discussions would not have enough conflict to make them very interesting reading.  Matt says that he is generally on the ‘reform’ side of these discussions, though, so I asked him to ‘start’ the exchange.  It is interesting to be on the receiving end of one of these ‘open letters’ as I’ve initiated so many of them.  I can see why so many people didn’t respond to mine.  By the nature of the process, the recipient is likely to feel and seem like he is on the defensive.

Here is his letter, followed by my response:


Dear Gary,

It was about two and a half years ago that I started reading your book, ‘Reluctant Disciplinarian’. I had just finished TFA’s institute and was a couple weeks away from my first day teaching English at a middle school in Colorado. I remember thinking your book was a bit cynical and a bit negative.

Fast forward a couple months. I wished I’d taken your book more seriously, and I realized the wisdom of many your views. In retrospect, I had been inculcated, by Institute, in the TFA culture of  ‘high expectations’ and I believed that I was an excellent classroom manager because I could control eight or nine summer-school students when multiple other adults were present. Not as easy in a real classroom, as I soon found out. TFA’s faulty training model is perhaps a discussion for another day.

Anyway, I discovered your blog several months ago, having recently finished up my stint as a TFA corps member, and beginning to get into education policy and then some education writing. I regularly visit your blog for what I consider to be the most clear-eyed, fair-minded traditionalist view. (A note on terminology: I tend to like ‘traditionalist’ rather than ‘anti-reformer’ for obvious reasons.) Unfortunately, though I respect much of what you write, I can’t say I agree with most of it.

So why am I writing you? Because I’m interested dialogue rather than monologues. Because I do fear that each side of the education debate has become an echo chamber. Because I’m hoping to have a meaningful discussion between TFA alumni who have divergent views on education policy.

It might be useful to start with what seems increasingly the biggest fault line between reformers and traditionalists: testing. Diane Ravitch recently said that the “most damaging things happening today stem from high-stakes testing.” I’ve got to admit that I’m baffled by the belief that high-stakes testing is that destructive.

I taught in a school district that was very reform-minded: we were regularly evaluated by principals and half of our evaluation was based on student tests, of which there were four or five high-stakes standardized assessments per year for each core subject. (The district’s former superintendent even wrote a report for the Fordham Institute on the evaluation system used.) I’ve written about some issues with my district’s evaluation system, but the amount of testing was, by and large, a good thing in my view.

What’s surprised me most about the anti-testing backlash is how little research is cited to support traditionalists’ arguments. (As a point of comparison, I’m more sympathetic to the anti-charter movement because there’s a solid research base for opposing charters. Though I don’t – a topic for another day, certainly.) Gary, you’ve long been critical of standardized tests, saying that you don’t ‘put a lot of stake into standardized tests.’ But what is your evidence for this, beyond your intuition and (admittedly extensive) experience?

Take a look at the research. The well-known Chetty study found that increased standardized test scores correlated with better life outcomes. The SAT has been a powerful predictor [pdf] of students’ first-year college GPA, correlating about as well as high-school GPA, which is pretty impressive in my view. There’s also strong evidence that the SAT is a good predictor of a student’s likelihood of graduating college. Yes, it’s a reformer talking point but I happen to think it’s true: standardized tests aren’t perfect metrics, but they are useful ones.

Traditionalists sometimes act as if preparing for a standardized test is a useless activity. Not so. Whether you like it or not, to be successful in many professions students will have to be successful at taking standardized tests. I had to pass a test to become certified as a teacher in Colorado; I’m now in law school and had to take the LSAT and will have to take the bar. Sure, there’s an argument that test prep has gone too far – and I would guess that that’s true in some schools and districts – but there should also be an acknowledgement that the ability to take a test has many real-world uses.

Finally, I think Paul Bruno has made the point well over at the Scholastic Administrator blog: it’s bizarre that many teachers are so opposed to testing when in fact they give high-stakes tests (high-stakes for their students, at least) all the time in their own classrooms. Why are standardized tests so fundamentally different than classroom tests?

I saw Karen Lewis speak at the University of Chicago a couple months ago, and she argued against standardized tests, saying, ‘My students aren’t standardized!’ That got some applause and head nods, but does that really make sense? Presumably what she meant was that all students can’t be measured by the same test – but when she was a teacher did she not give all students the same tests? Don’t you, Gary?

Say you go next door to a colleague, and design a test to give to all ninth grade algebra students, with the intent to assess them all equally, judge your own performances as instructors, and find areas of weakness, both yours and your students’. You’ve just created a standardized test, and yet you’d probably agree that doing so is pedagogically sound. I see standardized tests as scaling this idea.

Yes, we need to be careful with incentives when linking pay to test scores, we need to make sure tests are fair and accurate, we need to avoid narrowing the curriculum, and we need to ensure that teachers have a part in designing the tests (which my old district did to its credit). Of course there’s a lot of work still to be done on these matters, but it is work that can be done. These are issues that can be solved, not ones that warrant trashing high-stakes, standardized tests altogether.

Looking forward to hearing your thoughts,


My response:


Dear Matt,

Thanks for taking the time to write to me.

No real educator, certainly not me, is opposed to tests.  As a math teacher I give tests all the time.  Over the years I’ve slowly increased the value of these tests, and now they are nearly 90% of my students’ grades for my eleventh graders (I also teach an elective called math research which is more project based and quizzes only count for 15% of that grade).  I take great pride in my ability to make a great test — one that is not too hard or too easy, one that has just the right amount of ‘skills’ but also challenges my students to prove that they understand the underlying ideas enough that they can apply them to an unfamiliar setting.

During my summer training for TFA in 1991 I remember reading somewhere that in an education utopia there would not be any grades.  Students would study and learn because they were intrinsically motivated and interested in the material.  I remember thinking, back then, that I agreed with this.  But I changed my mind pretty quickly once I became an actual teacher.  If you look through both of my books I wrote about teaching, you will see that tests and students succeeding on these tests are a big component of my philosophy of teaching.  Tests serve a lot of purposes.  For one thing, tests serve as a way to encourage students to study.  Also, tests are a way to reward students who have been diligently doing all their homeworks and punish, indirectly, students who have been cheating by copying homeworks from others.  In my own class it is not uncommon for me to say something like “Last year there was a ten point test question on this topic which a lot of people lost points on for making this mistake, so please be aware of this as you do your homework and study.”  I’d say that I refer to ‘the test’ in one way or another, on average, a few times each period.

At my school I’ve been the chairperson for the eleventh grade math final exam for the past three years.  In this capacity I have to collect questions from a team of teachers and compose a test by sorting out the ‘good’ questions from the ‘bad’ ones.  I teach at Stuyvesant High School which is, by many standards, one of the best schools in the country.  I compose a final which, if I did it right, should produce an average of a little under 90.  If it is way off of that, I have made the test too easy or too hard.  There was a time at my school where the teachers who taught the ‘honors’ eleventh graders didn’t want their students to have to take the ‘regular’ final.  They argued that it was too easy for them and would inflate their grades so they wanted to make their own test.  In the debate over this, which I was on the winning side, it was agreed that the honors students should have to take the regular final.  The main argument that was sold was that it is OK if the honors students ace the final since getting a good grade in honors will help them get into college.  In the back of everyone’s minds, though, was the suspicion that the honors teachers were concerned that their students would not do so well on the ‘regular’ final which would make those teachers look like they were not covering the material they were supposed to.  So, in that sense, I do like ‘standardized’ tests for many purposes.

Like the two people you mentioned, Karen Lewis and Diane Ravitch, I am not opposed to tests — or even standardized tests.  The issue I have is that the scores on these tests are being misused in a way that, in my opinion, will ultimately make education in this country (and even achievement) worse.  This is why I scoff at ‘high performing’ charters who beef up their scores in various ways (and STILL can’t get them to be very good, in most cases).  This is why I oppose value-added being a component of teacher evaluation.

I suppose the biggest misuse of testing data is in teacher evaluation.  I do not believe that ‘bad teachers’ are the problem in this country.  So I really don’t think that fixing the ‘broken’ teacher evaluation system will raise ‘achievement’ in this country.  Instead what will happen is that a lot of money will be wasted (not to mention instructional time) in creating, preparing for, administering, grading, and interpreting the scores.  Despite what we hear from ‘reformers’ from time to time, money does matter and giving more of it to a school, provided they don’t waste it (which is another issue …) is a good thing.  So I don’t like to see billions of dollars wasted on educational alchemy, in this case turning test scores into golden teacher effectiveness metrics which, in theory, would result in teachers learning how to improve and, if that doesn’t work out, how to be fired.

Already we are seeing that even under these new evaluation systems most teachers are being rated ‘effective.’  This is very frustrating to reformers who think that this just means that the new evaluation systems are too lenient.  Never would they think that this might mean that they were wrong about how many bad teachers there were.  This is bad science.  You should conduct an experiment to test a hypotheses and if the experiment disproves the hypothesis you don’t reject the experiment.

I do agree that if there are two teachers at the same school who have generally similar students for a bunch of years and for one of them the test scores on a uniform end of the year test are substantially higher than the other then it is probable that that person is a better teacher.  This is particularly true if that teacher was not ‘teaching to the test’ and, worse, choosing to not expose students to activities that promote critical thinking since that would take time away from preparing for the convergent thinking tests.

But when ‘high stakes’ are attached to these tests, a whole new dynamic opens up that, I think, hurts schools, teachers, and students.  Teachers would scan their class rosters in the beginning of each year wincing when they see that have some students who are disruptive.  Maybe some teachers have an ‘in’ with an administrator who will transfer those students to another teacher.  I’ve read about schools that have a term called ‘bubble kids.’  This means that they have done a ‘triage’ on their students where some will pass the test without little help and some won’t pass the test, even with a lot of help.  Teachers are instructed to focus on the bubble kids who are on the border of passing and failing.  This is an unethical gaming of a system that puts too much stake in ‘percent proficient.’

Value-added based on test scores is too inaccurate for it to really be any part of a teacher’s evaluation.  How can a system which I’ve learned can rate the same teacher in the same year as a highly effective 7th grade teacher yet an ineffective 8th grade teacher.  It makes no sense.  The tests, which I have analyzed on my blog, are not good enough.  For enough money it might be possible to create different assessments for students that might be able to isolate teacher quality.  These tests would be quite expensive to make and to grade, so I don’t see why we should spend all those resources trying to identify the few ‘bad’ teachers that every principal already knows who they are.

You mention the Chetty report that students with teachers who had high valued added had better lives.  But have you seen how easily that report has been refuted?  For one thing, the students with the ‘good’ teachers only made about $250 a year more, on average, and this was after they decided not to count the students who were making the most money — throwing them out as outliers.  Perhaps this is why they didn’t submit it for peer review.

I think the very worst misuse of testing, though, is when it is used to label schools, through a value-added type calculation, as ‘failing’ or ‘high achieving.’  In my research I’ve found that often there is not much of a difference between the two schools, and when there is a difference it is likely because the ‘high achieving’ school (often a charter school) has ‘better’ kids.  Standardized test scores are being used as a weapon to demolish schools, most recently over 50 in Chicago, across the country.  Whatever benefit there might be from high-stakes use of standardized test scores (They motivate the miniscule fraction of teachers who aren’t otherwise motivated to try?  They make some charter operators very rich?) this is far outweighed by the damage they have done.  I’d be very happy if schools stopped getting closed down over standardized test scores and if teacher pay and job security were not linked to it.

My belief is that the misuse of testing is beginning to backfire on the reformers.  With the new systems being rolled out around the country, as part of Race To The Top, it will soon be clear to everyone how far we are from being able to use these test scores any anything more than a very rough idea of where some students are in their knowledge.

Thanks again for writing.  I hope I’ve answered your questions and feel free to write back as often as you’d like.


To see the next part of this discussion, click here.

  1. Educator says:

    This is fascinating and I applaud both of you. Please keep this dialogue up.

    I want to raise a question about standardized tests, and testing in general. Yes, I do get that they are important to an extent, and yes, I do understand Gary’s argument that they’re being misused.

    But have we considered whether aiming to get more people to do better on standardized tests is better for society and for future careers that haven’t been invented yet? This professor seems to think that there is trade-off between entrepreneurship and education standardization (I’ve only watched a few lectures and browsed his site, but I haven’t read his books, so I can’t promise I am understanding him perfectly correctly.) Another way to think about it — why are Asian countries like South Korea trying to get their education system to look like ours, while here in America it feels like we’re trying to get our education system to look like South Koreas? (extreme focus on standardized tests and end results)

    Anyhow, something to think about —

    Professor Yong Zhao

    • With all due respect to Dr. Zhao, I still don’t think that high test scores OR job prep should be the main focus of K-12 education, though of course any kid who wants to start career-planning early on in the game can likely find a way to do so, particularly if the career entails vocational education and s/he’s fortunate enough to have access to a school that offers a wide range of programs like auto-body, auto mechanics, welding, electronics, computer repair, etc. There are things that just can’t be outsourced to India, and learning one or more of them well makes people highly employable now and for the foreseeable future. We used to routinely offer that kind of education in this country, but it’s not too fashionable with either the education deform crowd or many “liberal” school improvement experts, for some reason. The new mantra is college for all. And so, we’re back to those high-stakes bubble tests.

      I think the entire focus of US education has long been wrong-headed, but it’s worked well-enough for those with money or at least a lot of affluent and near-affluent people to keep it going. Focusing on things like democratic core values, being a good citizen in a democracy, being a decent human being, etc., never seems too fashionable, and stressing any sort of critical thinking, analysis of rhetoric, etc., without a clear job-related goal doesn’t draw too much support.

      I’m still unclear what you mean by entrepreneurship, and I know it’s a word with connotations I don’t readily warm to. But I’m hoping that what you have in mind is something other than all the capitalist profiteering that immediately comes to mind, along with venture capital, and various ways to get rich without producing anything whatsoever of real value.

      Some day, we may wake up to realize that our entire education system is completely irrelevant to most children, or nearly so. But right now, there seems to be a death struggle between two unpalatable alternatives: the old tired dead forms of US education and the new repulsive education deform ideas. I’m not entirely sure which is worse, but I suspect that the profiteers are the more dangerous and ethically repugnant.

      • Educator says:

        I think Zhao argues that the students of today need to be able to be able to adapt in the future. And our current system of bubble testing, and common core standardization, isn’t what’s needed.

        I think that’s what he means by entrepreneurship — the ability to create new ideas that lead to a better society. This may entail creating corporate type stuff, like creating the next companies that employ people, or creating new industries that we haven’t even thought of. It sounds like you might not be too fond of this based on your post, but I could be misinterpreting what you write.

        So in other words, do we even know today what people of tomorrow need to know/do? What best prepares them to adapt? NCLB? Bubble tests? Common Core? I think Zhao uses the phrase that pursuing common core in so many states is a big gamble.

      • I will have to read his stuff on entrepreneurship. I’m inclined to give him the benefit of the doubt based on his track record. But the word-choice is suspect for me.

        As for creating jobs, I’m all for it. What worries me is how the so-called job-creators of today and yesteryear seem to believe they are entitled to create, through lobbies and bribes, laws that allow them to become insanely rich at the expense of the rest of us. That, in the long haul, has led us to our repulsive and ethically-bankrupt messes of the last century, culminate into the biggest disparity between rich and poor of any industrialized nation in history (despite Matt’s intriguing word- and number-play). I doubt that’s what Zhao has in mind, but one wants to be sure.

  2. Steve M says:

    What happens when the common core-based math and science “standardized exams” hit the scene and we find that EVERYONE does poorly? Or, not everyone…mainly kids with low-SES backgrounds.

    These exams will look like the constructed-response portions of the NY Regents Exams and the old Golden State Exams that used to be given in California.

    Will the charters come out smelling like roses? Something tells me: not.

  3. Julie says:

    Thank you for drawing attention to the awful issue of ‘bubble’ kids. In my kindergarteners’ school in Chicago, all teachers were required to put up data walls in their classrooms of MAP test results. One poster for the class as a whole, then six kids who were on the bubble separated out with their scores charted separately. As though the teacher’s main goal was to bring up these six kids, thereby quickly and easily getting the most growth quickly. I was saddened, and angered. That’s why we fight high-stakes testing.

  4. One of the main problems with standardized testing the way it has been done in my state is that it doesn’t measure learning or critical thinking. It measures a very narrow range of knowledge on two subjects–reading and math (and I guess now in science). I had plenty of students who could pass the standardized test on the first day of my class; and I had students who never would. Neither of these groups are measured in any way meaningful way by these tests. For the kids in between, there is way too much allowance for manipulation by poor administration to hurt good teachers. Sadly, it is my experience that poor and unethical administration (including things like the misuse and misinterpretation of data) is more the cause of bad schools than poor teaching. Putting so much at stake on standardized tests is therefore nonsensical, to put it mildly.

  5. David Shulman says:

    Two comments:
    1) College admissions people tell me that the correlation between the SAT and ultimate degree acquisition is small. I don’t know if Chetty is an outlier.
    2) The chief purpose of a scaled up test, such as the NYS Regents Exam used to be to insure some degree of uniformity in covering a syllabus created and revised annually by a group of Teachers for the State. That’s all it was-a kind of herd keeping. Many NYS teachers going to other states noted, particularly in courses like math, that the class they were currently teaching had a mix of students skills due not only to students uniqueness, but to the freestyle nature of the course content of the course the students had previously taken. Students new to the school from a neighboring school sometimes did not have any exposure to the preceding ( or base ) material. Thus for NYS the NYS Regents. Using those tests to evaluate teachers and schools (a fundamental no-no in the evaluation world-repurposing an instrument) is at the crux of Educators objections.

    Additionally, and years go, an Assistant Principal new to my (then) school, remarked Regents papers and increased the number of passing papers. The marking committee approached the Principal, who instructed them to remove their initials from the papers. The State Ed Dept reviewed those papers, restored the original grades, issued a disciplinary letter to the Superintendent, Principal, and Assistant Principal. He was gone at the next reorganization. IMAGINE THAT TODAY!

  6. Michael Fiorillo says:

    Two points about Matt Barnum’s letter:

    1. While he may be correct that higher standardized test scores correlate with better life outcomes, he neglects to mention that they also correlate with higher family incomes, which also have a bit to do with success later in life.

    KIPP and other charters claim their students higher test scores presage success later in life, but we know from their high attrition rates and lower enrollment of high-needs students that it’s not an honest comparison with the public schools.

    2. His letter does not adequately tease out the differences between tests, standardized tests and high stakes standardized tests, which teaching careers and schools now live and die by. Matt gives a pro-forma half-nod in that direction at the end of his letter, but his failure to adequately discuss it weakens his critique of Gary.

    Very few teachers would argue against testing/ assessing their students. Indeed, teachers are assessing their students daily, formally and informally. The controversy is over high stakes standardized testing, which in practice invariably narrows and homogenizes curriculum, and which is frequently used as a vehicle for implementing policies (school closings, privatization, de-unionization) that are fundamentally political in nature, but which the pseudo-scientific veneer of testing jargon and so-called education reform seek to mask.

  7. Ted Cook says:

    All the high school and college math and science tests I can remember were not multiple choice, they were show your work, big difference. Maybe that’s why colleges prefer to look at grades. How much do multiple choice tests measure the actual material versus the skill of being good at a process of elimination and playing the odds? Basically multiple choice mastery is a little harder than knowing the trick to succeed at tic tac toe or connect four, but probably easier than backgammon or chess. Bottom line, it is mastery of the tricks of a game, because by some proportion, a multiple choice test is a game and the other proportion is actual measurement of knowledge. What proportion is the game? A quarter or a third? Enough that the result is too skewed to base too much importance on, as colleges have realized over the years.

    Second question on testing and learning, what is the difference between testing on a computer screen and a piece of paper? How does that affect people and change the process?

  8. Jennie says:

    Well said, Gary…as usual.

    When I was a teacher (up until a little less than a year ago), my school did the “bubble kids” trick. It even showed up in our computer system (not on the roster in the online gradebook you see when inputting grades but another page that gave you info about the kids and their test scores, etc.) It took me a while to understand what the whole “bubble” thing was about, but once I did, I saw how disingenuous and even devious it was. The underlying premise is so rotten: don’t focus too much time/energy/resources on the kids who are already passing with flying colors, and don’t focus too much time/energy/resources on the kids who are so far from passing they probably never will; just focus on the kids who have a hope and a dream of passing (though often those “bubble kids” don’t really care all that much about passing, either).

    And what did focusing time/energy/resources on “bubble kids” look like in practice? At our school, it meant those kids were pulled out of classes for extra test prep. They were generally pulled out of electives, including, the first year it was done, my French classes–which may not be covered by a standardized test, but can be very hard to keep up with or catch up with when one is absent from class. And that’s for a GOOD student who WANTS to keep up/catch up. The students who were pulled from my class for extra test prep were NOT the “best and the brightest,” nor the most motivated…which is why they had not passed their state tests yet to begin with. However, several of them WERE passing my class prior to the test prep sessions–albeit just scraping by with low C’s. Once the test prep pull-out began, the kids in question missed at least one of my classes per week for several weeks before the state tests. I’m trying to remember if it was every single class or just once a week–but in any event it was definitely a LOT, way too much for those kids to stay caught up. Our principal made a big deal about how we were to let the students make up all the work they missed so they wouldn’t fail our class. What she conveniently forgot was that if they weren’t physically present in the class to get the new material, hear it explained and ask questions, they wouldn’t be able to make up the work they’d missed from being out of the class. I mentioned that one time and the response was that we were to help those kids, outside of class time if need be. Fine and dandy, and I would happily have done so–but helping a kid outside of class time requires said kid to show up to get help outside of class time. That can be hard enough when a kid is very motivated. When a kid is NOT motivated, it’s all but impossible.

    To make a short story long, all the kids who were routinely pulled out of my French class for test prep sessions failed my French class. Most of them still failed their state tests as well. Had they been in my class instead, perhaps they would have learned SOMETHING…none of them were really on the path to becoming fluent in French, granted, but in my class, we did a lot of etymology, a lot of grammar and vocabulary work that compared French to English and my students were essentially forced to learn grammar rules in English in order to understand French grammar. I think therefore that my class was beneficial to students who paid attention even if they never learned to produce accurate French, and I often had students tell me that the grammar/vocab they learned in my class helped them with their SATs, reading for other classes, etc.

    But did that matter? NO. All that mattered was more test prep, and only for the “bubble kids.” Fortunately (maybe because I whined a lot) the subsequent years I only had one or two kids pulled out of my class for test prep, and they were kids who were already failing so I didn’t care that much. They made more of an effort to only pull them out of gym or that type of class. But even that is problematic. Sometimes gym is the only class a kid enjoys, and could make the difference between that kid showing up at school or staying home. You take away the only classes that motivate them to show up, and you’re more likely to get a high absentee rate than a high test score.

  9. KrazyTA says:

    Gary: as always, you are restrained, direct, on task. And IMHO, correct.

    Two quick points and a modest suggestion.

    1), I am unconvinced when people assert that standardized testing [or almost anything] must be good for everyone else because it worked for them.

    2), There is a vast difference between the carefully calibrated diagnostic tests administered by a competent caring teacher to students s/he works with for months or years at a time and a high-stakes standardized test. We all need to be careful in distinguishing between very different sorts of tests and understand their very real strengths, weaknesses, and limitations.

    I would respectfully suggest that Mr. Barnum read the following: MAKING THE GRADES: MY MISADVENTURES IN THE STANDARDIZED TESTING INDUSTRY (2009), Todd Farley; MEASURING UP: WHAT EDUCATIONAL TESTING REALLY TELLS US (2008), Daniel Koretz; and THE MYTHS OF STANDARDIZED TESTS: WHY THEY DON’T TELL YOU WHAT YOU THINK THEY DO (2011), Phillip Harris, Bruce M. Smith, and Joan Harris.

    However, with all that said—without reservation I sincerely thank Mr. Barnum for engaging in dialogue with you.


    • Matt Barnum says:

      Thanks for your last line. Similarly, I’m sincerely thankful to Gary for engaging with me.

      Two quick points: I don’t think I said testing worked for me personally, though I did argue it worked at my school, and then cited research that it works in other schools – at least ‘works’ in the sense that it serves as a decent proxy for student learning. I feel like you could say the same to Gary – he largely draws from his own experience, and he in fact cites no research whatsoever to buttress his claims.

      Second I haven’t read any of those books (I’d like to though!) but I have regularly read writing critical of standardized testing, including Diane Ravitch’s most recent book, her blog, Gary’s blog, and the Answer Sheet blog.

      • KrazyTA says:

        I inferred my first point from your ninth paragraph.

        I am not surprised that you dealt indirectly with my second because you admitted that you hadn’t read the books I mentioned. Thank you for your honesty [the web is a treacherous filter: I am not being sarcastic or demeaning]. Based on what you have written to Gary [at least Part 1] and your responses to other posters here, I will give you the benefit of the doubt, namely, that you seem genuinely at a loss to explain why “the research” backs up one side and “intuition” and “experience” buttress the other. [Quote marks based on your Part 1 comments]. I assume you do not realize how your [strongly implied] contrast is uncomfortably oversimplified: as if there were only two sides in the [variously defined] high-stakes testing debate! Life would be so so so much easier if people and discussions were just that simple…


        A simple suggestion. Start with MEASURING UP (2008) by Daniel Koretz. He’s a psychometrician, i.e., in layperson’s parlance a numbers/stats guy. Read the other two books afterwards. It will demystify the whole high-stakes standardized testing business. Then go on to the other two books. I suspect you will come across lived experience and intuitive wisdom and hard-core research you didn’t expect. To riff off a famous line in THE TEMPEST [Act 5, Scene 1], you may then utter “O brave new world, that has such information I never knew existed!”


        With all due apologizes to that most insanely Krazy Bard of Avon.


      • skepticnotcynic says:

        How do you know these reforms have worked in your school or district? Two years isn’t long enough to determine if the data-driven, high-stakes testing approach works. So, if your scores go up year-over-year, does that mean your students are more successful and are learning the skills necessary for them to become productive members of society? I’ve worked in schools where tests scores went up year-over-year, but I don’t necessarily think they were successful schools, nor would I send my own kids to them (refer back to Jennie’s response for why this happens all the time in majority title-1 schools).

        If your school was such a fantastic place to work, why have you left after teaching only two years? If it was a successful school, I’m sure you would’ve put off law school for one more year. Your 3rd year is a great teaching year, and if your instructional leadership or students could not convince you to stay for a third, I doubt it was an enjoyable experience for you. The reason people have been leaving the profession in higher numbers is because the working conditions for teachers are becoming worse and worse, especially in majority title-1 schools.

        Having worked in both traditional public and high-performing charter schools, the reform-minded approach is definitely not working. When you can’t retain professionals longer than 2 years, there is a serious problem with the system. You don’t see this type of turnover in the accounting, legal, or medical professions.

        Surprise, the best schools have stability and very little turnover. So, if the school serves an at-risk student population and is really challenging to work in, why make it an even more undesirable place to work for smart and innovative people? The last time I checked, highly productive people don’t want to work in a top-down, compliance driven work environment where they are questioned about their intentions or what’s best for kids.

        Why do you assume these types of reforms are going to make schools any better? If you believe in mediocrity, then yes, by all means, advocate for high-stakes testing. Last time I checked though, the elite do not send their kids to these types of schools. If you are truly championing they type of education that will give kids from low-income neighborhoods a fighting chance at a better life than you would not be supporting these type of policies. Instead, you would support a well-rounded education that addresses the whole child and offers wraparound services.

        In response to the articles and research you keep citing. Have you ever read through these studies and questioned the methodology? I have read through a lot of the research in education, and it’s mind boggling how bad some of these studies have been conducted. I’m highly skeptical of any research where all variables cannot be controlled for and/or where the sample-sizes are quite small. The longer you stay in the classroom or in schools, the sharper your bsdetector gets.

      • Matt Barnum says:

        Skeptic, what I’ll say is that you make a lot of assumptions about me and my beliefs that are not true. Here are two that are wrong:

        1) That I think high teacher turnover is a good thing. I was recently very critical of TFA in part for that very reason:

        2) That my school was a ‘fantastic place to work.’ Never said it, not even close. All I said was that the testing at my district was, on balance a good thing.

      • John Thompson says:

        How many times have you read Chetty? Personally, I don’t make comments on studies like that without multiple rereads. You don’t address anything in Chetty that supports your position on standardized testing. Neither, I would add, does Chetty’s methodology allow him to address the policy issues regarding high stakes testing. Its correlations are about averages, not individuals. They only claim to have, on the average, a reliablity of 80% in regard to identfying a low performing teacher. Would you teach in a school where you had a 10%, 20%, or whatever percent chance PER YEAR for your career to be destroyed or damaged because of the incompetent way that high-stakes testing is used?

        I think you have the research backwards. There is a huge body of research showing the problems with standardized testing, and just a few “papers”, (that may look on the surface like social science but are just position papars) that defend it. These papers tend to be based on correlations, (a la Big Data) with no effort to tease out causation or link patterns to reality, and funded by pro-testing edu-philanthropists.

        And are all of us who see the mandates to commit educational malpractice just halucinating? And, how many of your examples, where you don’t see damage done, does it take to offset the damage we see? How many children should we sacrifice to the harm done by bubble-in accountability to produce the benfits that you say aren’t impossible?

        Finally, how much test prep is constructive? Ask the policy question. Has standardized testing produced far more test prep than could ever be justified by any research you could conceive of?

      • Matt Barnum says:

        Some good questions – hopefully I’ll address some in my next letter to Gary. For now, though, I’ll just say that I think when stating that, “there is a huge body of research showing the problems with standardized testing,” you should perhaps cite some research showing the problems with standardized testing.

  10. skepticnotcynic says:

    I appreciate the dialogue, but I am more concerned that Mr. Barnum has left for law school to pursue a career in education policy without having spent enough time working in schools. Mr. Barnum’s response is fairly typical of a well-educated TFA corps member who has only spent a couple years in a classroom, and then jets off to pursue a consulting gig, grad school, or any other job outside of schools, since they believe they have it all figured out. I hope Mr. Barnum realizes that if he decides to continue working in education in some capacity he will garner little respect from the rank-and-file working in schools if he continues to promote “reform-minded” policies.

    From my experience, an educator’s thinking evolves the longer they stay working as a teacher or at least in some capacity within a school. However, I would prefer that a teacher has a minimum of 5 years in the classroom before entering into any sort of administrative or leadership position within a school. Even this is usually not enough for most people.

    It appears from his writing that Mr. Barnum appears to be quite thoughtful and reflective for a young teacher; however, his comment about “traditionalists,” is a bit disconcerting and it demonstrates his immaturity as a young educator. There is a fundamental difference between “smart education reform” and “ed-reform.” In my opinion, the “ed-reform” arguments and buzz words are becoming more and more traditional, status-quo, and tiresome. It’s time to move beyond 2001. The carrots-and-stick’s approach to raising student achievement is “old-school.” New approaches are needed to get us back on track as a country, so we can make incremental progress again.

    • Matt Barnum says:

      I’ve tried to be very upfront with the fact of my limiting experience in schools by acknowledging that I’m now in law school, acknowledging some of my failings as a teacher, and acknowledging how much more experience Gary has. My use of the word traditionalists is my best attempt to label the body of thought that argues against the current ed-reform movement. I hope that people address the quality of my arguments not how many years I’ve taught for or whether traditionalist is the perfect word to use.

      • gkm001 says:

        Dear Mr. Barnum,

        I appreciate your willingness to engage in dialogue not only with Mr. Rubinstein, but with his readers. I don’t mean to pile on the questions about your own experience and background, but were you a math teacher? The reason I ask is that I suspect standardized multiple-choice tests are a better tool to “measure learning” (and we should be careful with that phrase: learning is a construct, not a physical property, and it cannot be “measured” as if it were some quantifiable property like mass or temperature) in math than in reading comprehension (another construct). As an adult — and one who has always done well on standardized tests — I have been bewildered to look at third-grade reading test questions and find that there are four inadequate answers to choose from. Or that one of the questions itself is flawed. These tests are not written by geniuses; but beyond that, there is something fundamentally wrong with the exercise of giving children four vaguely plausible answers to a question about a passage of text, and asking them to figure out which one the test-writers believe is the right answer. This bears little relationship to the real purposes of reading, writing, and learning, and certainly does nothing to further an enjoyment of reading among schoolchildren. In education, as in any social science, there are limits to what testing can tell you, because it takes place in an artificial situation, stripped of the meaningful social contexts of human behavior.

        My own children had the good fortune to attend a K-5 elementary in which there were no grades and no tests; assessment was a daily practice of the teachers, who held “book club” discussions, assigned research and writing projects, helped the children keep reading logs, and worked with the children on handwriting and spelling while the children wrote their own stories. The result is that my kids love to read and love to learn; the oldest is now in middle school, where her grades and test scores are very high — but where, due to the pressures of high-stakes testing, a lot of activity resembles the artificial situation of the standardized test rather than real-world situations of work in which people draw on their knowledge in order to solve problems or create products that have value.

        I realize that what I have just given is an anecdote, and you are looking for data; but where will you find the data that will tell you whether children are learning to enjoy reading, learning to hold a productive exchange of ideas, or learning to construct a cogent argument? What data will tell you whether and how children value their educational experiences, and what meaning school holds for them?

        If I could recommend one more book, it would be The Read-Aloud Handbook by Jim Trelease. And if I could ask one more question, it would be this: How will we know what effect our efforts to raise “student achievement” in the early grades may have on children’s growth and development as learners later on? Perhaps it is no wonder that our children, turned off to reading by a constant barrage of tests (not to mention the unappealing and irrelevant texts they often contain), begin to disengage from school as they get older. Perhaps the roots of low achievement in high school go back to a childhood deprived of play and wonder and meaningful experience.

        But I don’t know where you would find the data that would tell you that.

        Gloria Mitchell

      • Matt Barnum says:

        Wish I had the time to respond a bit more in-depth, but I did teach language arts, so I’m well aware of some of the limitations of multiple choice. That’s why i think essays and short responses are important to include on tests. (Though I do believe that it’s very possible to write a good multiple choice reading comprehension question – see the SAT for examples.)

        Fundamentally, I believe that the teachers who are the most innovative, inspirational, and who push critical thinking will also be the ones who produce the best test results (and produce better life outcomes for students – the Chetty study). If that’s true, as I believe, then I don’t have a problem using standardized tests to assess students and teachers.

      • gkm001 says:

        I’m familiar with the evidence that teachers who conduct inquiry-based and project-based learning, and who use authentic assessments and real-world standards of work, often raise students’ test scores more than other teachers, even though test results are not what they believe is the most important goal of schooling. I think of it as the “Chinese finger trap” effect, and you can see it at schools like those profiled in this PBS segment:

        But there is also evidence that schools and teachers can raise test scores by focusing on test-taking skills, narrowing the curriculum to the subjects tested, and taking students on a forced march through a year of worksheets, lectures, and textbook readings.

        If we must choose between these approaches to school, and the theories of education that underpin them, do we really want our preference for one over the other to rest solely on the fact that the first ekes out a slightly bigger test-score gain? Is that a _reason_ to prefer it? What if, in a particular district or school, the evidence comes down just slightly the other way?

        Or: what if it’s a tie? Imagine two teachers, one of whom teaches the first way and one of whom teaches the second, and both of whom get the same value-added score.

        In my view, the second teacher needs improvement, for reasons having nothing to do with the test scores and everything to do with the democratic values, dispositions toward work and learning, habits of mind, and social and critical-thinking skills she has failed to incorporate into her classroom or instill in her students.

        But the more weight we give to value-added scores, the less room there is to hold this teacher (or her school, if her teaching is merely a reflection of the school culture, which it well might be) accountable for her real and serious deficiencies.

        What do we value in education, and why?

        Is it, after all, the SAT scores themselves that lead to better life outcomes? Or is it the years of learning that these scores (approximately) represent that helps students to be more successful in college and the workplace?

        I have nothing against the SAT or GRE, by the way. I do question whether high-stakes testing of students early and often in their school careers offers them any educational benefit at all — even the scant benefit of becoming better test-takers. And I question whether raising the stakes of tests will help teachers to effectively ignore the standardized tests and teach in a way that is innovative, inspirational, and cultivates critical thinking. Taking the pressure off the test-score gains (and putting the pressure _on_ to produce evidence of the kind of teaching and learning you and I seem to agree is best) would, it seems to me, better serve that aim.

  11. meghank says:

    Thank you Gary. Like Mr. Barnum, I enjoyed taking standardized tests in school and was always good at them. But this was in the 90’s, when our elementary school teachers would tell us not to stress out on the test or worry about it. They gave us NO test prep prior to the test.

    If I had gone to elementary school in an environment like today’s, I’m sure I would have liked school less. Constant test prep takes all the fun out of learning. I wonder what his response to that statement would be.

    • Matt Barnum says:

      I don’t think I said I enjoyed taking standardized tests or that I was always good at them; I simply said that in the district that I taught in, which used a lot of high-stakes tests, I didn’t witness all the supposed deleterious effects of testing.

      I do not believe test prep should be overemphasized, and I actually don’t think that too test prep will produce better results. There’s some evidence for this view (see here:

      If this is true, ironically, then evaluating teachers based on standardized testing will mean that teachers who spend too much time ‘drilling and killing’ will receive lower evaluations. Sounds good to me.

      • meghank says:

        But does it sound good to you that the teachers get lower evaluations if they have to spend too much time “drilling and killing” due to the dictates of their administration? My school is currently in “blitz month,” for example. That apparently means a month of district-dictated drill-and-kill.

      • Matt Barnum says:

        Agree – hopefully districts and schools will learn that taking practice tests over and over again is probably not the best way to get higher test results.

        (And I taught at a middle school.)

      • meghank says:

        I know you have the best intentions, but the fact remains that if the tests were gone, this perversion of children’s educations (constant test prep) would not be occurring.

        You hope the schools will voluntarily cease their over-emphasis on test prep.

        I think the costs of standardized tests (constant test prep for children in schools that don’t voluntarily eschew it) far outweigh the benefits you outline.

      • skepticnotcynic says:

        How is this not the best way to get results? I have gotten most of my students to pass state exams by teaching test-taking strategies, formulaic writing, process of elimination, and the list goes on. In fact, I have gotten so good at analyzing past test questions and data that I can coach a student who has no business passing the high-stakes test how to do so. I can also identify my bubble kids early in the year and tailor my instruction, so I can get them all to pass the test, even though this approach may not be best for their long-term education, especially if they have aspirations of attending college. Many of these students can barely read or write, and I know in good conscience that I would be a much better teacher if I could focus solely on actually teaching them how to read and write based on their actual reading level. However, with testing pressure from the top this is not a realistic scenario. The way the system is set up, we have to teach every child like they’re all the same. Remember, every one takes the same test, even though my students vary from 3rd grade reading level to post college.

        I also have the skills and experience to boost pass rates with the challenging ELL population on the high school ELA exams. I’m not saying I can get every child to pass, since some are so far behind that it is futile exercise, but I know how to set up most students for success on these exams. To say that taking and analyzing practice tests over and over again is not the best way to boost test scores is just not honest. The more familiar you are with the test and its questions, the better you will do. That’s just common sense.

      • meghank says:

        I agree with you, skeptic, and the link Mr. Barnum gave showing that test prep leads to lower test scores actually doesn’t show that at all. The study linked to from there doesn’t actually state that, either.

      • Matt Barnum says:

        Meghank, sorry should have linked to the primary source. Here it is (pdf):

        And here’s the relevant part:

        “In a comprehensive survey, CPS teachers were found to have devoted large amounts of time to prepping students for the ACT; typically, over a month of instructional time was devoted to test‐prep. The outcome? More test prep was associated with lower ACT scores.

        Research shows students who are tasked with
        intellectually demanding work that promotes
        disciplined inquiry and relevance to their lives score
        higher on standardized tests.”

      • meghank says:

        Hmm. So people who invest in courses like Kaplan are just wasting their money? Somehow, it seems like there must be other research out there contradicting this research. I don’t have my hands on it, however.

        I did teach a Kaplan course, and it appeared to me to be very effective at raising test scores. Again, I don’t have the research.

      • meghank says:

        I did find a fault with the study you cited. Here it is:

        “Regardless of the type of test preparation emphasized, schools where most teachers do intensive ACT preparation showed the
        same or lower ACT scores as schools where few teachers do intensive ACT preparation”

        The problem I see is, in schools where the students are economically disadvantaged, teachers are more likely to do intensive test prep.

        Since economic disadvantage is the biggest predictor of low standardized test scores, they should have controlled for this in their study. They did not, or not that I can see.

        Also, the scores (on page 43) are not that much lower (.3 points, on average). If they had been controlled for SES, they might have been much higher in the schools with test prep.

      • skepticnotcynic says:

        Test prep is mostly ineffective with really low-skilled students. I will agree with you on that Mr. Barnum. Unfortunately these kids are usually written off by big schools who need to focus on the bubble kids, because they know these kids have no chance at passing.

      • skepticnotcynic says:

        I am still concerned though that you are not skeptical of all the research you read, even the research that confirms your bias. My bias has formed based on my experience working for nearly a decade in the classroom. Who do you trust more, a bunch of researchers who have never taught kids, or someone who has taught over a 1000 kids and can detect bs intuitively because of pattern recognition and experience working with all different types of students? I thought exactly like you when I was a young teacher, believing every thing I would read in education. As you mature, you will begin to question the efficacy of these studies and see their flaws. Especially if you actually read the research, instead of the articles in the media that link to them.

      • Matt Barnum says:

        Skeptic, I think it’s a really interesting and difficult philosophical question. Specifically: Who should we trust – researchers who look at numbers or practitioners who look at day-to-day reality?

        i surely don’t discount the insights of teachers, and I try to look critically at research (and not just read views that agree with my own – hence my desire to engage with Gary, you, and other commenters), but I am more inclined to trust data – good data (though that opens another can of worms) – over lived experience.

        Let me analogize this to baseball, a sport that I’m a fan of. (I know, I know – baseball isn’t teaching, etc etc. That’s why it’s an analogy. It only goes so far.) For decades practitioners – baseball players and coaches – had a vision about what it meant to be a good baseball player and how scouts could determine which was prospects would be good and which would be duds. The views were almost unanimous. Then a group of stats nerds and general managers – like Bill James, Nate Silver, Billy Beane, and Theo Epstein – realized that the paradigm was wrong.

        Many argued that they were, to paraphrase you, ‘a bunch of researchers who had never played the game’ and therefore their opinions couldn’t be trusted. Today, they’ve been proven right.

        Again, I realize the limitations of the analogy. My point is that I have a general skepticism – which I hope you can appreciate – of those who want to discount certain views or certain research because it doesn’t jive with their lived experience. The human brain is amazing, but incredibly imperfect – living experience can be faulty and unreliable. Research can supplement, validate, or even challenge and debunk that lived experience.

      • meghank says:

        Well, not to be a thorn in your side, but since you mention your willingness to engage with commenters, I wish you would respond to the problem I found with this particular study. Perhaps in your next open letter, if you’d rather not do it here.

      • Matt Barnum says:

        Meghank, sure. From the study: ‘Improvements from the PLAN to the ACT are smaller the more time teachers spend on test preparation in their classes and the more they use test preparation materials.’

        This means the researchers (thankfully) didn’t simply compare schools that did a lot of test prep to schools that didn’t. They looked at improvements by students from the PLAN to the ACT based on the amount of test prep done. Since you’re comparing growth, I don’t think you can argue that these results are due to correlations with poverty and test-prep. (I couldn’t find the information specifically in the study, but the vast majority of schools in CPS are highly impoverished so I’m guessing that most schools in the non-test-prep condition still had a high poverty rate.)

      • skepticnotcynic says:

        I’ve read “Moneyball” like everyone else, and I know it’s fashionable to think that data trumps experience and wisdom. I do use data extensively in my own classroom as a tool to enhance, not supplant my intuition and experience working with all different types of students. At this point in my career, I trust my intuition over the data I encounter in education (it’s far too inaccurate). Reason being is that teaching and learning is incredibly complex, and I do not think at this point we know enough when it comes to measuring learning using faux-metrics and junk science like value-added. There are far too many variables that cannot be controlled. Most natural scientists scoff at research in sociology, psychology, and other social sciences. I’m not one to criticize the social sciences because I do see value in researching and studying it, but given my experience, I am highly skeptical of the research that I read. Data might help a younger teacher more than it would help me at this point. There are trade-offs when you obsess over data. I see it all the time in my schools with young teachers who have been indoctrinated with data being the end all be all, especially TFA’s. This prevents them from building better relationships with their students and at the end of the day, I believe this is far more effective in terms of getting results with your students.

        How have the Oakland A’s being doing these days? I know a lot of teams are using sabermetrics now, so the competitive advantage has worn off, but at the end of the day, if you use both data and have experience, skilled and knowledgeable people, I will take that any day over a bunch of young data zealots.

      • meghank says:

        In the sentence I quoted from the article, they are not comparing growth; they are comparing test scores.

        But let’s change the subject slightly for a moment. Suppose test prep was proven in several studies you found valid to increase test scores. If that were the case, would you still be promoting test-based accountability?

        If you say no, then I know what the “traditionalists,” if you will, might want to focus on next (getting those studies done).

      • Matt Barnum says:

        I realize the part you quoted had to do with absolute scores; what I’m saying is that they looked both at growth and absolute results and found the same thing – too much test prep didn’t help, it hurt.

        To answer your question, yes, if I believed that teachers could consistently produce high test scores without producing meaningful learning, then I would very likely not favor test-based accountability. (Keep in mind that I do believe a small amount of test prep is appropriate and will lead to slight increases in test scores. I think that’s perfectly fine.)

      • meghank says:

        That’s not exactly what I said. I don’t believe test prep will consistently produce high test scores. I believe test prep will consistently produce high GAINS in test scores. Growth, as you put it.

        But your answer was good enough for me. I’ll work on finding the studies to point you towards (I’m sure they must already have been done. Kaplan, and other test prep companies, must have funded some of them).

      • Matt Barnum says:

        I meant gains, sorry.

        But be careful. I’m not saying that test prep will produce no gains or less gains than doing nothing (or not taking a test prep class, like Kaplan). What I’m saying is that content-based instruction will produce more gains than test-prep-based instruction (particularly on content-based tests, rather than aptitude-based tests).

        Moreover, gaining familiarity with a test like the SAT will likely lead to large gains from the starting point as students learn to manage their time, for example. But after that, I expect any gains to quickly level out. Again, this is why I support some small amount of test-prep instruction.

      • meghank says:

        I understand what you are saying. I want to thank you for engaging with me in this discussion. When I look for these studies, I am going to look for ones that focus on test prep for the state standardized tests, since that’s what I’m most concerned about. There is a teacher I know, a well-meaning, good teacher, who (after years of kindergarten) taught for the first time in a tested grade, and at the end of the year told me her reflections on that year. Her decisions? “It seemed like they all knew how to read and I could have sworn they would have made good scores on the test. But they didn’t. I don’t know why. Next year, I’m going to start working on the TCAP from day one. That was the mistake I made last year.” And, if her goal is higher scores (and that is what continuing her career is based on, so that must be a major goal), I do believe she is right. But if the goal is creating a life-long love of learning in children, she is dead wrong.

        These two goals are, I believe, opposed to one another for low-SES students. You don’t think they are.

        I’m currently trying to find a study you may find interesting. A researcher was trying a new mathematics curriculum with some students, and found that it greatly increased their learning in his study. However, he was surprised to see the students’ standardized test scores at the end of the year: they did not do well, or increase much. He concluded that standardized tests are not an effective measure of meaningful learning. I know this occurred in Texas, and was reported on in the blog . I’ll let you know when I find it.

        I remember something said on that documentary the teachers made to oppose the NY school closings. A teacher said (I’m paraphrasing), “They come and ask, ‘Why aren’t your test scores increasing? What’s wrong with the way you’re teaching?’ Well, I say, ‘We teach these kids to love learning in ways that are hands-on. There’s nothing wrong with the way we teach. What’s wrong with your TESTS?'”

        That about sums it up for me.

      • meghank says:

        I also think you may not have been teaching in an elementary school, to have not witnessed the deleterious effects of testing.

      • E. Rat says:

        I agree; the impact of testing on elementary schools is especially stark. The “increased rigor” testing is supposed to bring trickles down to PreK. My students who were able to attend a preschool program before starting Kindergarten had decent foundational skills and could fill in a mean worksheet with speed and independence. They also struggled to take turns, co-play, hop on one foot, and talk to each other.

        Now that they are at elementary school, they will face increased curricular narrowing: less science, fewer art activities, and so on: and this is in a district that receives city SLAM funding (sports, libraries, arts, and music).

        I would note also that while education reformers caution against too much test prep, they are generally fans of test readiness skills (how to take a test), and those in large quantity. On the ground, test prep and test readiness both translate to regular practice tests that eat up learning time.

        And then there’s testing across all subjects, as recounted here:

      • Matt Barnum says:

        Incidentally, the district featured in the Prospect article is where I taught. I think expanding testing beyond math and English is good to a large extent: it avoids the common problem cited that testing causes too much time to be focused on only math and reading.

      • E. Rat says:

        I don’t think the testing modules described in the article are going to support arts instruction – which is pretty clearly the opinion of the author and the teachers, too. If anything, they will lead to vague arts appreciation lessons with language arts standards as the key focus.

  12. Michael Paul Goldenberg says:

    There are so many misstatements about testing in Matt’s note that it is staggering to think he makes them without any doubts or caveats. I am on an iPad, so I will restrict myself to one glaring error for now, as this is a tedious entry method.

    Giving everyone in a class, several classes, several schools, etc., does NOT make itvacstandardized test. Most so-called standardized tests are normalized so that the scores will fall in a normal distribution. That means that you get a bell curve, with about half the scores below the mean, and predictable score groupings within one, two, and three standard deviations of that mean. Classroom tests are rarely designed that way. If all students have mastered the material, studied, don’t suffer from undue test-anxiety, etc., it’s quite possible for no one to fail, and even for everyone to do extremely well. That won’t happen in the long haul with, say, an SAT or ACT, and these tests are designed so that scores are not skewed if scores were skewed, the tests wouldn’t be doing what end users (colleges) want and test- makers guarantee: normally-distributed scores that allow for selections to be made based on performance.

    One other point: anyone comparing the LSAT with the various state bar exams as if they belong in the same category really doesn’t get testing and measurement. The former requires ZERO knowledge of law. ZERO. The latter is completely predicated on mastery of various areas of the law and legal practices. I can smoke the LSAT and have done so. I could not currently pass any bar exam. The latter requires actual knowledge that no amount of general intelligence, aptitude, or study will provide. Only specific factual legal knowledge will suffice.

    A lousy LSAT score may keep you out of law schools, by and large, but wouldn’t stop you from taking and passing the bar. But a top LSAT score will avail you nothing when it comes to passing the bar if you didn’t learn the law. Capice?

  13. skepticnotcynic says:

    Promise this is my last post on this issue.

    There is more cost than benefit to value-added methods The cost and unintended consequences list is a mile long, while the benefits are really only benefits for a small minority of people.

    Costs: teaching to a test, rote learning, cheating, wasted taxpayer dollars on assessment and evaluation that goes to corporations , punitive, decreases collaboration among colleagues, encourages administrators to manipulate data and remove low-performing students (e.g. special-ed, ESL, neglected children) from their schools where they can dump them on someone else’s plate, de-professionalizes the teaching profession, cookie-cutter methods, creativity in the classroom is not encouraged, kids hate school, teacher burnout, turnover, faux-achievement, encourages administrators to hire cogs who follow a recipe, the best teachers leave the lowest performing schools in inner-city and rural schools, public education is destroyed and handed to the private-sector, and we all become widgets in the industrial education complex.

    Benefits: enriches companies who create these assessments, politicians look like they are solving a problem they are only making worse, policy makers and ed-reformers get to champion flawed methods that enriches their side at the expense of poor children.

    Seems to me like a small number of benefits for a very small minority of people.

  14. gkm001 says:

    Aha, I knew when I saw “Moneyball” that someone out there was watching it and wondering how to apply it to education policy. It was Matt Barnum!

    Mr. Barnum, I understand the appeal. And I know you have already thought about flaws in the analogy, so I won’t dwell on the ways baseball is not like school, except to say that baseball produces winners and losers, and the game is a flawless instrument for telling the spectator who won and who lost. In school, we want every teacher to teach well and every child to grow up to be a responsible, contributing member of society. Standardized tests are at best a predictor of success, not success itself.

    Given the differences, I hope you are willing to at least entertain the possibility that in education, it is the data analysts who are missing what is vital and important. I think about what kinds of schools I would want my children to attend; then I consider what policies and practices would result in more schools that resemble that ideal (like this one:

    I see you are at the University of Chicago, so you will be familiar with Robert Maynard Hutchins: “The best education for the best is the best education for all.” And with John Dewey: “What the best and wisest parent wants for his child, that must the community want for all of its children. Any other aim for our schools is narrow and unlovely; acted upon, it destroys our democracy.” What kinds of citizens do we hope to produce? What kinds of schools and teachers would support all children to grow towards those ideals for them? What schools would you want to attend, or want your children to attend, and what are the policies and practices that help or hinder their flourishing?

    On another topic, I had a chance to read your Answer Sheet blog, which was terrific. Can it be that TFA spends more than $38,000 on each recruit? I am in a master’s/certification program that requires 2 years of coursework, 100 hours of field observations, and a semester of student teaching (which, I might note, involves teaching a full class all day during the regular school year), and it only costs $30,000. Of my own money.

    I hope I get a job so I can pay my loans back.


    • Matt Barnum says:

      Absolutely I’m open to the possibility that the analogy fails. But I’ll say again that my view is that the teachers who are great at producing quality citizens and promoting critical thinking are also the ones who will produce good test scores. This means that test-based accountability will promote teachers who everyone agrees are high quality and will dismiss teachers who teach to the test too much.

      It is stunning how much TFA spends. I’m hopeful that those on the ‘reform’ side begin to realize that this is problematic.

      • skepticnotcynic says:

        If it were only that easy. Apparently you have not taught special-ed.

      • gkm001 says:

        So you have, first, a premise: that all creative, thoughtful, ethical teachers who teach citizenship and critical-thinking skills will have students who perform well on standardized tests (in any grade, in any subject, and regardless of the other resources of the school). I trust that this is based on evidence and is not merely a hunch or an article of faith with you.

        But given that premise, does it follow that test-based accountability supports and enables more of this good teaching?

        Test-based accountability could go away tomorrow and the creative, ethical, thoughtful teachers I know would keep on teaching the same way they do now. Their teaching methods spring from deep beliefs and understandings about children and learning — they operate out of (you can hear the ‘reformers’ groan) a philosophy of education, not merely a set of classroom management tricks they have been taught.

        But what about the less imaginative teachers, those who teach to the test, or the principals and superintendents who insist that they do? Are they going to change their ways if we continue to threaten them with the loss of their jobs or the closure of their schools if test scores do not rise? If test-based accountability went away tomorrow, it would at least diminish the outsized importance that is placed on the scores as the one and only measure of success.

        Surely there is something perverse about raising the stakes of standardized tests in order to prevent teachers and schools from attaching too much importance to them.

        Instead, we could return to using standardized tests as we once did: as a source of information about teaching and learning. Superintendents and principals could use that information (where appropriate), along with classroom observations, students’ original work, and parent and student feedback, to help schools and teachers improve.

        It’s funny to me how the reformers are very certain that, in the right environment and with the right supports, all children can learn (I agree with them), but don’t seem to believe that, in the right environment and with the right supports, all teachers can teach.

      • skepticnotcynic says:


      • gkm001 says:

        I know, right?

        Over on the reform site Dropout Nation, there’s a prominently placed quote from John Taylor Gatto: ““We need to start from the cold-blooded premise that almost everyone is a genius…not that almost everyone is worthless.”

        Imagine if they applied that same premise to teachers. We would be having a completely different conversation.

  15. Just fascinated with the fact that Mr. Barnum has pointedly ignored my comments about his clear ignorance or pointed dismissal of fundamental principles of psychometrics, his abuse of the term “standardized test,” and his apparent willingness to aver that demonstrably bad testing grounded in major violations of basic psychometric principles is better than taking the time to ensure that the metrics we’re using to assess students, schools, teachers, etc., actually reflect democratic core values that inform free public education for all. Who cares, after all, if countries like Finland rarely fire their teachers, but rather work with them to help them improve (see FINNISH LESSONS by Pasi Sahlberg)? It’s so much more fun to be Michelle Rhee – firing principals or teachers on camera and in other public, humiliating ways; having witch-hunts that ‘weed out’ the lowest 5% of teachers, administrators, schools (not to mention neighborhoods, except that we don’t readily replace those with charter schools and wage-slave, non-union teachers, unless a convenient disaster like Katrina comes along to drive all the poor and wrongly-colored folks out) – or someone with her media panache and much-applauded tough-gal attitude (“the bee-eater” is also, of course, the mouth-taper, the self-aggrandizing liar about her supposed miracle score-raising as a teacher, as well as about the alleged miracle score-raising in the district she bullied and mismanaged, then ran for the Left Coast, her pervert 2nd husband, and raising a billion or two with the support of her neo-liberal, neo-conservative, billionaire, mostly Republican backers).

    I posted previously here (but it apparently never made it into the comments) that basic psychometrics tells us to eschew the very practice that is at the core of the high-stakes testing racket: “Thou shalt not apply a test to a purpose for which it was not originally designed.” That would obviate the use of the SAT, ACT, or other student-assessing tests, particularly alleged to measure the likelihood of individual student success during the freshman year of college. . . PERIOD! – and which in fact are probably no better at making such predictions than looking at the students’ household incomes and overall socio-economic status – for evaluating teachers, administrators, schools, districts, or states. But what the heck? If you’re going to advocate for voodoo school reform, you might as well go all the way, whether due to heinous ideology or abject ignorance, if not both.

    Has Mr. Barnum ever taking a course in testing and measurement? Does he even know what the word “psychometrics” means or spoken with a psychometrician about his own limited and distorted views of how testing works? If he had, he could never say something as ridiculously ignorant as that any teacher-generated test that is given to all the students in his/her class is a “standardized” test. But then, knowing what one is speaking about doesn’t seem important when there’s always True Belief to fall back upon.

    • Matt Barnum says:

      I’m unwilling to engage with someone who thinks it’s appropriate to name call (‘self-aggrandizing liar’ and ‘pervert 2nd husband’) and belittle (‘whether due to heinous ideology or abject ignorance’). Thanks, but no thanks.

      • But you didn’t have that excuse for ignoring my first post. And there is no question about either of the comments I made about Rhee and DJ, but regardless, they have nothing to do with you. So your “no thanks” reads exactly like what we can expect from someone who has nothing to back his ill-informed comments about testing and measurement.

        If you had something, you’d have offered it when I made the first post here. When you didn’t, while answering other comments, I knew I had you cold (I already knew it, given your laughable claim about all tests being standardized tests). I don’t blame you for not wanting to engage in a debate with me. You lost before you started. What a pathetic, ignorant little man.

      • Oh, and Matt: there’s nothing wrong with being ignorant: after all, we’re all ignorant of many things.

        What’s problematic is when you’re called on your ignorance, have the opportunity to revise your views and opinions, and hide behind the lame excuse that one of the people pointing out your errors is rude. Rudeness of others is not an excuse for ignorance and stubbornness on your part, is it?

      • Matt Barnum says:

        And we were doing so well civility-wise! This is why people don’t like internet commenters – because they act in a way online they never would in person. (Or would you call me a ‘pathetic little man’ in person? That’s even worse!)

        I didn’t respond to your original comment because I simply haven’t been able to respond to all comments , even some very thoughtful ones. (Also because I suspected you were more interested in attacking me than in engaging with me – and you know I feel proven right on that point!)

        But fair enough: your belligerence has nothing to do with the substance of your points, which I think is interesting. You point what I see as two dividing lines between types of tests. First, between content-based tests (like many state grade-level exams or the bar exam) and aptitude (purportedly) based exams (like the LSAT or SAT). Second between tests that are standardized on a curve/percentile (like the SAT) or that are not standardized on a curve (like most, but not all, classroom-based exams).

        I believe that most state exams – meaning most of the exams that reformers want to use to assess teachers – are content-based and standardized on a curve, but that scores are considered on an absolute basis too. So for example, the Colorado state exam (now called TCAP) is content-based (reading, writing, math, and science for some grades), with students receiving a percentile scores AND an absolute assessment of their learning (Unsatisfactory, Partially Proficient, Proficient, or Advanced).

        I’m open to the idea that different types of exams may be more ‘test-preppable’ and I am opposed to teacher evaluation systems that are ‘zero-sum games’ for teachers. However, I’m not sure that your point gets to the core of my argument: specifically that test-based accountability will promote high-quality teaching because excessive teaching to the test will not result in significant gains, but highly engaging, critical thinking-based pedagogy will.

      • I would have no hesitation to say precisely what I think of you, Matt, “to your face,” as you put it, because you show yourself to be just another guy who doesn’t understand statistics but never hesitates to misuse and abuse them to support your viewpoint.

        But let’s discuss for a moment “civility” and yours in particular when you think you’re among friends. Take a look at your contribution to Voice of A Dropout Nation:

        I don’t know if the intro I found there is also yours. If not, my apologies, but you should certainly have objected to it if you didn’t write the following:

        “The Poverty Myth in Education — that American public education can do little to improve the achievement of poor children in schools (especially if they are from minority households) – – remains one of the few weapons education traditionalists wield with some effectiveness. Thanks to the Ruby Payne, Betty Hart and Todd Risley (along with the equally debased rhetoric of once-respectable education historian Diane Ravitch), traditionalists have plenty of excuses for opposing systemic reform, arguing that education isn’t the long-term solution for stemming poverty, and letting themselves and the practices they defend off the hook for failing poor kids. Yet as Curt Dudly-Marling and others have shown, the underlying arguments are based on shoddy scholarship and impoverished, even racialist thinking. Given the success of KIPP and other charter, traditional, and private schools in helping poor kids achieve, the Poverty Myth just comes off as pure bunk. Not to say that poverty doesn’t complicate matters in providing all kids with high-quality education. But the issue of kids being poor has more to do with Zip Code Education policies and faulty thinking among those traditionalists working in education than with any natural conditions.”

        The comments about Diane Ravitch are nothing but the sort of cheap shots you decry here. Did you make them? Or did you merely tolerate their use on a page that offers your intriguing analysis of poverty statistics. The point that Ravitch and many others repeatedly make is that when it comes to the great industrial powers on this planet (of which Romania is not one, in case you didn’t realize that), we do a horrific job when it comes to maintaining a reasonable minmum effective standard of living for our poor, and that the percentage of our CHILDREN living below the poverty line is horrifically high (I keep seeing figures in the 20-25% range). RELATIVE poverty is significant because of something known as cost-of-living. Who cares if $50K would make you rich in Rwanda if you live in Manhattan, Matt? Try feeding a family of five in Westchester County on $50K or, say, under $20K. Oh, but wait. The majority of folks who live in Westchester aren’t scraping by on those kinds of incomes, are they? I find your “argument” more than a bit hard to swallow, but all-too-typical of the viewpoint that no matter how abject the poverty in many parts of this country (parts that I’ve worked and lived in more than once, particularly as my professional work the last 20 years has been almost without exception in high-needs inner-city districts and schools in SE Michigan and New York City), which I suspect you have little or no direct knowledge of. You really should come to Pontiac, Flint, and Detroit for a tour of some of the neighborhoods and schools I’ve been in regularly before talking about things not really being nearly as bad as Diane Ravitch, the late Gerald Bracey, and many others know full-well.

        No one I’ve read, by the way, has ever said that we have to FIX poverty before tackling the improvement of public schools. But many of us know how perverse the current deform movement is in its utter refusal to acknowledge the impact of poverty and its ill-effects on the children who attend schools in neighborhoods that you would not likely set foot in on a dare. I really don’t think you have any idea how bad much of Detroit is, how devastated by what’s gone on in the auto industry are Flint, Pontiac, Saginaw, and other big cities in Michigan, and just how much lead poisoning, malnutrition, lack of health care, abuse, crime, addiction, neglect, etc., the children in these cities are exposed to even before they begin school.

        I’ve spent enough time in elementary classrooms in some of these districts to know that it isn’t bad teaching that’s responsible for the most part for the poor test scores most of these children put up. And I know that the phony miracles from Michelle Rhee, Joel Klein, and outfits like TFA and KIPP are just that: phony. If you don’t know it, you’re just not looking or are blind to what is increasingly obvious as each new cheating scandal is exposed to the light of day. There are no miracle workers, no miracle schools, no miracles performed by well-meaning TFA folks or nifty chants. None of that makes a dent in the incredible mess that is urban and rural poverty.

        Of course, if the goal is to destroy US public schools to make way for private takeovers and that other dream of the deform crowd, vouchers, then of course it’s important to focus solely on our most poverty-stricken places and scream “No excuses!” Forget that these are the places the same monied deform crowd blithely ignored for decade upon decade. Now there’s money to be made, so the game is to forget about all the US schools that are performing quite well, thank you, by any reasonable standard, and have been doing so for decade upon decade. Forget that we’ve seen that our best kids more than hold their own against the best kids in the rest of the world (assuming, of course, that you accept the “competition” model of schooling, which I don’t). Forget that there is no reasonable proof that our economic woes are caused by our schools, any more, by the way, than any deformer has ever acknowledged how excellent our schools must have been, by analogy, just 15 years ago or so, before GWB, when we had an enormous surplus and the economy was booming. The same public education system that managed to win WWII and get us out of a Wall Street-caused depression, that gave us the prosperity of the 1950s, was suddenly blamed for our alleged falling behind when Sputnik was launched (even though there is documented evidence that we could readily have beaten the Russians into space and chose not to for political reasons, and even though in the decade that followed, we had no trouble getting men on the moon, while the Russians never came close).

        I’m glad that you feel proven right on the civility point, Matt. You should have SOMETHING to put in your win column from your visit here. Then, when you’re back on safe turf, you and your friends can claim that Diane Ravitch (who is extraordinarily civil), uses “shrill” rhetoric. I realize that Diane does cite a lot of inconvenient facts. I only wish Gerald Bracey were still with us to continue debunking the sort of statistical nonsense and bunkum that comes from the ed deform crowd. We lost a giant when he passed, but others have stood up to point out the nakedness of the deform emperors. And every time someone does (for example, Gene Glass), folks on your side are quick to bring out the long knives of name-calling, ad hominem, and utter disrespect. That Diane Ravitch came to her senses and puts up with non-stop savaging from her former friends and allies is fascinating to watch. She certainly has my admiration and respect. And she does it without resorting to my more street-oriented tactics. Nonetheless, her fighting for those who most need her voice earns her nothing but abuse from those well-heeled conservatives, neo-liberals, and libertarians who used to love her. She won’t be baited into actually BEING nasty, of course, but that doesn’t mean that all of us on the progressive side are quite so delicate. I see you don’t like that. I understand, Matt: it’s only rude when we do it, right?

      • peonteacher says:

        Matt must have some high-level connections – only afforded to those with privilege – to get published after having spent only 2 years in education. Unfortunately this is the sad state of education we live in. People who actually have expertise in education get a smaller or non-existant voice on the national stage when it comes to expressing their views, while people who don’t deserve a platform, get published in national publications like the Washington Post.

        Today’s “ed-reform” movement is a commercial ploy to make money, simple as that. We have been reforming our educational system for the last 100 years; however, the current standards-based reform movement has been the most destructive in our history and will blow up in our faces just like the financial system did. Anyone notice any of the parallels going on?

        At the end of the day, poor kids still get the shaft while “cultural tourists” like Matt run as fast as they can from where the real work gets done to grad school and ultimately a job pushing paper and managing others from the top-down. Why wouldn’t you want a professional job like this? It’s so much easier.

        By the way, If Matt really wrote this piece it is evident he doesn’t even know what he doesn’t know.

      • NewarkTFA says:

        I think that even if you are offended by Mr. Goldberg’s language, his point about the very nature of standardized testing deserves a more thoughtful response. If I, as a classroom teacher, were to design a test INTENDED to spread out my students’ scores across a percentile-type range, I would be guilty of educational malpractice. Obviously, I should design tests that seek to demonstate the extent to which my students have mastered the skills, concepts, and information I have been teaching. Ideally, all my students would do well on the test, and if they don’t, I should reflect on how I could do a better job of teaching them in the first place. Standardized testing, on the other hand, is designed to sort students rather than determine if all of them have achieved a certain level of mastery. Notwithstanding this obvious truth, many standardized tests which, by their very nature, will always flag 50% of all students (and now, by extension, their teachers) as having “failed” are being used as part of high-stakes “accountabilty” initiatives. Many of the self-styled reformers sympathetic with these initiatives are, like us, alumni of Teach for America, and have indeed enjoyed the benefits of an “exellent education.” For people who are obviously intelligent and well-educated enough to appreciate the underlying mathematics of standardized testing to refuse to address the intrensically punitive and unjust nature of such practices–well–I can see how such as refusal might inspire a certain anger and vitriol.

      • You’ve got a point, NewarkTFA: conflating regular teacher-written tests with standardized tests does seem to be a problem. If I were putting myself forth as an expert on education reform, and if I had the unmitigated chutzpah to refer to those who dare question my views or disagree openly with them “traditionalists,” I might just be opening myself up to some wrath. If I then demonstrably showed that I lacked basic understanding of one of the most important issues in the education deform debate – testing and measurement, its underlying theory and practice, and the precepts held to be essential by experts in the field – well, I would have to expect some mockery, and some anger when I couldn’t come back with either better arguments or an apology for my presumptuousness.

        Fortunately, I try to keep my mouth shut when I don’t know the shot, to paraphrase David Mamet. Would that education deformers and their pundits did the same.

      • Manuel says:

        Wow, there are other people out there that know that the emperor has no clothes!

        Indeed, when tests define, a priori, 50% of the takers as “non-proficient,” then it won’t matter what teachers or students do as the scores will never go up to the level demanded by NCLB (“100% proficient by 2014 or bust!”).

        The public expects our educrats to design a test that can eventually be “passed” not one in which 50% are perpetually “not at grade level.” Since they haven’t done it, aren’t they and their alleged supervisors, the state legislators, committing fraud?

        Significantly, Mr. Barnum has not responded to your post. Maybe he ran out of interest after running into some outrage?

        (BTW, the fact that California, where I am based, “flunks” 50% of its students annually is something that has been studiously ignored by anyone I’ve ever shown the actual data. They all seem to think it is “not their job” to deal with it. I guess you must feel the same.)

  16. meghank says:

    Aha, I found it –

    The standardized tests do not measure meaningful learning. They measure test-prep.

    What do you think of that, Mr. Barnum?

    Pearson designs Texas’s tests, as well as just about everyone else’s, including my state’s, so don’t claim it’s just a problem with the Texas test.

    • Matt Barnum says:

      Thanks for finding this and passing it along. I remember reading this article when it came out, and it definitely gave me pause. (I couldn’t find much more information on the study beyond what was in the article. Have you been able to?)

      Anyway, I’ve got a couple thoughts:

      1) This would seem to contradict the Chetty study because if the tests aren’t meaningful metrics, then why would they correlate with real-world gains? (Of course I’m not saying that Chetty is right and this is wrong – I have no idea.)

      2) As I understand it, Stroup’s chief criticism of standardized testing is that instead of measuring actual student achievement, it simply rank-orders students. (Does that sound right?) But what doesn’t make sense, to me, is that actual student achievement wouldn’t correlate highly with rank order of students. What I mean is that, if it’s true that Stroup’s innovative math program produced huge gains in real student learning, then why wouldn’t those students move up dramatically compared to other students who didn’t make the same gains? If they don’t, presumably the questions are simply bad math questions. Perhaps I’m missing something – like I said, I couldn’t find the entire study.

      Thanks, again for sending this and engaging with me.

      • meghank says:

        There is a dissertation linked to in the article.

        1. The Chetty study didn’t really show many real-world gains. Weren’t the gains only about $750 more a year? Also, and I go into more detail about this in 2, the nature of the tests have changed. I remember taking the tests and finding the questions very easy. I’m positive that if I were a student today I would find them to be absolute gibberish. Supposedly this has to do with raising standards, but I think the tests are poorer quality, so that the test Stroup’s research subjects took was not indicative of actual learning, whereas the Chetty participants, taking the tests 15 or more years ago, were being measured on what they had learned in the school year.

        2. I believe the problem is with the test questions. Gary has done a great job of exposing the pointlessness of many of these questions on this blog. I’ve seen a test myself, and, although I suppose there is a threat of criminal action for saying anything at all about the nature of the questions (Pearson has a shocking amount of power, doesn’t it?), I think I’m going to risk it and say that I found the questions on the third grade test to be remarkably bad. They are far above grade-level, when compared to what was considered grade-level when I was in school. The math questions also required the student to be a proficient reader (they were almost entirely complicated word problems, which general ed students were not allowed to have read to them). The tests won’t be sensitive to gains in learning for students who started out well below grade level and made great gains, but who are now still slightly below grade level, or for students who made great gains in math skills, but whose reading skills did not come up to grade level.

        The faulty nature of the test is why test prep works better than what you and I consider to be good education. I still intend to find studies proving that it does work. But I just wanted to give you another example: Another teacher friend of mine teaches high school Algebra, where students come to him adding and subtracting on their fingers. What does he focus on in his classes, now that his job hinges on their scores on the state test? Teaching them how to use a graphing calculator to game the test. The vast majority will never again use a graphing calculator in their lives. They have not improved in basic arithmetic at the end of the year. Do you want to guess what his Value-added score was on a scale of 1 to 5 last year?

  17. A Texas Teacher says:

    I’m surprised that this has not already been mentioned. We already have the evidence that ever increasingly high stakes tests do not benefit students. Texas has been administering state assessments to students for over 20 years. The stakes get higher each year (and the tests get ‘more rigorous’ ever six years) until this year, 2013 when high school students need to pass nearly 15 tests to graduate. What has all this (90 billions of tax dollars) gotten for the students? Not much. SAT scores are still near the bottom of the national pack. There is the proof. 20 years worth of it.

