The Problem with, “Show Me the Research” Thinking

Understanding the limitations of education research and accepting responsibility for contributing to moving it forward

Most studies in education are observational studies. This means that investigators pore over data previously collected by others. They seek correlations between different variables. This approach is far less expensive than other methodologies because it is easier and faster. With research budgets stretched thin, cost is a major consideration. The trouble is that observational studies are subject to biases that sometimes make the results unreliable. If results can’t be replicated by others, the conclusions lose credibility.
—Walt Gardner, Limitations of Education Studies, posted on July 20, 2012

Ask any of us why we teach the way we teach. Our honest response includes a combination of years of teaching experience, how we were taught as students, our personalities, what we gleaned from professional development, administrative policies, faculty culture, and whether or not we’re getting enough sleep that week. We stick with this fragile alchemy as we plan our lessons, sure that ours is the most effective instruction possible.

Then a colleague or school declares they’d like us to teach in a different way, and the first cries of, “Show me the research, or I won’t accept it,” tumble into faculty back-channels, the first bricks of defensive walls are laid. We’re so sure of our own sense of things, devoid of formal research protocols as it may be, yet we demand those same protocols before considering anything new. And for some, anything short of incontrovertible proof of a new strategy’s provenance and direct impact on student learning is grounds for complete dismissal, and occasionally, indignation.

Critique of new ideas in education is often the way many of us sort our thinking and evolve as teachers. It’s actually quite healthy and should be invited with all new building initiatives. We want initial skepticism, as investigation and discussion create robust engagement with new ideas. We explored this more thoroughly in, “The Grief of Accepting New Ideas,” AMLE Magazine, April 2018.

Here, though, we’re looking at practitioner paralysis and the lack of effective instruction that comes when educators do not understand or accept the limitations of research in social sciences, and consequently, base their decisions to use or not use teaching strategies on their perceived presence or lack of “hard science” evidence. Attending to the influencing factors of social science studies is key: Many of my own students over the years could have learned so much more if I had been more aware of the helpful insights—and clear limitations—of education research.

In his August 24, 2018 blog, math teacher and chief academic officer at Desmos, Dan Meyer, discusses a New York Times op-ed piece and its rebuttal on how math should be taught. He pulls the pedagogical lens back at one point in the debate and observes the power of tightly held beliefs affecting our actions, good or bad: “I’m absolutely convinced that a) we act ourselves into belief rather than believing our way into acting, and b) actions and beliefs will accumulate over a career like rust and either inhibit or enhance our potential as teachers.” He ends the piece by asking educators to share moments when beliefs were overturned by new evidence or perspective and we were forced to change our teaching as a result. This is a scary thing for many of us, for we are not used to not knowing.

On Greg Ashman’s, July 1, 2018 “Filling the Pail,” blog, math and physics teacher, Lee McCulloch-James, posted a reasoned plea in their spirited debate on how teachers accept or deny education research claims:

I am a … maths-physics teacher instinctively more partial to maths/physics research itself … than to the educational research (with its “Mastery” talk) focused on its dissemination… For me much of this learning theorising needs to be packaged more cogently… Between the extremes of quoting acronyms or vacuous phrases to that of the … padded academic research papers (which merely pitch to their own), there is a need for more writing in the spirit of this Blog for us time-poor teachers. Give me a list of the ongoing debated theories and map it … to their implementation techniques in the classroom… I will find out in time what is working for the students’ learning in my best efforts to churn those teaching methods deemed to be the latest best practice, while having scope for the hobbyist scientist in me to also model in my classes the process of science thinking as it is actually practised. (July 5, 2018)

The comments of assessment and teaching researcher, Dylan Wiliam, were significant factors on both sides of this debate. After a serious back and forth, however, Wiliam states,

…[The educator whose work they are discussing] does not want teachers to have to become “amateur psychologists”. As teachers, our main … job seems to me to be to get our students to learn stuff. The idea is that after some time in our classrooms, our students know, understand, and can do things that they couldn’t do before. As teachers, we are in the learning business. For someone professionally involved in education to be incurious about how this happens, and how to do it better, seems to me rather odd.

Then, he follows with this humbling admission in his own learning curve with cognitive load theory:

…[F]or many years (most of my teaching career in fact) I taught mathematics in a very similar way… I used problem-solving, mathematical investigations, and extended projects, and my students seemed to enjoy mathematics. I was dismissive of cognitive load theory because I did not want it to be true. I did not want to believe that the way I had been teaching was in all likelihood less effective, especially for lower-achieving students. But then I looked at the evidence, and although I could quibble with details here and there, the overall evidence was so overwhelming that I was forced to change my mind. We still know relatively little about how to apply the lessons of cognitive load theory in real classrooms, but I remain convinced that it is the single most important thing for teachers to know; students can be happily, productively, and successfully engaged in mathematical activity and yet learn nothing as a result. I don’t like the fact that our brains work in this way, but it seems they do. (July 6, 2018)

Wiliam is one of the most research-discerning minds in education today, yet he struggles with the research just as we do. It’s hard to do any kind of deep dive into the latest education studies when we have so much competing for our time and energy as educators. It would be crippling to have to research every move we make as teachers: What’s the research say about greeting students at the door? What does it say about the number of practice problems I should give when assigning in this particular topic? Should I teach adverbs before adjectives? How about how to set up my seating chart, incorporate Chromebooks in the lesson, or if it’s okay for a student to read a novel other than the one assigned? Well, yes it can—sometimes only with these variables and not those, and, of course, what this study just declared effective contradicts the conclusions of that other study, and gosh, where do I even find reliable data? Yikes, what’s a teacher to do?

Avoid Physics Envy

It is a misconception that the only research in education (social sciences) that is acceptable for education reform is one that adheres to proper-protocol, juried journal, always reproducible, randomly assigned, third party confirmed research such as exists in physics and similar “hard” sciences. This is the “physics envy” referenced by Dylan Wiliam and others. In that envy, we seek models that include these steps:

  • Develop a theoretical model and hypothesis.
  • Test the hypothesis with large sample size, randomly-assigned subjects in multiple situations, controlling for variables as needed, using double-blind investigations.
  • Publish results.
  • Invite others in the field to reproduce the investigations with same elements, controls, and conditions, and get the same results. Experience validation when seemingly causal relationships established: “When A is done, B occurs. If A is not done, B does not occur.”
  • Publish those verifications.

Hard science investigations in physics and chemistry, for example, can often control for their variables, which helps us isolate the impact of a particular change in the experiment’s factors and outcomes. We all yearn for such assurance and clear connection between teacher decisions and student learning, but it is rarely achieved. There are often too many intersecting parts, each influencing the other, to make an absolute, unequivocal, it-always-works-like-this conclusion about one particular teaching factor in diverse students’ learning. Wiliam notes, “A recent review of one hundred research papers published in top psychology journals found that fewer than 40 percent of the studies gave similar results when the same experiments were run again but by a different team. Chasing the latest fads is likely to result in trying to implement ideas that turn out to be ineffective even in the laboratory, let alone in real school settings” (2018).

In reality, data investigations often do not align with classroom realities or allow for direct transfer of a study’s conclusions to successful implementation in a school. We can rarely replicate exact conditions or account for all confounding variables when repeating experiments to test theories in education. The results of any given study can be affected by: student maturation, readiness levels, cultural/family backgrounds, local politics, access to technology, English language proficiency, gender, attendance, class disruptions, community support for schools or lack thereof, presence/absence of school counselors/nurses, hunger, diet, sleep patterns, family dysfunction, parents’ education, socio-economic status, access to discretionary monies, transiency rates, community violence/gang membership, afterschool care, grading practices, presence/absence of libraries, curriculum, leadership, teacher training, emotional climate, teacher-student ratio, childcare services for students who become parents at a young age, opioid use—and the new variables introduced by the intersection of any two or more of these factors.

Develop a Critical Eye for Education Research

When reading the limitations of studies, we find that some do not follow sound research protocols, but if we’re not reading them with a critical eye, we don’t see the issues. In some studies, for example, researchers or those interpreting their results may confuse causation with correlation, but just because two things are statistically correlated doesn’t mean one element is the direct result of the other or even influences the other. The study’s conclusion over-stepped its data indicators, making unfounded claims. Some studies, too, indicate great success in an early pilot with a small control group, but the positive impacts disappear when scaled up to use in a large school or district. As Wiliam notes, “Everything works somewhere and nothing works everywhere” (2018).

Of particular concern is how unethical it would be in some cases to use a control group of students who are not provided with a given experimental factor in order to test that factor’s effect on another group’s learning. This is especially a concern if there’s no time to go back and re-teach the control group students effectively if they achieved less than the experimental group, such as when one group of students is taught mathematics without benefit of manipulatives while another learns with them. And really, no parent wants to hear that teachers are experimenting on their children—It sounds sinister.

Just as importantly, though, is the fact that not all that is wise and wonderful in education has a robust research base; it doesn’t exist. Where it does exist, it’s usually correlational, relying more on qualitative than quantitative data analysis, studies of studies instead of true field studies, and looking at patterns/extrapolations over time, sometimes with limited data sets, or data for a large population but losing correlation when applied to an individual learner.

In his article on the research about teachers’ professional learning, professor Tom Guskey (2012) points to several universal cautions about educational research in general. First, he asks us to always begin with the outcomes: What is it we are seeking for our students and teachers, and how will we know those outcomes have been achieved? Second, Guskey says we should consider the perspectives of the stakeholders. He relates an impassioned story about how the outcome of a program’s use in a school carried more weight with the school board he was advising than all of his study’s empirical data and charts combined, concluding: “Even when planners agree on the student learning goals … different stakeholders may not agree on what evidence best reflects improvement in those outcomes.”

Education writers and reporters make every effort to get it right when it comes to research. Debra Viadero wrote a highly recommended guide for Education Writers Association (EWA) called, “Making Sense of Education Research.” She cautions that, “It can be comforting to think of research as the ultimate authority on a question of educational policy or practice, but the truth is that usually it is not. The best that research can do is to provide clues on what works, when, and for whom, because classrooms, schools, and communities inevitably vary.”

Viadero urges education writers, and indirectly, us, to ask the important questions about the research they’re reporting:

  • “Who paid for the study? …{B]e suspicious of information generated by anyone with a stake in the results.
  • Where was the study published? In terms of trustworthiness, research published in a peer-reviewed journal almost always trumps research that is published without extensive review from other scholars in the same field.
  • How were participants selected for the study? Reporters should always be on the lookout for evidence of “creaming”–in other words, choosing the best and brightest students for the intervention group.
  • How were the results measured? It is not enough to state that students did better in reading, math, or another subject… Was it a standardized test or one that was developed by the researchers? Did the test measure what researchers were actually studying?
  • Was there a comparison group? Reporters should be wary of conclusions based on a simple pre- and post-test conducted with a single group of students.
  • What else happened during the study period that might explain the results? For example, were there any changes in the school’s enrollment, teaching staff or leadership?”

Accept a Little Professional Humility

Just as I finally accept some great truth in teaching, someone comes along and shows me that the Emperor has no clothes. Take a short trip into the world of today’s education research, and you’ll find many teaching practices we hold dear now suspect. For example, there is considerable evidence that a diet of only project-based and inquiry learning is not as effective as we intuitively think it is, and that guided and direct instruction have clear places in the modern classroom. In a study by Kirschner, Sweller, and Clark (Educational Psychologist, 2006), the authors write:

“After a half-century of advocacy associated with instruction using minimal guidance, it appears that there is no body of research supporting the technique. In so far as there is any evidence from controlled studies, it almost uniformly supports direct, strong instructional guidance rather than constructivist-based minimal guidance during the instruction of novice to intermediate learners…. Not only is unguided instruction normally less effective; there is also evidence that it may have negative results when students acquire misconceptions or incomplete or disorganized knowledge.”

For those of us so sure of the veracity of our inquiry and constructivist methods, this is a real head scratcher, and our pulse quickens as we prepare rebuttals. There are other practices that current education research questions: use of rubrics, learning styles, single gender classes, coed classes, grades as motivation, technology integration, 1:1 initiatives, charter schools, teaching coding to young students, individualized/personal learning, and cultivating grit programs. [‘Nodding with readers, speaking in a raised pitch, astonished and commiserating voice] “I know!”

If We Have No Time to Do the Research Ourselves, What Can We Do?

When it comes to using different instruction, standards-based grading, teaching coding to young children, and similar initiatives, we often ask for the proof that such an approach works before we embrace it. The fact is, however, that we don’t have incontrovertible evidence about any of these in their entirety. What we have are focused studies within the larger category. For example, Benjamin Bloom and his mastery learning research showed that providing time and additional lessons to reteach and reassess students who did not master the content in the same timeframe as their classmates resulted in higher achievements in those students. In Classroom Instruction That Works (ASCD, 2001), Robert Marzano reports a 20 percentile increase in outside-the-school test scores when students redo assessments until they achieve a satisfactory level of performance regarding the standard. Scientists, mathematicians, and engineers re-do experiments and problems repeatedly until they solve challenging problems. From these, other studies, and life itself, we see the value of re-do’s and re-takes.

To my knowledge, there has never been a full scale, amply sized, inclusive of all elements, random-selection, double-blind, causal relationship, official study of standards-based grading or differentiated instruction. And why is that? Because it’s physically impossible to conduct either one, as there are too many confounding variables and intersecting elements for which we could control. It’s prohibitively expensive, requires so much inference and extrapolation as to be functionally inconclusive, and in some cases, is unethical to control group students. To demand such studies and full proof of positive effects of either one before discussing their potential use in the classroom is toxic contrarianism for its own sake, and not helpful.

Education author and leader, Todd Whitaker, often reminds us that we didn’t have full-proof research about going to the moon when we took that trip in 1969, but we did it anyway. There’s a heck of a lot in teaching and learning that we do because of “gut” sense that it will work, brave though it may be. No parent teaching a toddler to pull up and button his own pants for the first time stops everything to go read dissertations on, “Learning to put on our own clothing,” when the first attempt results in Spiderman underwear stretched from ear to ear. Common sense dictates we coach the child and ask him to try it again and again, providing feedback as needed, until he can fly solo with the task.

Properly conducted research in education is welcome. It catalyzes our next investigations and invites critical analysis from thoughtful educators. It informs our decisions, but it rarely identifies definitive action. Teacher experience, professionalism, testimony, context, and reasonable attempts to gather more information are also valued.

We can’t paralyze our instructional efforts, however, by worshiping at a limited research altar, claiming we only do research-based practices, especially when the research isn’t plentiful or clearly correlating. Declaring, “Show me the research that this works, or I will refuse to do it,” is a form of professional cowardice disguised as prudence. It takes professional courage to remain open to new possibilities, especially with the ones that threaten the status quo or our personal way of doing things. We can be skeptical instead of cynical, and we can ask questions instead of dismissing ideas outright.

Let’s read and respond to the research that is there–seriously, there’s a lot out there that gets read only by other researchers, not classroom practitioners. Once we’ve read and discussed what’s out there, let’s get more invested in the research ourselves, conducting teacher action-research, forming Critical Friends Networks and Professional Learning Communities, and sharing what we find with each other and inviting its critique.

Let’s be thoughtful about what we do on a daily basis, and ask the questions we never have time to ask: How do I know this works with each of my students? What am I missing in the teaching-learning dynamic? What assumptions am I making with this teaching practice, and how are they getting in the way of student learning? What biases do I need to shed? Am I comfortable with the agenda this practice perpetuates? Is this practice born of faculty politics or sound pedagogy?

Let’s put ourselves in places and experiences that are likely to connect with education research by mentoring and being mentored, reading professional journals and books, maintaining reflective journals, participating in online communities in our subject areas or educator forums, participating in Ed Camps, videotaping ourselves and analyzing our teaching with a colleague, attending workshops and conferences, watching webinars, video-conferencing with researchers and authors, and seeking National Board Certification from the National Board for Professional Teaching Standards.

We are imperfect, and our field is imperfect. We’ll shake apart, though, if we can’t accept the ambiguities and messy evolutions that form our enterprise, or if we lose interest in keeping up with an ever-changing profession. Too much is at stake to remain aloof or instructionally impotent. It’s unsettling to not have a clear view of the path ahead, but that’s an enticing challenge—to boldly go. The successful among us see the merits of informed discussion and the limits of argument from myth. Though we might lack the tools to get it right every time, we are attentive to others’ research while contributing research of our own. We make the most conscientious decision we can, given our growing expertise and the context of any given moment. For most of us, that’ll do.


Ashman, G. (2018, July 1). Filling the Pail – Mike Ollerton critiques Cognitive Load Theory [Web log post]. Retrieved from

Guskey, T.R. (2012). Focus on key points to develop the best strategy to evaluate professional learning: The rules of evidence. Journal of Staff Development, 33(4).

Kirschner, P.A., Sweller J., & Clark, R.E. (2006). Why minimal guidance during instruction does not work: An analysis of the failure of constructivist, discovery, problem-based, experiential, and inquiry-based teaching. Educational Psychologist, 41(2), 75-86. doi:10.1207/s15326985ep4102_1

Marzano, R. J., Pickering, D., Pollock, J.E. (2001). Classroom instruction that works: Research-based strategies for increasing student achievement, Alexandria, VA: ASCD.

Meyer, D. (2018, August 24). Drill-based math instruction diminishes the math teacher as well [Web log post]. Retrieved from

Viadero, D. (n.d). Making sense of education research. Retrieved from

Wiliam, D. (2018). Creating the schools our children need: Why what we’re doing now won’t help much (and what we can do instead). West Palm Beach, FL: Learning Sciences International.