Some Recent Essays on Schooling
Rui Ma, Austin Scholar, another essayist hosted by Scott Alexander, and The Zvi
A controversy has been kicked up by the essay on the Alpha School hosted by Scott Alexander. I will put in my two cents here.
Long-time readers will know that I emphasize what I call The Null Hypothesis. This is the view that when all is said and done, attempts at different ways of schooling make no difference. Controlled experiments find that interventions have little effect. When there are effects, they tend to “fade out,” so that when the experimental group outperforms the control group at one point (say, the end of 3rd grade), they fall back to the same level as the control group by some later point (say, then end of 6th grade). And if experimental effects do not fade out, the ability to replicate those effects in a different study fails, especially if the new study is much larger.
The Alpha School seems to have effects that are significant and do not fade out. But it is not a clean, controlled experiment, and replication is untried. So where does the Null Hypothesis stand?
Suggesting that the Alpha School has overcome the Null Hypothesis, Rui Ma writes,
What really stood out, though, and what the parent-reviewer said is the true engine behind Alpha, is the school’s internal virtual currency: Alpha Bucks. Students earn Alpha Bucks for completing tasks, reaching goals, and going above and beyond. They can then spend them on real things: physical products, school events, even internal auctions. It’s an economic system that shapes motivation and behavior.
Pointer from Tyler Cowen.
Similarly, Austin Scholar writes,
After four years at Alpha School and now at Stanford, I've come to believe that extrinsic motivation is not only more powerful than we give it credit for – it's actually the dominant force behind most high achievement. And rather than corrupting intrinsic motivation, thoughtful extrinsic rewards can enhance and channel it.
Here's why I think the conventional wisdom is wrong:
At Alpha, where students complete their academics in just two hours a day using AI-powered adaptive educational apps, we've seen something fascinating happen. Because academic learning becomes so efficient, our guides don't spend time on lesson-planning. Instead, they spend their time getting to know each individual student and figuring out what motivates them.
…The key insight: the guides spend enormous time personalizing these extrinsic motivators for each kid. They don't assume one size fits all. They observe, experiment, and adjust.
Roland Fryer did the original research on paying students to learn. ChatGPT summarizes,
Roland Fryer's research on paying students to learn holds up methodologically and was valuable for what it taught us — especially that incentives need to be immediate, clear, and focused on effort, not outcomes. But its practical impact is limited by cost, modest gains, and questions about sustainability.
I believe that in education there is a strong bias in favor of using interventions that feel good to educators. Most of the time, the result is the Null Hypothesis: the intervention does not have any replicable, long-term effect. But educators keep pushing such interventions anyway. The “whole language” approach to reading, which is now considered inferior to phonics, is a classic example.
Paying students to learn has the opposite characteristic. Educators want to see it fail. But that means it ought to be tried more and studied more.
The new essay hosted by Scott Alexander takes a view that is contrary to that of would-be educational revolutionaries. In my terms, it says that the Null Hypothesis holds because for the purpose of mass education, schooling as we do it today has evolved to be nearly optimal. The author writes,
School isn’t designed to maximize learning. School is designed to maximize motivation.
…most students learn to read and do arithmetic, some learn much more than that, and on average school seems to add to IQ. Revisiting Chesterton’s fence, those are the benefits of school we need to understand before we tear anything apart.
Note that in my opinion it is not well established that “on average school seems to add to IQ.”
The author posits three levels of students, based on what he calls their level of motivation. I suspect that he means a combination of motivation and aptitude.
At the highest level are “no-structure learners.” These are the auto-didacts who will learn in any environment. How you teach auto-didacts does not matter. The secret to being a good school relative to auto-didacts is to select for them.
At the next level are “low-structure learners.” These are students who can sit still and do what they are told. This will get them through the high school curriculum, although by that point they will be on a lower track than the no-structure learners.
At the lowest level are “high-structure learners.” These are students who at most can get through middle school. But in high school, they are no longer able to keep up, and they have to be socially promoted in order to graduate.
The author can use this model to explain why the The Null Hypothesis holds. It holds for the highest level students, because they already learn more than what schools are trying to teach. It holds for the middle level of students, because they thrive in any consistent environment.
low-structure learners thrive on collective routines. Conformity explains why personalized learning often fails. Most students need the social scaffolding of lockstep instruction, even when it’s inefficient. Conformity isn’t a perfect solution, but it’s the best one we have.
To put a fine point on it, the author argues that forcing students to work at a standard level is best for many of them.
Working at your own pace may seem like it makes sense, but it often undermines motivation. Grouping students by ability, whether within or across classrooms, has shown little benefit. Neither approach has delivered the transformation its advocates promised.
In the end,
The no-structure learners will always be bored, as long as we are committed to putting them into classrooms where everyone learns the same thing. And those classrooms where everyone learns the same thing are exactly what the low-structure learners need.
…No-structure learners thrive anywhere. Low-structure learners need any coherent system. High-structure learners are much more sensitive to the quality of teaching, but trying to meet each student where they are doesn’t work very well. Lumping everyone together and asking them all to learn the same curriculum seems to work better at scale than anything else we’ve tried. These are the core challenges of education.
Zvi Mowshowitz read this essay and reacted angrily.
These are no-structure learners, who by your own admission will always be bored in your classes, so don’t impose your god damned stupid class structure on them at all?
Or, if you can’t do that in context, and again hear me out… create different tracks, use tests as gates for them, and if the kid can’t hack the one moving quickly, move them out of the track into another one that they can handle?
The Zvi goes on,
Tracking is necessary in high school because students diverge too much (despite forcing them not to beforehand) but definitely fails earlier because of reasons, despite all the parents favoring it and everyone involved constantly saying it works (and my understanding of the research also saying that it very clearly works)?
I think that the author would answer that parents favor tracking in elementary school because of the Lake Wobegon effect. The parents all believe that their mid-level students are gifted and talented. Schools try to satisfy these parents by putting many more students into GT classes than really belong there. If that does not keep parents happy, the administrators get so fed up with the parents that they abolish GT altogether.
Unlike the Zvi, I found myself inclined to agree with the latest essay. Consider a Girardian theory of schooling: we want what other people want. When the other kids in elementary school want the approval of the teacher, I want the approval of the teacher. For most students, that is necessary and sufficient motivation. That is why ordinary classroom learning works, and nothing else works at scale.
Take the average 8-year old out of the classroom and put him in front of a computer at home, where he no longer sees other kids wanting the approval of the teacher. His motivation drops to zero. Only if we can overcome that will any form of technology-driven education take off.
substacks referenced above: @
Failure to reject the null hypothesis, I at least was taught, demonstrates merely that there is no relationship between two data sets. Failure to reject the null hypothesis does not prove anything and its use as evidence in favor of some other assertion is weak evidence at best. Thus, using the null hypothesis as a basis for the various dubious assertions made or implied here seems to do quite some violence to the ordinary limits of what is traditionally understood as “the null hypothesis.” If you want to wave around the null hypothesis, that would ordinarily imply you have no standing for making a policy prescriptions, even it is only to accept the status quo.
The assertions that appear to be made here that we are asked to accept based upon failure to reject the null include:
1. “for the purpose of mass education, schooling as we do it today has evolved to be nearly optimal”
2. “When the other kids in elementary school want the approval of the teacher, I want the approval of the teacher. For most students, that is necessary and sufficient motivation. That is why ordinary classroom learning works, and nothing else works at scale,” and
3. “Take the average 8-year old out of the classroom and put him in front of a computer at home, where he no longer sees other kids wanting the approval of the teacher. His motivation drops to zero.”
Let’s take each in turn.
First, regarding the optimal US education system and its implied policy prescription of generalized complacency. The 2024 National Assessment of Educational Progress found:
“Thirty-one percent of fourth-grade students performed at or above the NAEP Proficient level on the reading assessment in 2024—2 percentage points lower compared to 2022 and not significantly different from 1992, the first reading assessment year.”
(https://www.nationsreportcard.gov/reports/reading/2024/g4_8/national-trends/?grade=4#achievement-level-trends )
The lack of progress would appear to be a perfect demonstration of the null hypothesis: nothing makes a difference so why bother? However, around the world there are a variety of education models that produce different levels of achievement:
“Results from the most recent Programme for International Student Assessment (PISA) in 2015, which is a test administered to 15-year-old students in participating economies every three years, showed that students in the United States lag far behind other high-income countries across different subjects, despite making the most progress in equity compared to other participating countries. … …
Analysts determined that the nations and city-states at the top of the rankings had several things in common. For one, they had well-established standards for education with clear goals for all students. Although these countries have well-delineated standards, they do not necessarily outline similar goals. For example, one country may emphasize cooperation, another student growth, or yet another may focus on equality, as in Finland.[2] Another thing the high-performing nations had in common was a tendency to recruit teachers from the top 5 to 10 percent of university graduates each year, which is not the case for most countries (National Public Radio 2010).
Finally, there is the issue of social factors. One analyst from the Organization for Economic Cooperation and Development, the organization that created the test, attributed 20 percent of performance differences and the United States’ low rankings to differences in social background. Researchers noted that educational resources, including money and quality teachers, are not distributed equitably in the United States. In the top-ranking countries, limited access to resources did not necessarily predict low performance. Analysts also noted what they described as ‘resilient students,’ or those students who achieve at a higher level than one might expect given their social background. In Shanghai and Singapore, the proportion of resilient students is about 70 percent. In the United States, it is below 30 percent. These insights suggest that the United States’ educational system may be on a descending trajectory that could detrimentally affect the country’s economy and its social landscape (National Public Radio 2010).
Recent research has found that the United States’ low overall educational achievement is in large part due to an underperformance by the middle class. The poorest students in the United States, despite being among the most socioeconomically disadvantaged around the world, perform averagely relative to other poor students. The richest students from the U.S., despite being among the wealthiest, are also average when compared to other rich students – which is also an alarming finding. However, students in the middle of the SES distribution perform half a school year behind comparable middle-SES students, despite being among the wealthiest middle-SES groups in the world.”
(https://courses.lumenlearning.com/wm-introductiontosociology/chapter/education-around-the-world/ )
So even if the US lacks the intellectual capital to produce efficacious educational reform, there is no convincing evidence that education reform is necessarily doomed in all places and at all times. The interest of basic national survival alone would counsel continued efforts regardless of the track record of failure. What really needs to be addressed and overcome is the doctrine of establishment complacency (aka “institutionalism”.) The threat of a moldering status quo is much worse than than the threat of change.
On the second assertion, there is a vast literature on the varieties of learning motivation that is easily accessible. LLM prompt: What are the major theories of student learning movition in schools?
On the final assertion asserting the primacy of teacher approval-centered mimetic motivation in learning:
“Research shows that children self-motivate information seeking behavior to solve problems. For example, 5–9-year-old children are more likely to persist in information-seeking behavior to effectively solve a problem when given ambiguous rather than conclusive evidence (Busch & Legare, 2019). Children also adjust self-motivated exploratory behavior under problem-solving conditions in response to the credibility of their resources (Gweon et al., 2014). Some researchers suggest children are actually better self-motivated problem-solvers than adults (Lucas et al., 2014), as children seemingly are more willing to expend effort exploring information and resources before exploiting findings and drawing conclusions (Liquin & Gopnik, 2022). On the other hand, children may be less efficient self-motivated learners since they may struggle to terminate information-seeking behavior, even after the answer has been discovered (Ruggeri et al., 2016).”
(https://www.sciencedirect.com/science/article/pii/S0001691822003316 )
The null hypothesis might tell us something specific, but it does not justify endorsement of any other policy prescription.
Confused about whether Arnold thinks Alpha School has beaten the Null Hypothesis. At the end he seems to say that he believes the essay critical of the reformers, but compares a classroom setting to being alone with a computer, which is a straw-man.
I think that most interventions are too small or short-lived to expect major changes. Alpha School's completely different structure, going from K-12, seems like it clearly meets the intervention size and length to expect a non-null hypothesis. I have full expectation that they have found a better way to educate, and I hope it spreads.
The emphasis on practical skills and engagement with the real world would also tend to result in better outcomes for students than singular focus on SAT test performance, given the reality that industry cares about more than just IQ. Some of the wealthiest people I know are in sales or own/run a business, doing much better than me who was single-minded / successful in academics.