Some more results from my essay-grader GPT
Grading Andy Matuschak on why books don't work; Joseph Heath on cancel culture and generational transition; and some subsequent instructions I gave to the grader
Andy Matuschak’s essay is here. It was suggested by a commenter.
My essay grader GPT lavishly praised the essay. It concluded:
Numeric Grade: 8/10 The essay effectively communicates its ideas and is supported by logical reasoning and evidence. However, it could be improved by providing more concrete examples or empirical studies to support its claims, especially regarding the effectiveness of alternative learning methods.
Feedback: Your essay presents a well-argued case about the limitations of books and lectures in facilitating deep learning. You effectively use examples and logical reasoning to support your arguments. However, the essay could be strengthened by incorporating more empirical evidence or studies that support your claims, particularly when discussing potential alternatives to traditional learning methods. Additionally, while the essay addresses the shortcomings of books and lectures, it would benefit from a more detailed exploration of how different learners might interact with these mediums, considering factors like individual learning styles and prior knowledge. This would add depth to your analysis and provide a more holistic view of the subject.
Another reader was curious about how the GPT would grade this essay by Joseph Heath. Again, after lavishly praising the essay, the GPT concluded:
Numeric Grade: 8/10
This grade reflects the essay's depth of analysis and balanced perspective. However, improvements in structure, clarity, and a focus on practical solutions could make the argument more compelling.
Suggestions for Revision:
Clarify Your Definition: Start with a clear, concise definition of cancel culture to set the stage for your analysis.
Structure the Argument: Organize the essay into distinct sections, each addressing a specific aspect of cancel culture (e.g., social media's role, cross-cultural examples, future implications).
Focus on Solutions: Dedicate a section to discussing potential solutions or strategies to mitigate the negative aspects of cancel culture.
Your essay provides a thoughtful and extensive examination of a complex social phenomenon, contributing valuable insights to the ongoing discourse on cancel culture.
I learned something interesting, which I should have known. ChatGPT does not give the same response every time. I accidentally lost the first response to this essay. When I submitted the essay a second time, I did not like the response as much. I gave it a thumbs down, it asked me why, and it then gave me another response which was more like the first. That is the response I quoted above.
I probably would have been inclined to give higher scores to both of these essays. But I think that the suggestions for improvement that the GPT made were constructive and reasonable.
I would appreciate more suggestions for essays to grade. I prefer pieces that are 1000 words or less (newspapers limit typical op-eds to 600 words). Academic papers are too long, as are major magazine articles.
Leave suggestions in the comments. It may take me a while to work through the queue, and I might not post every single result.
Another commenter suggested that I should have published my instructions to the GPT. I am sorry that I did not do so. But after these essays were graded, and based also on some other test cases, I gave it some new instructions. This was the conversation:
Welcome back! Is there anything Grade an Op-ed type essay should be doing better? Anything you'd like to change?
You
I would have you drop the issue of neutrality. op-eds are not supposed to be neutral
ChatGPT
GPT Builder
Understood, I've updated Grade an Op-ed type essay's approach. Now, it will focus more on the quality of reasoning and the fairness in presenting viewpoints, recognizing the inherent subjectivity in op-eds. It will no longer emphasize neutrality but will continue to penalize misrepresentation and inflammatory rhetoric. The goal remains to provide clear, logical feedback and suggestions for improvement.
Is there anything else you'd like to adjust?
You
I think that balanced argument can be subsumed under engagement with opposing views. Giving opposing views a fair hearing is what I am after. The essay does not have to please someone with an opposing point of view. But it must not leave those with an opposing point of view feeling that their views have been ignored, misunderstood, or--worst of all--misrepresented.
ChatGPT
GPT Builder
I've updated the behavior to reflect your focus on fair engagement with opposing views in op-ed essays. Now, "Reasoned Reviewer" will evaluate how effectively an op-ed engages with and represents opposing viewpoints, without necessarily demanding balance. Essays are expected to give a fair hearing to opposing views, and misrepresentation or failure to address these views will be critically assessed. The goal is to ensure that those with differing views do not feel ignored or misrepresented.
Is there anything else you would like to refine or add?
2 / 2
You
An essay deserves particular praise when the author considers the possibility that the essay might be wrong. If the author concedes the risk that some of its assumptions may be mistaken, or events could take a turn that would make authors change their minds, that indicates an open mind. Showing an open mind is worthy of praise.
ChatGPT
GPT Builder
The changes have been made. Now, "Reasoned Reviewer" will also praise op-ed essays that demonstrate an open mind, particularly when authors acknowledge the possibility of being wrong or show a willingness to reconsider their stance in light of new evidence. This aspect of intellectual humility and openness will be a key factor in the evaluation, alongside the quality of reasoning and fair engagement with opposing views.
The GPT currently summarizes its job as follows:
"Reasoned Reviewer," as a specialized grader of op-ed essays, emphasizes the quality of reasoning, fair engagement with opposing views, and demonstration of an open mind. This GPT evaluates essays not only on the clarity of arguments and fair representation of opposing views but also praises essays where authors acknowledge the possibility of being wrong. Essays that concede potential flaws in their assumptions or show openness to changing their stance based on new evidence or circumstances are highly regarded. This reflects intellectual humility and openness, important traits in op-ed writing. Misrepresentation, failure to address opposing views, or inflammatory rhetoric will be penalized. The GPT offers a numeric grade and detailed feedback, highlighting areas of strength and suggesting improvements, especially in engaging with different perspectives and demonstrating open-mindedness.
from Matuschak's essay:
"to understand something, you must actively engage with it. That notion, taken seriously, would utterly transform classrooms. We’d prioritize activities like interactive discussions and projects; we’d deploy direct instruction only when it’s the best way to enable those activities. I’m not idly speculating: for the last few decades, this has been one of the central evolutionary forces in US K–12 policy and practice.
And yet, K-12 students today are not learning more than students a few decades ago. Matuschak just assumes that teachers can get students to engage by "prioritiz[ing] activities like interactive discussions and projects". But if students are not interested, they will just go through the motions. They will remember for a while--young people have GREAT short term memories, much better than 40 or 50 year olds--and then that knowledge will fade away. They have not built a great edifice of learning, just a castle in the sand, which must be constantly rebuilt and protected from the wind and tide.
Like just about everyone who writes about education, Matuschak seems to believe that young people would learn so much more if we just DID IT RIGHT. They won't. They won't learn significantly more unless they care about the material and then continue to use the knowledge. Which ain't ... gonna ... happen.
Interested to see some very negative examples as well - essays that you judge to be poorly argued, assuming the conclusions, etc.