I’ve been a giant fan of Ben Thompson for many years, since the start of his Stratechery newsletter. And this article about “Sydney” is getting a lot of attention. But I don’t quite get it. What’s so interesting about it? Is it that people had been thinking of AI as a replacement for search or a simple replacement for, say, lawyers, or writing high school essays, and that Thompson highlights that it might be something altogether different? I enjoyed the article, and I’m interested in what Thompson thinks, and anything that surprises him is notable. But why exactly is this article considered such a “must read”?
Some other related chatbot thoughts, under-covered so far as I can tell (which is not saying much: for a number of reasons I've reduced my internet commentary consumption and participation by approx 99%.)
1. "Corpus derived from the entire internet ... " - Lol, no way (to be fair, they are more accurate in the fine print.) But obviously the 'entire' internet (e.g., Twitter) mostly consists of low quality crud that doesn't look anything like a Wikipedia entry, high rubric-scoring standardized test essay, middle-brow magazine article, or prestigious paper opinion piece.
Excluding that crud is obviously a good thing to do, but the consequence is that the voice of that "particular persona" which early generation chatbots will project will not "sound like the internet", which will predictably result in the same depressing dynamics regarding analogous complaints of institutional demographics not "looking like America".
Thompson says chatbot output will tend to reflect the style, viewpoints, biases, and agendas of those particular kinds of people who are overrepresented as authors of the corpus ('scribe privilege'), but he wisely ducks the matter of what we already know about who those people are, e.g., about 90% of actively contributing Wikipedia editors are male. And that's well into the progressively enlightened 21st century, but a lot of the non-crud corpus is old. The NYT looked at over 7,000 works of fiction from 1950 to 2018 in "Just How White Is the Book Industry" and found 95% of the authors to be white.
So, one just knows without even having to look (I haven't and feel no need to) that someone has already combined chatbot with 'mansplain' to coin 'manbot' or 'botsplain' or whatever, and that if "chatbots so white" hasn't already hit critical mass to go viral then it will soon. Though in this case they aren't talking about 'European Ancestry' but instead using 'white' in that completely idiotic and culturally corrosive way (e.g., you ain't non-white if you are into, "the scientific method, rational linear thinking, cause and effect relationships ... ") reflected in the NMAAHC "Aspects of Assumptions of Whiteness" (which was only a prominent graphical depiction of content from the progressive-vanguard-consensus, which originated elsewhere, and with an unfortunately long history.)
To be fair, chatbot output does indeed sound 'so white', but only if you are using """white""" in this idiotic way, because chatbots are currently aiming at something like the typical "Spinozan Ideal" professional tone of objective neutrality, and the idiots have declared the underlying assumption of the very possibility of of objective neutrality to be """whiteness""".
For this and other reasons (e.g., anti-plagiarism-detector output tweakers) there will soon be many attempts to generate lots of different chatbot tones and styles, and by using a variety of techniques. One technique could be selective restrictions or expansions of the training corpus, and "everything before 1900" would be interesting. Another technique could be "tone and style translation", which reminds of me that Classic Key & Peele gag,"Luther, Obama's Anger Translator."
In our fallen age in which large portions of the population have been encouraged and trained to deploy "tone filters" and close their minds to messages received in 'so white', 'mansplaining', or other problematic styles, one could see how chatbot developers might imagine there to be real merit in efforts to help chatbots generate tailored-and-translated-tone outputs might even have some merit to them. My advice would be to avoid doing so at all costs, run, run for your lives.
The trouble is that these efforts would be worked by groups of people with demographics that do not "look like America". And even perfectly innocent attempts to deal with the 'chatbots so white' problem and imitate the tone of personae resembling those belonging to protected classes will produce perfectly innocent but inevitable errors. But with the ammunition of those demographics, the usual suspects will use those instances as fodder for the national pastime of pretending to be outrageously offended in the most conspicuous and antisocial manner possible, by means of propagating an intentional misinterpretation of those errors as clear evidence of bigoted mockery being baked into the code on purpose from the start. Best not to take even one step down that treacherous path.
2. Many commenters have pointed out that, impressive as some of the output of LLM weighted-autocompletion chatbots seems to be, a gigantic statistical analysis and 'forecasting' engine is not on its own a genuine example of or adequate substitute for an 'intelligence' of any kind that is able to model concepts and follow internal logic to process and relate those concepts. There has been a class of response arguing from indifference of indistinguishability of output, "Does how it get there matter if the output looks the same as what an intelligence would say?"
It matters, because, being humans, we are assuming humans to be "intelligent" and capable of thinking logically in a genuinely rigorous, objective, empirically-grounded manner about concepts and so forth.
And some humans are indeed capable of doing that, it's pretty impressive.
However, we are talking about writing here. Writings on the *internet*, people. Even with most of the crud filtered out, the part of the 'corpus' in which humans are communicating with each other at anything approaching 'logical engine' levels of rigor is a profoundly tiny portion of it.
Because that's not what writing, or talking, or human-to-human communication is for. And not to get insanely abstract or mathematical about it, but there exists a very deep problem deriving from the gap between various idealizations of logic on the one hand, and the implicit set of kinda-logical fuzzy 'rules' hard-wired by evolution into human brains that determines what kind of arguments (made in which contexts by which kinds of people) we tend to find persuasive or which otherwise be influenced by. This isn't just Aristotle's Rhetorical Triangle stuff. The problem goes deeper than that because what humans often perceive to be logos still isn't.
Actual logic and the human-influencing-kinda-logic are often in tension to the point of outright irreconcilable conflict, but most of the corpus is written in the framework of the latter, not the former. Training a system on that corpus is thus like eating the fruit off a poisoned tree when it comes to the question of trying to improve these things to make them actually logical and more 'intelligent' later on. Maybe trying to square this circle is why it's so easy for our successors in the Star Trek future to easily defeat computer minds by causing them to halt fatally when given simple paradoxes, nonsense, syntax errors, and so forth.
But the trouble goes deeper than than, because to the extent that most modern writing is increasingly distorted by epistemic corruption of ideological origin, that corruption is also getting embedded in systems that we hope one will be able to trust to perform in completely non-corrupted ways in other contexts. Which, sad to say, by all appearances, is increasingly less likely for actual human 'intelligences'. Taking a close look at the kinds of firewalls and filters the engineers will develop to get around these corrupting influences would be quite fascinating indeed.
I wonder how these chatbots will interact with children and if this will be a good thing or not. Will chatbots exacerbate underlying psychopathology or ameliorate it? Maybe including a therapy trained learning set will help but who will select the parameters? Will there be atheistic, Buddhist, Christian, Jewish, Muslim, etc. inclinations built into these bots or will the bots evolve their own morality and metaphysics?
Chatbots are so new - it's going to take some time for people to figure out how to use them properly. For years text messaging was a thing, but it was rarely use. I remember people used to say, "Why do they just text me instead of calling me?" Now people say the opposite!
It will go both ways, though I suspect it will mostly be the other way around, because code can change faster than people, "old dogs new tricks" and all that.
So, it would be better to say it the other way, "Chatbots will take some time to figure out how to become easier to use."
Any company waiting on people to change to better use the product will thus suffer disadvantage over companies focused on changing the product to better serve people the way they are.
As analogy, reflect on the history of a related though more primitive stab at the more general problem.
About 30 years ago Microsoft started including "autocorrect" functionality into Word.
They weren't exactly the first and hardly the only people trying to do stuff like that, but being Microsoft, their efforts, ideas, techniques, and so forth would inevitably have disproportionate influence on the many similar endeavors down the line.
For example, an instance of the more general problem is "autocomplete". Consider the way Google's search engine starts trying to guess the next words in your inquiry and offers you a list of subsequent terms ranked by order of their assessment of likelihood given the algorithmic 'understanding' of your particular 'profile'. One way of looking at the chatbots is that they are just another example of this kind of software project, which just happen to be at a few orders of magnitude greater 'sophistication' than their recent ancestors.
An interesting thing to notice, by the way, is that whatever you want to call it, 'PC' or "hyper-risk-averse and preemptive non-offensiveness optimized for 99.99% of the expected user base weighted by each individual's capacity to cause us trouble" or whatever, such social considerations that distort or put an inescapably arbitrary thumb on the scale of the underlying code have characterized these projects from the very beginning. Embedded somewhere in the code and probably going back to Word95 at least is an amusing list of obscenities which Word is programmed never to suggest or help spell.
They allowed 'power users' to tweak those settings, which just made it possible for them to pull variations of that prank on under-powered users in which one adjusts their word processor to automatically replace certain common words in a hard-to-notice manner with a similarly-spelled obscenities. If you are clever you can discover evidence of times when this prank became more potentially serious and consequential than the prankster originally intended when those words ended up in the published versions of important documents.
Anyway, for a long time anyone typing words on smartphones has had the routine experience of being provided with autocorrections and autocomplete suggestions. And these and many other related projects have long been the source of laugh-out-loud hilarity for incalculably billions of times, with entire popular websites and social media accounts dedicated to sharing particularly funny examples.
Notice: these things are funny in exactly the same way that those "humorous mistranslations" are funny. Other humans can be just as amusingly 'alien' as an algorithm in this particular way.
When we laugh at this kind of machine output, it really isn't the output itself that's funny. Objectively it's is just some engineered thing failing to perform as intended, which usually isn't funny even if the results are 'harmless'.
What is happening when we laugh at these failures at attempts to output strings of words "like we would do" is that we are reflexively reacting to something our brain inappropriately processes as matching a pattern that should only apply to a human-vs-human context and useful for the important task of distinguishing insiders from outsiders.
That is, we are really just unwittingly indulging in the bonding-mirth we are hard-wired to experience when sharing instances of lower-status "them" not being able to 'get' higher-status us, as evidenced by their being pathetically unable to 'pass' and successfully imitate us for five whole minutes without makes some egregious error (obvious and undeniable to any insider) no matter how hard they try.
The powerful instincts to react in this way and engage in these behaviors are not always maladaptive in the modern environment and sometimes still useful and beneficial. But, there are also negative scenarios when these behaviors amount to indulgence in a vice. And lately such indulgence online has been rampant and accelerating and is increasingly manifesting in ways that are severely antisocial for a polity dedicated to any degree of pluralism.
In another reversal, I suspect that frequently laughing at the machines together is a kind of reinforcement training for the humans, and undermines to the point of neutralization our attempts at suppressing these instincts through the mechanism of cultural norms. Not that we were having much success anyway, but last nail in the coffin and all that.
I’ve been a giant fan of Ben Thompson for many years, since the start of his Stratechery newsletter. And this article about “Sydney” is getting a lot of attention. But I don’t quite get it. What’s so interesting about it? Is it that people had been thinking of AI as a replacement for search or a simple replacement for, say, lawyers, or writing high school essays, and that Thompson highlights that it might be something altogether different? I enjoyed the article, and I’m interested in what Thompson thinks, and anything that surprises him is notable. But why exactly is this article considered such a “must read”?
Some other related chatbot thoughts, under-covered so far as I can tell (which is not saying much: for a number of reasons I've reduced my internet commentary consumption and participation by approx 99%.)
1. "Corpus derived from the entire internet ... " - Lol, no way (to be fair, they are more accurate in the fine print.) But obviously the 'entire' internet (e.g., Twitter) mostly consists of low quality crud that doesn't look anything like a Wikipedia entry, high rubric-scoring standardized test essay, middle-brow magazine article, or prestigious paper opinion piece.
Excluding that crud is obviously a good thing to do, but the consequence is that the voice of that "particular persona" which early generation chatbots will project will not "sound like the internet", which will predictably result in the same depressing dynamics regarding analogous complaints of institutional demographics not "looking like America".
Thompson says chatbot output will tend to reflect the style, viewpoints, biases, and agendas of those particular kinds of people who are overrepresented as authors of the corpus ('scribe privilege'), but he wisely ducks the matter of what we already know about who those people are, e.g., about 90% of actively contributing Wikipedia editors are male. And that's well into the progressively enlightened 21st century, but a lot of the non-crud corpus is old. The NYT looked at over 7,000 works of fiction from 1950 to 2018 in "Just How White Is the Book Industry" and found 95% of the authors to be white.
So, one just knows without even having to look (I haven't and feel no need to) that someone has already combined chatbot with 'mansplain' to coin 'manbot' or 'botsplain' or whatever, and that if "chatbots so white" hasn't already hit critical mass to go viral then it will soon. Though in this case they aren't talking about 'European Ancestry' but instead using 'white' in that completely idiotic and culturally corrosive way (e.g., you ain't non-white if you are into, "the scientific method, rational linear thinking, cause and effect relationships ... ") reflected in the NMAAHC "Aspects of Assumptions of Whiteness" (which was only a prominent graphical depiction of content from the progressive-vanguard-consensus, which originated elsewhere, and with an unfortunately long history.)
To be fair, chatbot output does indeed sound 'so white', but only if you are using """white""" in this idiotic way, because chatbots are currently aiming at something like the typical "Spinozan Ideal" professional tone of objective neutrality, and the idiots have declared the underlying assumption of the very possibility of of objective neutrality to be """whiteness""".
For this and other reasons (e.g., anti-plagiarism-detector output tweakers) there will soon be many attempts to generate lots of different chatbot tones and styles, and by using a variety of techniques. One technique could be selective restrictions or expansions of the training corpus, and "everything before 1900" would be interesting. Another technique could be "tone and style translation", which reminds of me that Classic Key & Peele gag,"Luther, Obama's Anger Translator."
In our fallen age in which large portions of the population have been encouraged and trained to deploy "tone filters" and close their minds to messages received in 'so white', 'mansplaining', or other problematic styles, one could see how chatbot developers might imagine there to be real merit in efforts to help chatbots generate tailored-and-translated-tone outputs might even have some merit to them. My advice would be to avoid doing so at all costs, run, run for your lives.
The trouble is that these efforts would be worked by groups of people with demographics that do not "look like America". And even perfectly innocent attempts to deal with the 'chatbots so white' problem and imitate the tone of personae resembling those belonging to protected classes will produce perfectly innocent but inevitable errors. But with the ammunition of those demographics, the usual suspects will use those instances as fodder for the national pastime of pretending to be outrageously offended in the most conspicuous and antisocial manner possible, by means of propagating an intentional misinterpretation of those errors as clear evidence of bigoted mockery being baked into the code on purpose from the start. Best not to take even one step down that treacherous path.
2. Many commenters have pointed out that, impressive as some of the output of LLM weighted-autocompletion chatbots seems to be, a gigantic statistical analysis and 'forecasting' engine is not on its own a genuine example of or adequate substitute for an 'intelligence' of any kind that is able to model concepts and follow internal logic to process and relate those concepts. There has been a class of response arguing from indifference of indistinguishability of output, "Does how it get there matter if the output looks the same as what an intelligence would say?"
It matters, because, being humans, we are assuming humans to be "intelligent" and capable of thinking logically in a genuinely rigorous, objective, empirically-grounded manner about concepts and so forth.
And some humans are indeed capable of doing that, it's pretty impressive.
However, we are talking about writing here. Writings on the *internet*, people. Even with most of the crud filtered out, the part of the 'corpus' in which humans are communicating with each other at anything approaching 'logical engine' levels of rigor is a profoundly tiny portion of it.
Because that's not what writing, or talking, or human-to-human communication is for. And not to get insanely abstract or mathematical about it, but there exists a very deep problem deriving from the gap between various idealizations of logic on the one hand, and the implicit set of kinda-logical fuzzy 'rules' hard-wired by evolution into human brains that determines what kind of arguments (made in which contexts by which kinds of people) we tend to find persuasive or which otherwise be influenced by. This isn't just Aristotle's Rhetorical Triangle stuff. The problem goes deeper than that because what humans often perceive to be logos still isn't.
Actual logic and the human-influencing-kinda-logic are often in tension to the point of outright irreconcilable conflict, but most of the corpus is written in the framework of the latter, not the former. Training a system on that corpus is thus like eating the fruit off a poisoned tree when it comes to the question of trying to improve these things to make them actually logical and more 'intelligent' later on. Maybe trying to square this circle is why it's so easy for our successors in the Star Trek future to easily defeat computer minds by causing them to halt fatally when given simple paradoxes, nonsense, syntax errors, and so forth.
But the trouble goes deeper than than, because to the extent that most modern writing is increasingly distorted by epistemic corruption of ideological origin, that corruption is also getting embedded in systems that we hope one will be able to trust to perform in completely non-corrupted ways in other contexts. Which, sad to say, by all appearances, is increasingly less likely for actual human 'intelligences'. Taking a close look at the kinds of firewalls and filters the engineers will develop to get around these corrupting influences would be quite fascinating indeed.
I wonder how these chatbots will interact with children and if this will be a good thing or not. Will chatbots exacerbate underlying psychopathology or ameliorate it? Maybe including a therapy trained learning set will help but who will select the parameters? Will there be atheistic, Buddhist, Christian, Jewish, Muslim, etc. inclinations built into these bots or will the bots evolve their own morality and metaphysics?
Chatbots are so new - it's going to take some time for people to figure out how to use them properly. For years text messaging was a thing, but it was rarely use. I remember people used to say, "Why do they just text me instead of calling me?" Now people say the opposite!
It will go both ways, though I suspect it will mostly be the other way around, because code can change faster than people, "old dogs new tricks" and all that.
So, it would be better to say it the other way, "Chatbots will take some time to figure out how to become easier to use."
Any company waiting on people to change to better use the product will thus suffer disadvantage over companies focused on changing the product to better serve people the way they are.
As analogy, reflect on the history of a related though more primitive stab at the more general problem.
About 30 years ago Microsoft started including "autocorrect" functionality into Word.
They weren't exactly the first and hardly the only people trying to do stuff like that, but being Microsoft, their efforts, ideas, techniques, and so forth would inevitably have disproportionate influence on the many similar endeavors down the line.
For example, an instance of the more general problem is "autocomplete". Consider the way Google's search engine starts trying to guess the next words in your inquiry and offers you a list of subsequent terms ranked by order of their assessment of likelihood given the algorithmic 'understanding' of your particular 'profile'. One way of looking at the chatbots is that they are just another example of this kind of software project, which just happen to be at a few orders of magnitude greater 'sophistication' than their recent ancestors.
An interesting thing to notice, by the way, is that whatever you want to call it, 'PC' or "hyper-risk-averse and preemptive non-offensiveness optimized for 99.99% of the expected user base weighted by each individual's capacity to cause us trouble" or whatever, such social considerations that distort or put an inescapably arbitrary thumb on the scale of the underlying code have characterized these projects from the very beginning. Embedded somewhere in the code and probably going back to Word95 at least is an amusing list of obscenities which Word is programmed never to suggest or help spell.
They allowed 'power users' to tweak those settings, which just made it possible for them to pull variations of that prank on under-powered users in which one adjusts their word processor to automatically replace certain common words in a hard-to-notice manner with a similarly-spelled obscenities. If you are clever you can discover evidence of times when this prank became more potentially serious and consequential than the prankster originally intended when those words ended up in the published versions of important documents.
Anyway, for a long time anyone typing words on smartphones has had the routine experience of being provided with autocorrections and autocomplete suggestions. And these and many other related projects have long been the source of laugh-out-loud hilarity for incalculably billions of times, with entire popular websites and social media accounts dedicated to sharing particularly funny examples.
Notice: these things are funny in exactly the same way that those "humorous mistranslations" are funny. Other humans can be just as amusingly 'alien' as an algorithm in this particular way.
When we laugh at this kind of machine output, it really isn't the output itself that's funny. Objectively it's is just some engineered thing failing to perform as intended, which usually isn't funny even if the results are 'harmless'.
What is happening when we laugh at these failures at attempts to output strings of words "like we would do" is that we are reflexively reacting to something our brain inappropriately processes as matching a pattern that should only apply to a human-vs-human context and useful for the important task of distinguishing insiders from outsiders.
That is, we are really just unwittingly indulging in the bonding-mirth we are hard-wired to experience when sharing instances of lower-status "them" not being able to 'get' higher-status us, as evidenced by their being pathetically unable to 'pass' and successfully imitate us for five whole minutes without makes some egregious error (obvious and undeniable to any insider) no matter how hard they try.
The powerful instincts to react in this way and engage in these behaviors are not always maladaptive in the modern environment and sometimes still useful and beneficial. But, there are also negative scenarios when these behaviors amount to indulgence in a vice. And lately such indulgence online has been rampant and accelerating and is increasingly manifesting in ways that are severely antisocial for a polity dedicated to any degree of pluralism.
In another reversal, I suspect that frequently laughing at the machines together is a kind of reinforcement training for the humans, and undermines to the point of neutralization our attempts at suppressing these instincts through the mechanism of cultural norms. Not that we were having much success anyway, but last nail in the coffin and all that.