21 Comments

> The easiest PC components for AI to replace are the mouse and the keyboard.

I can't agree at all here. AI will let us do some things with voice that were not possible beforehand, but entirely replacing the mouse and keyboard (or touchscreen and on-screen-keyboard on mobile) is a much bigger task.

I think talking to an AI as the primary mode of communication is the thing that there is no actual demand for. There many people today who prefer to text each other than talk on the phone, I can't see interacting with a PC to be any different.

Expand full comment

I think you are absolutely right here, for a number of reasons.

-Many people can type faster than they can talk, and more smoothly meaning wise. I do both very quickly, but seeing transcriptions of what comes out of my mouth is embarrassing. Dictating to a computer what I want to write is slower and requires more cleanup. Great if the computer can do the clean up in my voice and catch my meaning, but at that point what is my voice and meaning? I hope I noticed the computer slightly changing things.

- Communication is more than just words, it is also a lot of visual cues and motions, moving things around, etc. We use our hands to talk quite a bit, as well as handy objects for visual references etc. The mouse lets us do the equivalent when communicating to the computer. Not only communicating, but when creating visual things like slides being able to move things around directly is very important. Trying to explain to another human "Move that picture over a little, then make it a bit smaller so we can see the chart. A little more. No, too far, go back... no, wrong way. The other way! Goddamnit, just let me do it!" is a common enough experience. Computers don't make it a lot easier without direct input.

There are times when talking is handy, such as those times when your hands are otherwise engaged or you know what you want but would need to search for it. My mom uses Alexa to set timers and select music to play while she cooks for instance, because the little box thingie can be over on a shelf well out of the way and she doesn't need to wash her hands before using it. I can imagine using something similar in the shop if I want to get some info on unit conversions or something without digging the phone out of my pocket. What those use cases have in common is "I am doing something else that requires my hands and most of my attention, so I just want to talk and have something else do the thing". Most extant computer assistants don't seem to really grasp that in a meaningful way, either being too intrusive or requiring too much hand holding.

Now it is possible that will get a lot better in the future, but it is worth noting that similar technology already has existed for quite some time, in the form of a personal secretary or assistant. Even when those were popular it doesn't seem that many people liked dictating letters or books as a mode of composition. I wonder how long AI will take to be able to really match that level of functionality.

Expand full comment

"Many people can type faster than they can talk, and more smoothly meaning wise."

But the goal is for AI to be able to do what you want faster than you can type.

Expand full comment

That's useful for many things, yes, but it isn't useful for conveying information to the computer about what I want it to do. E.g. dictating an email is slower than typing it (for those of us that type faster than we speak, or speak in a convoluted, difficult to parse way), sending a quick text to someone (I use the Siri text system while driving and it is pretty good, but having to speak the content, wait while Siri reads it back to see if it was right, make corrections if needed, then confirm that it is ready to send is a lot longer a process than just tapping it out, and I am awful at using touch screens to tap out text. And Siri loves my voice; my sister can't get speech to text to work without specifying the punctuation.)

The general point there is that many things are faster to do or convey via typing than by speaking. I agree that there are many use cases where speaking to a computer to get it to do stuff is desirable, however, especially as the machine gets better at understanding.

Expand full comment
Comment deleted
May 24
Comment deleted
Expand full comment

That impressive, and yet, I still find myself wasting tons of time filling out long hard-copy paper intake forms and questionnaires by hand in many doctor's officers, with the data then being expensively transcribed and entered into a database by staff, nearly a quarter through this "digital century". Few areas have seen so little progress in the customer experience happen so slowly.

Expand full comment

"But suppose that within the next three years ..."

Of course, if all that comes to be, then in three years anyone can ask their AI to review all the sources you review, in order for it to prepare a succinct substack post in your style according to your general worldview, for any subject of interest to them, or any subject the AI thinks you might have chosen to write about.

Expand full comment

I don't agree.

What I want out of AI is a personal assistant. Given I work in the knowledge economy and all my interactions happen via computer, an assistant that can summarize meetings, create to-dos from a meeting, remind me of deliverable timelines, track time more effectively, and help with menial tasks that need a CPU. I don't think this is copilot per se.

Expand full comment

I think there will always be meetings where you personally need to be engaged even though 85 percent are irrelevant and could turned over to the AI.

I think of a Wide receiver in the NFL who has had a bad game. He could spend hours watching film and directly engaging in every play, thinking about how he could have run that route better, read the coverage better and actually got his hands up to catch the ball. He could think about all this and visualize it.

Or he could have AI watch the film and give him a “to do” list

1) get open, don’t signal next cut

2) catch ball, get hands up quicker and turn head

Note that the coach is likely to have already given the WR this summary before even watching any film.

Which do you think will result in better output by the WR for the next game?

Similarly, in a corporate project that actually requires your output, SME and added value (and you building trusted relations with colleagues), I don’t think delegating the meeting to the AI will be optimal. The solution is probably to cancel the 85 percent of meetings which are useless “info sharing” and “touching base” or “regular updates”, but I suppose having an AI summarize these for everyone might be the next best thing. Maybe they’ll eventually die out in their own!

Where I see AI really helping the WR is to create a VR environment where the same plays with different coverages can be run again and again with immediate after action critiques and replays.

Expand full comment

100% this. It does seem to be the direction Microsoft is going, but whether they can get it right with copilot is another question.

Expand full comment

If the AI is doing this work for you remotely on Microsoft’s software and cloud, it could share this work with anyone or replicate it and re-sell it as its corporate owner likes. Your activity and your iterative thought processes and output doesn’t belong exclusively to you any more than your web searches or email content belongs to you. So far, the tech companies are only using your data to sell adverts and your consumption profile (and for espionage and almost certainly insider trading by dishonest employees), but they’ll be happy to knock off any good output you give them, I’m sure. Ask Scarlett Johansson.

Any company or individual working on anything proprietary is going to want a ringfenced AI tool and on their own Hardware or cloud and not using Microsoft’s et al. And talking is a fairly inefficient way of communicating, multitasking and sharing information. I will not listen to a podcast, even at 2x speed. Hardware is certainly going to play a role for some people- and more than VR or a microphone and earbud. Whether it will be an AI PC- who knows?

Expand full comment

I agree Microsoft's announcement is largely hype, but I think there is a use case where I want a local computer capable of implementing a LLM: a law firm with a large library of corporate formation documents, securities filings, contracts, etc. wants to put them into an LLM while maintaining their confidentiality. So it uses an LLM trained only on its documents and unconnected to the cloud (to keep the documents confidential). That would seem to need a computer that meets whatever technical specifications are necessary to accomplish this (probably more than Microsoft is using to label a CoPilot PC). Others some to mind - medical practices applying AI to patient records protected by HPPA, etc. Any one with a need for maintaining confidentiality.

Expand full comment

This seems to be more of a cloud task given the amount of compute you probably need for a LLM good enough to be used in the legal profession.

OpenAI already has trained models specifically for the legal profession (see: https://openai.com/index/harvey).

Just like with HIPAA, you don't see companies using local systems, but rather cloud systems with regulated and certified privacy standards.

Expand full comment

Fair point - although not sure I would trust a cloud based solution if I was managing partner of a large law firm. Lawyers are a conservative bunch. Some I've talked with don't want any of their documents in the cloud.

Expand full comment

I don’t disagree.

Expand full comment

Yes, the LLM-system superpower is communication in language, or at least the ability to generate output in line with language inputs.

And, as impressive as these capabilities have rapidly become, that is only an ordinary level cool power compared to the real super-power. The real super-power is the ability to extract extremely complicated "patterns" at the quasi-conceptual or thematic level when observing ("being trained on") a vast quantity of messy, weakly-structured observations.

If I ask you to try to define in words a particular aesthetic style, you may "know it when you see it," but conveying that sense in text will give you a hard time. "A picture's worth a thousand words". But that's because something like an aesthetic style is best captured and communicated in some gigantic data structure of matrices and weights and network graphs and so forth, and the thing that stands for some kind of motif or style is not effable or legible to humans in its data-distilled form. But it is to the AIs. "A billion word-tagged pictures thoroughly analyzed" in this kind of incomprehensibly large factor analysis is worth ten trillion words.

What that means is that AIs can learn *anything* that humans do (or think) not by instruction or reading textbooks or surveying the literature or even by practice but by carefully watching endless hours of info-dense recordings of skilled humans doing it well, and then calculating that complex-pattern distillation, and then go into a refine-test-evaluate loop to hone to perfection.

That's what the "AI PC" (and "Recall") is really all about. It is watching you so closely purportedly to "help you" (which it probably will to some extent), but more to the point to learn how to BE YOU, that is, substitute for you, and then scale you up at negligible marginal cost crashing the price of your skill to a penny per hour. You are being 'helped' to unwittingly train your replacement.

Expand full comment

I don't understand this take at all. Even with a theoretical AI that has achieved true general intelligence, the user will still need some way to communicate their desires to it. A mouse and a keyboard are in many cases just as efficient, and in many others more efficient, than using your voice to communicate.

Expand full comment

"The most difficult component to replace is the screen. But perhaps the attempts at creating glasses with Virtual Reality and/or Augmented Reality will pay off within a few years. If so, then the personal computer will definitely become a dinosaur.

If VR and AR glasses pan out, then the form factor of the future is a headset powered by AI. The AI could be local to the headset, or it could be in the cloud."

I don't know what you mean by personal computer. Glasses, smartphones, tablets, etc. are all personal computers.

https://en.wikipedia.org/wiki/Personal_computer

Expand full comment

Have you ever tried writing by dictation, though? It's not very natural! Sometimes you will want to write with your own voice.

Expand full comment

As long as we view a computer/AI as a minion assistant, using combination of keyboard & voice probably is as far as we get. But if the computer/AI can be more than a gofer (taking notes, summarizing, scheduling, etc.) then voice can become a greater tool. Think of a Socratic device, a way of dialogue between two intelligences, asking and answering questions, a means of discovery. Some of my best learning experiences as a student (eons ago) and as finding solutions to problems during my career, were thoughtful conversations with others that were equally invested in finding answers or new ways to get something done. This whole business of voice vs keyboard/mouse reminds me of the Star Trek movie where the crew travels back in time to find a way to transport whales and, when handed a mouse for the computer, Scotty holds the mouse and says into it, “Hello computer”. So a worker hands him a keyboard, Scotty says, “How quaint.” The point here, it seems to me, is we need to go beyond our current way of viewing the technology.

Expand full comment

We're there. Siri on an Apple phone, the device serving as a screen (visual interface) or a speaker (audio interface). Siri is the virtual intelligence that could, maybe already does, use LLM in the background.

Expand full comment
Comment deleted
May 24
Comment deleted
Expand full comment

It's more than that, for example, Recall is not just "Big Brother on Steroids" but the the Galactus-scaled version. If you like Chinese Social Credit surveillance, you're gonna love 24/7 AI recording everything everyone does all the time, capturing and understanding the patterns of your behaviors and cognitive processes, also watching on cameras, also listening on microphones, including if you trying your hardest to preserve your own privacy and are merely speaking in earshot of someone using the system without your knowledge. And of course communicating all that back to the Big Tech companies and anybody they want to sell it to in as high bandwidth as they can squeeze out of the connection. This is not Brin's Transparent Society: you are transparent to lots of wealthy and powerful entities, but nothing is transparent to you.

Expand full comment