In 1993, the National Center for Supercomputing Applications released Mosaic, the first graphical Web browser.
The Web protocol had debuted on the Internet two years earlier. All the Web had at that point was text and hyperlinks. Outside of universities, no one cared about it. The only businesses that had web sites were a few esoteric tech companies. Microsoft, the software behemoth, did not have an official corporate web site in 1993.
As of 1993, you could not use the web for any of the things we take for granted today. You could not check email on the Web. You could not stream video. You could not shop on line. You could not download a podcast. You could not look up a restaurant in order to browse its menu or look up directions to get you there.
People at home did not have the software (called a Winsock) to connect Windows PCs to the Web. They did not have the bandwidth to download multimedia (a simple graphic might take over a minute to show up on your computer).
Visionaries and entrepreneurs had many ideas for applications to develop using the Web. Most of these dreams never materialized. Others took much longer to arrive than companies hoped and planned for. Many others appeared unexpectedly.
There were many fads and false starts. I can recall many of my own errant predictions. I was confident in 1993 that the real estate industry would be disrupted, and yet 30 years later the process of buying a home is for the most part just as corrupt and inefficient as it was back then.
I did not think that portable, phone-based Web applications would go anywhere. I was clearly wrong about that.
AI is just getting started
I think that artificial intelligence is in that same chaotic, high-potential state in which the Web found itself in 1993. Yes, the field of artificial intelligence in computers has been around a long time. By the same token, as of 1993, the Internet had been around for a quarter century. But the release of NCSA Mosaic did for the Internet what the release of ChatGPT did for AI.
The newest buzzword in AI is “multimodal,” meaning combining text, images, and possibly other sensory inputs as well as data. Today’s text-only chatbots may soon look as primitive as the text-only Web of 1992.
Tim B. Lee’s recent essay foresees powerful results from combining AI’s with cameras.
Until recently, computers needed human help to understand what was in an image. Now computers can glean a ton of actionable information directly from images. And that data can then become an input to other software.
My advice is to set aside time to play with AI’s now. But imagine directions that the technology might take in the future. We’re in early days.
substacks referenced above:
@
Multisensory AI is going to be far more disruptive than the web. LLM will come to mean "large learning model" instead of 'language'. And this leap will prove to be monumental.
I think it's hard for people who live in a world of words to appreciate that bare human text, for all its power, is still a very unwieldy tool for communication, and especially for teaching and learning well enough to generate, the effectiveness diminishing as the topic becomes less abstract and symbolic in nature.
Most people don't (often can't) really learn much from bare text. As with culture in "The Secret Of Our Success", they instinctively and spontaneously observe, imitate, play, practice, and experiment under conditions of feedback and intuitive, subconscious picking up on nonverbal and environmental cues and subtle signals in social games to eventually get up to their potential or adequate level of performance.
And most masters and experts of many practical skills are terrible at teaching with words, or even thinking about how they do what they do in a fully self-aware way and in a manner that can be captured with sufficient accuracy and precision in words. Many who have tried to follow a cookbook recipe and failed and then watched a video of the author chef do it perfectly has realized that the chef has lost awareness of certain techniques she is performing and that the need to apply and master those techniques is both essential to success and something that novices wouldn't know they had to do. Even organic chemistry is full of such examples, and occasionally claims of non-replication themselves do not replicate because the original authors did not realize that some subtle but indispensable details were not adequately communicated in the original published experimental procedure.
But if the AIs can just watch everything we do with their Eagle eyes, then apply their statistical pattern finding magic to that omnicorpus, they will literally learn it all, immediately, perfectly, and then surpass us in everything, in a historical blink of an eye.
Multisensory Machine Learners (maybe MML is a good term given the visual contrast with LLM) are going to take that capacity of humans to learn from other humans and take it several levels higher. Think about face recognition, but with super sensors and ultra-polygraph, the cameras see in uv and infrared too, know your temperature, see if you're sweaty or flushed, read your emotions, know if you're lying, can detect the faintest whisper in frequencies too low and too high for us to hear, can bounce xrays and microwaves around to see below your clothes or under your skin, can spot the slightest flick of the wrist that allows for whisking up the ultimate perfection of meringue every time, and the possibilities go on forever. The human-monitored panopticon is going to be trivial compared to the power of MML universal surveillance, and that is something which is not in the sci-fi far future but which can be built -today-. And I'm sure it is. Have a nice day.
A small apropos anecdote: in 1996 or '97 I took an undergrad course in AI. We wrote some papers on the possibilities and limitations of various approaches then competing for academic interest. A friend of mine came up with what I thought was a very clever reductio ad absurdum task that an AI could not possibly perform: consider the Ogden Nash poem "The Ostrich" and (a) explain why it's funny, (b) translate it into French.
Yesterday I recalled that paper and decided to see how the LLMs would do on the task. ChatGPT failed with hazy generalities and an officious refusal to translate copyrighted material. Claude at first hallucinated a different poem entirely when I just referred to the poem by name and author.
But then when I copied the short text of the poem into the prompt and retried, Claude (though not ChatGPT) nailed both tasks, giving an extremely thorough and accurate anatomy of the poem's humor and doing as well as anyone could at the translation, with apropos commentary on why it was difficult to capture that sort of humor in another language.
1990s me would have thought "this is it, this is real strong AI, it's arrived, we're done." I don't think that, but boy are we in for some interesting times.