I wonder if a plateau or slowing of change could encourage applications / practical development. When things were progressing rapidly, there was probably a sense to some that any current application would be shortly left in the dust.
While at the Marketwatch site, I succumbed to a clickbait link about “A Once-in-a-Decade Investment Opportunity: 1 Little-Known Vanguard Index Fund to Buy for the Artificial Intelligence (AI) Boom” which turned out to be about a utilities ETF – more data centers need more electricity, you see. Probably not the worst play in the world, even if the most obvious, but not exactly a ringing endorsement of anticipated AI revenue streams.
One of the newsletters in the inbox this morning also featured an ad from Sapience which apparently has moved beyond AI to synthetic intelligence (https://invest.newsapience.com/ ) and is offering crowdfunding opportunities while at the same time its founder and CEO is tweeting:
“That LLMs are vastly expensive, and from an energy standpoint even unsustainable, is now obvious. But can they deliver on the touted productivity gains? If not, it is not for lack of trying. Research by The Upwork Research Institute reveals that 39% of C-suite leaders are mandating the use of genAI tools, with an additional 46% encouraging their use.
But the results are not encouraging, the same study found that nearly half (47%) of employees using genAI say they have no idea how to achieve the productivity gains their employers expect, and 77% say these tools have actually decreased their productivity and added to their workload.
The Internet (eventually) actually did ‘replace very expensive solutions with very cheap solutions’ but the dot-com startup investment bubble was irrational because there were no barriers to entry. After the bust, most failed right out of the gate and it would take decades for the real winners to emerge.
LLMs startups also have no barriers to entry, the technology will always be vastly expensive, and in the end, it just doesn’t deliver. When this bubble bursts it could be ugly indeed. There may be no long-term winners, at least not big ones, this time around.”
So I don’t know, maybe a winning approach to AI might be to bet on products that are actually producing revenue streams with net present values? Not sure.
Hallucinations remain a problem, even in code writing. Most users need accurate facts as answers from an ai. Imagine a textbook with one false fact every two pages—but only experts know which one it is. (We might already have nearly that situation from books, unknowingly)
My own expected model is for a firm’s internal data, already in multiple databases, becomes more easily available thru telling the aiBot what you want and having that bot create requests to get the needed data from the relevant databases, and combine them.
Insofar as more highly paid women have jobs with digital output, like middle managers, it might be more negative for their employment, as well as mostly male coders.
Yes, it does look like scaling is hitting a plateau, as the dread Gary Marcus (among others) has been asserting. I had something like that in mind when I wrote my working paper, GPT-3: Waterloo or Rubicon? Here be Dragons, Version 4.1 (https://www.academia.edu/43787279/GPT_3_Waterloo_or_Rubicon_Here_be_Dragons_Version_4_1). I do believe, however, that the transformer architecture has earned a permanent place in the AI toolkit.
But there are other tools, some we know about, but most, likely, have yet to be invented. We're in a radically new world. The future is radically uncertain.
As a crude analogy, when Columbus started sailing West across the Atlantic in 1492 he thought he was headed to the Indies, that is, East Asia. He made three more voyages before his death in 1506. Yet, at that time, he still maintained that he'd found parts of Asia. Meanwhile Amerigo Vespucci's 1501-02 voyage to South America forced him to realized that a new world had been discovered, one that received his name.
It took those old Europeans awhile to figure out what was going on. So it is with us. I wouldn't be surprised if we had a generation or more worth of discoveries ahead of us.
My impression is that it isn't quite right to call it "scaling" unless the additional training data is comparable in quality and density of non-redundant information of the sort that can make positive contributions to improvements in pattern learning and reproduction. But they weren't stupid or random about what sources of info they sought to train on first, trying to get as much bang for the buck as possible.
At some point the 90% of dumb tweets and other garbage-level stuff left to add wasn't worth as much all put together as the 10% they started with except to learn better how to chat convincingly with people at that level. I think the theory behind expected gains of actual scaling still holds, but the context about the prospects scaling into much-lower-quality terrain proved far too optimistic. It's like after having exhausted a diamond mine, one just decides to walk the earth and hope to stumble upon some gems at around the same rate. If you end up in Crater of Diamonds State Park in Arkansas, you might just find one, but 99.9999% of your travels will yield nothing.
The current trend appears to be to use LLMs to generate training tokens for bigger LLMs. I hardly need to point out the many possible failure modes of such an approach. AI engineers have to learn to use the available high-quality data much more efficiently. It's certainly possible; to give an example I am familiar with, in Go large model components* of AlphaGo, AlphaGo Zero and AlphaZero learned via self-play, and it took them many millions of games to equal human professionals. No human professional plays or can ever play that much. Korean and Chinese professionals start studying at 4 years old and if they are promising make professional by 12-14, for 10 years = 3600 days of training. Even if they play or review a hundred serious games each day (unrealistic as it takes at least 10 minutes just to replay a game record on a board without thinking about the plays) it's still more than an order of magnitude below what Alphas used to learn, and realistically 2-3 orders of magnitude.
* unlike SOA LLMs such as Claude, GPT o1 and so on, Alphas are not pure large neural models. They include models as an element and harness them into variants of Monte Carlo tree search, a powerful _algorithm_ (i.e. a program that can be completely understood by humans, unlike the model part which is an opaque blob of numbers), at least during the learning phase. If memory serves, the models themselves - without the algorithmic tree search - are nowhere near human professional strength. There was a discussion about this at LessWrong, https://www.lesswrong.com/posts/HAMsX36kCbbeju6M7/is-alphazero-any-good-without-the-tree-search
Gwern's links page on AlphaGo includes recent papers on the surprising vulnerability of AGZ-like players to adversarial attacks involving certain whole-board patterns. This class of vulnerability was first published in 2022, and has so far proven impossible to eradicate by additional tuning (the model becomes resistant to variants of the exploit pattern used for fine-tuning, but this does not transfer to slightly different variants). The patterns are so simple and obvious that no mid-skill human amateur (say Go ELO 2000) would miss them, but AGZ falls to them fairly reliably even as its search depth is increased. I have a hard time squaring this fact in my mind with AGZ's superhuman performance in games with top human professionals.
Trump needs to teach a lesson the establishment will never forget - that persecuting your political opponents under the color of law will never be tolerated.
“Think of written works as the tip of the iceberg of human knowledge.” The other day I searched the word “bench” in my iPhone photos library. I was searching for written text, but the algorithm returned a bunch of photos with my kids sitting on benches. If it can do this now, will it be able to “watch” my videos and return search results of my kids “jump roping” and “playing tug-o-war.” Further, can it teach me what I’m doing wrong in my jump roping? Can it see where the rope is smacking the ground and inform me that I should use a longer or shorter jump rope? Or jump a little sooner or little later? Can it count the number of times, I’ve jumped? Can it plot histograms showing me how the frequency of my jumps changes over time? I’m pretty sure it’s just a matter of time. What more complicated tacit knowledge did you have in mind?
Yuval Noah Harari’s new one…Nexus…has a bit to say about it. Expanding on the idea, he discusses understanding information networks as a means for understanding AI. I’m about halfway through…so far, so mind blowing…
I wonder if a plateau or slowing of change could encourage applications / practical development. When things were progressing rapidly, there was probably a sense to some that any current application would be shortly left in the dust.
Too rapid change has been a big disincentive for me; working on becoming obsolescent.
The robot surgery piece was indeed amazing. Certainly provides evidence to justify optimism.
Infected with that optimism, I entered “dividends from AI stocks” in my AI equipped browser and got back an AI generated response that included “other AI-focused stocks mentioned (Microsoft, Nvidia, and C3.ai) do not currently pay dividends.” Uh oh. While that may be true of C3.ai (https://www.marketwatch.com/investing/stock/ai ) it is not true of MSFT (https://www.marketwatch.com/investing/stock/msft?mod=search_symbol ) or NVDA (https://www.macrotrends.net/stocks/charts/NVDA/nvidia/dividend-yield-history ) even if their dividend histories might not be considered by some to be spectacular.
While at the Marketwatch site, I succumbed to a clickbait link about “A Once-in-a-Decade Investment Opportunity: 1 Little-Known Vanguard Index Fund to Buy for the Artificial Intelligence (AI) Boom” which turned out to be about a utilities ETF – more data centers need more electricity, you see. Probably not the worst play in the world, even if the most obvious, but not exactly a ringing endorsement of anticipated AI revenue streams.
One of the newsletters in the inbox this morning also featured an ad from Sapience which apparently has moved beyond AI to synthetic intelligence (https://invest.newsapience.com/ ) and is offering crowdfunding opportunities while at the same time its founder and CEO is tweeting:
“That LLMs are vastly expensive, and from an energy standpoint even unsustainable, is now obvious. But can they deliver on the touted productivity gains? If not, it is not for lack of trying. Research by The Upwork Research Institute reveals that 39% of C-suite leaders are mandating the use of genAI tools, with an additional 46% encouraging their use.
But the results are not encouraging, the same study found that nearly half (47%) of employees using genAI say they have no idea how to achieve the productivity gains their employers expect, and 77% say these tools have actually decreased their productivity and added to their workload.
The Internet (eventually) actually did ‘replace very expensive solutions with very cheap solutions’ but the dot-com startup investment bubble was irrational because there were no barriers to entry. After the bust, most failed right out of the gate and it would take decades for the real winners to emerge.
LLMs startups also have no barriers to entry, the technology will always be vastly expensive, and in the end, it just doesn’t deliver. When this bubble bursts it could be ugly indeed. There may be no long-term winners, at least not big ones, this time around.”
(https://x.com/BryantCruse/status/1819087395683262852 )
On the other hand, Warren Buffett is apparently invested heavily in AI albeit mostly indirectly. (https://www.nasdaq.com/articles/46-warren-buffetts-410-billion-portfolio-invested-4-artificial-intelligence-ai-stocks )
So I don’t know, maybe a winning approach to AI might be to bet on products that are actually producing revenue streams with net present values? Not sure.
Hallucinations remain a problem, even in code writing. Most users need accurate facts as answers from an ai. Imagine a textbook with one false fact every two pages—but only experts know which one it is. (We might already have nearly that situation from books, unknowingly)
My own expected model is for a firm’s internal data, already in multiple databases, becomes more easily available thru telling the aiBot what you want and having that bot create requests to get the needed data from the relevant databases, and combine them.
Insofar as more highly paid women have jobs with digital output, like middle managers, it might be more negative for their employment, as well as mostly male coders.
Yes, it does look like scaling is hitting a plateau, as the dread Gary Marcus (among others) has been asserting. I had something like that in mind when I wrote my working paper, GPT-3: Waterloo or Rubicon? Here be Dragons, Version 4.1 (https://www.academia.edu/43787279/GPT_3_Waterloo_or_Rubicon_Here_be_Dragons_Version_4_1). I do believe, however, that the transformer architecture has earned a permanent place in the AI toolkit.
But there are other tools, some we know about, but most, likely, have yet to be invented. We're in a radically new world. The future is radically uncertain.
As a crude analogy, when Columbus started sailing West across the Atlantic in 1492 he thought he was headed to the Indies, that is, East Asia. He made three more voyages before his death in 1506. Yet, at that time, he still maintained that he'd found parts of Asia. Meanwhile Amerigo Vespucci's 1501-02 voyage to South America forced him to realized that a new world had been discovered, one that received his name.
It took those old Europeans awhile to figure out what was going on. So it is with us. I wouldn't be surprised if we had a generation or more worth of discoveries ahead of us.
My impression is that it isn't quite right to call it "scaling" unless the additional training data is comparable in quality and density of non-redundant information of the sort that can make positive contributions to improvements in pattern learning and reproduction. But they weren't stupid or random about what sources of info they sought to train on first, trying to get as much bang for the buck as possible.
At some point the 90% of dumb tweets and other garbage-level stuff left to add wasn't worth as much all put together as the 10% they started with except to learn better how to chat convincingly with people at that level. I think the theory behind expected gains of actual scaling still holds, but the context about the prospects scaling into much-lower-quality terrain proved far too optimistic. It's like after having exhausted a diamond mine, one just decides to walk the earth and hope to stumble upon some gems at around the same rate. If you end up in Crater of Diamonds State Park in Arkansas, you might just find one, but 99.9999% of your travels will yield nothing.
The current trend appears to be to use LLMs to generate training tokens for bigger LLMs. I hardly need to point out the many possible failure modes of such an approach. AI engineers have to learn to use the available high-quality data much more efficiently. It's certainly possible; to give an example I am familiar with, in Go large model components* of AlphaGo, AlphaGo Zero and AlphaZero learned via self-play, and it took them many millions of games to equal human professionals. No human professional plays or can ever play that much. Korean and Chinese professionals start studying at 4 years old and if they are promising make professional by 12-14, for 10 years = 3600 days of training. Even if they play or review a hundred serious games each day (unrealistic as it takes at least 10 minutes just to replay a game record on a board without thinking about the plays) it's still more than an order of magnitude below what Alphas used to learn, and realistically 2-3 orders of magnitude.
* unlike SOA LLMs such as Claude, GPT o1 and so on, Alphas are not pure large neural models. They include models as an element and harness them into variants of Monte Carlo tree search, a powerful _algorithm_ (i.e. a program that can be completely understood by humans, unlike the model part which is an opaque blob of numbers), at least during the learning phase. If memory serves, the models themselves - without the algorithmic tree search - are nowhere near human professional strength. There was a discussion about this at LessWrong, https://www.lesswrong.com/posts/HAMsX36kCbbeju6M7/is-alphazero-any-good-without-the-tree-search
Gwern's links page on AlphaGo includes recent papers on the surprising vulnerability of AGZ-like players to adversarial attacks involving certain whole-board patterns. This class of vulnerability was first published in 2022, and has so far proven impossible to eradicate by additional tuning (the model becomes resistant to variants of the exploit pattern used for fine-tuning, but this does not transfer to slightly different variants). The patterns are so simple and obvious that no mid-skill human amateur (say Go ELO 2000) would miss them, but AGZ falls to them fairly reliably even as its search depth is increased. I have a hard time squaring this fact in my mind with AGZ's superhuman performance in games with top human professionals.
That's very informative, thanks!
Thanks for reminder that I need to read William Gibson.
Trump needs to teach a lesson the establishment will never forget - that persecuting your political opponents under the color of law will never be tolerated.
https://shorturl.at/7QUjA
“Think of written works as the tip of the iceberg of human knowledge.” The other day I searched the word “bench” in my iPhone photos library. I was searching for written text, but the algorithm returned a bunch of photos with my kids sitting on benches. If it can do this now, will it be able to “watch” my videos and return search results of my kids “jump roping” and “playing tug-o-war.” Further, can it teach me what I’m doing wrong in my jump roping? Can it see where the rope is smacking the ground and inform me that I should use a longer or shorter jump rope? Or jump a little sooner or little later? Can it count the number of times, I’ve jumped? Can it plot histograms showing me how the frequency of my jumps changes over time? I’m pretty sure it’s just a matter of time. What more complicated tacit knowledge did you have in mind?
Yuval Noah Harari’s new one…Nexus…has a bit to say about it. Expanding on the idea, he discusses understanding information networks as a means for understanding AI. I’m about halfway through…so far, so mind blowing…