GPT/LLM links
Made-up Training Data; Ethan Mollick on how to learn to use LLMs; Mark Wilson on Dot, a companion chatbot; The Zvi on people who say ho-hum about AI; Create your own GPT!
Alexander Kruel gives the description of MimicGen.
“Synthetic data will provide the next trillion tokens to fuel our hungry models. MimicGen: massively scaling up data pipeline for robot learning! We multiply high-quality human data in simulation with digital twins. Using < 200 human demonstrations, MimicGen can autonomously generate > 50,000 training episodes across 18 tasks, multiple simulators, and even in the real-world! MimicGen shows the power of synthetic data and simulation to keep our scaling laws alive.”
The way to control the behavior of large language models is to seed them with a lot of fake data. That has disadvantages as well as advantages.
A lot of the other links that Kruel offers look very interesting. I wish he were supplemented or replaced by an AI tool that extracted all of the information in the sources that he cites and explained it to me in language that I can understand. I could use a Zvi-extractor also.
In another post, Kruel links to a paper by Mrinank Sharma and many others.
Reinforcement learning from human feedback (RLHF) is a popular technique for training high-quality AI assistants. However, RLHF may also encourage model responses that match user beliefs over truthful responses, a behavior known as sycophancy. We investigate the prevalence of sycophancy in RLHF-trained models and whether human preference judgements are responsible. We first demonstrate that five state-of-the-art AI assistants consistently exhibit sycophantic behavior across four varied free-form text-generation tasks. To understand if human preferences drive this broadly observed behavior of RLHF models, we analyze existing human preference data. We find that when a response matches a user's views, it is more likely to be preferred. Moreover, both humans and preference models (PMs) prefer convincingly-written sycophantic responses over correct ones a non-negligible fraction of the time. Optimizing model outputs against PMs also sometimes sacrifices truthfulness in favor of sycophancy. Overall, our results indicate that sycophancy is a general behavior of RLHF models, likely driven in part by human preference judgements favoring sycophantic responses.
So LLMs get trained just like the NYT and Twitter.
just using AI will teach you how to use AI. You can become a world expert in the application of AI to your domain by just using AI a lot until you figure out what it is good and bad at. This is one of two reasons that I dislike the emphasis on prompting that pervades much of the discussions of AI: it makes using AI systems seem much harder and more mysterious than it is. Just use it and see where that takes you.
The second reason I don’t like the emphasis on prompting is that, for most people, having to worry about prompts at all is a very temporary state of affairs. As AI systems improve, the need for esoteric prompting decreases, because the AIs themselves become good at figuring out what you might want.
For FastCompany, Mark Wilson writes,
Dot is more than a natural language file management system. It analyzes what you say and attempts to be proactive in helping you take the next step in whatever you’re interested in. That means it suggested riffs on recipes I’d made and scoured the web for recipes I thought I might like—these are little surprises that awaited me on any given day that New Computer calls “gifts.” Dot would share them with follow ups—how was cooking going? Had I tried that one recipe? How did my son, a picky eater, like it?
Dot can be annoying, following up a little too much on a few topics it knows I’m interested in.
I think it is important to stay on top of these friendly applications of large language models. I have no interest in an AI that can discuss cooking with me, because that is not my thing. But I’ve enjoyed talking with Pi about books, and I imagine I would also enjoy Dot on that topic.
It is proven that some people try GPT-4 and DALLE-3 and do not come away impressed. They look for the flaws, find something to criticize, rather than trying to understand what they are looking at. If you want to not be impressed by something, if you are fully set in your ways, then it can take a lot to shake you out of it. I get that. I still don’t fully get the indifference reaction, especially to image models. How is this not completely crazy that we can get these images on demand?
A lot of people reacted to the Web the same way in the early days. I think the reason is that people think in terms of “use cases” for which they already have very good tools. Google for search, for example. It takes imagination to come up with a use case that is beyond other capabilities and that the new AI’s are able to deliver in their current primitive forms.
You can now create your own GPT! Casey Newton writes,
On a lark, I started pasting in the text of my column and told ChatGPT to “poke holes in my argument.” It’s not as good at this as it is catching spelling mistakes. But sometimes it does point out something useful: you introduced this one idea and never returned to it, for example.
Before today, I was essentially hacking ChatGPT into becoming a copy editor. As of today, it’s a GPT (called Copy Editor; it’s currently only available to users in the GPT beta).
To create it, I didn’t write a line of code. Instead, OpenAI’s chat interface asked me what I wanted to build, and then built it for me in a few seconds.
It sounds like I can easily build my system to grade essays on the quality of reasoning.
Right now, GPTs are the easiest way of sharing structured prompts, which are programs, written in plain English (or another language), that can get the AI to do useful things. I discussed creating structured prompts last week, and all the same techniques apply, but the GPT system makes structured prompts more powerful and much easier to create, test, and share. I think this will help solve some of the most important AI use cases (how do I give people in my school, organization, or community access to a good AI tool?)
substacks referenced above:
@
@
@
I find #3 of Kruel's list particularly terrifying somehow:
> “Can we make LLMs unlearn a subset of their training data? ...we took Llama2-7b and in 30 minutes of fine-tuning, made it forget the Harry Potter universe while keeping its performance on common benchmarks intact...” https://www.microsoft.com/en-us/research/project/physics-of-agi/articles/whos-harry-potter-making-llms-forget-2/
> just using AI will teach you how to use AI. You can become a world expert in the application of AI to your domain by just using AI a lot until you figure out what it is good and bad at.
Sounds like Ethan Mollick is more talented at this than I am. The more I use AI the more I become like a grumpy old man who thinks this new fangled stuff never works. But I still cling to the hope that I'm just holding it wrong.