Interesting work! Could this be fixed in training by giving it practice at repeating each token when asked?
Another thing I’ve wondered is how substring operations can work for tokenized text. For example, if you ask for the first letter of a string, it will often get it right. How does that happen, and are there tokens where it doesn’t work?
I think this is a question about markets, like whether people are more likely to buy healthy versus unhealthy food. Clearly, unhealthy food has an enormous market, but healthy food is doing pretty well too.
Porn is common and it seems closer to unhealthy food. Therapy isn’t so common, but that’s partly because it’s expensive, and it’s not like being a therapist is a rare profession.
Are there healthy versus unhealthy social networks? Clearly, some are more unhealthy than others. I suspect it’s in some ways easier to build a business around mostly-healthy chatbots than to create a mostly-healthy social network, since you don’t need as big an audience to get started?
At least on the surface, alignment seems easier for a single-user, limited-intelligence chatbot than for a large social network, because are people are quite creative and rebellious. Short term, the biggest risk for a chatbot is probably the user corrupting it. (As we are seeing with people trying to break chatbots.)
Another market question: how intelligent would people want their chatbot to be? Sure, if you’re asking for advice, maybe more intelligence is better, but for companionship? Hard to say. Consider pets.
There's an assumption that the text that language models are trained on can be coherently integrated somehow. But the input is a babel of unreliable and contradictory opinions. Training to convincingly imitate any of a bunch of opinions, many of which are false, may not result in a coherent model of the world, but rather a model of a lot of nonsense on the Internet.
I'm wondering who, if anyone, keeps track of throughput at a port? Ideally there would be some kind of graph of containers shipped per day and we could see long-term shipping trends.
(This is making a bad assumption that containers are fungible, but we would at least have a rough idea of how bad the problem is.)
Could you say anything more specific or concrete about how reading HPMOR changed your life?
While improvements to moderation are welcome, I suspect it’s even more important to have a common, well-understood goal for the large group of strangers to organize around. For example, Wikipedia did well because the strangers who gathered there already knew what an encyclopedia was.
Tag curation seems a bit like a solution in search of a problem. If we knew what the tags were for, maybe we would be more likely to adopt a tag and try to make a complete collection of things associated with that tag?
Maybe tags (collections of useful articles with something in common) should be created by the researchers who need them? They can be bootstrapped with search. Compare with playlists on YouTube and Spotify.
It seems like a genuinely collaborative project, where articles are intended to be useful and somewhat more evergreen, would probably end up looking something like Wikipedia or perhaps an open source project.
There needs to be some concept of shared goals, a sense of organization and incompleteness, of at least a rough plan with obvious gaps to be filled in. Furthermore, attempts to fill the gaps need to be welcomed.
Wikipedia had the great advantage of previous examples to follow. People already knew what an encyclopedia was supposed to be.
I suspect that attempts at a “better discussion board” are too generic to inspire anyone. Someone needs to come up with a more specific and more appealing idea of what the thing will look like when it’s built out enough to actually be useful. How will you read it? what would you learn?
I’ve played around with Anki a bit, but never used it seriously because I was never sure what I wanted to memorize, versus look up when needed.
I wonder if it might be better to look at it a different way, using a note-taking tool to leverage forgetting rather than remembering? That is, you could use it to take notes and start reviewing cards more seriously when you’re going to take a test. Afterwards, you might slack off and forget things, but you still have your notes.
After all, we write things down so we don’t have to remember them.
Such a tool would be unopinionated about remembering things. You could start out taking notes, optimize some of them for memorization, take more notes, and so on. The important thing is persistence. Is this really a note-taking system you’ll keep using?
Teaching people to use such a tool would fall under “learning how to learn.” Ideally you would want them to take their own notes, see how useful it is for studying for a test, and get in the habit of using them for other classes. If not, at least they would know that such tools exist.
Back when I was in school, I remember that there was a teacher that had us keep a journal, probably for similar reasons. Maybe that got some people to start keeping a diary, who knows? For myself, I got in the habit of taking notes in class, but I found that I rarely went back to them; it was write-only. I kept doing it because I thought taking the notes helped a bit to remember the material, though.
You talked about rest but have you looked into stretches, putting your wrists in hot and cold water in tubs, ice packs, and so on? I had a different problem (tendonitis) and these helped.
A chat log is not a simulation because it uses English for all state updates. It’s a story. In a story you’re allowed to add plot twists that wouldn’t have any counterpart in anything we’d consider a simulation (like a video game), and the chatbot may go along with it. There are no rules. It’s Calvinball.
For example, you could redefine the past of the character you’re talking to, by talking about something you did together before. That’s not a valid move in most games.
There are still mysteries about how a language model chooses its next token at inference time, but however it does it, the only thing that matters for the story is which token it ultimately chooses.
Also, the “shoggoth” doesn’t even exist most of the time. There’s nothing running at OpenAI from the time it’s done outputting a response until you press the submit button.
If you think about it, that’s pretty weird. We think of ourselves as chatting with something but there’s nothing there when we type our next message. The fictional character’s words are all there is of them.