A collection of examples of AI systems "gaming" their specifications - finding ways to achieve their stated objectives that don't actually solve the intended problem. These illustrate the challenge of properly specifying goals for AI systems.
(Crossposted from my Substack: https://taylorgordonlunt.substack.com/p/my-ai-predictions-for-2027)
I think a lot of blogging is reactive. You read other people's blogs and you're like, no, that's totally wrong. A part of what we want to do with this scenario is say something concrete and detailed enough that people will say no, that's totally wrong, and write their own thing.
--- Scott Alexander
I recently read the AI 2027 predictions[1] . I think they're way off. I was visualizing my self at Christmastime 2027, sipping eggnog and gloating about how right I was, but then I realized it doesn't count if I don't register my prediction publicly, so here it is.
This blog post is mostly about me trying to register my predictions than trying to convince anyone, but I've also included my justifications below,...
I think they're way off. I was visualizing my self at Christmastime 2027, sipping eggnog and gloating about how right I was,
Reading further it seems like you are basically just saying "Timelines are longer than 2027." You'll be interested to know that we actually all agree on that. Perhaps you are more confident than us; what are your timelines exactly? Where is your 50% mark for the superhuman coder milestone being reached? (Or if you prefer a different milestone like AGI or ASI, go ahead and say that)
and with actual revenue.
“It’s not just the number one or two companies -- the whole batch is growing 10% week on week,”
YC makes all startups in the batch report KPIs even from before being accepted into the batch, If you participate in their Startup School, you are asked to track and report weekly numbers, such as number of users.
Paul Graham posts unlabeled charts from YC startups every now and then, so I assume the aggregate of all of these is what Garry Tan is refering to. Unfortunately, it is not possible to reproduce his analysis. But we should s...
i think this is a reasonable proxy for some stuff people generally care about, but definitely faulty as a north star.
some negative examples:
In last week's post, Meditations on Margarine, I explained how "I've awakened ChatGPT" is a perfectly reasonable claim. The error is in assuming "awakened" results in a human-like consciousness, rather than a "margarine mind".
In today's post, I want to explain where the common errors do occur: the belief that LLMs are a reliable source of insight, rather than a lottery or "gacha game".
Note: this post is focused on the ways LLM usage fails. There are valuable ways to use it, but that's not the focus of this post.
LLMs are best thought of as a slot machine; the payout is random.
"Gacha" refers to games where your rewards are randomized: each time you pull the lever on an LLM, you're getting a random reward. Common to the...
There’s quite a lot in the queue since last time, so this is the first large chunk of it, which focuses on apps and otherwise finding an initial connection, and some things that directly impact that.
are relationship coaches (not PUA) not a thing in the US?
Epistemic status: Philosophical argument. I'm critiquing Hinton's maternal instinct metaphor and proposing relationship-building as a better framework for thinking about alignment. This is about shifting conceptual foundations, not technical implementations.
--
Geoffery Hinton recently argued that since AI will become more intelligent than humans, traditional dominance-submission models won't work for alignment. Instead, he suggests we might try building "maternal instincts" into AI systems, so they develop genuine compassion and care for humans. He offers the mother-baby relationship as the only example we have of a more intelligent being "controlled" by a less intelligent one.
I don't buy this - for starters, it is not clear that mothers are always more intelligent than their babies, and it is also not clear that it is always the babies that control their mothers....
Wait… isn’t this already filial piety? We created AI, and now we want it to mother us.
While I'm intrigued by the idea of acausal trading, I confess that so far I fail to see how they make sense in practice. Here I share my (unpolished) musings, in the hopes that someone can point me to a stronger (mathematically rigorous?) defense of the idea. Specifically, I've heard the claim that AI Safety should consider acausal trades over a Tegmarkian multiverse, and I want to know if there is any validity to this.
Basically, I in Universe A want to trade with some agent that I imagine to live in some other Universe B, who similarly imagines me. Suppose I really like the idea of filling the multiverse with triangles. Then maybe I can do something in A that this agent likes; in return, it goes on...
A core element is that you expect acausal trade among far more intelligent agents, such as AGI or even ASI. As well that they'll be using approximations.
Problem 1: There isn't going to be much Darwinian selection pressure against a civilization that can rearrange stars and terraform planets. I'm of the opinion that it has mostly stopped mattering now, and will only matter even less over time. As long as we don't end up in a "everyone has an AI and competes in a race to the bottom". I don't think it is that odd that an ASI could resist selection pressures. ...
Note: Thank you to @Eneasz for handling on-site logistics while I (Garrett) am on a work retreat!
Note 2: We will be meeting in building B this week.
Come get old-fashioned with us, and let's read the sequences at Lighthaven! We'll show up, mingle, do intros, and then split off into randomized groups for some sequences discussion. Please do the reading beforehand - it should be no more than 20 minutes of reading.
This group is aimed for people who are new to the sequences and would enjoy a group experience, but also for people who've been around LessWrong and LessWrong meetups for a while and would like a refresher.
This meetup will also have dinner provided! We'll be ordering pizza-of-the-day from Sliver (including 2 vegan pizzas). Please RSVP to this...
Amidst the unrelenting tumult of AI news, it’s easy to lose track of the bigger picture. Here are some ideas that have been developing quietly in the back of my head, about the path from here to AGI.
A take I haven't seen yet is that scaling our way to AI that can automate away jobs might fail for fundamentally prosaic reasons, and that new paradigms might be needed not because of fundamental AI failures, but because scaling compute starts slowing down when we can't convert general chips into AI chips.
This doesn't mean the strongest versions of the scaling hypothesis was right, but I do want to point out that fundamental changes in paradigm can happen for prosaic reasons, and I expect a lot of people to underestimate how much progress was made in the AI summer, even if it isn't the case that imitative learning scales to AGI with realistic compute and data.
Epistemic status: I think you should interpret this as roughly something like “GenAI is not so powerful that it shows up in the most obvious way of analyzing the data, but maybe if someone did a more careful analysis which controlled for e.g. macroeconomic trends they would find that GenAI is indeed causing faster growth.”