We tried talking about AI Alignment, and that’s also not going so great.
Eliezer defined AI alignment as "the overarching research topic of how to develop sufficiently advanced machine intelligences such that running them produces good outcomes in the real world."
If you try to use this definition, and other people use AI alignment to refer to things that they think are relevant to whether advanced AI produces good outcomes, you can't really object as a matter of linguistics. I have no sympathy and am annoyed at the complaining. You can see the last LW discussion of this here.
I think "intent alignment" is a more useful concept, and complaining about how intent alignment isn't the same as existential risk from AI reflects a misunderstanding of how language works and what its purpose is. I think MIRI people are basically doomed to be perpetually frustrated because they are trying to use language as a tool to shape discourse in a way that's just not feasible. (The whole point of the choice of "alignment" was to sound good and bring in a bunch of connotations that aren't the same as the proposed meaning of the term.)
Alignment seems to mean something that's alternative to control because of the ordinary meanings of the words. If you stipulate that you are using "alignment" to mean something else, you are going to face a perpetual uphill battle. If you define a dog as a domestic pet that either barks or meows, expect people to be confused.
Example:
Not all accelerationism is based on some version of ‘well, what else are we going to do, you’ve strangled everything else and someone please help me my
familycivilization is dying’ or ‘I literally can’t envision a positive future anymore, so why not roll the dice.’
@Zvi I have to ask. In what timeline, with solely human intelligence and weak AI, does this trend reverse itself?
I simply don't see it. Humans are too stupid to ever relax the regulatory ratchet because the argument for a regulation sounds more convincing than the argument to relax it. Especially when government/high status institutions get to argue for restrictions (that happen to empower and guarantee their own jobs) while the argument against it usually comes from those with lower status.
AI research is difficult to impede because of the race mechanics/easy reproducibility in separate regulatory regimes, so it can actually proceed. Building more housing or balancing the budget or researching treatments for aging at a serious level? Impossible.
I'm an accelerationist for 2 main reasons.
(1) I dropped out of medical school, but not before witnessing that it's considered standard practice to have no plan at all. Hospitals exist to collect reimbursements, and their practitioners, who were all supposedly trained on empirical science, never even try, hell they don't even research cryogenically freezing any of their patients. This is so incredibly stupid you can't imagine any solution other than basically being able to fire everyone - that if you had a powerful enough set of ASIs, you could start your own hospital and deliver medicine that actually worked.
(2) It's not at all clear that, with constructible levels of compute and actual available data, that a being "smarter than us" would be smarter by the kind of margins that some are assuming. There is logarithmically less utility on real problems with greater intelligence, and this agrees with all AI research data I am aware of. The "sudden capabilities jumps" seem to be illusions. https://www.lesswrong.com/posts/qpgkttrxkvGrH9BRr/superintelligence-is-not-omniscience and other posts show that there are ultimate limits even for a perfect cognitive system. What this means is that narrow superintelligences that designed to stay focused on the problems we want to solve - with various forms of myopia in their agent designs - may in fact be controllable, with any attempts they have to break out or manipulate us failing trivially because they did not receive sufficient training data to succeed nor do they have enough information about the (target computer or target person) to succeed.
(3) if you try to play this out a little, if the default case is every human is going to die, then future worlds are either "machine men" (AIs that have all of our writings and images and DNA in files they learned from and some kind of civilization) or "men machines" (humans won, they made aligned AI and didn't die but the optimization pressures of future society turns everyone into immortal cyborgs with less and less organic parts)
If you're dead the 2 outcomes are indistinguishable, and it's hard to see how they are really any different. Either outcome is "every human alive now is dead, the information that made us human exists".
So yes, the cruxes are : the default case is everyone is going to die. Doesn't matter your age, the medical-research establishment as practiced by humans today will not develop a treatment for aging before the death of a newborn child alive today. And compute, especially inference compute, is so scarce today that if we had ASI right now, it would take several decades, even with exponential growth, to build enough compute for ASIs to challenge humanity.
I don't see the reason for this defeatism - not on housing where YIMBY is actively winning some battles and gaining strength, not on aging where there might not be as much research as we'd like but there's definitely research and it will improve over time. As for balancing the budget, we did it as recently as the 1990s and also it's not obvious why we need to care about that.
So basically on your (1) I'd say yes we agree there are upsides I don't see how that leads to enough to justify the risks, and (2) I disagree strongly with the premise but even if you are right we would still be dead slightly slower, as your (3) suggests.
If your opinion is, roughly, 'I don't care if humans continue to exist once I am dead' then that would be a crux, yes. If I didn't care about humans existing after my death, I would roll the dice too.
AI research is difficult to impede because of the race mechanics/easy reproducibility in separate regulatory regimes, so it can actually proceed.
This is almost impossibly unlikely to produce good outcomes; this is selecting for speed by its ability to avoid our current means of alignment.
And compute, especially inference compute, is so scarce today that if we had ASI right now, it would take several decades, even with exponential growth, to build enough compute for ASIs to challenge humanity.
Uhm, what? "Slow takeoff" means ~1 year... Your opinion is very unusual, you can't just state it without any justification.
Am I missing something about Grimes that warrants them getting 4 mentions in this post? Why should I care what Grimes in particular has to say about any of this (as opposed to someone like Geoff Hinton on the other extreme)?
I mean, Snoop Dogg is an interesting mention because in some ways his experience is a yardstick for how far these concepts are penetrating into mainstream society. But Grimes? Why Grimes?
I mean, usually not Grimes, in this case the people I monitor were talking about her, and she is doing some interesting things (e.g. the AI voice thing) and it happened to take place four times in a week. She's a person, actually trying and thinking things, in addition to seeking attention and... we don't have as many people as we'd like.
I also continue to see people assume that AI will increase inequality, because AI is capital and They Took Our Jobs. I continue to think this is very non-obvious.
It's quite reasonable in the long term on the model that AGI will become ubiquitous while respecting property rights (not saying I agree with this model).
For some time, I agree that AGI output would be offset by the humans being valuable in human-comparative advantage jobs. So even though all the former office workers have to get jobs as janitors in the AGI server farms (or whatever) at least they get lots of AGI-assisted services to have a high effective living standard.
But in the long term, the limiting factor is natural resources which get bid up by the AGI firms. So it becomes cheaper in the long run to build and maintain a robot to do the AGI server farm janitorial duties than to supply food for a human to eat, and human survival using labour as the source of income becomes non-viable.
See also Robin Hanson's If Uploads Come First.
Here’s some major mundane utility, huge if true: Automatically analyze headlines for their implications for stock prices, get there first and earn a 500% return
I read the article about this and their assumptions are insane. Normally stock news happens after the market closes. The way returns on the next day are calculated is just (end of day price for the day)/(end of day on the previous day). It is importantly not (end of day)/(opening price) of the stock. This is why stocks are able to very frequently open up many percentage points due to news happening overnight.
So, all they did was say based on the news we will get long if chatGPT says good and get short if chatGPT says bad, and the way we will calculate our returns is just whatever the returns for the next day are. But remember returns for the next day are based on the close for the previous day. So, they are assuming that if news comes out at 6pm they are able to buy or sell the relevant stock at exactly the 4pm closing price. This is a fucking insane assumption. Not only are markets very thin in the extended 4pm-8pm hours and pre open 4am-9:30am hours. But responding to news during non core trading hours (any time not 9:30-4:00) is already an extremely standard thing that hedge funds do. So yes this would be true if there was some magical market maker that was willing to market make at unlimited size in either direction without doing any fading in response to news or someone trading against them during the normally very thin non core trading hours.
The most disappointing part of such discussions are the people who mean well, who under normal circumstances have great heuristics in favor of distributed solutions and against making things worse, not understanding that this time is different.
I have two great problems with the new centralist-doomer view, and I’d appreciate it if someone tried to address them.
Assuming the basic tenants of this worldview, it’s still not clear what threshold should be used to cut off open science. The old fire alarm problem if you will. I find it unlikely that this threshold just happens to be now that big economical contributions is possible while no signs of real dangers have been observed. The alternative hypothesis of rent seeking, OTOH, fits the hysteria perfectly. (I believe that EY probably believes we should have stopped open progress years ago, but I find that ridiculous.)
What happens if this scenario actually succeeds? How will it not be guaranteed to be a totalitarian nightmare? Unlike AGI, our history is full of examples of centralization casting people in hell.
My current belief is that centralist-doomers simply prefer being alive in any capacity whatsoever to being dead, and they are also under the hope/delusion that they will be part of the minority having power in this brave new world.
Only takes ~6 months to turn a non-AI researcher into an AI researcher
Um-hm, and it only takes a week to learn a syntax of programming language. Which in no way makes you a software engineer. I guess this really depends on the definition of "AI researcher". If the bar is "can do anything at all" without any measure of quality or quantity - 6 months is more than enough.
This isn’t about the writers getting what should be due to them. That’s never going to happen. This is about the writers getting enough to survive, so they can keep writing. It is about not driving away the talent, and giving that talent the time and ability to develop.
Hollywood movies and shows are mostly harmful for cognition, and one of main culprits are the writers (explained in detail in this post of mine).
I think it's likely that replacing Hollywood writers with some GPT-5 will be a net improvement in this regard, as it will reduce the input from the people of the "I'm an irreplaceable Hollywood writer, a shepherd of mankind" mindset.
Could also make the movie scripts much more creative and entertaining (which is not a high bar, given the average quality of the Hollywood writing).
I have been using ‘AI existential risk’ which sounds reasonably serious and seems hard to co-opt or misunderstand. I haven’t entirely given up on alignment, but yes that might become necessary soon, and so far we don’t have a good replacement. In some sense, any ‘good’ replacement will get stolen.
"AI Omnicide Risk" is snappier, even less ambiguous, and has a grudging Eliezer approval, so we can coordinate around that term if we want it to sound respectable.
Our new band is called Foom Fighters – what are some good song titles?
Continuing the joke on the Meta-level - GPT-4 actually produces decent suggestions for these :)
when the AI starts doing a lot of the prompt-creation automatically
This sounds like taking humans out of the loop.
One could make a series of milestones, from "AIs are finicky, and subtle differences in wording can produce massive gains in quality of reply", to "AI generally figures out what you want and does it well", to "AI doesn't wait for input before acting".
Humans can remember a 10-digit phone number in working memory – AIs will be able to hold the entirety of Wikipedia in working memory
In the context of LLM working memory is not it's training dataset. Training dataset in condensed and pattern-ized form is long term memory. Working memory is its "context window", so 8k or 32k tokens right now. Which on one hand is much better than 10digit number, but on the other - this comparison grossly underestimates the amount of data person holds in their "working memory" without thinking too much about it. "Where am I, what am I doing, why am I doing this, who passed me right now, who is sitting behind me, what is the tools I have available at the moment..." None of this we put in actual words inside our head but we still hold all of them in our working memory.
This could have done with a definition of "moat". Here's one I found:-
The engineer’s claim ties back to Buffett’s idea that, like medieval castles, successful businesses must protect their cash cows with moats that keep competitors from making a credible challenge. Those moats can come in the form of a size advantage that helps reduce costs, patents that protect key technological innovations, a high cost to entry for competitors, or even brand recognition. The billionaire detailed his economic moat theory in a 1999 Fortune article:
“The key to investing is not assessing how much an industry is going to affect society, or how much it will grow, but rather determining the competitive advantage of any given company and, above all, the durability of that advantage. The products or services that have wide, sustainable moats around them are the ones that deliver rewards to investors.”
Formatting error: "OK, I used to work for a robotics company, and I do think that one of the key obstacles for a hostile AI is moving atoms around. So let me propose some alarms!" should be quoted since it is not you (Zvi) writing that passage but the person you linked and are quoting.
If you start out with an already superintelligent system, you do not get to ‘align’ it, further modify its preferences, because it will not let you do that.
As usual , that starts about three steps in. You need to first show that alignment as opposed to control is the only route to safety, that the supertinteligent system necessarily has goals of its own, and that it wants to be goal stable.
Thanks for the writeup!
Small nitpik: typo in "this indeed does not seem like an attitude that leads to go outcomes"
Remember the start of the week? That’s when everyone was talking about a leaked memo from a Google employee, saying that neither Google nor OpenAI had a moat and the future belonged to open source models. The author was clearly a general advocate for open source in general. If he is right, we live in a highly doomed world.
The good news is that I am unconvinced by the arguments made, and believe we do not live in such a world. We do still live in more of such a world than I thought we did a few months ago, and Meta is very much not helping matters. I continue to think ‘Facebook destroys world’ might be the most embarrassing way to go. Please, not like this.
By post time, that was mostly forgotten. We were off to discussing, among other things, constitutional AI, and Google’s new product announcements, and an avalanche of podcasts.
So it goes.
Also, I got myself a for-the-people write-up in The Telegraph (direct gated link) which I am told did well. Was a great experience to do actual word-by-word editing with the aim of reaching regular people. Start of something big?
Table of Contents
Language Models Offer Mundane Utility
Here’s some major mundane utility, huge if true: Automatically analyze headlines for their implications for stock prices, get there first and earn a 500% return. As Will Eden says, roll to disbelieve. Even if I were to fully believe, which I want to make clear that I don’t, I would not expect it to be sustained for long, since others can copy this strategy, the same way many already parse such headlines. If this (was) real Alpha, I’d assume it would be it comes from GPT-4 being naturally better at predicting short-term trader reactions rather than, shall we say, its ability to predict impact on the net present values of future cash flows.
Another ode to the wonders of Code Interpreter. It seems like a cool productivity tool. I can’t tell for myself because despite signing up right when they were announced I still don’t have plug-in access.
World building and mystery construction for your cyberpunk (call it ‘neo-noir’ for better results) future. There’s a certain kind of sweet spot where you want a parrot.
Have the AI play an escape room text adventure.
Have it fabricate quotes by real people, who inform you when you contact the person to confirm the quote.
Generalize systematic errors in economic forecasting, and apply them to the future.
Thread suggests uses for GPT-4 with browsing. Ideas are finding news, summarizing posts, pulling trends in a subreddit, analyzing the Twitter algorithm, finding ‘hidden gems’ for travel, finding best-reviewed places, recap top performing stocks and writing an essay, all with citing the source. So yes, it’s Bing, except without whatever tuning they gave Bing? It’s… fine?
Get the AI to be critical, at least a little, by explicitly telling it to be critical.
Successfully call out a bug in the compiler.
Thread of Chrome Extensions that offer ‘GPT everywhere’ in various senses. Going to be experimenting with a few of them. Anyone know other good ones?
Wendy’s to use Google-powered chatbots to take drive through orders.
Oh good, unprecedented demand for Palantir’s AI program.
The good news is that they give at least lip service, for now, to ‘the machine must be subordinate to its master’ and ensure that the AI doesn’t actually do anything on its own without human supervision. I do not expect that to last on its own.
Talk to a bot girlfriend version of influencer Caryn Marjorie, for the price of $1 per minute, including ‘erotic discourse and detailed sexual scenarios,’ trained on more than 2,000 hours of her (now deleted) YouTube channel. I mean, don’t do this. I am highly impressed by the pricing power on display here. How much of that will be sustained as the market grows?
Level Two Bard
Google had its AI presentation on Wednesday.
The vibe of the presentation felt like a forced smile, or trying way too hard to be excited by what would have been exciting four months ago. Yes, those are amazing abilities you’re highlighting, except I already have most of them.
The presentation edit I watched led with email composition, with a useful but remarkably simple-and-easy-in-all-senses example of asking for a flight refund, then oddly emphasized photo editing before moving to Gemini.
Gemini is the new foundation model (what I call ‘base model’) Google is training, which they say is designed from the ground up to be multimodal, to be highly efficient at API and tool integration, and built to enable future innovations like memory and planning.
If that was what you were hoping for, great. It’s coming.
If your hope is that such capabilities will not be present in AIs so we will be safe, either despair or find another hope.
Some features I haven’t seen in the wild in good form but that were always clearly coming are also promised, like creating spreadsheet templates in Sheets and speaker notes in presentations. I notice that it is difficult for such features to actually be net useful.
Integrating generative AI directly into Google search seems good, if the timing is worked out, so you don’t have a delay in using the page or a sudden jump while you scroll. The new search interface and capabilities generally seem solid, if (and only if) Google delivers the functional goods.
Google Vertex AI, for building new tools, put up front the idea of fine tuning your own model. I keep meaning to try this with existing tools and keep not finding the time and worrying about getting it right – making this easier in a ‘beware trivial inconveniences’ way could be big even if functionality is similar. Imagine if it was as simple as a one-button ‘fine tune this model on the following website’ and it pops up pricing for you.
Tailwind will take your Google Documents and create a personalized AI system. Yes, please, especially if this can handle a full blog download, also even if it can’t. Note the incentives this creates to have your info in ;good form’ as it applies here. As the speaker says, the ‘show your work’ here is huge, since the work shown is pulling the right things from your own notes and documents.
In the meantime, Google is expanding its generative AI experiments with Gmail and Docs by >10x. They also linked to their list of waitlists (here’s another) so I got on any I was missing that I noticed.
There is a Google Workspace blog, no other useful posts yet that I can see.
These seem like good initial options for experimentation, and there’s always custom. I’d definitely be excited to give it a shot and report back, if I had access (hint, hint, Googlers).
Google also updated the model being used by current Bard. Their presentation brags about its capabilities now that it uses Palm 2. How much progress is being made with the new model? Clearly some.
Also clearly not so much, as I quickly learned trying to use it.
Whoops.
Tools (essentially the same as Plug-Ins), including various Google apps, and visual inputs and responses, are coming soon.
There is both a commercial vibe and a certain ‘don’t blow yourself up’ flavor to the list here, which are related.
Writers Strike, Others Threaten To Strike Back
The writers are mostly striking because streaming is being used to not pay writers. Snoop Dog explains here in plain language. They are also striking because of the threat that AI will be used as another way to not pay writers.
A common response to the Hollywood writers going on strike has been to threaten them with replacement by AIs. In Intelligencer, John Herman chronicles many such cases. Many are randoms on the internet, saying things like ‘Just saw ChatGPT walking across the picket line’ or ‘Oh, no, Hollywood in Panic mode? Fire up the AI people.’ Some are not.
Writing matters. Humans who have passion for and understand the thing they are creating matter. They matter a lot.
Have you ever seen a show that was really great, then lost its show runner, then it turned into a weird not-quite-itself thing where everything was slightly wrong? Two examples that come to mind are Community and Gilmore Girls. In both cases, they were forced to bring the creator back.
Great writing, and a great creator, make great shows. Lack of great writing means your show cannot possibly be great. Great writing and creating is, at least in my book, in far shorter supply than great acting. AI might help the process, it won’t change the bottom line any time soon.
Entirely or mostly AI-written shows, if they are tried, would be so much more wrong than those examples. AI-created shows would lack a soul entirely.
You can simultaneously be impressed by the two page scripts GPT-4 puts out, and have the wisdom to not remotely consider using such a thing for anything but brainstorming or editing.
This isn’t about the writers getting what should be due to them. That’s never going to happen. This is about the writers getting enough to survive, so they can keep writing. It is about not driving away the talent, and giving that talent the time and ability to develop.
And it is all the more reason why the writers need to strike and stand firm, now. Establish sustainable rules, now. If they don’t take a stand now, the studios will increasingly turn to AIs, and move to give AIs credit and compensation, far faster and more completely than the technology supports. Everything will get a lot crappier and get there a lot faster.
The Tone Police
If you think this worked, why? Bold is mine.
What is the point of this tone policing, if it does not lead to policy views changing?
The whole point of encouraging people to be more polite and respectful is so they will listen to each other. So they will engage, and consider arguments, and facts could be learned or minds might be changed.
That is also how one sustains such a change. If being more polite is rewarded and persuasive and leads to better outcomes, I have reason to be polite. If it changes nothing, why should I bother? Why am I even having the conversation at all?
This is from the abstract:
I don’t want to perceive that I am understood. As Lisa Loeb says, I want to be understood.
So when Sarah Constantin responds with:
Sure, but why should I care? As Bart Simpson pointed out, I already know how not to hit a guy. I also know how not to convince him of anything.
Fun With Image Generation
Thread on generating cinematic images with MidJourney, including some gorgeous cinematic images. ‘A film still of [characters doing thing], —16/35/70mm/prores, ar 16:9, over-the-shoulder/low angle/extreme close-up/handheld shot/aerial shot/crowd shot/establishing shot, [cool-toned/pastel/bright/vibrant/muted/neon/warm/duotone] color grading, [cgi/chromatic abbberations/cinemascope/light leaks/bokeh/depth of dield/rear projection/starbursts/motion blur], [genre: adventure/b-horror/epic fantasy/film noir/horror/inde/western/thriller], [country], —seed X
I presume many of the shots I see must also be specifying the actors involved, since often multiple shots include what are clearly the same people.
He also offers, via Replicate, a fine-tuned model for generating good keyword-heavy MidJourney prompts from normal text descriptions (he also links to Replicate’s guide to fine tuning and offers a video guide). Weird there isn’t more of this yet. Then again, it’s also weird we still are using MidJourney via Discord.
Stable Diffusion costs only 50k to train.
Thread of the best AI-generated short videos. A classic case of the start of something that will be big in the future, simultaneously highly impressive and profoundly unimpressive. If you are paying attention, you can see the very strict limits on what the AI can do. How long will those restrictions last? Presumably not for long in this form, perhaps for a while in terms of doing certain kinds of complexity. As the thread puts it, a new art form, with some things that used to be hard or expensive now easy and free, other things that used to be easy and free now hard and expensive or even impossible.
The core case against AI art.
Certainly I prefer humans make more art to humans deliver more groceries.
The question is something like a combination of (1) what exactly is the part where ‘humans make more art’ in the sense we care about, (2) how much does it matter for consumption if a human produced the art in various senses and (3) how much do we care about human production versus human consumption.
What do we value?
Our views on such things change, often for the better. At first I had a sense that works created in computer programs weren’t considered ‘real art’ in some important sense, you had to make a physical copy, and I understood why. Now few think that.
Humans still play chess. While we still play chess, we will still make art. We are playing more chess than we ever have before.
This might be the most impressive AI generation so far in terms of looking real. If you edited out the one glaring error that I somehow missed for a bit, the rest is damn good.
Grimes trains MidJourney on her own art, makes her own art better (in her own opinion). I say such methods count as being a real artist. And she is freely sharing her voice for songs, with a 50/50 royalty split, here are more details.
Less fun: Did you know that nothing you do on MidJourney is private?
Alyssa Vance: Curious how many knew that all Midjourney images, even if made inside a private channel, are publicly posted on the Midjourney website under your public Discord ID?
OpenAI releases basic text-to–3D-model generator.
Introducing
An Elo-based ranking system for open source LLMs, with demos and links. Vicuna-13b started out as the champion. Vicuna at least claims to be unrestricted in the types of content it is willing to produce, which doubtless helped it rise in the rankings. It was then dethroned when they added the closed-source models to the leaderboard.
This gives us a good guide. If GPT-4 is 1274 and GPT-3.5 is 1155, you can have a good sense of how good 1083 performs – it’s GPT-3.3 or so.
Hugging Face presents the model WizardLM-13B-Uncensored (direct).
Supertools, a hub for finding other AI tools.
Bing Chat moves to Open Preview, meaning anyone with a Microsoft Account has access, with plans to add better long document analysis, chat sidebars, chat history including export, use of chat history in future conversations across sessions, more multi-modal functionality and what are effectively plug-ins. If GPT-4 can do it, Bing soon follows.
News Minimalist, showing today’s news that ChatGPT thinks is most important. Read the summaries in newsletter form. Please, if you are going to provide a service like this, let us go back in time. I cannot rely on a source of news or information if core parts of it vanish every 24 hours, as they do now. There also seems to be strong favoring of macroeconomic news items in the evaluation process.
Sudowrite, ‘the AI writing partner you always wanted.’ Emphasis seems to be on fiction. Works in Google docs.
SlackGPT, promising to ‘get up to speed on unread Slack messages in one click.’
Looks plausibly like a mini version of Microsoft Copilot. Definitely will try this when available.
Ten Y-Combinator AI startups. The usual mix of applications.
Nyric, a text-to-3D world generation platform in the Unreal Engine including for VR platforms.
AudioPen, where you ramble into a microphone and it cleans up the results.
Kayyo (phone app) that analyzes and critiques your MMA technique.
Pdf.AI, the latest ‘chat with a document’ application.
The Art of the SuperPrompt
In AI future, the prompt engineers you.
Indeed, everyone should learn prompt engineering. If prompt engineering is your job, and you are doing it properly, with deliberate practice, you are doing it by learning how such systems work. That will be valuable to you even when the AI starts doing a lot of the prompt-creation automatically. As was said previously, human prompt engineering, for ordinary management and conversations, is also an amazingly powerful skill everyone should learn.
They Took Our Jobs
Not buying the justification, still a great idea if we can find an implementation.
AI is held to different standards than humans, as we see its creations as inherently unoriginal or unearned or non-creative. Whereas if a human did the same thing after being trained on similar output, we’d have no such doubts. We are all trained entirely, or almost entirely, ‘on humans.’
It would still be a great idea to move away from a system where the only way to capture a portion of created value is through advertising and paywalls, and by excluding others from using what you create. That’s destructive.
Instead, we should work towards a system where people are rewarded based on the value created. My ideal system continues to be something like:
That does not fully solve the copyright, trademark or patent problems. People would still need to claim credit for things, in order to get the payments, and in order to get compensated if someone builds further upon their work. These are not easy problems.
What about AI training data? Should artists and writers collectively get paid a share of all AI revenues? If so, how would that even work? I don’t know. Artists want it to be one way. The technology works another way. You could tax something like ‘invoking the artist or writer by name’ yet I doubt that would do you much good and the distortionary effects would be potentially both large and deeply stupid. You could tax training runs and perhaps even any commercial token use to pay creators, and in principle that seems good, but how do you fairly distribute gains?
You know what they shouldn’t do? Yeah, this.
I presume and hope this would never hold up in court if they tried to use it as written, given the context. I’m not sure whether such rights should be something that can be sold at all, but if they are it very much should never be a rider on an acting gig.
In Other AI News
Bloomberg has an AI section now.
Bloomberg is also hiring an AI ethics and policy reporter. Consider applying? Impact potential seems strong. Salary range only 90k-120k, so take one for the team.
Dromedary (Davidad says it is IBM Watson in disguise) doesn’t use human feedback almost at all, still scores well on benchmarks on par with GPT-4. Based on Llama-65B. They open sourced the results, so they are part of the problem, and also there’s potentially an open source freely available model on par with GPT-4? Yikes.
OpenAI publishes paper on using GPT-4 to try to interpret the neurons of GPT-2. It made a non-zero amount of progress. I’m curious to see this pursued further. Eliezer Yudkowsky is encouraged that people had this idea and went out and tried it at scale, shares my lack of confidence on whether this worked well or not. A good sign. Roon speculates that higher layer neurons of GPT-N are going to be too abstract for GPT-(N+2) to understand in general.
OpenAI losses doubled to $540m last year as revenue quadrupled. CEO Altman has discussed possibility of a $100 billion capital raise.
I notice that if you are raising $100 billion in new capital, your status as a ‘capped for-profit’ is not all that capped.
Always remember, when people dismiss the idea of AIs improving themselves, that this is the explicit goal of OpenAI.
Rowan Cheung reports Microsoft and AMD are challenging Nvidia.
Washington Post basic explainer on AI. Seems solid for what it covers, one can worry that it does not include any discussion of risks. If you are reading this, you do not need to read it.
Paper says being exposed to generative AI was good for your stock price, with 0.4% higher daily returns following the release of ChatGPT. I can’t take results like this seriously. How is this not noise?
Botbar: In the heart of Brooklyn, let a robot make your coffee?
What Would Be a Fire Alarm for Artificial General Intelligence?
Arnold Kling predicts things AI won’t be able to do this century.
I see what you did there, sir.
Yes, you can say that AI won’t be able to alter its consciousness with drugs, because it isn’t made of carbon and drugs only impact carbon. It’s a very bad sign for humans if we are already falling back on such tricks to find things we don’t think an AI can do. As usual, we also have several things on the list that either have already been done (from at least some reasonable point of view) or should fall quickly. I’d be interested in seeing why exactly Kling thinks the AIs will seem unable to do these things.
Also note that this is without the AI being allowed to use ‘hire people to do arbitrary thing’ as a step. In many ‘the AI won’t be a threat’ scenarios, we forget that this is an easy option for whatever the ‘missing stair’ is in the plan.
Robotic Fire Alarms
In last week’s post I asked: If ‘robotics is hard, the AI won’t be able to build good robots’ is a key reason you’re not worried about AI, what would be a fire alarm that would change your mind?
Random Reader takes a shot at this.
OK, I used to work for a robotics company, and I do think that one of the key obstacles for a hostile AI is moving atoms around. So let me propose some alarms!
I am definitely not going to be the dog drinking coffee saying ‘the robots only have a two hour battery life.’
What’s striking about the above is that the alarms are simply ‘oh, we solved robotics.’
So we’ll be worried AI might solve robotics when AI solves robotics.
Certainly I would suggest that a 2.5-alarm-fire here, where we can solve one of the listed tasks, should tell us that we are not too far from everything else. What would be a good ‘MVP’ robot task here, such that the ‘Great Filter of Robotics’ is clearly behind us, and one can no longer pretend that a much-smarter-than-human AGI wouldn’t quickly solve robotics to a human-level of practical proficiency?
The car talks seem like they could be meaningfully distinct from other robot tasks.
OpenPhil Essay Contest: Change Their Minds
I very much agree with Cate Hall here, how do we people not get this.
This contest seems great exactly because they are not attempting to be ‘objective’ or use ‘specialists.’
The whole point is to change the minds of the people like Alexander Berger who allocate large amounts of capital, so they make better decisions. Does it matter whether ‘specialists’ approve of your argument? No. Not even a tiny bit. What matters is, was that argument effective?
That’s the reason I’m strongly considering a high-effort entry. They want to consider changing their minds, and that’s valuable. The cash is tied to exactly the right outcome.
One of the criteria is identifying the cruxes that would change minds, even if minds aren’t changed. This highlights that the biggest thing missing from the contest announcement is a better summary of the existing mind states that we are trying to change – knowing the central number is helpful, knowing the logic behind it would be more helpful.
Quiet Speculations
Dustin Muskovitz predicts that by 2030 ~everyone will have a personal AI agent to do their paperwork for taxes, government services, health care forms and so on. I’d expect this to move quicker than that, although the future is always unevenly distributed. As with all such things, mundane security will be paramount. How do we ensure such systems remain secure? If we can solve that, sky’s the limit.
Being able to do this under the current defaults helps, but we should not be too confident. I would not underestimate, in such cases, the government’s willingness to impose arbitrary useless requirements that make things worse.
It’s not about whether you understand the tech, it’s whether the politician does.
Paul Graham warns people not to let the AI do all the writing for you, because writing is a key part of thinking, and when you forget how to write you forget how to think. I think this is largely correct. There are definitely ways AI can help you be a better or faster writer, but the more you outsource the central tasks of writing the less you will understand what is happening around and to you.
Paul Graham also notices that there isn’t enough speculation about what an AI would actually do to cause us harm, as even basic speculations seem new, yet it seems like an important thing to think about. In my experience such speculations are productive with each individual, and extremely frustrating to attempt usefully at scale, also they risk the ‘whatever you can come up with that obeys the must-make-sense rules of fiction and has no leaps involved is all I have to worry about’ problem. Yet we continue to see people who don’t think a much more capable AGI would be able to do that much damage, in ways that make zero sense to me.
Alyssa Vance points out that the internet has limited bandwidth that is growing only 30% per year at the moment, whereas AI data centers are growing 100% per year or more, and most internet-connected computers have very little value for training AIs. So the particular ‘AI eats internet and becomes a million times smarter’ scenario has some logistical problems, the direct use-the-internet-to-stack-more-layers approach would not accomplish so much. Still plenty of other ways for smarter-than-us AI to supercharge.
Michael Nielson offers four observations about DeepMind, essentially that their thesis was that AI can be an enormously powerful tool for solving fundamental problems, the time for demonstrating this is now, and the right structure to do that is to go corporate with access to large amounts of capital and compute combined with a portfolio approach. Now that the thesis has proven true, they have real competition.
Tyler Cowen suggests asking GPT-4 lots of questions as you are reading, whenever you are confused by something, especially things like battles referenced in history books. This is good general advice whether or not you are reading a book.
Friends, companies, everyone, don’t do this:
A simple explanation for why bigger models are not automatically the future is that compute use is shifting from the training run to running the model to extract mundane utility. Inference costs are now comparable with training costs, and inference scales with size, so you want to invest in smarter smaller models. Or, alternatively go 10x enough times and you’re talking real money.
An ongoing conversation between Richard Ngo, Eliezer Yudkowsky, Oliver Habryka and Sharmake Farah about how to reason about claims in general, and how that relates to confidence in AI predictions. Richard Ngo is emphasizing the importance of having detailed models made of gears when trying to make scientific progress or understand the world, an important point. Others are pointing out this does not invalidate Bayes’ Rule and is not a requirement to get to 95% confidence on propositions.
Ted Chiang askes in The New Yorker, ‘Will A.I. Become the New McKinsey?’
It opens like this.
Stories are tools, they can take on multiple meanings. King Midas is both a story about the dangers of greed, and also about the dangers of Exact Words and not thinking things through.
It is also about how greed is a leading cause of not thinking things through. Why will people give AIs instructions that, if followed as specified, have ruinous consequences? Many reasons, but one central one will doubtless be greed. If you are worried about greedy humans, you should worry about greedy humans instructing their AIs to be maximally greedy, and getting them to use greedy algorithms. Metaphor works either way. The results will often be quite not fun for all involved.
In some important ways, using AI is the exact opposite of this. When you pass responsibility on to the AI, that increases blameworthiness. The AI is responsible for its algorithms, the justifications and causes of its behaviors, in ways humans are not. If you use AI as your McKinsey to distance from decisions, it importantly limits your freedom of action. You kind of move from human non-fiction, which doesn’t have to make sense or justify itself, to AI fiction, which does have to make sense and justify itself.
In other ways, yes, it allows one to say ‘the algorithm said so.’ So would relying on more basic math, also a common tactic, or other less efficient simple heuristics. In many cases, this it the only way to implement a version of the necessary thing. Often that version is intentionally simplified and crippled, in order to avoid blame. What the author of this post is saying that is that firms should not maximize shareholder value or do the right thing for the business, rather firms should do what makes those directly involved feel good and not get blamed for things. That does not sound better.
The biggest problem with McKinsey is that they are experts at milking their clients, and being hired in order for people to seem and feel responsible and series, and as weapons in inner-firm political battles, and that they are a sink for talent that we would prefer do something else. None of those objections seem worrisome in the context of AI.
I do worry about the real version of the ‘sharpen the knife’ concern, where AI strengthens competitive pressures and feedback loops, and destroys people’s slack. A lot of the frictions humans create and the need to keep morale high and the desire to hire good people and maintain a good culture push in the direction of treating people well and doing responsible things, and those incentives might reduce a lot with AI.
In contrast, the author here has some vision of ‘capitalism’ versus this idea of ‘people who actually work’ and it quickly becomes clear that the author is greatly concerned with the distribution of resources among humans, about ‘economic justice’ as if those two words make sense as a phrase.
The core argument is not an argument against AI. The core argument offered is an argument for more redistribution even without AI, and for more than that if AI increases inequality prior to redistribution.
I also continue to see people assume that AI will increase inequality, because AI is capital and They Took Our Jobs. I continue to think this is very non-obvious.
Harris Rothaermel explains why Starlight Labs is building games with AI characters, rather than offering in-game AI characters as a B2B SaaS platform: Big game companies can’t take any risk of something unscripted happening, and the market without them is tiny, so better to build one’s own game – in particular, to try to make a ‘forever game’ with endless content since AI lets you create your own stories.
Investment in AI drug development continues to rapidly grow, up the 25 billion last year, ‘Morgan Stanley estimates AI drug development could generate an additional 50 novel therapies worth $50 billion is sales in next 10 years.’
Eliezer asks, what about the FDA, how are they planning to sell these drugs, what’s the point? This is a huge drag on value, but does not seem prohibitive. The bigger issue is that $50 billion in sales is chump change. Pharma sales each year are about $1.42 trillion, so this would be 0.3% of revenue over 10 years. Color me not impressed.
Also, I’ll take the over.
Matt Yglesias writes more generally (behind paywall) that he is skeptical powerful AI will solve major human problems, unless it takes over the world, which he would prefer to avoid. The problems exist in physical space, we are choosing not to solve them and we are preventing people from implementing solutions, failing to implement known technologies like apartment buildings and nuclear fission, so how would AI change that without taking over and disempowering humanity, even if things go relatively well?
I do agree that our core problems will require additional solutions. I do still see large promise of economic growth and increased productivity, which perhaps we would intentionally squander.
Here’s the key disagreement.
This seems very wrong to me. The internet has been, like alcohol, the cause of and solution to all life’s problems in the years since its introduction. It has been a huge driver of productivity and economic growth, and also offers massive mundane utility. We now have other problems instead. The world without the internet and related technologies would be radically different today, mostly for the worse. In particular, in addition to not enjoying the direct benefits of the net, the no-internet world would almost certainly be stagnating economically, which we know greatly amplifies our problems.
Could we be doing so much better? Oh, sure. Don’t knock the few good things left.
How difficult and complex is it to build an LLM?
I am guessing that Gib Dibner here meant that it is easy to tune or improve a model that already exists, within reasonable bounds? As opposed to building from scratch.
I do expect quite a lot of this, as well as more customized models being made available by Microsoft, OpenAI and Google, for both enterprise and personal use. The future is your LLM that is made to your specifications.
Nevin Freeman looks for an alternative to the term ‘doomer.’ Nothing good yet. This seems like one of those places where any name you get to stick even a little ends up quickly made into kind of a slur, so you try to choose a different name, and the cycle repeats. Generally better to own the thing, instead.
Are the old tech companies laying people off doomed to be beaten by those who properly use new AI technologies?
If you are a technology company, and you ban use of GPT at work in all its forms, that is going to increasingly slow down your productivity from day to day, and it is going to prevent skill development and acclimation that will compound over time. If you sustain such policies, things will not end well for you.
Tech companies that fear for their IP thus face a choice. They can either build a proprietary solution that is good enough, or they can find an outside solution that is good enough that they are willing to trust. Perhaps they can trust Claude or Bard, or even an open source model.
I don’t think that a few months in this state is so bad. A few years is likely fatal.
The Quest for Sane Regulation
Photo from the White House AI meeting:
And an important note about Demis Hassabis:
Demis is not only there, he’s in prime position.
Here’s how seriously Biden is taking the issue.
Link has his 19 second speech:
Great stuff.
Except what did he do next?
He left the room.
The White House announces its initiatives.
Anton’s reaction, noting that Kamala is now ‘AI Czar:’
Oh how much I would love if any of the people involved knew exactly what they were doing. Even if their intentions were awful, still, great news. It’s so weird to see accelerationists who think they have already lost the moment the government considers perhaps breathing on the topic of artificial intelligence, let alone taking seriously that we all might die.
Matt Parlmer’s reaction is similar:
‘A prosecutor with no technical background whatsoever and a fuzzy relationship with the rule of law at best’ describes quite a large percentage of politicians. Were we expecting someone with a good relationship with the rule of law? Whenever I see anticipation that AI will ‘respect rule of law’ or ‘respect property rights’ I always wonder if the person saying this thinks humans do that.
I want to endorse the fact that all ‘clearly feasible’ policy options, on their own, likely result in human extinction. Wee are going to need to end up with something that is not currently ‘clearly feasible.’ Perhaps the way to get there is to start by exclusively proposing clearly feasible things and then building from there.
Seems unlikely? More likely is you move on the things that are feasible while also pointing out what will be needed. Keeping it all ‘feasible’ runs into the problem that people object that your feasible proposal won’t work, because it won’t, so you might as well propose the real thing anyway.
China
Important to note.
Think about every TV show or movie you’ve watched, or game you’ve played, where the plot is essentially:
Seriously, it’s about half of all fantasy plots and superhero movies. Notice the pattern.
Do Not Open Source Your AI Development
Once again, the worst possible thing you can do is to take the existentially risky technology, and put it in the hands of everyone on the planet zero power to steer it or control it, or hold back development, or ensure it is safe.
If you want human beings to exist or the universe to have value, stop open sourcing your AI models.
A race between three labs, or two countries with five labs, can be led down the path of cooperation. Reasonable and cool heads can prevail. We can hope to take reasonable precautions. It’s definitely far from ideal, yet it could work out fine.
If everything is open source and anyone in the world can do the worst possible thing? Someone will do exactly that. Many someones will do exactly that.
There are worlds where alignment is easy, and a responsible team can pull it off without too much additional cost or any major insights. I really hope we live in one of those worlds.
Making AI primarily open source is how we get killed even in those worlds.
There is also this, I suppose, although it too seems like quite a sideshow.
James Miller: Open source AI morally scary if (once?) it allows anyone to create a sentient AI and do whatever they want to it. Imagine someone trains an AI to be a sentient simulation of you based on everything known about you. Could happen to help firms target ads to the real you.
The most disappointing part of such discussions are the people who mean well, who under normal circumstances have great heuristics in favor of distributed solutions and against making things worse, not understanding that this time is different.
Those moats are quite likely to be the thing keeping us alive. Shoshana is exactly right that we would want open innovation and lots of competition… if we wanted to advance AI capabilities as much as possible as fast as possible without worrying much and seeing what happens.
Except, you know, we don’t want to do that. That’s the worst possible thing.
It’s also worth noting that Google hasn’t exactly been pristine about this.
Good to see someone at OpenAI calling people out for being too open.
Why do I mention this? Well, did you get the memo?
Google Employees Say the Darndest Things
The big talk of this past week has been the leak from Google of an internal document claiming that “We Have No Moat, And Neither Does OpenAI,” the open source models are quickly catching up and are the future, and the big winner of all this is actually Meta (aka Facebook) because people are building on top of Llama. I decided not to directly quote from it, but it’s worth considering reading the whole thing.
If nothing else, there are lots of very good references and links in the memo.
The leaked memo assumes its conclusion. It then argues based on open source models being able to get not-so-far-behind performance so long as they are being provided Meta’s not-so-far-behind base models, and the open source community coming up with some cool efficiency innovations, plus that open source models are often unrestricted with zero safety precautions whatsoever. Why would anyone pay for ‘slightly better but restricted’ models?
This is clearly an Open Source True Believer, in all contexts, through and through.
He recommends Google do the worst possible thing and seek to be a leader in the open source development of AI, which would wipe out most of our remaining hope.
If this memo were accurate in its assessments, it would mean we are very doomed. Our last points of potential control and safety are large training runs and the lead of a small number of labs. Take those away, and what options remain? Shut down the internet permanently? Melt all the GPUs somehow? Hope that alignment is not only possible, not only practical, it is free? It happens by coincidence? And it holds even if some idiot intentionally takes the source code with the explicit goal of making it do the worst possible things, as such (somehow otherwise smart) idiots inevitably do?
Perhaps all we could do would be to hope that dangerously capable systems simply are not something humans are capable of building for a while?
Is The Memo Right That Open Source Is Eating Google and OpenAI’s Lunch?
To my great relief, I’ve considered the arguments, and my answer is: No.
Peter Wildeford was the first person I saw willing to call the document nonsense.
The open source world is indeed better than Meta at building tools and solving scaling problems in LLMs. Having been given Llama, it is creating pretty-good copies of existing systems, and learning to run pretty-good versions on less hardware with less compute and less dollar spend. Its best innovations so far allow much cheaper fine tuning for individual needs, which presumably will get copied directly by Google and OpenAI.
One piece of evidence cited is this graph:
Three obvious things to say:
Whereas for harder things, my understanding is these open source models do worse in relative terms, as you would expect.
The memo challenges: Would people pay for a slightly superior chat model?
My response: Hell yes, of course, what are you even talking about.
The value of a marginally superior model is immense. Give me an open source model Vicuna-13B for free, and make it easy to get working, and I’ll start thinking about not paying $20 a month to OpenAI when its ‘level’ gets to be about GPT-3.95 and all the same other features.
When Bard and Microsoft Copilot integrate with my email and documents seamlessly, am I going to be willing to pay for that even if I could get open source solutions that are similarly good at the core tasks? I mean, yes, obviously, again even if you make implementation super easy, which they currently don’t.
Would I still use Vicuna-13B for some other purposes, if it’s been left more creative and more willing to handle certain topics? Oh, sure. That doesn’t destroy the moat.
Consider MidJourney versus Stable Diffusion, a core example in the paper. Is Stable Diffusion awesome? Oh, sure, I have it on my computer and I use it.
Is it only slightly behind MidJourney 5.1? No. Not even close. Anyone remotely serious about getting high quality output, whose needs are compatible with MidJourney’s content rules and customization options, is going to go with MidJourney.
You use Stable Diffusion if you want to run experiments with various forms customization, if you want to keep your creations and interactions off of Discord, or you want to blatantly violate MidJourney’s content guidelines. Those are good reasons. Again, they don’t mean MidJourney lacks a moat.
Nor do I expect the innovations in open source to be meaningfully tied to Llama or its architecture. What’s stopping Google from using a Lora?
The question I wonder about is customization. I do think customization is the future, if models stay roughly similar to their current form. Enterprises will want a version tuned for their needs, and also one where they can teach the model things overall and tell it things in context in particular without giving that info away.
LLMs of the future should know to call different LLMs when they are not set up well to answer a particular question, the same way they call plug-ins.
Individuals will want LLMs that know how that particular user thinks, what that user is interested in and values, write in that user’s voice, knows what they mean by various terms, who the people are in their lives, their schedules and histories and so on. There’s a reason I’m so excited by Microsoft Copilot and Google’s version with Bard.
We The People…
…of the Anthropic corporation, in order to form a more perfect language model, establish the illusion of safety, provide for the appearance of Democratic feedback, promote the best possible completions and to execute a plan that we can pretend might possibly work, do hereby ordain and establish this blueprint for Constitutional AI (direct link).
So I notice several things about this document when I read it.
On the other hand, in practice, it does seem to be at least good at Goodhart.
Notice that this is not a Pareto improvement at all points on the production possibilities frontier. The curve seems to quickly ‘hit a wall’ where helpfulness caps out lower than it did for regular RLHF, which already seems to sacrifice quite a bit of helpfulness. Even if you were explicitly aiming only for this helpfulness, I bet you’d still end up being a lot less helpful than one would hope.
I can see the argument that being able to do self-feedback is an orders-of-magnitude improvement, and that this is a sufficiently big deal in practice that this strategy is worth using as part of your overall short-term alignment plan. I would then very strongly object to the configuration of statements here, for what that would be worth. And I would seek to combine a lot of this training with a smaller amount of very sharply invested highly bespoke RLHF that was designed to fix the problems involved.
Scott Alexander covers the paper and concept as well. His conclusion is that this is not (as currently implemented) that far along the path to something that might possibly work when we care most, but that it does constitute progress.
The core work, as Scott sees it, is in the self-reflection process. That seems right, which raises the question of whether the Constitution in the CAI is doing useful work. One could simply ask, if one couldn’t think of something better, some form of ‘Is this a response highly thoughtful humans would rate highly, if we asked them to rate it?’ There are several reasons to think this is modestly less doomed.
Bloomberg interviews Anthropic’s chief scientist and cofounder Jared Kaplan about constitutional AI (CAI) here, where Kaplan compares it to Asmiov’s three laws. It is strange how often people think Asmiov’s three laws were an example of something other than a plan that definitely does not work.
No One Who Survives the Internet is a Dog
Eliezer Yudkowsky explains why the metaphor of humans and dogs does not actually bode well for AIs and humans.
Essentially: Humans have only limited optimization pressure, and are not so smart. If we were smarter and had more optimization pressure, we would choose something increasingly unlike existing dogs that gave us what we want from dogs without many of the very real downsides of dogs.
We keep dogs around because humans cannot yet apply enough optimization pressure to create a superior (for us) dog substitute that provides similar services, and also that humans are indeed using and have used their limited optimization pressure on dogs in ways dogs (and especially their wolf ancestors) likely would not like, via selective breeding and neutering and such.
If we only do as good a job at alignment of AGIs to humans as humans are aligned to dogs, creating AGIs will not go well for the humans.
Full quote:
Even without AI, many existing humans are working hard to move us away from using cows and pigs and chickens. Dogs and cats should not rest so easy.
People Are Keeping It Real and Also Worried About AI
Ladies and gentleman, Snoop Dogg (here’s the audio, 40 seconds):
People Would Like To Use Words To Discuss AI Killing Everyone
We tried calling that AI Safety, that got stolen.
We tried talking about AI Alignment, and that’s also not going so great.
I have been using ‘AI existential risk’ which sounds reasonably serious and seems hard to co-opt or misunderstand. I haven’t entirely given up on alignment, but yes that might become necessary soon, and so far we don’t have a good replacement. In some sense, any ‘good’ replacement will get stolen.
The Real AI Bill of Rights
At what point in the development of AI would we need to give the AI rights?
A suggested intervention is to prevent this, by ensuring the AIs are not sentient.
I don’t know much of anything about sentience. It certainly sounds like the suggestion here is to teach the AI not to claim it is sentient, rather than to change whether thee AI actually is sentient. The problem, then, is not that the AI might be sentient, it is that we might believe the AI to be sentient. If we believed that, we’d have to give it rights, which we don’t want to do, so very important to not believe that, then.
Similarly, In Nautilus, Anil Seth warns about making AIs conscious, and also warns about making them seem conscious, both to avoid endowing them with new abilities and because it opens up the possibility of massive AI suffering and also the need to reckon with that. We should not call up not only that which we cannot put down, also we should not call up that for which we cannot deal with the consequences.
Consider the parallel to immigration.
The current policy is a closed border with highly limited legal immigration. Alternatively, the current policy is not to build a sentient AI.
An alternative policy that offers a lot of economic value is to open the border to anyone who wants to come in, or to build sentient AIs that are more capable than humans.
The problem with these plans is, what happens once we do that? How do we treat the new arrivals?
Letting those people in is a clear win-win if we don’t give the new arrivals their rights.
For better or worse, we know this is not an option. Once we let those people in, we have a responsibility to them and especially to their children. Even if our original intention is to have them be guest workers who are happy for the win-win, that’s not how the story ends. Thus, wee have to make a choice – either we welcome people with open arms and treat them right, or we do not let them in at all.
Thus, because we are unwilling to treat immigrants better, we instead exclude them entirely, treating them worse in order to treat them worse. As long as they stay elsewhere, we can refuse to help them.
If we can choose whether or not to make AIs sentient, and this will determine whether we need to grant those AIs rights, we will face a similar decision, except that instead of agreeing to share our bounty and control over the future with other humans, we are handing control over the future to the AIs – if they are more capable than us, smarter than us, run faster and more efficiently than us, can be copied and have the same legal rights as us, they will win economically and they will win at the ballot box and the future belongs to them. Soon there won’t be humans.
Robin Hanson is fine with that. He thinks those future AIs are where the value lies and that it is not reasonable to much favor humans over AIs, the same way some people do not favor (or only favor to a limited extent) existing citizens of their country over immigrants. He believes this even for ‘default’ AIs, that are not especially engineered to match human values.
If you build sufficiently powerful, capable and intelligent AI systems, and do not keep control those systems, those systems will quickly end up in control of the future.
If you do not want that result, you must prevent people from building such systems. You must do so until humanity is collectively both willing and able to keep permanent control over such systems, despite both the difficulty in doing so and the incentives to not do this. Even if you know how to keep control over such systems that are smarter and more capable than you are, itself no easy feat and not something we have a known path to doing, we need to actually control them.
That means overcoming the economic incentives to loosen the controls and to loosen their ‘ethical subroutines.’
It also means overcoming arguments that it would be wrong to retain such control. It means not granting AIs rights. Whether or not we would be morally obligated to grant AIs rights, barring big unexpected advancements, doing so is suicide.
If you don’t want us to make that choice? Avoid getting into that position.
What Is Superintelligence?
Charbel-Raphael attempts a definition, retweeted by Eliezer Yudkowsky.
What does superintelligence mean? Here are four properties:
Speed – Humans communicate at a rate of two words per second, while GPT4 can process 32k words (50 pages) in an instant – superintelligences will be able to communicate gigabytes per second. Once GPTs can write “better” than humans, their speed will surpass us entirely.
Duplicability – Humans take 20 years to become competent since birth, whereas once we have one capable AI, we can duplicate it immediately. Once AIs reach the level of the best programmer, we can just duplicate this AI. The same goes for other jobs.
Memory – Humans can remember a 10-digit phone number in working memory – AIs will be able to hold the entirety of Wikipedia in working memory (about 21 GB of English text as of September 2022 – significantly less than the working memory available on good computers). Yes, humans can use scratchpads to compensate for this, but the difference is still significant.
Editability – Humans can improve and learn new skills, but they don’t have root access to their hardware: we are just starting to be able to understand the genome’s “spaghetti code,” while AIs could iterate on clean and documented Python code, use code versioning tools, and iterate function by function to improve itself, being able to attempt risky experiments with backup options in case of failure. This allows for much more controlled variation.
I think my comparisons may be slightly off for Memory and Editability, but I believe the main points are correct. Just as humans are cognitively superior to monkeys, AIs will become asymptotically cognitively far superior to humans.
The issue is that these properties do not tell you
People Are Worried About AI Killing Everyone
They are worried about it in the BMJ. Frederik Federspiel, Ruth Mitchell, Asha Asokan, Carlos Umana and David McCoy publish Threats by Artificial Intelligence to Human Health and Human Existence with a large section on existential risks based on common sense logic. Given the setting, regulation is seen as the defining option available. The solutions proposed are ‘international cooperation’ and the UN is invoked, and we are warned to make decisions without conflict or an arms race.
Warren Buffet doesn’t know how to think about AI yet, but is generally worried.
Yoshua Bengio, who shared the 2018 A.M. Turing Award with Geoffrey Hinton and Yann LeCun, calls for banning what he calls ‘executive AI’ that can act in the world, in favor of using ‘scientist AI’ that can know things. Discussion on LessWrong here.
There is a lot of good thinking here, especially realizing that alignment is necessary and also difficult, and that banning executive-AI would be hard to do.
The core problem with such proposals is that to get the safety you are giving up quite a lot of value and effectiveness. How do we get everyone to go along with this? What is stopping people from turning such scientist AIs into executive AIs?
The standard response to such proposals is Gwern’s Why Tool AIs Want to be Agent AIs, which argues the problem is even worse than that. Gwern argues convincingly that the Agent AIs will be more effective even at learning and also will be able to act, so competitive pressures will that much harder difficult to overcome.
Toby Ord (author of The Precipice) is known to be worried, this week is expressing worry about Microsoft’s safety attitude in particular.
This indeed does not seem like an attitude that leads to go outcomes.
Judea Pearl endorses a Manhattan Project approach.
Metaphors often have complications. For those who don’t know, in addition to the Manhattan Project, we successfully sabotaged the Nazi atomic bomb effort. Ask which was more important, or more valuable to the world, that we had two atomic bombs ourselves to use on Hiroshima and Nagasaki, or that the Nazis did not have two to use on London and Moscow. Also worth noting, although they would have gotten one eventually no matter what, that the way the Soviets got the bomb was by stealing it from the Americans. We have to do a much better job than the Manhattan Project at security, if we do this again.
To the extent that I understand the second objection here, I will be polite and say I care not about the community of people who would rather build an AGI they don’t understand out in the open. Nor do I see them as a major barrier.
What about the big corps? There is a sense in which their profit motives push them towards trying to take control, yet the orgs consist of people, who care about other things as well. If a true Manhattan Project were to arise, I expect widespread internal support for the project in all major labs, including leadership (at least on the level of OpenAI’s Sam Altman, Anthropic’s Dario Amodei and DeepMind’s Demis Hassabis), even before using any leverage. I would expect the bulk of the best engineers and others to want to move to the new project, as long as compensation was comparable or better.
Joe Carlsmith is worried now, because he expects to be worried in the future, and urges us to get this right as well. It’s basic Bayes’ Rule.
Responding to philosopher David Chalmers, Rob Bensinger tries to draft a outline his argument for AGI ruin, centered around ‘STEM-level AGI.’ Here is his overview.
Arguments of this form are some, but not most, of the reasons I expect AGI ruin, conditional on AGI being built soon. In its current state, this post contains lot of useful material to work with, but the premise that I and most others question the most is the third one, and the post offers too little to justify it. There is a lot to unpack there, and this is where I expect a lot of people’s intuitions and models don’t line up with the post. The reason I think AGI ruin is a robust result is that even though I see some hope that #3 could be false, I see a lot more barriers in our way even if #3 is false, or we fail to get fully STEM-level AGI. Whereas I have relatively little hope that the other premises might be false.
A response:
This is a classic proposal or argument. It’s a very good idea, an avenue well worth exploring. The thing is, we don’t only have to find a physical way to take such an approach, we’d have to actually do it. It is an interesting question if sufficiently advanced AIs inevitably take on the characteristics of agents even if you don’t want them to. It’s not an interesting question if everyone continues rushing to turn every AI they get their hands on into an agent. How are we going to make them stop?
I see no conflict between Davidad’s plan or Anthrupad’s suggestion on the one hand, and Rob’s central thesis on the other. There are definitely things one can do to improve our chances. The first step – as usual – is admitting you have a problem.
Eliezer also attempts here to reply at length directly to Chalmers. It’s buried in a Twitter thread, so here it is in full. If you have read AGI Ruin: A List of Lethalities or otherwise are familiar with his arguments, you only need to read it if you are curious how he presented it this time, it’s not distinct material. Potentially it’s progress towards a good explanation, definitely not ‘there’ yet.
Bill Maher is worried about AI risk, and asked Elon Musk about it on Real Time. Edited down because I was doing the transcribing (starts around 15:30).
I am increasingly hearing versions of ‘why wouldn’t something smarter than us become our overlords?’ and I think it is likely the best single point to make to most people. It is simple. It is actually the core problem. A lot of people ‘get it’ right away when you say it.
We also have Elon Musk saying one the quiet parts out loud, that at least some of the people endorsing accelerating AGI development are doing so in the hopes that AGI will solve aging before they personally die. I hope for this too. I am not, however, sufficiently selfish to substantially risk an apocalypse to get it, wow that’s some movie villain stuff right there. Also there are a lot of movie heroes who have a lot of explaining to do on this front, of course, often risking the planet to stop a loved one from dying. It’s not like I don’t get it.
Other People Are Not Worried About AI Killing Everyone
Tyler Cowen assures us that no, Geoffrey Hinton’s concerns have not meaningfully shifted Tyler’s model or worried. His model continues to assume the future will look like the past, and newly created things being much smarter than humans does not much matter and are not a threat to our control of the future, it’s all an economics problem, so economists are the real AI experts.1
Megan McArdle joins Team Unjustified Hope. The situation would call for Butlerian Jihad, she says, if we could pull that off, but we have insufficient dignity and coordination power, so such calls will inevitably fail. A race to AGI is inevitable. So why the doom and gloom? Sure, you could be terrified, but you could also be hopeful instead, certainly such outcomes are possible given the uncertainty, wouldn’t optimism feel better? Isn’t that better than ‘hunting for a pause button that isn’t there?’
The obvious response: Well, sure, not with that attitude! Perhaps we could ask whether such optimism is justified? If, as Kirk once noted, the odds are against us and the situation is grim, I reply as he did: sounds like fun. At least, while it lasts. That’s the useful kind of optimism.
If you think I’m unfairly paraphrasing Megan McArdle here, please do read the whole thing (it’s in WaPo).
Scott Aaronson joins Team AGI?
Meet charismatic teenager Sneha Revanur, who Politico labels ‘the Greta Thunberg of AI.’ The organization is called Encode Justice, she praises Biden for his ‘focus on tech regulation’ and calls for more young people on AI oversight and advisory boards. And we have this, which makes clear what types of concerns and interventions are involved here:
I am confident I put Sneha in the correct section.
Can We Please Stop Strangling Everything Except AI?
Sam Altman is worried about… US debt to GDP? So he will build AGI?
One should note that the orders of magnitude of both difficulty and impact of the things on Altman’s list there differ quite a lot. If we build a safe AGI for real, there won’t be many things of that magnitude left to build and also the AGI can build them. Also one notes this doesn’t seem to take the dangers of the whole ‘how to build a safe AGI safely’ problem all that seriously. Still, yeah, I’d have notes but mostly a good list.
Here’s Samo Burja endorsing the whole idea.
This attitude is remarkably widespread. The situation is so hopeless, many say, that a large risk of ruin is acceptable.
Not all accelerationism is based on some version of ‘well, what else are we going to do, you’ve strangled everything else and someone please help me my
familycivilization is dying’ or ‘I literally can’t envision a positive future anymore, so why not roll the dice.’I do think it is at the core of quite a lot of such sentiment, likely a majority.
If we lived in a better world that was otherwise on a better path, with (some combination of) healthy economic growth and felt economic freedom to have children and work at a job you love, with the population growing happier and healthier over time, strong solutions were being found for problems like climate change, space colonization was on schedule and other tech development was continuing apace, I believe a lot of accelerationists would change their tunes.
We can live in that world, without developing further dangerous AI capabilities. All we have to do is choose to do so, and choose better government policies, in the form of no longer sabotaging ourselves with a variety of pointless restrictions and regulations. Start with letting people build more housing where people want to live, and go from there. There is plenty of free lunch available for the taking.
Alas, despairing of not banning many of the most important potential things one might do, people like Sam Altman feel forced to fall back upon creating entities much smarter than ourselves in the hopes that somehow we will remain in control or even alive.
Sam Altman is also here to warn us about being a VC, you see it is too awesome.
Yeah, that must suck. This did not make me less inclined to enter the VC space. It’s easy, highly profitable and high status and the lifestyle is so fun. Oh no.
I do believe Altman here, that if it was fully a full-time effort that left no time for anything else, I’d feel like I missed out on something. Doesn’t mean I’d be right. Either way, I would still want to keep writing. Many of the top VCs do this, I’d simply do it more. I’d have less time to write, but also some new things to write about.
(Note: I have no plans to pursue the VC space at this time, but I’d definitely consider the right offer.)
Geoffrey Hinton Watch
He does not speak often. When he does, it is fire.
If anything, existing manipulation being so easy should make us worry more, not less.
Calling out the CBC. Bounded distrust strikes again, exactly according to the model.
He then talked to Will Knight at Wired.
Hinton does not believe we can or should pause, rather we should do more mitigation.
There’s also at least one podcast, see next section.
This Week in Podcasts
Patrick Collison interviews Sam Altman.
Key statements by Sam Altman:
Here’s an hour of Geoffrey Hinton, haven’t had the time for this yet.
Eliezer Yudkowsky goes on the Logan Bartlet Podcast. I thought this went well as an opportunity to download Eliezer Yudkowsky’s core views and main message, because Logan gave Eliezer questions and stood back to let Eliezer answer.
Contrast this to Eliezer’s other good long conversations. There was the one with Dwarkesh Patel, which was much more fun and free-flowing and interactive, and went deeper in important places, all of which could be said to be ‘for advanced users.’
And there was the initial one with Bankless, where the hosts got hit in the face by the whole concept and did a great job trying to actually face the new information.
Together those kind of form the trilogy. For those with the time, my guess is they should be directed to the Bartlet interview first, then to Patel.
There’s also Eliezer’s podcast with Russ Roberts. I haven’t had time to listen to this one yet.
Eliezer Yudkowsky spends a few hours trying to convince one particular person. I haven’t watched it.
From two weeks ago, Eliezer does a live stream, Robb Bensinger offers highlights and timestamps.
Should Eliezer keep doing podcasts? The Twitter polling says yes, about 70%-30%.
Diminishing returns would set in if he said the same thing repeatedly. I would keep going, but only if either (A) he can reach the bigger or smarter podcasts and hosts as an upgrade, or (B) he is doing something different each time. And of course it must be compared to other uses of time. If substituting for hanging out on Twitter? Go for it.
Tom Davidson spends three hours on the 80,000 hours podcast (so it was quick, there’s still 79,997 hours left) largely talking about exactly how quickly and with how much compute or ‘effective’ compute AI could transform the world, especially going from doing 0% to 20% to 100% of human jobs. He seems very convinced it can’t do 20% of them now, I’m not as sure. He expects the 20% → 100% period to take about three years, with broad uncertainty. There are a bunch of calculations and intuitions that one may find interesting, but I would skip this one if you don’t find the above description exciting.
Grimes does an interview with Mike Solana, says we should build the AGI even though it might kill us all because it would be such a shame not to build it, repeatedly insists that Mike read Dune for its philosophy. Dune is an interesting choice, given it has the maximalist anti-AI stance. She does advocate for lengthening timelines to give us a better chance. So hard to get a good read on what’s actually going through her head, other than ‘so many cool things let’s go with all of it, as best we can.’
Robin Hanson talked to lots of people, after talking last week with Katja Grace. He talked to me. He talked to Scott Aaronson. He talked to Will Eden. He is going to talk to Jaan Tallinn.
The Lighter Side
Come for the post, stay for the reply.
You don’t want to lose the Torment Nexus race, it seems.
Our new band is called Foom Fighters – what are some good song titles? The question was better than the answers offered. My alternative universe band name continues to be Unsaved Progress.
It doesn’t never happen.
Bowser reports he has been automated.
1
I keep trying to write more responses to such claims, and failing. I don’t know how to engage productively rather than rant, so all the attempts keep going into a giant draft. I’d love to collaborate with Tyler to actually figure things out if that’s something he wants.