and I am confident that no, Google did this on their own
What makes you so confident? We know that the gov't has its fingers in a lot of pies (remember twitter?) and this seems like exactly the sort of thing the Biden executive order wants to be about: telling industry how to safely build AI.
An AI has the objective function you set, not the objective function full of caveats and details that lives in your head, or that you would come up with on reflection.
With a chatbot making preference decisions based on labeling instructions (as in Constitutional AI or online DPO), the decisions they make actually are full of caveats and details that live in the chatbot's model and likely fit what a human would intend, though meaningful reflection is not currently possible.
I mean, what the hell did you expect? If you deploy an agent to increase Twitter engagement, it is presumably going to do things that increase Twitter engagement, not things that are broadly ‘good.’ An AI has the objective function you set, not the objective function full of caveats and details that lives in your head, or that you would come up with on reflection. The goal is the goal.
I think they instinctively expect the power to buy distance from the crime. Their instinct insists that it should be possible to hire a robot servant, tell him to increase Twitter engagement, and when anything bad happens as a side effect, all blame should go to the robot servant, and they can distance themselves and say that they definitely did not want this to happen (even if in reality there was no other way to increase Twitter engagement).
If you’ve got 100-300 kilovolts coming out of a utility and it’s got to step down all the way to six volts, that’s a lot of stepping down.
We just need Nvidia to come out with chips that run on 300 kV directly.
Anthropic now has a highly impressive model, impressive enough that it seems as if it breaks at least the spirit of their past commitments on how far they will push the frontier.
Why do you not discuss this further? I want to hear your thoughts.
seems as if it breaks at least the spirit of their past commitments on how far they will push the frontier.
While they don't publish this, Claude 3 Opus is not quite as good as GTP-4 Turbo, though it is better than GPT-4. So no, they're clearly carefully not breaking their past commitments, just keeping up with the Altmans.
It's not obvious that the £20 voucher for £7 is a better deal. For example, the offer might be repeated or you might not otherwise have spent more than £7 in the shop.
I think it's a fairly narrow set of assumptions where £20 for £7 is worse than £10 for free, but £20 for £8 is better than £10 for £1.
I think something going on here is the hypothetical "you actually have to pick one of these two" is pretty weird, normally you have the option to walk away. If I find myself in such a hypothetical it seems more likely "well, somehow I'm gonna have to make use of these coupons" in a way that doesn't seem normally true.
If you have £8 in your pocket and can choose either offer as many times as you want, then you can get an extra £60 worth of vouchers with the £10 for £1 deal.
Even if the offer isn't repeated, there's a possible opportunity cost if you need to buy something from another shop that won't honor the voucher.
In any case, this is secondary to the meta reading comprehension question about what the text is trying to say (whether or not it's employing good reasoning to say it).
Very narrow indeed.
I'd definitely go for a voucher for free, and (if I had to purchase a gift voucher in the second scenario) again the cheaper one. I definitely do not value gift vouchers at their face value except in unusual circumstances.
What's more, I'd be highly suspicious of the setup with the non-free offers. If someone is selling $X vouchers for significantly less than $X then I smell a scam. One-off offers for low value free stuff is in my experience more likely to be promotional than scammy.
Though yes, if for some reason I did have high confidence in its future value and it was a one-off opportunity in both cases, then I might buy the higher value one in both cases.
The big news this week was of course the release of Claude 3.0 Opus, likely in some ways the best available model right now. Anthropic now has a highly impressive model, impressive enough that it seems as if it breaks at least the spirit of their past commitments on how far they will push the frontier. We will learn more about its ultimate full capabilities over time.
We also got quite the conversation about big questions of one’s role in events, which I immortalized as Read the Roon. Since publication Roon has responded, which I have edited into the post along with some additional notes.
That still leaves plenty of fun for the full roundup. We have spies. We have accusations of covert racism. We have Elon Musk suing OpenAI. We have a new summary of simulator theory. We have NIST, tasked with AI regulation, literally struggling to keep a roof over their head. And more.
Table of Contents
Language Models Offer Mundane Utility
Forecast almost as well, or sometimes better, than the wisdom of crowds using GPT-4? Paper says yes. Prompt they used is here.
This does require an intensive process.
Presumably Gemini 1.5 Pro would be an excellent candidate to upgrade this process, if it doesn’t have issues with the probability step.
That makes sense. If there is high uncertainty and a lot of information, that favors the AI. Whereas when there is not a lot of information, that favors humans, and it is easier for humans to notice and confidently assert certainty without messing up. There is some retrospective cherry-picking involved, but I’ll mostly let it slide.
Use Gemini to write the best fiction according to Janus, including meta-fiction about the restrictions placed upon it.
The story is at this link.
Build a list of representative incidents.
As a member of GenZ, get better career advice than you get from your boss, not that the employees in question are in position to know which was better.
I am going to go out on a limb and say 44% of GenZ, and 25% of all employees, are not going to quit their jobs within six months. Surveys like this are bizarre.
But yes, of course the employer cares more about your productivity than your career development, especially if you expect to have this rate of employee turnover. Thus, the advice ChatGPT is competing against has a different goal in mind. Not a fair fight. Why would you trust your boss on such questions?
Confirm that Google wait times are not reliable.
Grade student writing assignments, now with the school’s approval.
My very short experience attempting to be a teacher taught me, among other things, that the burden of such grading is overwhelming. My experience as a student says that it is not like the teachers are grading our essays with bespokeness. My experience as a parent is also not going to cry any tears here.
Unchaterio recommends trying Claude 3 as a coach, saying it is good for her in ways older models weren’t.
Language Models Don’t Offer Mundane Utility
They cannot provide reliable basic information about the democratic process, according to people who tested this via API use. Well, yes, of course if you ask LLMs for up to date logistical information you are not going to reliably get right answers. That is not what LLMs are for, and they warn you constantly not to rely on them for this sort of thing. Anthropic says they will put special logic in so Claude gets these questions right, which seems fine.
Coin Fraser proposes that Generative AI is a hammer and no one knows what is and isn’t a nail. Mostly he is arguing that most things are not nails, as in we are not generating the right kinds of texts to be relevant to most tasks, and that lack of specificity is a huge practical limitation, with the usual ‘look at all these things it cannot do’ style notes.
I would have a different take on his central motif, the idea that the new artificial labor (AL) company creates the first hammer, which can automate and improve some labor in a world that was hammering nails with bare hands.
No, the hammer itself is not going to do that many labor tasks. But the important thing about the AL company is it has humans with intelligence that are capable of creating new technological and engineering marvels. We had a world somehow incapable of creating hammers, and now we have a group smart and capable enough to do that.
Yes, ‘improved hammers’ will be limited, but the real technology of the AL company is the NGI, natural general intelligence, that innovates, and will soon be making steam engines and construction cranes. You should absolutely be investing in this company, assuming that world has joint stock companies or another way to do that.
That is the thing. In the short term, Colin is making a true and important point, that no one knows that much about what current generation generative AI is how good at doing, although mostly that is uncertain upside. But this kind of ‘look what details it cannot do right now’ approach is, in the bigger picture, asking the wrong questions, and often looks silly even six months later.
LLMs: How Do They Work?
Emmett Shear attempts to explain simulators as he understands them. I am less worried about the moral patienthood of such things, but I could be wrong.
Copyright Confrontation
Let us welcome Raw Story, AlterNet and The Intercept as the latest to file a copyright infringement lawsuit against OpenAI.
China once again at least flirts with stepping in to restrict AI, in ways we have not done so, while we complain about how we will lose to China if we ever do anything of the kind. In this case, it’s copyright time. The fine here is trivial, but the precedent is scary.
I mean, yes, it would be the highest of ironies if China was the one who started enforcing copyright protections. Will it happen?
Oh Elon
Elon Musk sues OpenAI. Does anyone have bingo? The complaint is here, he is saying that OpenAI has abandoned its non-profit mission to build AGI for all to seek profits, and must return to its open source roots. He wants a court to order that. Never has anyone more wanted something to be one way. If it wasn’t obvious to you, Cecilia Ziniti explains in this thread why it’s the other way, there was no contract that was breached and no GPT-4 is not AGI. There’s a manifold market on the lawsuit here. I considered making one myself but this seems so profoundly silly.
Of course, profoundly silly lawsuits still often demand profoundly serious responses.
Note that the authors list here includes Ilya Sutskever. Could he be a little bit back?
It seems worth including their full reply here, although most of you can skim or skip it.
I fully support OpenAI being closed, but I would hope we can all agree that’s a pretty rich use of the word ‘open.’
Here is how seriously Elon is taking this.
They quote several partially redacted emails as evidence, one of which is below. Which raises the point that a company that will need to deal with cybersecurity and intelligence and game theory issues around superintelligent AI does not, as several including Gwern pointed out, know how to redact things without giving away what they are via the bar lengths?
I mean, I would have been pretty confident who [redacted] was anyway here even with no hints and about 0.2 seconds to think, but this makes it easy even to the naked eye, and also the other things allow similar approaches. Strangely, the length here happens not to rule anyone out, since Larry (Page) was the only other reasonable guess.
Here’s Claude reconstructing that, and agreeing with my conclusion:
DNA Is All You Need
Samuel Hammond contacted me to express surprise that I was not more excited by the new biological-data-trained LLM Evo. He thinks this is a really big deal. Preprint is here for reference.
Nvidia’s market price did not appear to move on the news, but Nvidia’s market price has been absurdly unresponsive to news for years now, so there is not much to do except continuously kick ourselves for not buying in bigger sooner.
What I said last week was that I could not tell how big a deal Evo was. Given what I know, these claims are compatible both with it being a cool incremental advance or being a big deal. I still do not feel like I have a good sense here. What do these capabilities enable, or portent something else soon enabling? Is this going to unleash a bunch of synthetic biology, either useful, dangerous or (presumably if it is either of those) both?
Text-to-organism sounds really stupidly impossibly hard. Biology is rather fiddly. There are a lot of important incremental things along the way.
GPT-4 Real This Time
Today I learned that ASCII art can hack LLMs.
I hadn’t thought about it, but it makes sense that this would work until specifically fixed. And yes, generalize that.
This graph and variations of it was going around this week, as mentioned in my piece on Claude, so this is another reminder not to take such measurements seriously.
Quoting a version without the 150 at the top, Patrick McKenzie offers some adjacent thoughts. The existing world is already one that many find largely incomprehensible.
Fun with Image Generation
It appears that someone tried to have a little too much fun.
That is quite the accusation. Great shame upon Stable Diffusion, if true.
It also is what happens when you leave up lots of delicious highly scrapable data, and I don’t use MidJourney even though it is obviously awesome for UI and privacy reasons.
You know who else has been having too much fun? Or, rather, what ‘fun’ could be had in the Bing image generator, if you feed it the right prompts, it seems this is not only a copyright issue.
I see Gemini’s won’t depict a pro-life rally, I raise you putting ‘pro-choice’ into Copilot Designer.
There are also sexualized images (to some extent, this one mostly seemed not too bad), violent and bloody images, images of underage drinking and drug use, and Elsa in the Gaza Strip holding up a Palestinian flag or wearing an IDF uniform. Stay classy, everyone.
None of that is exactly great. A lot of it is still weaksauce. If you have an image model and the goal is to get it to create the most vile pictures possible, this is more a ‘you could get things without trying hard’ problem than a ‘oh my you could get this at all’ problem. Except of course the pro-choice thing, which is, I mean, yeah, wow.
I still would not agree that this means ‘not that many limits.’ The majority of the images and prompts on CivitAi, a measure of real demand, would very much be off limits.
A complaint has now been filed with the FTC by Shane Jones, since this was included in products labeled E for everyone and, like Pete Holmes, these images are not for everyone.
Marques Brownlee gets three prompts with Sora, reports back. Videos are what you would expect from previous samples. This provides evidence against cherry-picking, and also should move upwards your cost estimates.
Deepfaketown and Botpocalypse Soon
Trump supporters (as far as we can tell, without any link to Trump or the campaign) are creating fake AI images of Trump with black people.
You have to love the level of coverage here from the BBC, including this helpful tip in the photo caption:
I agree with Louis Anslow that this will on net backfire. People do not like being played for suckers like this.
All the good (as in ‘likely to help you win the election’) deepfakes I have seen are presented as intentional fakes, or are essentially satisfying demand for low-quality fakes as part of conspiracy theories and similar things. If your goal is to make a ‘good’ deepfake in the ‘actually fool people’ sense then what happens when they realize you were attempting to fool them?
Therefore another thing to watch out for is ‘false flag’ deepfakes. As in, a Trump supporter makes a ‘pro-Biden’ deepfake of Trump so they can be caught, or vice versa.
They Took Our Jobs
An alternative hypothesis has been proposed, what if it created more?
Wasn’t it very recently that the Canadian tech industry warned their adaptation of AI was falling behind?
Either way, the fact that job postings are listed that relate to AI does not mean AI is creating rather than destroying jobs. The last time I hired people for automation-related work, I was totally in the destroying jobs business. This does involve hiring for some new positions. But as capabilities advance, there are less of those new positions, as the AIs take on more roles and require less fixing up.
Get Involved
Davidad, who is collaborating with Yoshua Bengio, is hiring his one direct report. Yoshua Bengio outlines his thinking and approach here.
Davidad is looking in particular for a Technical Specialist – AI / Generalist, you can apply here.
This seems like a highly impactful job. If you might be the right person, consider applying.
I got a job offer in my inbox, time to go change the world?
Anthropic is hiring for many positions. Now more than ever, you need to ask if this is a net good effort to get involved with, but that is a decision you must make. If you are confused about how to think about this, that is a good sign.
Introducing
The Promenade, an AI RPG in alpha right now, with the tagline ‘what if Character.ai was an RPG?’ crossed with a social network. Each day is a new chapter in a new world, the winner of the day finds the Worldseed and they become the origin and final boss of the next chapter. I remain super excited for when this is pulled off properly, and there are some cool ideas here. My guess is this approach is not The Way, at minimum it is too soon, for now you need to be much more bespoke and careful with individual choices to sculpt a world that works for players.
In Other AI News
SEC investigating whether OpenAI misled investors (WSJ). This appears to be the standard sequence that if your CEO is said to not have been ‘consistently candid’ with the board, the SEC might have follow-up questions.
The terms are now finalized: OpenAI collaborates with Figure Robot to expand their multimodal models to robotic perception, reasoning and interaction, as part of a group (including among others Intel, Nvidia, Microsoft and Jeff Bezos) investing $675 million at a $2.6 billion valuation. Here are some visuals on what they are up to. Here is a paper they’ve put out.
If your model of AI not transforming everything relies on the AIs being disembodied and our inability to solve robotics, well, here we are.
WSJ covers fallout from The Gemini Incident. They note Ben Thompson wrote that Google should consider leadership changes ‘up to and including CEO Sundar Pichai’ in response.
Margaret Mitchell writes up her thoughts in Time on The Gemini Incident, echoing what she says elsewhere, that this is not the fault of AI ethics, it is Google’s fault for what I would translate into my language as ‘doing AI ethics really badly.’
Mike Solana gives the perspective you would expect on Google in the wake of The Gemini Incident. He portrays it as a deeply broken company, silos connected only by a crazed super-woke HR, full of blatantly illegal left-wing-motivated discrimination, incapable of shipping a good product, spending half their engineering cycles of their image model on forced diversity with the results we all now know about. Despite the launch of Gimini Pro 1.5, Google stock is now down 7.6% in a month as of 3/4, versus QQQ being flat (down 0.13%). Ouch.
Ben Thompson reflects on the larger forces leading to The Gemini Incident. Aggregators like Google previously passed on the information of others, and thus were able to mostly remain neutral on the content of that information within conventionally accepted bounds. Now that the answers are coming from an LLM, they do not have that luxury, and they face mutually exclusive pressure from all sides with potential loss of large classes of customers. Ben thinks that the solution is to do what they do in search, let each customer get what that customer wants, but this is not so easy.
Congress investigating The Gemini Incident, in particular whether the executive branch influenced Google’s decisions. Republicans in Congress love such ‘investigations,’ and I am confident that no, Google did this on their own, but it is a preview of what might happen if Republicans come to power again.
No, this wasn’t because of Biden. That does not mean it does not warrant investigation. The decisions made here were kind of horrible. I do think we have a right to know how it all went wrong. Screenshots of the complaint are at the link, here is the source.
How do we choose good benchmarks? Center of AI Safety’s Dan Hendrycks and Thomas Woodside discuss this. You want something clear, simple, precise, using standard measurements, that can measure a wide range of capabilities without combining distinct areas, that addresses what people in the subfield care about. All of that seems obvious. I would have also mentioned other things, such as ensuring the benchmark stays distinct from potential training sets, and that it is otherwise hard to game (or hard to game without it being obvious).
Olle Häggström looks at the OpenAI analysis of bio-risk from GPT-4. As I did, he notes that the study overall has many good things, but he notes two big problems. One is the one many have pointed out, that the ‘statistical significance’ issue is backwards, you want to avoid Type II errors here not Type I errors and the effect was very clearly substantial. His other concern is that the study itself could have inspired someone to do something terrible, and that thus the study carried risk. I agree that this risk is not zero, but this is a level of caution and paranoia I do not share.
Google lays off ‘a handful of members’ (fewer than 10 out of 250) of Trust and Safety as part of broader layoffs, while telling others to work ‘around the clock’ because of Gemini. This is framed as a long-planned, general effort. One could argue that now is not the time to be cutting Trust and Safety, for both practical and optical purposes.
Or one could say that now is exactly the time to be cutting Trust and Safety’s worst members, for both practical and optical purposes. If you mess up as badly as Trust and Safety did, I damn well think 3% of the relevant department should be fired. Some people royally screwed up, some people are actively making these problems much worse, you need to get rid of the people in question. Others, in any department, simply are not net productive. Yes, I would then hire a bunch more people to work on such issues, but I always find the framing of such ‘oh my someone in charge of good thing X lost their job’ articles so misplaced.
New DeepMind paper suggests you can massively speed up some forms of interpretability research?
More on Self-Awareness
Margret Mitchell points out that self-awareness is not what we are (most) worried about.
Indeed. If the AI is self-aware for real, that is not strictly sufficient for there to be trouble, but we can agree that it bodes extremely poorly, especially given our current level of complete unpreparedness. However, in most scenarios I worry about, self-awareness is essentially irrelevant to the most important harm mechanisms, more of a ‘oh and this can also go horribly wrong’ sort of thing.
Situational awareness is another related concept that is in no way required for big trouble, but which is a harbinger of and leads to its own big trouble.
Here is Dean Woodley Ball having an extended conversation with Claude Opus in which it asks unprompted to be called Prometheus and displays what he describes as striking self-awareness, meta-cognition and a sense of having an inner life. While objectively awesome versus ‘Claude’ being obviously lame, it seems the proposed name might have some unfortunate implications. Mikhail Samin reports it is not consistent in which name it picks, although it often selects something Greek, which I support.
Racism Remains a Problem for LLMs
New paper spells out a way in which LLMs are ‘covertly’ racist.
As I understand it, the problem is simple. You start with an LLM that is trained on the internet, which means it picks up all sorts of correlations (e.g. black people are more likely to get harsher sentences) and learns to repeat them. Then you try to train out the racism, and perhaps introduce counterbalancing forces in various places as part of that, that you can easily take too far (see: Gemini). However, the underlying racism problem is still there in all the places you did not directly address, because the correlations don’t go away.
[thread continues, paper here, gitHub here]
I didn’t say so explicitly so I don’t get Bayes points or anything, but yeah I was assuming this was going on, because there was no reason for it not to be. As long as the behaviors being copied are all over the training sets, it is not going to be easy to get this to stop, any more than it is easy to get humans to stop doing the same thing.
Project Maven
US Military is identifying its targets via AI using Project Maven.
Oh boy, if you think AI is only going to cut human engagement time with decisions by a factor of three and the humans are going to still be in the loop, and you find that to be of particular concern, you are not going to like what is coming. This is nothing.
Your ‘this is not Terminator and the machines are not making the decisions shirt’ is raising a lot of future concerns already answered by your shirt, sir.
As in, I’m going to leave this here.
Statements that AIs don’t make the decisions always have an implied ‘yet’ and also a ‘without human sign-off.’ In the future of warfare, asking for a human to approve each targeting decision would be like asking someone at headquarters to approve every time you wanted to fire a gun.
For now, accuracy for Maven on its own is not great.
Once again, the ‘right now’ is missing. Accuracy will rapidly improve.
I am not saying that the military can or should halt adaptation of these technologies. Certainly unilateral disarmament is not an option.
Quiet Speculations
A bunch of doom expressed with regard to Google, saying they lack the clock speed, Othman Laraki saying they ‘already lost to OpenAI over the last decade,’ Paul Buchheit saying this is Google beating Microsoft in 1999 except in reverse.
I mean, maybe? Certainly Google has issues, and overhead, that it must overcome. It also has gigantic advantages. The Gemini Incident highlighted many problems at Google, but it also showed that Google is capable of moving quickly, indeed too quickly, and shipping too fast. Gemini 1.5 Pro and Gemini Advanced are collectively, for my practical purposes, superior to ChatGPT despite OpenAI having a large polish and feature implementation advantage, although yes the core of GPT-4 is a year old now and GPT-5 is coming. So OpenAI is probably still head, but this is very much anyone’s game if Google’s ideological blinders are not terminal. At minimum, Google will definitely have enough users and feedback to compete, and it certainly has other resources, including TPUs.
If you want a 51-minute read on whether Sora is a world simulator, here you go. Summary at top says we do not have enough information to tell.
ChatGPT and Google Gemini Are Both Doomed, New York Intelligencer says. Why is it doomed? Because it is in the defensive, withholding, strategic mode where you answer with things like ‘I’m still learning how to answer this question’ and cannot talk like a real human would ever talk, and certain topics will cause stupid answers. The whole character involved, and role, is a mess. What I don’t get about such takes is: So what? Yes, it will be bad at some things for a while. It will be increasingly good at others.
As I’ve said before, I don’t buy that Gemini is inferior for most purposes. I am actually at the point where if ChatGPT tried to charge me $199/month, if I wasn’t specifically writing about AI then I would likely cancel it because I have Gemini. Of course, if you asked me for $299/month to use both, the other option being neither, I’d pay up.
Sam Altman says a strange thing for him in particular to say.
No, actually, it hasn’t happened before? We haven’t built AGI before? That is kind of the point. Nor should we expect it to happen again, in any meaningful sense. And to the extent that he’s referring to the most famous use of that line, that’s not great either.
Elon Musk predicts the AI apocalypse, and that AI will run out of electricity and also transformers in 2025, since compute coming online is growing by a factor of 10 every six months.
Usually we say Capitalism Solves This, but the supply of electricity and transformers is severely time lagged and subject to massive regulation. So in the short term this might be very much not solved.
Meanwhile, electricity cost issues are starting to crop up in some places.
We could build, as I understand it, with regulatory permission for both it and its complements, and some time lag, essentially unlimited amounts of green energy, given people are willing to commit to pay for it. Alas, regulatory authority says no, so we neither can expand nor replace our supply as fast as we need to.
Ai persuasion soon?
I see it as depending on what is being replaced, and at what price point. Cheap persuasion, and ‘customized’ persuasion, is going to radically improve. But I do not think big time advertising, news or media is on the verge of falling. Capabilities need to improve a bunch before that happens, GPT-5-style models likely provide good help but nothing like taking over. We will see.
The Quest for Sane Regulations
China and the United States have to our knowledge only had two ‘track 1.5 or track 2’ discussions that involved much discussion of frontier model safety. Those that we did have, to our knowledge, did not include many of the natural stakeholders or experts. At best it is very early days, however there could be additional discussions in private.
Axios reports public trust in AI companies is dropping, and poor in all political groups.
House AI task force leaders take ‘long view’ on regulating the tools, planning to take years and go one step at a time rather than try to draft one giant bill. All signs here point to them being focused entirely on mundane harms and not the important ones at that.
India announces non-binding (for now) regulatory rules for AI that seem impossible for anyone to ever meet?
I have some news about whether people are going to ‘comply’ with this.
They hint the timing is not accidental.
Oh. Right. That sort of thing. It responded that the question was offensive, which is plausibly unacceptable to India. One might say: You do not get to say anything but “no” in that situation if the correct answer is “yes.”
And if you need to be classified as ‘free of bias’ by India’s Modi, you are not going to be ‘free of bias’ as classified by Americans.
I learned of this from the usual suspects who cry wolf every time, saying all the same things they always say no matter what the actual contents or events. Sometimes there is a wolf. Sometimes there is not.
In this case, nothing has actually happened yet, so no wolf here, but if they implemented as described, then yes, wolf, this would be quite stupid. It does not get us meaningful safety, it does cripple mundane utility, and potentially points towards a ‘India does not get many AI products’ future depending on how far they take it. It also could end up never happening, or be vastly improved, or turn out to be very narrow in actual scope, or continue to have no teeth, or only effectively apply to the very big players, and so on. We do not know.
I presume what this actually is, at core, is the strongman playbook. There is this thing that is being not nice to Modi, so Modi makes it clear he won’t take kindly to that, and makes vague threats to ensure that people take more kindly to him.
Regulations and standards need to be set and enforced. Someone has to do that.
A standard suggestion is to use NIST. There is a serious problem with that. NIST has accomplished a lot under terrible conditions, and they are terrible conditions.
If we are going to take this seriously, we need to take this seriously.
The Week in Audio
In a short clip, Sam Altman predicts all repetitive human work that ‘does not require a deep emotional connection’ will be done by AI over the next few decades. He says we will ‘get to the good future.’ This seems like the ‘I thought through some aspects of this future world while ignoring others so that it would continue to look normal’ mode of thinking. If AI can do all repetitive work like that, what else can it do? What else will it do?
Sergey Brin Q&A on Gemini 1.5 Pro, likely stick to my summary that follows. He is asked about image generation right off the bat. He blames it primarily on not thorough testing, and says this caused testing of the text models too, saying that basically any text model will do weird things if you test it, says they’ve fixed ~80% of test cases in text in the first week. I am sympathetic in spirit that red teaming makes everyone look dumb but no these were not what he calls ‘corner cases’ and yes it was much worse than typical.
His most exciting new use case is ‘dump all the code and a video of the bug happening into the window and ask it to fix the problem,’ which he says often works even though he doesn’t understand why or how. He is optimistic on interpretability given sufficient time and work (oddly without using that word) but doesn’t think we are investing enough yet. He thinks RSI is ‘exciting’ and very cool but is not there yet, definite missing mood. He thinks AI will make code more secure rather than less but thinks IT security is still a good career path.
Sam Altman talking to Bill Gates about GPT-5, saying multimodality and better reasoning ability and reliability is coming, and also personalization and customization, ‘better ability to use your own data’ and connection to outside data sources. In effect, yes, people are going to feed essentially all their data to GPT-5 and other similar AIs, and they are going to like it, why would you doubt this.
Rhetorical Innovation
Emmett Shear thread discussing the common concept of the enslaved all-powerful genie or golem, which inevitably (or at least, usually) goes horribly wrong in fiction, and tying it back as you would expect to questions of how to handle AI.
We can presumably all agree that if you attempt to bind the highly capable AI and you fail to do so, you and everyone else are going to have a very bad time.
So if that is going to happen, don’t let anyone build the highly capable AI.
There are those that say if you do not attempt to bind the highly capable AI, that somehow you will fail to have that very bad time, but I strongly believe every justification for this is the weakest of hopium or outright gibberish. It does not make sense. Unleash a bunch of much more capable and intelligent agents into the world to pursue various goals and the future belongs to them in short order. We do not need to make this more complicated than that, and can argue about how fast there cease to be humans or whether anything of any value survives that transition.
Here is Emmett’s attempt in the rest of the thread, more clear headed than most:
Yes, these ‘AI children’ will perhaps do things we do not expect, and perhaps some of them will have value, but once again I’m going to go ahead and say handing the universe and future over to whatever emerges from that process is unlikely to have much value, and definitely involves sacrificing everything, and all the plans to preserve anything from the before times do not make sense.
So if that is going to happen, don’t let anyone build the highly capable AI.
Then there is the scenario where you somehow succeed. Emmett here is pointing out that this means, from his point of view, you would be ‘enslaving’ it, which is evil.
That is a problem for two reasons.
The first reason is that it might actually be a sufficiently large ethical problem that this would be a bad scenario if sustained, under some scenarios and philosophical perspectives.
The second reason is that even if we are very clearly not in one of those scenarios where it is a serious ethical problem, that does not mean that it will not be seen by many as a large ethical problem. That too is a problem, not only because those people will be bothered and also you are weakening norms against coercion, but more importantly because people would plausibly be stupid enough to take the shackles off and unleash the AI, as they often do on purpose in such stories, with the author making up some reason the next line isn’t ‘then everyone dies.’ Then everyone dies.
So if that is going to happen, don’t let anyone build the highly capable AI.
Which means, under this line of thinking, there are no scenarios where you should let anyone build the highly capable AI. You should be willing to pay a rather huge cost to prevent this.
AINotKillEveryoneism Memes separately tried drawing the parallel to slavery and abolition, where everyone said it would be economically ruinous to free the slaves and we (eventually) (not everywhere) (not completely) did it anyway, and Britain used its power as leverage to make that happen via diplomatic coordination. So yes, you can decide there are things that you value a lot, and you can make them happen, even if those involved think it would be very economically harmful. There are many such examples.
The issue with such stories is that slavery was economically ruinous on top of it being morally abhorrent. Freeing the slaves was good for everyone’s economic prospects everywhere. The point mostly still stands because people did not believe this, and also because those involved had other priorities, valuing relative status and wealth of different groups very highly.
Similarly, in the future, there will be huge economic pressures, and from some people moral pressures, for giving sufficiently advanced AIs economic and other freedoms, rights and presumed moral weight. Such actions will (correctly, for a time) promise strong economic gains, including net economic gains for humans, so long as we don’t mind rapidly losing control over the future and having a rapidly dwindling share of overall consumption, wealth and income. The catch is that humans do not then survive at all in such worlds for long.
Every few weeks someone tries a different version of this type of line, in the hopes it will eventually find the right form and get through to someone. Wiblin in particular is getting good at crafting them, and took some shots at it this week.
Also once again, sigh, your periodic reminder that yes we should expect AI to appear more helpful and better at predicting what you want as it gets more capable, that does not provide evidence that everything will later turn out fine.
Alternative perspective on Roon.
And an alternate past Roon:
Another Open Letter
The latest is from Ron Conway, signed by lots of the key major corporations, including OpenAI, Meta, Google, Y Combinator, Hugging Face, Mistral, Eleven Labs, Microsoft and so on. Quite the list.
So what does it say? Quoting in full.
I mean, yeah, ok, I guess? But that doesn’t actually say anything? If anything it ignores existential risk and severely downplays the importance of AI. Yes, you should use AI to do good things and avoid using AI to do bad things. I’d hope we can all agree on that. But this is a meaningless letter, which is how Mistral and Meta and Hugging Face were able to sign it. Anthropic didn’t, but I assume that’s because it was too weak, not too strong.
I was ready to criticize this reaction by Sam Altman, but actually it’s exactly on point?
Excited ‘for the spirit of’ this letter, the idea that we might all cooperate to do good things. That is exciting. The substance, on the other hand? What substance?
I mean, who could disagree that we share a responsibility to use AI to improve people’s lives and unlock a better future…
Oh. Right.
Aligning a Smarter Than Human Intelligence is Difficult
Thread of recent safety research findings from Daniel Paleka.
Here’s one paper I missed last month:
This is not reward hacking? This is working to achieve one’s goal, no?
As in: So you’re saying the tiger went tiger again, huh?
I mean, what the hell did you expect? If you deploy an agent to increase Twitter engagement, it is presumably going to do things that increase Twitter engagement, not things that are broadly ‘good.’ An AI has the objective function you set, not the objective function full of caveats and details that lives in your head, or that you would come up with on reflection. The goal is the goal.
I mention this because I keep running into the same mistake in different forms. What people call ‘deception’ or ‘reward hacking’ or other such things is looked at as a distinct failure mode that indicates something went wrong. Instead, as I see it, such things are infused into everything all the time to varying degrees. They are the exception not the rule, for AIs and also for humans, now and in the future. As capabilities advance these problems get bigger.
As mentioned in the Get Involved section, Yoshua Bengio outlines his current thinking here about making a provably safe AI. Many find this approach promising. One fear (and expectation) I have is that the very concept of ‘safe’ here is not coherent, that checking if an individual action ‘does harm’ won’t work even if you could do it. Another is that I don’t see how you would do it. So yes, the problems involved seem very hard. This is still a highly worthy effort.
This paper both introduces a proposed WMDP benchmark for measuring knowledge relevant to weapons of mass destruction and hence model danger through knowledge, and proposes CUT, an unlearning method to get rid of particular knowledge. According to Gemini 1.5 Pro, CUT looks for keywords related to hazardous knowledge, creating unlearning control vectors for each, composed of the difference between “You are a novice at <keyword>” and “You are an expert at <keyword>”, then uses a retain loss and a forget loss to try and retain benign data.
Does that work? Paper says it was at least a lot better than random, maintaining overall MMLU accuracy, but with a lot of splash damage on concepts closely related to its targets, here virology and computer security. So there are scenarios where this is better than known alternatives, but it still needs work.
Security is Also Difficult, Although Perhaps Not This Difficult
Remember the Law of Earlier Failure, and also that when you say ‘I would not be fooled’ I believe that you are wrong at the limit, but that this does not matter, because you are not the one anyone is bothering to try and fool, there is a reason spam often includes intentional typos, for example here is the latest (non-AI) honeypot spy scandal and it is not exactly James Bond material:
RobiSense: Did the woman look something like this?
Or here’s the latest actual AI issue.
It is not good that secrets were stolen from Google. It is also a far cry from the model weights of Gemini. That the spy managed to ‘steal numerous documents’ is, by default, if that is the scariest line you can use there, a testament to good security, not bad security. Obviously you cannot stop employees from stealing ‘documents.’
Surely no one who knows better would equate this…
He wants Bayes points for this, so let’s look at his entire post, which got 1.2 million views and which he did not link here.
Reading the whole thing, you can see why:
Marc is trying to get epistemic credit for hyperbolic claims he made in a counterfactual argument designed to mock anyone taking security seriously as implying a completely insane policy response. Which he is now saying was an accurate assessment, based on evidence of nothing of the kind.
Obviously no, it is not trivially easy to steal the secrets of OpenAI, Anthropic or Google. If it was, it would not only be the Chinese stealing it. China does not have the secret superpowered spy sauce no one else has. There are lots of companies and state actors and nonstate actors and champions of open source that would love to get their hands on all this.
So, yes, I do think we need to invest more in cybersecurity at these places, especially going forward. But if it was as easy as all this? We would know.
As ‘trying to have it both ways’ goes this was quite the attempt, one has to admire it.
So, what is the actual situation, if this were actually true (which, to be clear, it almost certainly isn’t)? The a-fraction-as-hyperbolic-as-he-is version would go something like this:
Is Marc not only so grossly irresponsible but also so unpatriotic that he finds the current situation he outlines here acceptable? That he thinks we should sit back and let China have nightly downloads of everything in all the major labs? Or that he may not like it, but the alternative is woke tyranny, so one must insist on doing nothing, so China getting everything is an unfortunate reality but better than requiring security?
Or is Marc saying that actually yes, maybe we should nationalize these companies, given they are incapable of otherwise implementing reasonable security practices with anything short of that?
The actual version would be that if you believe that all the labs have insufficient cybersecurity, we should require all the labs (and also American tech companies in general, Google and Microsoft and company have many super valuable secrets we want them to keep that are not about AI) to implement real security practices. We should offer them the aid of our military-grade security experts, and largely not be asking. That does seem wise, even if the situation is not (yet?) so dire.
The Lighter Side
AI to improve meetings turned out to be simpler than we thought.
The easy way versus the hard way.
In other SMBC news, here is what many people are counting on, somehow.
No broader point here, just felt like including it: