Upstarts will not defeat them, since capital now trivially converts into superhuman labour in any field.
It is false today that big companies with 10x the galaxy brains and 100x the capital reliably outperform upstarts.[1]
Why would this change? I don't think you make the case.
My favorite example, though it might still be falsified. Google invented transformers, owns DeepMind, runs their own data centres, builds their own accelerators and have huge amounts of them, have tons of hard to get data (all those books they scanned before that became no
I don't expect it that soon, but do I expect more likely than not that there's a covid-esque fire alarm + rapid upheaval moment.
For people who don't expect a strong government response... remember that Elon is First Buddy now. 🎢
Okay, well, I'm not going to post "Anthropic leadership conversation [fewer likes]" 😂
(Can you edit out all the "like"s, or give permission for an admin to do edit it out? I think in written text it makes speakers sound, for lack of a better word, unflatteringly moronic)
McNamara was at Ford, not Toyota. I reckon he modelled manufacturing like an efficient Boeing manager not an efficient SpaceX manager
(Nitpick: I'd find the first paragraphs would be much easier to read if they didn't have any of the bolding)
rename the "provable safety" area as "provable safety modulo assumptions" area and be very explicit about our assumptions.
Very much agree. I gave some feedback along those lines as the term was coined; and am sad it didn't catch on. But of course "provable safety modulo assumptions" isn't very short and catchy...
I do like the word "guarantee" as a substitute. We can talk of formal guarantees, but also of a store guaranteeing that an item you buy will meet a certain standard. So it's connotations are nicely in the direction of proof but without, as it were, "proving too much" :)
Interesting thread to return to, 4 years later.
FYI: I skimmed the post quickly and didn't realize there was a Patreon!
If you wanted to change that, you might want to put it at the very end of the post, on a new line, saying something like: "If you'd like to fund my work directly, you can do so via Patreon [here](link)."
Someone posted these quotes in a Slack I'm in... what Ellsberg said to Kissinger:
...“Henry, there’s something I would like to tell you, for what it’s worth, something I wish I had been told years ago. You’ve been a consultant for a long time, and you’ve dealt a great deal with top secret information. But you’re about to receive a whole slew of special clearances, maybe fifteen or twenty of them, that are higher than top secret.
“I’ve had a number of these myself, and I’ve known other people who have just acquired them, and I have a pretty good sense of w
Someone else added these quotes from a 1968 article about how the Vietnam war could go so wrong:
...Despite the banishment of the experts, internal doubters and dissenters did indeed appear and persist. Yet as I watched the process, such men were effectively neutralized by a subtle dynamic: the domestication of dissenters. Such "domestication" arose out of a twofold clubbish need: on the one hand, the dissenter's desire to stay aboard; and on the other hand, the nondissenter's conscience. Simply stated, dissent, when recognized, was made to feel at home. On th
tbf I never realized "sic" was mostly meant to point out errors, specifically. I thought it was used to mean "this might sound extreme --- but I am in fact quoting literally"
I mean that in both cases he used literally those words.
Sic is short for the Latin phrase sic erat scriptum, which means thus it was written. As this suggests, people use sic to show that a quote has been reproduced exactly from the source – including any spelling and grammatical errors and non-standard spellings.
I was only familiar with sic to mean "error in original" (I assume kave also), but this alternative use makes sense too.
It's not epistemically poor to say these things if they're actually true.
Invalid.
Compare:
A: "So I had some questions about your finances, it seems your trading desk and exchange operate sort of closely together? There were some things that confused me..."
B: "our team is 20 insanely smart engineers"
A: "right, but i had a concern that i thought perhaps ---"
B: "if you join us and succeed you'll be a multi millionaire"
A: "...okay, but what if there's a sudden downturn ---"
B: "bull market is inevitable right now"
Maybe not false. But epistemically poor form.
(crossposted to the EA Forum)
(😭 there has to be a better way of doing this, lol)
(crossposted to EA forum)
I agree with much of Leopold's empirical claims, timelines, and analysis. I'm acting on it myself in my planning as something like a mainline scenario.
Nonetheless, the piece exhibited some patterns that gave me a pretty strong allergic reaction. It made or implied claims like:
[censored_meme.png]
I like review bot and think it's good
(Sidenote: it seems Sam was kind of explicitly asking to be pressured, so your comment seems legit :)
But I also think that, had Sam not done so, I would still really appreciate him showing up and responding to Oli's top-level post, and I think it should be fine for folks from companies to show up and engage with the topic at hand (NDAs), without also having to do a general AMA about all kinds of other aspects of their strategy and policies. If Zach's questions do get very upvoted, though, it might suggest there's demand for some kind of Anthropic AMA event.)
Poor Review Bot, why do you get so downvoted? :(
Because it's obviously annoying and burning the commons. Imagine if I made a bot that posted the same comment on every post of less wrong, surely that wouldn't be acceptable behavior.
I was around a few years ago when there were already debates about whether 80k should recommend OpenAI jobs. And that's before any of the fishy stuff leaked out, and they were stacking up cool governance commitments like becoming a capped-profit and having a merge-and-assist-clause.
And, well, it sure seem like a mistake in hindsight how much advertising they got.
Not sure how to interpret the "agree" votes on this comment. If someone is able to share that they agree with the core claim because of object-level evidence, I am interested. (Rather than agreeing with the claim that this state of affairs is "quite sad".)
Does anyone from Anthropic want to explicitly deny that they are under an agreement like this?
(I know the post talks about some and not necessarily all employees, but am still interested).
I am a current Anthropic employee, and I am not under any such agreement, nor has any such agreement ever been offered to me.
If asked to sign a self-concealing NDA or non-disparagement agreement, I would refuse.
I left Anthropic in June 2023 and am not under any such agreement.
EDIT: nor was any such agreement or incentive offered to me.
Note that, by the grapevine, sometimes serving inference requests might loose OpenAI money due to them subsidising it. Not sure how this relates to boycott incentives.
That metaphor suddenly slide from chess into poker.
If AI ends up intelligent enough and with enough manufacturing capability to threaten nuclear deterrence; I'd expect it to also deduce any conclusions I would.
So it seems mostly a question of what the world would do with those conclusions earlier, rather than not at all.
A key exception is if later AGI would be blocked on certain kinds of manufacturing to create it's destabilizing tech, and if drawing attention to that earlier starts serially blocking work earlier.
I have thoughts on the impact of AI on nuclear deterrents; and claims made thereof in the post.
But I'm uncertain whether it's wise to discuss such things publicly.
Curious if folks have takes on that. (The meta question)
y'know, come to think of it... Training and inference differ massively in how much compute they consume. So after you've trained a massive system, you have a lot of compute free to do inference (modulo needing to use it to generate revenue, run your apps, etc). Meaning that for large scale, critical applications, it might in fact be feasible to tolerate some big, multiple OOMs, hit to the compute cost of your inference; if that's all that's required to get the zero knowledge benefits, and if those are crucial
"arguments" is perhaps a bit generous of a term...
(also, lol at this being voted into negative! Giving karma as encouragement seems like a great thing. It's the whole point of it. It's even a venerable LW tradition, and was how people incentivised participation in the annual community surveys in the elden days)
(Also the arguments of this comment do not apply to Community Notes.)
the amount of people who could write sensible arguments is small
Disagree. The quality of arguments that need debunking is often way below the average LW:ers intellectual pay grade. And there's actually quite a lot of us.
Cross posting sure seems cheap. Though I think replying and engaging with existing discourse is easier than building a following of one's top level posts from scratch.
Yeah, my hypothesis is something like this might work.
(Though I can totally see how it wouldn't though, and I wouldn't have thought it a few years ago, so my intuition might just be mistaken)
I dont think the numbers really check out on your claim. Only a small proportion of people reading this are alignment researchers. And for remaining folks many are probably on Twitter anyway, or otherwise have some similarly slack part of their daily scheduling filled with sort of random non high opportunity cost stuff.
Historically there sadly hasn't been scalable ways for the average LW lurker to contribute to safety progress; now there might be a little one.
never thought I'd die fighting side by side with an elf...
Yes, I've felt some silent majority patterns.
Collective action problem idea: we could run an experiment -- 30 ppl opt in to writing 10 comments and liking 10 comments they think raise the sanity waterline, conditional on a total of 29 other people opting in too. (A "kickstarter".) Then we see if it seemed like it made a difference.
I'd join. If anyone is also down for that, feel free to use this comment as a schelling point and reply with your interest below.
(I'm not sure the right number of folks, but if we like the result we could just do another round.)
there could still be founder effects in the discourse, or particularly influential people could be engaged in the twitter discourse.
I think that's the case. Mostly the latter, some of the former.
Without commenting on the proposal itself; I think the term "eval test set" is clearer for this purpose than "closed source eval".
I'm writing a quick and dirty post because the alternative is that I wait for months and maybe not write it after all.
This is the way.
I think this is an application of a more general, very powerful principle of mechanism design: when cognitive labor is abundant, near omni-present surveillance becomes feasible.
For domestic life, this is terrifying.
But for some high stakes, arms race-style scenarios, it might have applications.
Beyond what you metioned, I'm particularly interested in this being a game-changer for bilateral negotiation. Two parties make an agreement, consent to being monitored by an AI auditor, and verify that the auditor's design will communicate with the ...
ah that makes sense thanks
Sidenote: I'm a bit confused by the name. The all caps makes it seem like an acronym. But it seems to not be?
Intentional
Lure for
Improvised
Acronym
Derivation
International League of Intelligent Agent Deconfusion
I
Love
Interesting
Alignment
Donferences
Sure that works! Maybe use a term like "importantly misguided" instead of "correct"? (Seems easier for me to evaluate)
To anyone reading this who is considering working in alignment --
Following the recent revelations, I now believe OpenAI should be regarded as a bad faith actor. If you go work at OpenAI, I believe your work will be net negative; and will most likely be used to "safetywash" or "governance-wash" Sam Altman's mad dash to AGI. It now appears Sam Altman is at least a sketchy as SBF. Attempts to build "social capital" or "affect the culture from the inside" will not work under current leadership (indeed, what we're currently seeing are the failed results of 5+ y...
I'm around ~40% on "4 years from now, I'll think it was clearly the right call for alignment folk to just stop working at OpenAI, completely."
But, I think it's much more likely that I'll continue endorsing something like "Treat OpenAI as a manipulative adversary by default, do not work there or deal with them unless you have a concrete plan for how you are being net positive. And because there's a lot of optimization power in their company, be pretty skeptical that any plans you make will work. Do not give them free resources (like inviting the...
Hm, I disagree and would love to operationalize a bet/market on this somehow; one approach is something like "Will we endorse Jacob's comment as 'correct' 2 years from now?", resolved by a majority of Jacob + Austin + <neutral 3rd party>, after deliberating for ~30m.
Mostly seems sensible to me (I agree that a likely model is that there's a lot of deceptive and manipulative behavior coming from the top and that caring about extinction risks was substantially faked), except that I would trust an agreement from Altman much more than an agreement from Bankman-Fried.
That's more about me being interested in key global infrastructure, I've been curious about them for quite a lot of years after realising the combination of how significant what they're building is vs how few folks know about them. I don't know that they have any particularly generative AI related projects in the short term.
Anyone know folks working on semiconductors in Taiwan and Abu Dhabi, or on fiber at Tata Industries in Mumbai?
I'm currently travelling around the world and talking to folks about various kinds of AI infrastructure, and looking for recommendations of folks to meet!
If so, freel free to DM me!
(If you don't know me, I'm a dev here on LessWrong and was also part of founding Lightcone Infrastructure.)
Surely one should think of this as a vector in a space with more dimensions than 1.
In your equation you can just 1,000,000x magnitude and it will move in the "positive direction".
In the real world you can become a billionaire from selling toothbrushes and still be "overtaken" by a guy who wrote one blog post that happened to be real dang good
I made a drawing but lw won't allow adding it on phone I think