All of Bird Concept's Comments + Replies

Impact = Magnitude * Direction

Surely one should think of this as a vector in a space with more dimensions than 1. 

In your equation you can just 1,000,000x magnitude and it will move in the "positive direction".

In the real world you can become a billionaire from selling toothbrushes and still be "overtaken" by a guy who wrote one blog post that happened to be real dang good

I made a drawing but lw won't allow adding it on phone I think

4plex
Or, worse, if most directions are net negative and you have to try quite hard to find one which is positive, almost everyone optimizing for magnitude will end up doing harm proportional to how much they optimize magnitude.
1f3mi
I was thinking something potentially similar. This is super nitpicky, but the better equation would be impact = Magnitude * ||Direction||

Upstarts will not defeat them, since capital now trivially converts into superhuman labour in any field.


It is false today that big companies with 10x the galaxy brains and 100x the capital reliably outperform upstarts.[1] 

Why would this change? I don't think you make the case. 

  1. ^

    My favorite example, though it might still be falsified. Google invented transformers, owns DeepMind, runs their own data centres, builds their own accelerators and have huge amounts of them, have tons of hard to get data (all those books they scanned before that became no

... (read more)
4lc
They have "galaxy brains", but applying those galaxy brains strategically well towards your goals is also an aspect of intelligence. Additionally, those "galaxy brains" may be ineffective because of issues with alignment towards the company, whereas in a startup often you can get 10x or 100x more out of fewer employees because they have equity and understand that failure is existential for them. Demis may be smart, but he made a major strategic error if his goal was to lead in the AGI race, and despite the fact that the did he is still running DeepMind, which suggests an alignment/incentive issue with regards to Google's short term objectives.
4Alexander Gietelink Oldenziel
OpenAI is worth about 150 billion dollars and has the backing of microsoft. Google gemini is apparently competitive now with Claude and gpt4. Yes google was sleeping on LLMs two years ago and OpenAI is a little ahead but this moat is tiny.
4L Rudolf L
For example: * Currently big companies struggle to hire and correctly promote talent for the reasons discussed in my post, whereas AI talent will be easier to find/hire/replicate given only capital & legible info * To the extent that AI ability scales with resources (potentially boosted by inference-time compute, and if SOTA models are no longer available to the public), then better-resourced actors have better galaxy brains * Superhuman intelligence and organisational ability in AIs will mean less bureaucratic rot and communication bandwidth problems in large orgs, compared to orgs made out of human brain -sized chunks, reducing the costs of scale Imagine for example the world where software engineering is incredibly cheap. You can start a software company very easily, yes, but Google can monitor the web for any company that makes revenue off of software, instantly clone the functionality (because software engineering is just a turn-the-crank-on-the-LLM thing now) and combine it with their platform advantage and existing products and distribution channels. Whereas right now, it would cost Google a lot of precious human time and focus to try to even monitor all the developing startups, let alone launch a competing product for each one. Of course, it might be that Google itself is too bureaucratic and slow to ever do this, but someone else will then take this strategy. C.f. the oft-quoted thing about how the startup challenge is getting to distribution before the incumbents get to distribution. But if the innovation is engineering, and the engineering is trivial, how do you get time to get distribution right? (Interestingly, as I'm describing it above the most key thing is not so much capital intensivity, and more just that innovation/engineering is no longer a source of differential advantage because everyone can do it with their AIs really well) There's definitely a chance that there's some "crack" in this, either from the economics or the nature of AI perf

I don't expect it that soon, but do I expect more likely than not that there's a covid-esque fire alarm + rapid upheaval moment. 

For people who don't expect a strong government response... remember that Elon is First Buddy now. 🎢

It is January 2020. 

3james oofou
This aged amusingly. 
5Mitchell_Porter
By the start of April half the world was locked down, and Covid was the dominant factor in human affairs for the next two years or so. Do you think that issues pertaining to AI agents are going to be dominating human affairs so soon and so totally? 
9keltan
In reference to o3 right? Comparing it to just before the 2020 pandemic started? As in “Something large is about to happen and we are unprepared”?
4yanni kyriacos
Yep  

Okay, well, I'm not going to post "Anthropic leadership conversation [fewer likes]" 😂

(Can you edit out all the "like"s, or give permission for an admin to do edit it out? I think in written text it makes speakers sound, for lack of a better word, unflatteringly moronic) 

8Zach Stein-Perlman
I already edited out most of the "like"s and similar. I intentionally left some in when they seemed like they might be hedging or otherwise communicating this isn't exact. You are free to post your own version but not to edit mine. Edit: actually I did another pass and edited out several more; thanks for the nudge.

McNamara was at Ford, not Toyota. I reckon he modelled manufacturing like an efficient Boeing manager not an efficient SpaceX manager

2L Rudolf L
I was referring to McNamara's government work, forgot about his corporate job before then. I agree there's some SpaceX to (even pre-McDonnell Douglas merger?) Boeing axis that feels useful, but I'm not sure what to call it or what you'd do to a field (like US defence) to perpetuate the SpaceX end of it, especially over events like handovers from Kelly Johnson to the next generation.
Bird ConceptΩ237

(Nitpick: I'd find the first paragraphs would be much easier to read if they didn't have any of the bolding)

rename the "provable safety" area as "provable safety modulo assumptions" area and be very explicit about our assumptions.

Very much agree. I gave some feedback along those lines as the term was coined; and am sad it didn't catch on. But of course "provable safety modulo assumptions" isn't very short and catchy...

I do like the word "guarantee" as a substitute. We can talk of formal guarantees, but also of a store guaranteeing that an item you buy will meet a certain standard. So it's connotations are nicely in the direction of proof but without, as it were, "proving too much" :)

2Davidmanheim
That seems fair!

Interesting thread to return to, 4 years later. 

FYI: I skimmed the post quickly and didn't realize there was a Patreon! 

If you wanted to change that, you might want to put it at the very end of the post, on a new line, saying something like: "If you'd like to fund my work directly, you can do so via Patreon [here](link)."

2abramdemski
Edited.

Someone posted these quotes in a Slack I'm in... what Ellsberg said to Kissinger: 

“Henry, there’s something I would like to tell you, for what it’s worth, something I wish I had been told years ago. You’ve been a consultant for a long time, and you’ve dealt a great deal with top secret information. But you’re about to receive a whole slew of special clearances, maybe fifteen or twenty of them, that are higher than top secret.

“I’ve had a number of these myself, and I’ve known other people who have just acquired them, and I have a pretty good sense of w

... (read more)
6TsviBT
So what am I supposed to do if people who control resources that are nominally earmarked for purposes I most care about are behaving this way?
Jonas V7722

Someone else added these quotes from a 1968 article about how the Vietnam war could go so wrong:

Despite the banishment of the experts, internal doubters and dissenters did indeed appear and persist. Yet as I watched the process, such men were effectively neutralized by a subtle dynamic: the domestication of dissenters. Such "domestication" arose out of a twofold clubbish need: on the one hand, the dissenter's desire to stay aboard; and on the other hand, the nondissenter's conscience. Simply stated, dissent, when recognized, was made to feel at home. On th

... (read more)

tbf I never realized "sic" was mostly meant to point out errors, specifically. I thought it was used to mean "this might sound extreme --- but I am in fact quoting literally"

Ruby*120

Sic is short for the Latin phrase sic erat scriptum, which means thus it was written. As this suggests, people use sic to show that a quote has been reproduced exactly from the source – including any spelling and grammatical errors and non-standard spellings.


I was only familiar with sic to mean "error in original" (I assume kave also), but this alternative use makes sense too.

It's not epistemically poor to say these things if they're actually true.

Invalid. 

Compare: 

A: "So I had some questions about your finances, it seems your trading desk and exchange operate sort of closely together? There were some things that confused me..."

B: "our team is 20 insanely smart engineers" 

A: "right, but i had a concern that i thought perhaps ---"

B: "if you join us and succeed you'll be a multi millionaire"  

A: "...okay, but what if there's a sudden downturn ---" 

B: "bull market is inevitable right now"

 

Maybe not false. But epistemically poor form. 

(crossposted to EA forum)

I agree with much of Leopold's empirical claims, timelines, and analysis. I'm acting on it myself in my planning as something like a mainline scenario. 

Nonetheless, the piece exhibited some patterns that gave me a pretty strong allergic reaction. It made or implied claims like:

  • a small circle of the smartest people believe this
  • i will give you a view into this small elite group who are the only who are situationally aware
  • the inner circle longed tsmc way before you
  • if you believe me; you can get 100x richer -- there's still alpha,
... (read more)
2kave
(I don't understand your usage of "sic" here. My guess from the first was that you meant it to mean "he really said this obviously wrong thing", but that doesn't quite make sense with the second one).
8Joseph Miller
(crossposted to the EA Forum) These are not just vibes - they are all empirical claims (except the last maybe). If you think they are wrong, you should say so and explain why. It's not epistemically poor to say these things if they're actually true.
6kave
mod note: this comment used to have a gigantic image of Rockwell's Freedom of Speech, which I removed.
Bird Concept3769

(Sidenote: it seems Sam was kind of explicitly asking to be pressured, so your comment seems legit :)  
But I also think that, had Sam not done so, I would still really appreciate him showing up and responding to Oli's top-level post, and I think it should be fine for folks from companies to show up and engage with the topic at hand (NDAs), without also having to do a general AMA about all kinds of other aspects of their strategy and policies. If Zach's questions do get very upvoted, though, it might suggest there's demand for some kind of Anthropic AMA event.) 

Because it's obviously annoying and burning the commons. Imagine if I made a bot that posted the same comment on every post of less wrong, surely that wouldn't be acceptable behavior.

Bird Concept2227

I was around a few years ago when there were already debates about whether 80k should recommend OpenAI jobs. And that's before any of the fishy stuff leaked out, and they were stacking up cool governance commitments like becoming a capped-profit and having a merge-and-assist-clause. 

And, well, it sure seem like a mistake in hindsight how much advertising they got. 

Not sure how to interpret the "agree" votes on this comment. If someone is able to share that they agree with the core claim because of object-level evidence, I am interested. (Rather than agreeing with the claim that this state of affairs is "quite sad".)

Does anyone from Anthropic want to explicitly deny that they are under an agreement like this? 

(I know the post talks about some and not necessarily all employees, but am still interested). 

I am a current Anthropic employee, and I am not under any such agreement, nor has any such agreement ever been offered to me.

If asked to sign a self-concealing NDA or non-disparagement agreement, I would refuse.

Reply1454
Ivan Vendrov*14012

I left Anthropic in June 2023 and am not under any such agreement.

EDIT: nor was any such agreement or incentive offered to me.

6aysja
Agreed. I'd be especially interested to hear this from people who have left Anthropic.  
7RobertM
Did you see Sam's comment?

Note that, by the grapevine, sometimes serving inference requests might loose OpenAI money due to them subsidising it. Not sure how this relates to boycott incentives. 

5gwern
If serving those inference requests at a loss did not net benefit OA, then OA would not serve them. So it doesn't matter for the purpose of a boycott - unless you believe you know their business a lot better than they do, and can ensure you only make inference requests that are a genuine net loss to them and not a net benefit.

That metaphor suddenly slide from chess into poker. 

If AI ends up intelligent enough and with enough manufacturing capability to threaten nuclear deterrence; I'd expect it to also deduce any conclusions I would.

So it seems mostly a question of what the world would do with those conclusions earlier, rather than not at all.

A key exception is if later AGI would be blocked on certain kinds of manufacturing to create it's destabilizing tech, and if drawing attention to that earlier starts serially blocking work earlier.

I have thoughts on the impact of AI on nuclear deterrents; and claims made thereof in the post.

But I'm uncertain whether it's wise to discuss such things publicly.

Curious if folks have takes on that. (The meta question)

6owencb
My take is that in most cases it's probably good to discuss publicly (but I wouldn't be shocked to become convinced otherwise). The main plausible reason I see for it potentially being bad is if it were drawing attention to a destabilizing technology that otherwise might not be discovered. But I imagine most thoughts are kind of going to be chasing through the implications of obvious ideas. And I think that in general having the basic strategic situation be closer to common knowledge is likely to reduce the risk of war.  (You might think the discussion could also have impacts on the amount of energy going into racing, but that seems pretty unlikely to me?)

y'know, come to think of it... Training and inference differ massively in how much compute they consume. So after you've trained a massive system, you have a lot of compute free to do inference (modulo needing to use it to generate revenue, run your apps, etc). Meaning that for large scale, critical applications, it might in fact be feasible to tolerate some big, multiple OOMs, hit to the compute cost of your inference; if that's all that's required to get the zero knowledge benefits, and if those are crucial 

"arguments" is perhaps a bit generous of a term...

(also, lol at this being voted into negative! Giving karma as encouragement seems like a great thing. It's the whole point of it. It's even a venerable LW tradition, and was how people incentivised participation in the annual community surveys in the elden days)

6habryka
The LW survey is something that has broad buy-in. In-general promising to upvote others distorts the upvote signal (since like, people could very easily manufacture lots of karma by just upvoting each others), so there is a prior against this kind of stuff that sometimes can be overcome.

(Also the arguments of this comment do not apply to Community Notes.)

the amount of people who could write sensible arguments is small

Disagree. The quality of arguments that need debunking is often way below the average LW:ers intellectual pay grade. And there's actually quite a lot of us.

2Michaël Trazzi
ok I meant something like "people would could reach a lot of people (eg. roon's level, or even 10x less people than that) from tweeting only sensible arguments is small" but I guess that don't invalidate what you're suggesting. if I understand correctly, you'd want LWers to just create a twitter account and debunk arguments by posting comments & occasionally doing community notes that's a reasonable strategy, though the medium effort version would still require like 100 people spending sometimes 30 minutes writing good comments (let's say 10 minutes a day on average). I agree that this could make a difference.  I guess the sheer volume of bad takes or people who like / retweet bad takes is such that even in the positive case that you get like 100 people who commit to debunking arguments, this would maybe add 10 comments to the most viral tweets (that get 100 comments, so 10%), and maybe 1-2 comments for the less popular tweets (but there's many more of them) I think it's worth trying, and maybe there are some snowball / long-term effects to take into account. it's worth highlighting the cost of doing so as well (16h or productivity a day for 100 people doing it for 10m a day, at least, given there are extra costs to just opening the app). it's also worth highlighting that most people who would click on bad takes would already be polarized and i'm not sure if they would change their minds of good arguments (and instead would probably just reply negatively, because the true rejection is more something about political orientations, prior about AI risk, or things like that) but again, worth trying, especially the low efforts versions

Cross posting sure seems cheap. Though I think replying and engaging with existing discourse is easier than building a following of one's top level posts from scratch.

Yeah, my hypothesis is something like this might work.

(Though I can totally see how it wouldn't though, and I wouldn't have thought it a few years ago, so my intuition might just be mistaken)

I dont think the numbers really check out on your claim. Only a small proportion of people reading this are alignment researchers. And for remaining folks many are probably on Twitter anyway, or otherwise have some similarly slack part of their daily scheduling filled with sort of random non high opportunity cost stuff.

Historically there sadly hasn't been scalable ways for the average LW lurker to contribute to safety progress; now there might be a little one.

1Nate Showell
The time expenditure isn't the crux for me, the effects of Twitter on its user's habits of thinking are the crux. Those effects also apply to people who aren't alignment researchers. For those people, trading away epistemic rationality for Twitter influence is still very unlikely to be worth it.
2mako yass
Remember that you were only proposing discreet auditing systems to mollify the elves. They think of this as a privacy-preserving technology, because it is one, and that's largely what we're using it for. Though it's also going to cause tremendous decreases in transaction costs by allowing ledger state to be validated without requiring the validator to store a lot of data or replay ledger history. If most crypto investors could foresee how it's going to make it harder to take rent on ledger systems, they might not be so happy about it. Oh! ""10x" faster than RISC Zero"! We're down to a 1000x slowdown then! Yay! Previous coverage btw.
5Bird Concept
(also, lol at this being voted into negative! Giving karma as encouragement seems like a great thing. It's the whole point of it. It's even a venerable LW tradition, and was how people incentivised participation in the annual community surveys in the elden days)

Yes, I've felt some silent majority patterns.

Collective action problem idea: we could run an experiment -- 30 ppl opt in to writing 10 comments and liking 10 comments they think raise the sanity waterline, conditional on a total of 29 other people opting in too. (A "kickstarter".) Then we see if it seemed like it made a difference.

I'd join. If anyone is also down for that, feel free to use this comment as a schelling point and reply with your interest below.

(I'm not sure the right number of folks, but if we like the result we could just do another round.)

1Leksu
Interested
2Chris_Leong
I'd participate.
4snewman
I'd participate.

there could still be founder effects in the discourse, or particularly influential people could be engaged in the twitter discourse.

I think that's the case. Mostly the latter, some of the former.

Without commenting on the proposal itself; I think the term "eval test set" is clearer for this purpose than "closed source eval".

1Jono
agreed

I'm writing a quick and dirty post because the alternative is that I wait for months and maybe not write it after all.

This is the way. 

Bird ConceptΩ3911

I think this is an application of a more general, very powerful principle of mechanism design: when cognitive labor is abundant, near omni-present surveillance becomes feasible. 

For domestic life, this is terrifying. 

But for some high stakes, arms race-style scenarios, it might have applications. 

Beyond what you metioned, I'm particularly interested in this being a game-changer for bilateral negotiation. Two parties make an agreement, consent to being monitored by an AI auditor, and verify that the auditor's design will communicate with the ... (read more)

4mako yass
Isn't it enough to constrain the output format to so that no steganographic leaks would be possible? Wont the counterparty usually be satisfied just with an hourly signal saying either "Something is wrong" (encompassing "Auditor saw a violation" / "no signal, the host has censored the auditor's report" / "invalid signal, the host has tampered with the auditor system" / "auditor has been blinded to the host's operations, or has ascertained that there are operations which the auditor cannot see") or "Auditor confirms that all systems are nominal and without violation."? The host can remain in control of their facilities, as long as the auditor is running on tamperproof hardware. It's difficult to prove that a physical device can't be tampered with, it may be possible to take some components of the auditor even further and run them in a zero knowledge virtual machine, which provides a cryptographic guarantee that the program wasn't tampered with, so long as you can make it lithe enough to fit (zero knowledge virtual machines currently run at a 10,000x slowdown, though I don't think specialized hardware for them is available yet, crypto may drive that work), though a ZKVM wont provide a guarantee that the inputs to the system aren't being controlled, the auditor is monitoring inputs of such complexity — either footage of the real world or logs of a large training run — that it may be able to prove algorithmically to itself that the sensory inputs weren't tampered with either and the algorithm does have a view into the real world (I'm contending that even large state actors could not create Descarte's evil demon).
Bird ConceptΩ257

Sidenote: I'm a bit confused by the name. The all caps makes it seem like an acronym. But it seems to not be? 

Reply1111

Intentional
Lure for
Improvised
Acronym
Derivation

8Lorxus
It's the Independently-Led Interactive Alignment Discussion, surely.
8Alex_Altair
Interactively Learning the Ideal Agent Design

International League of Intelligent Agent Deconfusion

gw411

I
Love
Interesting
Alignment
Donferences

Reply10543322

Sure that works! Maybe use a term like "importantly misguided" instead of "correct"? (Seems easier for me to evaluate)

Bird Concept13198

To anyone reading this who is considering working in alignment --

Following the recent revelations, I now believe OpenAI should be regarded as a bad faith actor. If you go work at OpenAI, I believe your work will be net negative; and will most likely be used to "safetywash" or "governance-wash" Sam Altman's mad dash to AGI. It now appears Sam Altman is at least a sketchy as SBF. Attempts to build "social capital" or "affect the culture from the inside" will not work under current leadership (indeed, what we're currently seeing are the failed results of 5+ y... (read more)

1Linch
I weakly disagree. The fewer safety-motivated people want to work at OpenAI, the stronger the case for any given safety person to work there. Also, now that there are enough public scandals, hopefully anybody wanting to work at OpenAI will be sufficiently guarded and going in with their eyes fully open, rather than naive/oblivious.
Raemon344

I'm around ~40% on "4 years from now, I'll think it was clearly the right call for alignment folk to just stop working at OpenAI, completely." 

But, I think it's much more likely that I'll continue endorsing something like "Treat OpenAI as a manipulative adversary by default, do not work there or deal with them unless you have a concrete plan for how you are being net positive.  And because there's a lot of optimization power in their company, be pretty skeptical that any plans you make will work. Do not give them free resources (like inviting the... (read more)

Hm, I disagree and would love to operationalize a bet/market on this somehow; one approach is something like "Will we endorse Jacob's comment as 'correct' 2 years from now?", resolved by a majority of Jacob + Austin + <neutral 3rd party>, after deliberating for ~30m.

Ben Pace126

Mostly seems sensible to me (I agree that a likely model is that there's a lot of deceptive and manipulative behavior coming from the top and that caring about extinction risks was substantially faked), except that I would trust an agreement from Altman much more than an agreement from Bankman-Fried.

That's more about me being interested in key global infrastructure, I've been curious about them for quite a lot of years after realising the combination of how significant what they're building is vs how few folks know about them. I don't know that they have any particularly generative AI related projects in the short term. 

Anyone know folks working on semiconductors in Taiwan and Abu Dhabi, or on fiber at Tata Industries in Mumbai? 

I'm currently travelling around the world and talking to folks about various kinds of AI infrastructure, and looking for recommendations of folks to meet! 

If so, freel free to DM me! 

(If you don't know me, I'm a dev here on LessWrong and was also part of founding Lightcone Infrastructure.)

1mesaoptimizer
Could you elaborate on how Tata Industries is relevant here? Based on a DDG search, the only news I find involving Tata and AI infrastructure is one where a subsidiary named TCS is supposedly getting into the generative AI gold rush.
Load More