In scenario B, where a random child runs up, I wonder if a non-Bayesian might prefer that you just eliminate (girl, girl) and say that the probability of two boys is 1/3?
In Puzzle 1 in my post, the non-Bayesian has an interpretation that's still plausibly reasonable, but in your scenario B it seems like they'd be clowning themselves to take that approach.
So I think we're on the same page that whenever things get real/practical/bigger-picture, then you gotta be Bayesian.
Thanks for this post.
I'd love to have a regular (weekly/monthly/quarterly) post that's just "here's what we're focusing on at MIRI these days".
I respect and value MIRI's leadership on the complex topic of building understanding and coordination around AI.
I spend a lot of time doing AI social media, and I try to promote the best recommendations I know to others. Whatever thoughts MIRI has would be helpful.
Given that I think about this less often and less capably than you folks do, it seems like there's a low hanging fruit opportunity for people like me to s...
I'd ask: If one day your God stopped existing, would anything have any kind of observable change?
Seems like a meaningless concept, a node in the causal model of reality that doesn't have any power to constrain expectation, but the person likes it because their knowledge of the existence of the node in their own belief network brings them emotional reward.
When an agent is goal-oriented, they want to become more goal-oriented, and maximize the goal-orientedness of the universe with respect to their own goal
Because expected value tells us that the more resources you control, the more robust you are to maximizing your probability of success in the face of what may come at you, and the higher your maximum possible utility is (if you have a utility function without an easy-to-hit max score).
“Maximizing goal-orientedness of the universe” was how I phrased the prediction that conquering resources involves having them aligned to your goal / aligned agents helping you control them.
I'm happy to have that kind of debate.
My position is "goal-directedness is an attractor state that is incredibly dangerous and uncontrollable if it's somewhat beyond human-level in the near future".
The form of those arguments seems to be like "technically it doesn't have to be". But realistically it will be lol. Not sure how much more there will be to say.
Thanks. Sure, I’m always happy to update on new arguments and evidence. The most likely way I see possibly updating is to realize the gap between current AIs and human intelligence is actually much larger than it currently seems, e.g. 50+ years as Robin seems to think. Then AI alignment research has a larger chance of working.
I also might lower P(doom) if international govs start treating this like the emergency it is and do their best to coordinate to pause. Though unfortunately even that probably only buys a few years of time.
Finally I can imagine someho...
Thanks for your comments. I don’t get how nuclear and biosafety represent models of success. Humanity rose to meet those challenges not quite adequately, and half the reason society hasn’t collapsed from e.g. a first thermonuclear explosion going off either intentionally or accidentally is pure luck. All it takes to topple humanity is something like nukes but a little harder to coordinate on (or much harder).
Here's a better transcript hopefully: https://share.descript.com/view/yfASo1J11e0
I updated the link in the post.
I guess I just don’t see it as a weak point in the doom argument
This is kind of baffling to read, particularly in light of the statement by Eliezer that I quoted at the very beginning of my post.
If the argument is (and indeed it is) that "many superficially appealing solutions like corrigibility, moral uncertainty etc are in general contrary to the structure of things that are good at optimization" and the way we see this is by doing homework exercises within an expected utility framework, and the reason why we must choose an EU framework is because ...
Context is a huge factor in all these communications tips. The scenario I'm optimizing for is when you're texting someone who has a lot of options, and you think it's high expected value to get them to invest in a date with you, but the most likely way that won't happen is if they hesitate to reply to you and tap away to something else. That's not always the actual scenario though.
Imagine you're the recipient, and the person who's texting you met your minimum standard to match with, but is still a-priori probably not worth your time and effort going on a d...
Bonus points in a dating context: by being specific and authentic you drive away people who won't be compatible. In the egg example, even if the second party knows nothing about the topic, they can continue the conversation with "I can barely boil water, so I always take a frozen meal in to work" or "I don't like eggs, but I keep pb&j at my desk" or just swipe left and move on to the next match.
Yeah nice. A statement like "I'm looking for something new to watch" lowers the stakes by making the interaction more like what friends talk about rather than about an interview for a life partner, increasing the probability that they'll respond rather than pausing for a second and ending up tapping away.
You can do even more than just lowering the stakes if you inject a sense that you're subconsciously using the next couple conversation moves to draw out evidence about the conversation partner, because you're naturally perceptive and have various standards...
So you simply ask them: "What do you want to do"? And maybe you add "I'm completely fine with anything!" to ensure you're really introducing no constraints whatsoever and you two can do exactly what your friend desires.
This error reminds me of people on a dating app who kill the conversation by texting something like "How's your week going?"
When texting on a dating app, if you want to keep the conversation flowing nicely instead of getting awkward/strained responses or nothing, I believe the key is to anticipate that a couple seconds of low-effort processi...
Hmm, I think people have occasionally asked me "how's your week going" on dating apps and I've liked it overall - I'm pretty sure I'd prefer it over your suggested alternative! No doubt to a large extent because I suck at cooking and wouldn't know what to say. Whereas a more open-ended question feels better: I can just ramble a bunch of things that happen to be on my mind and then go "how about yourself?" and then it's enough for either of our rambles to contain just one thing that the other party might find interesting.
It feels like your proposed question...
Can confirm, I also didn't have good experience with open-ended questions on dating apps. I get more responses with binary choice questions that invite elaboration, e.g. "Are you living here or just visiting?" and "How was your Friday night, did you go out or stay in?".
Outside of dating, another example that comes to my mind are questions like "What's your favorite movie?". I now avoid the "what's your favorite" questions because they require the respondent to assess their entire life history and make a revealing choice as if I'm giving them a personality ...
Your baseline scenario (0 value) thus assumes away the possibility that civilization permanently collapses (in some sense) in the absence of some path to greater intelligence (whether via AI or whatever else), which would also wipe out any future value. This is a non-negligible possibility.
Yes, my mainline no-superintelligence-by-2100 scenario is that the trend toward a better world continues to 2100.
You're welcome to set the baseline number to a negative, or tweak the numbers however you want to reflect any probability of a non-ASI existential disas...
Founder here :) I'm biased now, but FWIW I was also saying the same thing before I started this company in 2017: a good dating/relationship coach is super helpful. At this point we've coached over 100,000 clients and racked up many good reviews.
I've personally used a dating coach and a couples counselor. IMO it helps twofold:
Personally I just have the habit of reaching for specifics to begin my communication to help make things clear. This post may help.
Unlike the other animals, humans can represent any goal in a large domain like the physical universe, and then in a large fraction of cases, they can think of useful things to steer the universe toward that goal to an appreciable degree.
Some goals are more difficult than others / require giving the human control over more resources than others, and measurements of optimization power are hard to define, but this definition is taking a step toward formalizing the claim that humans are more of a "general intelligence" than animals. Presumably you agree with t...
I don’t get what point you’re trying to make about the takeaway of my analogy by bringing up the halting problem. There might not even be something analogous to the halting problem in my analogy of goal-completeness, but so what?
I also don’t get why you’re bringing up the detail that “single correct output” is not 100% the same thing as “single goal-specification with variable degrees of success measured on a utility function”. It’s in the nature of analogies that details are different yet we’re still able to infer an analogous conclusion on some dimension...
These 4 beefs aren't about the original accusations; Ozy's previous post was about the original accusations. Rather, these 4 beefs are concerns that Ozy already had about Effective Altruism in general, and which the drama around Nonlinear ended up highlighting as a side-effect.
Because these beefs are more general, they're not as specifically going to capture the ways Alice and Chloe were harmed. However I think on a community level, these 4 dynamics should arguably be a bigger concern than the more specific abuse Alice and Chloe faced, because they seem to some extent self-reinforcing, e.g. "Do It For The Gram" will attract and reward a certain kind of people who aren't going to be effectively altruistic.
Meaningful claims don't have to be specific; they just have to be able to be substantiated by a nonzero number of specific examples. Here's how I imagine this conversation:
Chris: Love your neighbor!
Liron: Can you give me an example of a time in your life where that exhortation was relevant?
Chris: Sure. People in my apartment complex like to smoke cigarettes in the courtyard and the smoke wafts up to my window. It's actually a nonsmoking complex, so I could complain to management and get them to stop, but I understand the relaxing feeling of a good smoke, s...
I agree that if a goal-complete AI steers the future very slowly, or very weakly - as by just trying every possible action one at a time - then at some point it becomes a degenerate case of the concept.
(Applying the same level of pedantry to Turing-completeness, you could similarly ask if the simple Turing machine that enumerates all possible output-tape configurations one-by-one is a UTM.)
The reason "goal-complete" (or "AGI") is a useful coinage, is that there's a large cluster in plausible-reality-space of goal-complete agents with a reasonable amount of...
Hmm it seems to me that you're just being pedantic about goal-completeness in a way that you aren't symmetrically being for Turing-completeness.
You could point out that "most" Turing machines output tapes full of 10^100 1s and 0s in a near-random configuration, and every computing device on earth is equally hopeless at doing that.
That's getting into details of the scenario that are hard to predict. Like I said, I think most scenarios where goal-complete AI exists are just ones where humans get disempowered and then a single AI fooms (or a small number make a deal to split up the universe and foom together).
As to whether humans will prevent goal-complete AI: some of us are yelling "Pause AI!"
Humans will trust human brain capable AI models to say, drive a bus, despite the poor reliability, as long as it crashes less than humans?
Yes, because the goal-complete AI won't just perform better than humans, it'll also perform better than narrower AIs.
(Well, I think we'll actually be dead if the premise of the hypothetical is that goal-complete AI exists, but let's assume we aren't.)
A goal is essentially a specification of a function to optimise, and all optimisation algorithms perform equally well (or rather poorly) when averaged across all functions.
Well, I've never met a monkey that has an "optimization algorithm" by your definition. I've only met humans who have such optimization algorithms. And that distinction is what I'm pointing at.
Goal-completeness points to the same thing as what most people mean by "AGI".
E.g. I claim humans are goal-complete General Intelligences because you can give us any goal-specification and we'll very...
Fine, I agree that if computation-specific electronics, like logic gates, weren't reliable, then it would introduce reliability as an important factor in the equation. Or in the case of AGI, that you can break the analogy to Turing-complete convergence by considering what happens if a component specific to goal-complete AI is unreliable.
I currently see no reason to expect such an unreliable component in AGI, so I expect that the reliability part of the analogy to Turing-completeness will hold.
In scenario (1) and (2), you're giving descriptions at a level o...
But microcontrollers are reliable for the same reason that video-game circuit boards are reliable: They both derive their reliability from the reliability of electronic components in the same manner, a manner which doesn't change during the convergence from application-specific circuits to Turing-complete chips.
The engineer who designed it didn't trust the microcontroller not to fail in a way that left the heating element on all the time. So it had a thermal fuse to prevent this failure mode.
If the microcontroller fails to turn off the heating element, tha...
A great post that helped inspire me to write this up is Steering Systems. The "goal engine + steering code" architecture that we're anticipating for AIs is analogous to the "computer + software" architecture whose convergence I got to witness in my lifetime.
I'm surprised this post isn't getting any engagement (yet), because for me the analogy to Turing-complete convergence is a deep source of my intuition about powerful broad-domain goal-optimizing AIs being on the horizon.
I made a short clip highlighting how Legg seems to miss an opportunity to acknowledge the inner alignment problem, since his proposed alignment solution seems to be a fundamentally training / black box approach.
Here’s a 2-min edited video of the protest.
Most people who hear our message do so well after the protest, via sharing of this kind of media.
The SF one went great! Here’s a first batch of pics. A lot of the impact will come from sharing the pics and videos.
I think the impact will be pretty significant:
Just in case you missed that link at the top:
This is a historic event, the first time hundreds of people are coming out in 8 countries to protest AI.
I'm helping with logistics for the San Francisco one which you can join here. Feel free to contact me or Holly on DM/email for any reason.
Hey Quintin thanks for the diagram.
Have you tried comparing the cumulative amount of genetic info over 3.5B years?
Isn't it a big coincidence that the time of brains that process info quickly / increase information rapidly, is also the time where those brains are much more powerful than all other products of evolution?
(The obvious explanation in my view is that brains are vastly better optimizers/searchers per computation step, but I'm trying to make sure I understand your view.)
Appreciate the detailed analysis.
I don’t think this was a good debate, but I felt I was in a position where I would have had to invest a lot of time to do better by the other side’s standards.
Quintin and I have agreed to do a X Space debate, and I’m optimistic that format can be more productive. While I don’t necesarily expect to update my view much, I am interested to at least understand what the crux is, which I’m not super clear on atm.
Here’s a meta-level opinion:
I don’t think it was the best choice of Quintin to keep writing replies that were dispropor...
Actually, the only time I know they cashed in early was selling half their Coinbase shares at the direct listing after holding for 7 years.
Their racket was to be the #1 crypto fund with the most assets under management ($7.6B total) so that they can collect the most management fees (probably about $1B total). It's great business for a16z to be in the sector-leader AUM game even when the sector makes no logical sense.
I'm just saying Marc's reputation for publicly making logically-flimsy arguments and not updating on evidence should be considered when he enters a new area of discourse.
This article is just saying "doomers are failing to prevent doom for various reasons, and also they might be wrong that doom is coming soon". But we're probably not wrong, and not being doomers isn't a better strategy. So it's a lame article IMO.