LESSWRONG
LW

All of Ben Pace's Comments + Replies

Oops, I didn't send my reply comment. I've just posted it, yes, that information did change my mind about this case.

Thank you for the details! I change my mind about the locus of responsibility, and don’t think Wascher seems as directly culpable as before. I don’t update my heuristic, I still think there should be legal consequences for decisions that cause human deaths,

My new guess is that something more like “the airport” should be held accountable and fined some substantial amount of money for the deaths, to go to the victim’s families.

Having looked into it a little more I see they were sued substantially for these, so it sounds like that broadly happened.

Accountability Sinks

Ben Pace12d60

I liked reading these examples; I wanted to say, it initially seemed to me a mistake not to punish Wascher, whose mistake led to the death of 35 people.

I have a weak heuristic that, when you want enforce rules, costs and benefits aren’t fungible. You do want to reward Wascher’s honesty, but I still think that if you accidentally cause 35 people to die this is evidence that you are bad at your job, and separately it is very important to disincentivize that behavior for others who might be more likely to make that mistake recklessly. There must be a reliable... (read more)

9Martin Sustrik12d

This is what the investigation found out (from the Asterisk article): 1. LAX was equipped with ground radar that helped identify the locations of airplanes on the airport surface. However, it was custom built and finding spare parts was hard, so it was frequently out of service. The ground radar display at Wascher’s station was not working on the day of the accident. 2. It was difficult for Wascher to see Intersection 45, where the SkyWest plane was located, because lights on a newly constructed terminal blocked her view. 3. After clearing the USAir plane to land, Wascher failed to recognize her mistake because she became distracted searching for information about another plane. This information was supposed to have been passed to her by another controller but was not. The information transmission hierarchy at the facility was such that the task of resolving missing data fell to Wascher rather than intermediate controllers whose areas of responsibility were less safety-critical. 4. Although it’s inherently risky to instruct a plane to hold on the runway at night or in low visibility, it was legal to do so, and this was done all the time. 5. Although there was an alarm system to warn of impending midair collisions, it could not warn controllers about traffic conflicts on the ground. 6. Pilot procedure at SkyWest was to turn on most of the airplane’s lights only after receiving takeoff clearance. Since SkyWest flight 5569 was never cleared for takeoff, most of its lights were off, rendering it almost impossible for the USAir pilots to see. Does that make you update your heuristic?

"The Urgency of Interpretability" (Dario Amodei)

Ben Pace15d69

I don't think that propaganda must necessarily involve lying. By "propaganda," I mean aggressively spreading information or communication because it is politically convenient / useful for you, regardless of its truth (though propaganda is sometimes untrue, of course).

When a government puts up posters saying "Your country needs YOU" this is intended to evoke a sense of duty and a sense of glory to be had; sometimes this sense of duty is appropriate, but sometimes your country wants you to participate in terrible wars for bad reasons. The government is ... (read more)

1Casey_15d

haha: agreed :) my rage there was around a different level/kind of fakery from anthropic, but i see now how this could connect-with/be part of a broader pattern that i wasn't aware of. remaining quibbles aside, i was wrong; this would be sufficient context/justification for using "propaganda". I see how it could potentially be the same pattern as people claiming ey hasn't talked enough about his position; to the person you disagree with, you will never have explained enough. but yeah i doubt any of the labs have

"The Urgency of Interpretability" (Dario Amodei)

Ben Pace16d*42

I'm saying that he is presenting it as something he believes from his place of expertise and private knowledge without argument, because it is something that is exceedingly morally and financially beneficial to him (he gets to make massive money and not be a moral monster), rather than because he has any evidence, and he stated it without evidence.

It is a similar sentence to if a President of a country who just initiated war said “If there’s one thing I’ve learned in my life it’s that war is inevitable, and there’s just a question of who wins and how... (read more)

-1Casey_15d

I truly have no settled opinion on anthropic or you, and i get it's annoying to zoom in on something minor/unimportant like this, but: i think your lack of charity here is itself a kind of propaganda, a non-truth seeking behavior. even in your mark zuckerberg hypothetical, considered purely "technically"/locally, if he's not lying in that scenario, i don't think it's appropriate to call it propaganda. but of course in that world, this world, he is overwhelming likely to be lying or mind-warped past the point of the difference mattering. are you not ambiguating/motte-and-bailey-ing between what seems three possible meanings for propaganda ? : 1) something like lies 2) doing something that's also beneficial to you financially 3) stating something without evidence you know the connotation "propaganda" conveys (1 and general badness), but you fell back on 2 and 3. also while you may not agree with them, you know there are plenty of arguments (proposed evidence) for why we can't stop. you are therefore being disingenuous. must they be penned in-toto by dario to count? also didn't he put serious effort behind machines of loving grace? (I haven't read it yet). this isn't an 'out of the blue' and/or 'obvious bullshit' position like the zuck hypothetical; the whole AI world is debating/split on this issue; it is reasonably possible he really believes it, etc. ... edit: saw your comment change just as i posted reply. not your fault tbc, just explaining why some of the content of my reply refers to things that no longer exist

Wei Dai's Shortform

Ben Pace16d*40

Can you expand on this? How can you tell the difference, and does it make much of a difference in the end (e.g., if most people get corrupted by power regardless of initial intentions)?

But I don't believe most people get corrupted by power regardless of initial intentions? I don't think Francis Bacon was corrupted by power, I don't think James Watt was corrupted by power, I don't think Stanislav Petrov was corrupted by power, and all of these people had far greater influence over the world than most people who are "corrupted by power".

I'm hearing you'd be ... (read more)

"The Urgency of Interpretability" (Dario Amodei)

Ben Pace18d101

Not sure I get your overall position. But I don’t believe all humans are delusional about the most important questions in their lives. See here for an analysis of pressures on people that can cause them to be insane on a topic. I think you can create inverse pressures in yourself, and you can also have no pressures and simply use curiosity and truth-seeking heuristics. It’s not magic to not be delusional. It just requires doing the same sorts of cognition you use to fix a kitchen sink.

1Knight Lee18d

Admittedly, I got a bit lost writing the comment. What I should've wrote was: "not being delusional is either easy or hard." * If it's easy, you should be able to convince them to stop being delusional, since it's their rational self interest. * If it's hard, you should be able to show them how hard and extremely insidious it is, and how one cannot expect oneself to succeed, so one should be far more uncertain/concerned about delusion.

"The Urgency of Interpretability" (Dario Amodei)

Ben Pace18d115

Not only would most people be hopelessly lost on these questions (“Should I give up millions-of-dollars-and-personal-glory and then still probably die just because it is morally right to do so?”), they have also picked up something that they cannot put down. These companies have 1,000s of people making millions of dollars, and they will reform in another shape if the current structure is broken apart. If we want to put down what has been picked up more stably, we must use other forces that do not wholly arise from within the companies.

3Knight Lee18d

I agree that it's psychologically very difficult, and that "is my work a net positive" is also hard to answer. But I don't think it's necessarily about millions of dollars and personal glory. I think the biggest difficulty is the extreme social conflict and awkwardness you would have telling researchers who are very personally close to you to simply shut down their project full of hard work, and tell them to oh do something else that probably won't make money and in the end we'll probably go bankrupt. As for millions of dollars, the top executives have enough money they won't feel the difference. As for "still probably die," well from a rational self interest point of view they should spend the last years they have left on vacation, rather than stressing out at a lab. As for personal glory, it's complicated. I think they genuinely believe there is a very decent chance of survival, in which case "doing the hard unpleasant thing" will result in far more glory in the post-singularity world. I agree it may be a factor in the short term. ---------------------------------------- I think questions like "is my work a net positive?" "Is my ex-girlfriend more correct about our breakup than me?" and "Is the political party I like running the economy better?" are some of the most important questions in life. But all humans are delusional about these most important questions in life, and no matter how smart you are, wondering about these most important questions will simply give your delusions more time find reassurances that you aren't delusional. The only way out is to look at how other smart rational people are delusional, and how futile their attempts at self questioning are, and infer that holy shit this could be happening to me too without me realizing it.

Wei Dai's Shortform

Ben Pace18d*5120

My sense is that most of the people with lots of power are not taking heroic responsibility for the world. I think that Amodei and Altman intend to achieve global power and influence but this is not the same as taking global responsibility. I think, especially for Altman, the desire for power comes first relative to responsibility. My (weak) impression is that Hassabis has less will-to-power than the others, and that Musk has historically been much closer to having responsibility be primary.

I don’t really understand this post as doing something other than ... (read more)

2Wei Dai16d

Can you expand on this? How can you tell the difference, and does it make much of a difference in the end (e.g., if most people get corrupted by power regardless of initial intentions)? And yet, Eliezer, the writer of "heroic responsibility" is also the original proponent of "build a Friendly AI to take over the world and make it safe". If your position is that "heroic responsibility" is itself right, but Eliezer and others just misapplied it, that seems to imply we need some kind of post-mortem on what went wrong with trying to apply the concept, and how future people can avoid making the same mistake. My guess is that like other human biases, it's hard to avoid making this mistake even if you point it out to people or try other ways to teach people to avoid it, because the drive for status and power is deep-seated, because it has a strong evolutionary logic. (My position is, let's not spread ideas/approaches that will predictably be "misused", e.g., as justification for grabbing power, similar to how we shouldn't develop AI that will predictably be "misused", even if nominally "aligned" in some sense.)

1testingthewaters18d

Well said. Bravo.

"The Urgency of Interpretability" (Dario Amodei)

Ben Pace18d*1310

I think that Anthropic is doing some neat alignment and control work, but it is also the company most effectively incentivizing people who care about existential risk to sell out, to endorse propaganda, silence themselves, and get on board with the financial incentives of massive monetization and capabilities progress. In this way I see it as doing more damage than OpenAI (though OpenAI used to have this mantle pre-Anthropic, while the Amodei siblings were there and with Christiano as researcher and Karnofsky on the board).

I don't really know the relative numbers, in my mind the uncertainty I have spans orders of magnitude. The numbers are all negative.

"The Urgency of Interpretability" (Dario Amodei)

Ben Pace20d1523

I couldn’t get two sentences in without hitting propaganda, so I set it aside. But I’m sure it’s of great political relevance.

1Casey_16d

(genuine question) are you suggesting: he thinks we can stop, but is lying about that? just because he might not spell out that given node of his argument here doesn't mean it is propaganda. like locally/by-itself i don't see how this could be propaganda, but maybe extra context im missing makes that clear

Knight Lee19d*14-4

I'm a bit out of the loop, I used to think Anthropic was quite different from the other labs and quite in sync with the AI x-risk community.

Do you consider them relatively better? How would you quantify the current AI labs (Anthropic, OpenAI, Google DeepMind, DeepSeek, xAI, Meta AI)?

Suppose that the worst lab has a -100 influence on the future, for each $1 they spend. A lab half as bad, has a -50 influence on the future for each $1 they spend. A lab that's actually good (by half as much) might have a +50 influence for each $1.

What numbers would you give to... (read more)

9MalcolmMcLeod19d

I expected your comment to be hyperbolic, but no. I mean sheesh: (Emphasis mine.) What rhetorical cleverness. This translates as: "I have expertise and foresightedness; here's your Overton window." Then he goes gears-level (ish) for a whole essay, reinscribing in the minds of Serious People the lethal assumptions laid out here: "We can't slow down; if you knew what I knew you'd see the 'forces' that make this obvious, and besides do you want the commies to win?" I'm not just doing polemic. I think the rhetorical strategy "dismissing pause and cooperation out of hand instead of arguing against them" tells us something. I'm not sure what, alas. I do think that labs' arguments to the governments work best if they've already set the terms of the debate. It helps Dario's efforts if "pause/cooperate" is something "all the serious people know" is not worth paying attention to. I 80% think he also believes that pausing and cooperation are bad ideas (despite his obvious cognizance of the time-crunch). But I doubt he dismisses it so out-of-hand privately.

Davidmanheim20d103

Quick take: it's focused on interpretability as a way to solve prosaic alignment, ignoring the fact that prosaic alignment is clearly not scalable to the types of systems they are actively planning to build. (And it seems to actively embrace the fact that interpretability is a capabilities advantage in the short term, but pretends that it is a safety thing, as if the two are not at odds with each other when engaged in racing dynamics.)

aog's Shortform

Ben Pace25d94

Key ideas include long timelines, slow takeoff, eventual explosive growth, optimism about alignment, concerns about overregulation, concerns about hawkishness towards China, advocating the likelihood of AI sentience and desirability of AI rights, debating the desirability of different futures, and so on.

Small semantic note: these are not new ideas to Epoch, they are a new package of positions on ideas predominantly originating from the MIRI/LW cluster that you earlier mentioned.

Caleb Biddulph's Shortform

Ben Pace1mo30

Of note: the AI Alignment Forum content is a mirror of LW content, not distinct. It is a strict subset.

Benito's Shortform Feed

Ben Pace1mo62

I wrote this because I am increasingly noticing that the rules for "which worlds to keep in mind/optimize" are often quite different from "which worlds my spreadsheets say are the most likely worlds". And that this is in conflict with my heuristics which would've said "optimize the world-models in your head for being the most accurate ones – the ones that will give you the most accurate answers to most questions" rather than something like "optimize the world-models in your head for being the most useful ones".

(Though the true answer is some more complicat... (read more)

Benito's Shortform Feed

Ben Pace1mo20

Update from chatting with him: he said he was a just freelancer doing a year exclusively with NYT, and he wasn’t in a position to write on behalf of the NYT on the issue (e.g. around their deanonymization policies). This wasn’t satisfying to me, and so I will keep to being off-the-record.

Benito's Shortform Feed

Ben Pace1mo*340

I occasionally get texts from journalists asking to interview me about things around the aspiring rationalist scene. A few notes on my thinking and protocols for this:

I generally think it is pro-social to share information with serious journalists on topics of clear public interest.
By-default I speak with them only if their work seems relatively high-integrity. I like journalists whose writing is (a) factually accurate, (b) boring, and (c) do not feel to me to have an undercurrent of hatred for their subjects.
By default I speak with them off-the-record, an

... (read more)

4Zach Stein-Perlman1mo

Yep, my impression is that it violates the journalist code to negotiate with sources for better access if you write specific things about them.

1kilgoar1mo

This is amusing. When you ask to speak "off the record," it does not mean anything legally or otherwise. It is entirely up to their discretion what is and isn't shared, as they are the ones writing the story.

6Yoav Ravid1mo

The quoting policy seems very good and clever :)

LessWrong has been acquired by EA

Ben Pace2mo140

Oops! Then we have taken that feature down for a bit until further testing is done (and the devs have had a little more sleep).

LessWrong has been acquired by EA

Ben Pace2mo259

While we always strive to deliver the premium unfinished experience you expect from EA, it seems this bug slipped past our extensive testing. We apologize; a day-one patch is already in development.

(I expect you will see your picoLightcones in the next 30-60 mins.)

Edit: And you should have now gotten them, and any future purchases should go through ~immediately.

williawa2mo110

I have not gotten them.

williawa's Shortform

Ben Pace2mo50

[Comment moved here for visibility by the community.]

Benito's Shortform Feed

Ben Pace2mo120

An idea I've been thinking about for LessOnline this year, is a blogging awards ceremony. The idea being that there's a voting procedure on the blogposts of the year, in a bunch of different categories, a shortlist is made and winners are awarded a prize.

I like opportunities for celebrating things in the online, written, truth-seeking ecosystem. I'm interested in reacts on whether people would be pro something like this happening, and comments on suggestions for how to do it well. (Epistemic status: tentatively excited about this idea.)

Here's my firs... (read more)

On (Not) Feeling the AGI

Ben Pace2mo20

Thanks!

Zvi's post is imported, so it's stored a little differently than normal posts. Here's two copies I made stored differently (1, 2), I'd appreciate you letting me know if either of these look correct on mobile.

(Currently it looks fine on my iPhone, are you on an Android?)

5aphyer2mo

...to my confusion, not only do both of those look fine to me on mobile, the original post now also looks fine. (Yes, I am on Android.)

On (Not) Feeling the AGI

Ben Pace2mo20

Same, here's a screenshot. Perhaps Molony is using a third-party web viewer?

1Declan Molony2mo

No, I wasn't using a third-party. I was viewing it on PC. It looks normal today and I'm seeing paragraph breaks now.

5aphyer2mo

On mobile I see no paragraph breaks, on PC I see them. Edited to add what it looks like on mobile:

Will Jesus Christ return in an election year?

Ben Pace2mo*2-5

Seeing this, I update toward a heuristic of "all polymarket variation within 4 percentage points are noise".

1Pat Myron14d

This also misses the time value of money; fixed heuristics don't capture that the effect is more pronounced the further out the settlement date is (already prediction markets with 2030s settlement dates)

Garrett Baker2mo177

I think the math works out to be that the variation is much more extreme when you get to much more extreme probabilities. Going from 4% to 8% is 2x profits, but going from 50% to 58% is only 1.16x profits.

Elizabeth's Shortform

Ben Pace2mo121

I tried to invite Iceman to LessOnline, but I suspect he no longer checks the old email associated with that account. If anyone knows up to date contact info, I’d appreciate you intro-ing us or just letting him know we’d love to have him join.

iceman2mo260

I'll pass, but thanks.

METR: Measuring AI Ability to Complete Long Tasks

Ben Pace2moΩ451

I think my front-end productivity might be up 3x? A shoggoth helped me building a stripe shop and do a ton of UI design that I would’ve been hesitant to take on myself (without hiring someone else to work with), as well as quality increase in speed of churning through front-end designs.

(This is going from “wouldn’t take on the project due to low skill” to “can take it on and deliver it in a reasonable amount of time”, which is different from “takes top programmer and speeds them up 3x”.)

LessOnline 2025: Early Bird Tickets On Sale

Ben Pace2mo40

I have a bit of work to do on the scheduling app before sending it around to everyone this year, not certain when I will get to that, my guess is in like 4 weeks from now.

Relatedly: we have finished renovating the final building on our campus, so there will be more rooms for sessions this year than last year.

Elizabeth's Shortform

Ben Pace2mo20

Am I being an idiot or does technically 99%< work? Like, it implied that 99% is less than it, in a mirror to how <1% means 1% is greater than it.

Elizabeth's Shortform

Ben Pace2mo40

I personally have it as a to-do to just build polls.

(React if you want to express that you would likely use this.)

Benito's Shortform Feed

Ben Pace2mo20

Edited, should be working fine now, thx!

Benito's Shortform Feed

Ben Pace2mo*Ω340

Something a little different: Today I turn 28. If you might be open to do something nice for me for my birthday, I would like to request the gift of data. I have made a 2-4 min anonymous survey about me as a person, and if you have a distinct sense of me as a person (even just from reading my LW posts/comments) I would greatly appreciate you filling it out and letting me know how you see me!

Here's the survey.

It's an anonymous survey where you rate me on lots of attributes like "anxious", "honorable", "wise" and more. All multiple-choice. Two years ago I al... (read more)

3Guive2mo

When I click the link I see this:

A Bear Case: My Predictions Regarding AI Progress

Ben Pace2mo52

I'm never sure if it makes sense to add that clause every time I talk about the future.

A Bear Case: My Predictions Regarding AI Progress

Ben Pace2mo100

Curated. Some more detailed predictions of the future, different from others, and one of the best bear cases I've read.

This feels a bit less timeless than many posts we curate but my guess is that (a) it'll be quite interesting to re-read this in 2 years, and (b) it makes sense to record good and detailed predictions like this more regularly in the field of AI which is moving so much faster than most of the rest of the world.

6Thane Ruthenis2mo

It'll be quite interesting to be alive to re-read this in 2 years, yes.

The Semi-Rational Militar Firefighter

Ben Pace2mo41

Thanks for this short story! I have so many questions.

This was during training camp? How many days/weeks/month in was this?
How many people went through this training camp with you? Are you still friends with any of them?
How long was training, and then how long did you serve as a professional?
I encourage any links you have to content about these folks in future stories. I had to check the Wikipedia page before I fully believed that the logo was a skull with knife through it.

Yes, I would be interested in reading another story about your time there. This stor... (read more)

8P. João2mo

Ben, my brother in chaos, you’re about to unlock Level 2 of my military lore. Training duration: Imagine a Brazilian telenovela directed by Bear Grylls—1 year total, culminating in a final jungle camp (but with more mosquito-borne diseases). Cast of characters: Around 120 firefighters, split into four platoons. My origin story: I trained for nine years just to get in—which is ironic, considering I was an asthmatic allergic to: * Pollen * Military hierarchy * Taking life seriously But first—somes links to you: That time I almost got arrested for teaching CPR: Because of that, I ended up working on projects like this: Could My Work Beyond ‘Haha’ Benefit the LessWrong Community? It wasn’t easy. I even failed the Brazilian Marines’ physical for... acne (true story!). Apparently, making enemy forces too scared wasn’t part of the strategic doctrine. Want to understand BOPE? Watch Elite Squad 1 & 2. Their training makes ours look like My Little Pony: Rescue Brigade. Fair warning—their version of "team building" involves fewer beach barbecues and more interrogating drug lords.

Open Thread Spring 2025

Ben Pace2mo20

Intercom please! Helps for us to have back and forth like "What device / operating system / browser?" and other relevant q's.

help, my self image as rational is affecting my ability to empathize with others

Ben Pace3mo20

That sounds good to me i.e. draft this post, and then make it a comment in one of those places instead (my weak guess is a quick take is better, but whatever you like).

help, my self image as rational is affecting my ability to empathize with others

Ben Pace3mo20

Posted either as a comment on the seasonal open thread or using the quick takes / shortform feature, which posts it in your shortform (e.g. here is my shortform).

I'm saying that this seems to me not on the level of substance of a post, so it'd be better as a comment of one of the above two types, and also that it's plausible to me you'd probably get more engagement as a comment in the open thread.

1KvmanThinking3mo

ahh that makes sense. should i just move it there now?

help, my self image as rational is affecting my ability to empathize with others

Ben Pace3mo2-2

FWIW this feels like it should be a shortform/open thread than a post.

1KvmanThinking3mo

A "shortform/open thread"?

Will_Pearson's Shortform

Ben Pace3mo60

I have used my admin powers to put it into a collapsible section so that people who expand this in recent discussion do not have to scroll for 5 seconds to get past it.

Benito's Shortform Feed

Ben Pace3mo40

Though if the text changes, then it degrades gracefully to just linking to the right webpage, which is the current norm.

Benito's Shortform Feed

Ben Pace3mo216

I have a general belief that internet epistemic hygiene norms should include that, when you quote someone, not only should you link to the source, but you should link to the highlight of that source. In general, if you highlight text on a webpage and right-click, you can "copy link to highlight" which when opened scrolls to and highlights that text. (Random example on Wikipedia.)

Further on this theme, archive.is has the interesting feature of constantly altering the URL to point to the currently highlighted bit of text, making this even easier. (Example, a... (read more)

1Czynski3mo

I find them visually awful and disable them in settings. And avoid using archive.is because there's no way to turn that off. Not that I browse LW that much, in fairness.

gwern3mo*209

I have misgivings about the text-fragment feature as currently implemented. It is at least now a standard and Firefox implements reading text-fragment URLs (just doesn't conveniently allow creation without a plugin or something), which was my biggest objection before; but there are still limitations to it which show that a lot of what the text-fragment 'solution' is, is a solution to the self-inflicted problems of many websites being too lazy to provide useful anchor IDs anywhere in the page. (I don't know how often I go to link a section of a blog post, w... (read more)

2Mateusz Bagiński3mo

You can add it as an opt-in feature.

6cubefox3mo

The highlights are officially called "text fragments" and the syntax is described here: https://developer.mozilla.org/en-US/docs/Web/URI/Reference/Fragment/Text_fragments

3Guive3mo

I like this idea. There's always endless controversy about quoting out of context. I can't recall seeing any previous specific proposals to help people assess the relevance of context for themselves.

MondSemmel3mo104

"Copy link to highlight" is not available in Firefox. And while e.g. Bing search seems to automatically generate these "#:~:text=" links, I find they don't work with any degree of consistency. And they're even more affected by link rot than usual, since any change to the initial text (like a typo fix) will break that part of the link.

The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better

Ben Pace3mo*172

The point that "small protests are the only way to get big protests" may be directionally accurate, but I want to note that there have been large protests that happened without that. Here's a shoggoth listing a bunch, including the 1989 Tiananmen Square Protests, the 2019 Hong Kong Anti-Extradition Protests, the 2020 George Floyd Protests, and more.

The shoggoth says spontaneous large protests tends to be in response to triggering events and does rely on pre-existing movements that are ready to mobilize, the latter of which your work is helping build.

5Matt Vincent3mo

Unless you have a very optimistic view of warning shots, we shouldn't rely on such an opportunity.

Benito's Shortform Feed

Ben Pace3mo334

I want to contrast two perspectives on human epistemology I've been thinking about for over a year.

There's one school of thought about how to do reasoning about the future which is about naming a bunch of variables, putting probability distributions over them, multiplying them together, and doing bayesian updates when you get new evidence. This lets you assign probabilities, and also to lots of outcomes. "What probability do I assign that the S&P goes down, and the Ukraine/Russia war continues, and I find a new romantic partner?" I'll call this the "sp... (read more)

3Thane Ruthenis3mo

Hm, I'm not sure I understand what's confusing about this. First, suppose you're an approximate utility maximizer. There's a difference between optimizing the expected utility E[U(world,action)] and optimizing utility in the expected world U(E(world),action). In general, in the former case, you're not necessarily keeping the most-likely worlds in mind; you're optimizing the worlds in which you can get the most payoff. Those may be specific terrible outcomes you want to avert, or specific high-leverage worlds in which you can win big (e. g., where your startup succeeds). Choosing which worlds to keep in mind/optimize obviously impacts in which worlds you succeed. (Startup founders who start being risk-averse instead of aiming to win in the worlds in which they can win big lose – because they're no longer "looking" at the worlds where they succeed, and aren't shaping their actions to exploit their features.) Second, human world-models are hierarchical, and your probability distribution over worlds is likely multimodal. So when you pick a set of worlds you care about, you likely pick several modes of this distribution (rather than specific fully specified worlds), characterized by various high-level properties (such as "AI progress continues apace" vs. "DL runs into a wall"). When thinking about one of the high-level-constrained worlds/the neighbourhood of a mode, you further zoom-in on modes corresponding to lower-level properties, and so on. Which is why you're not keeping a bunch of basically-the-same expected trajectories in your head, but meaningfully "different" trajectories. This... all seems to be business-as-usual to me? I may be misunderstanding what you're getting at.

1samuelshadrach3mo

(edited) This is probably obvious to you, but you can expand the working memory bottleneck by making lots of notes. You still need to store the "index" of the notes in your working memory though, to be able to get back to relevant ideas later. Making a good index includes compressing the ideas till you get the "core" insights into it. Some part of what we consider intelligence is basically search and some part of what we consider faster search is basically compression. Tbh you can also do multi-level indexing, the top-level index (crisp world model of everything) could be in working memory and it can point to indexes (crisp world model of a specific topic) actually written in your notes, which further point to more extensive notes on that topic. As an aside, automated R&D using LLMs currently heavily relies on embedding search and RAG. AI's context window is loosely analogous to human's working memory in that way. AI knows millions of ideas but it can't simulate pairwise interactions between all ideas as that would require too much GPU time. So it too needs to select some pairs or tuples of ideas (using embedding search or something similar) within which it can explore interactions. The embedding dataset is a compressed version of the source dataset and the LLM itself is an even more compressed version of the source dataset. So there is interplay between data at different levels of compression.

4tailcalled3mo

I think the billion-dollar question is, what is the relationship between these two perspectives? For example, a simplistic approach would be to see cognitive visualization as some sort of Monte Carlo version of spreadsheet epistemology. I think that's wrong, but the correct alternative is less clear. Maybe something involving LDSL, but LDSL seems far from the whole story.

4Viliam3mo

So, one problem seems to be that humans are slow, and evaluating all options would require too much time, so you need to prune the option tree a lot. I am not sure what is the optimal strategy here; seems like all the lottery winners have focused on analyzing the happy path, but we don't know how much luck was involved at actually staying on the happy path, and what was the average outcome when they deviated from it. Another problem is that human prediction and motivation are linked in a bad way, where having a better model of the world sometimes makes you less motivated, so sometimes lying to yourself can be instrumentally useful... the problem is, you cannot figure out how much instrumentally useful exactly, because you are lying to yourself, duh. Another important piece of data would be, how many of the people who cannot imagine failure actually do succeed, and what typically happens to them when they don't. Maybe nothing serious. Maybe they often ruin their lives.

How might we safely pass the buck to AI?

Ben Pace3moΩ6165

Further detail on this: Cotra has more recently updated at least 5x against her original 2020 model in the direction of faster timelines.

Greenblatt writes:

Here are my predictions for this outcome:
25th percentile: 2 year (Jan 2027)
50th percentile: 5 year (Jan 2030)

Cotra replies:

My timelines are now roughly similar on the object level (maybe a year slower for 25th and 1-2 years slower for 50th)

This means 25th percentile for 2028 and 50th percentile for 2031-2.

The original 2020 model assigns 5.23% by 2028 and 9.13% | 10.64% by 2031 | 2032 respectively. Each t... (read more)

ryan_greenblatt3mo*Ω6106

Note that the capability milestone forecasted in the linked short form is substantially weaker than the notion of transformative AI in the 2020 model. (It was defined as AI with an effect at least as large as the industrial revolution.)

I don't expect this adds many years, for me it adds like ~2 years to my median.

(Note that my median for time from 10x to this milestone is lower than 2 years, but median to Y isn't equal to median to X + median from X to Y.)

Martin Randall's Shortform

Ben Pace3mo22

High expectation of x-risk and having lots to work on is why I have not been signed up for cryonics personally. I don't think it's a bad idea but has never risen up my personal stack of things worth spending 10s of hours on.

[RETRACTED] It's time for EA leadership to pull the short-timelines fire alarm.

Ben Pace3mo20

I agree that the update was correct. But you didn't state a concrete action to take?

When you downvote, explain why

Ben Pace3mo31

I disagree, but FWIW, I do think it's good to help existing, good contributors understand why they got the karma they did. I think your comment here is an example of that, which I think is prosocial.

2Seth Herd3mo

I'm curious why you disagree? I'd guess you're thinking that it's necessary to keep low-quality contributions from flooding the space, and telling people how to improve when they're just way off the mark is not helpful. Or if they haven't read the FAQ or read enough posts that shouldn't be rewarded. But I'm very curious why you disagree.

The Failed Strategy of Artificial Intelligence Doomers

Ben Pace3mo20

FWIW in my mind I was comparing this to things like Glen Weyl's Why I Am Not a Technocrat, and thought this was much better. (Related: Scott Alexander's response, Weyl's counter-response).

The Failed Strategy of Artificial Intelligence Doomers

Ben Pace3mo*4-2

I wrote that this "is the best sociological account of the AI x-risk reduction efforts of the last ~decade that I've seen." The line has some disagree reacts inline; I expect this is primarily an expression that the disagree-ers have a low quality assessment of the article, but I would be curious to see links to any other articles or posts that attempt something similar to this one, in order to compare whether they do better/worse/different. I actually can't easily think of any (which is why I felt it was not that bold to say this was the best).

Edit: I've expanded the opening paragraph, to not confuse my comment for me agreeing with the object level assessment of the article..

2Ben Pace3mo

FWIW in my mind I was comparing this to things like Glen Weyl's Why I Am Not a Technocrat, and thought this was much better. (Related: Scott Alexander's response, Weyl's counter-response).

The Failed Strategy of Artificial Intelligence Doomers

Ben Pace3mo161

I'm not particularly resolute on this question. But I get this sense when I look at (a) the best agent foundations work that's happened over ~10 years of work on the matter, and (b) the work output of scaling up the number of people working on 'alignment' by ~100x.

For the first, trying to get a better understand of the basic concepts like logical induction and corrigibility and low-impact and ontological updates, while I feel like there's been progress (in timeless decision theory taking a clear step forward in figuring out how think about decision-makers ... (read more)

9Davidmanheim3mo

I hate to be insulting to a group of people I like and respect, but "the best agent foundations work that's happened over ~10 years of work" was done by a very small group of people who, despite being very smart, certainly smarter than myself, aren't academic superstars or geniuses (Edit to add: on a level that is arguably sufficient, as I laid out in my response below.) And you agree about this. The fact that they managed to make significant progress is fantastic, but substantial progress on deep technical problems is typically due to (ETA: only-few-in-a-generation level) geniuses, large groups of researchers tackling the problem, or usually both. And yes, most work on the topic won't actually address the key problem, just like most work in academia does little or nothing to advance the field. But progress happens anyways, because intentionally or accidentally, progress on problems is often cumulative, and as long as a few people understand the problem that matters, someone usually actually notices when a serious advance occurs. I am not saying that more people working on the progress and more attention would definitely crack the problems in the field this decade, but I certainly am saying that humanity as a whole hasn't managed even what I'd consider a half-assed semi-serious attempt.

Lucius Bushnaq3mo*4017

I don't share the feeling that not enough of relevance has happened over the last ten years for us to seem on track for solving it in a hundred years, if the world's technology^[1] were magically frozen in time.

Some more insights from the past ten years that look to me like they're plausibly nascent steps in building up a science of intelligence and maybe later, alignment:

We understood some of the basics of general pattern matching: How it is possible for embedded minds that can't be running actual Solomonoff induction to still have some ability to ext

... (read more)

The Failed Strategy of Artificial Intelligence Doomers

Ben Pace3mo63

Can I double-click on what "does not understand politics at [a] very deep level" means? Can someone explain what they have in mind? I think Eliezer has probably better models than most of what our political institutions are capable of, and probably isn't very skilled at personally politicking. I'm not sure what other people have in mind.

3samuelshadrach3mo

Sorry for delay in reply. I’m not sure if the two are separable. Let’s say you believe in “great man” theory of history (I.e. few people disproportionately shape history, and not institutions, market forces etc). Then your ability to predict what other great men could do automatically means you may have some of the powers of a great man yourself. Also yes I mean he isn’t exceptionally skilled at either of the two. My bet is there are people who can make significantly better predictions than him, if only they also understood technical details of AI.