All of Ruby's Comments + Replies

Ruby61

For me, S2 explicitly I can't justify being quite that confident, maybe 90-95%, but emotionally 9:1 odds feels very like "that's what's happening".

Ruby20

I'm just wondering if we were ever sufficiently positively justified to anticipate a good future, or if we were just uncertain about the future and then projected our hopes and dreams onto this uncertainty, regardless of how realistic that was.

I think that's a very reasonable question to be asking. My answer is I think it was justified, but not obvious.

My understanding is it wasn't taken for granted that we had a way to get more progress with simply more compute until deep learning revolution, and even then people updated on specific additional data points... (read more)

3MondSemmel
One more virtue-turned-vice for my original comment: pacifism and disarmament: the world would be a more dangerous place if more countries had more nukes etc., and we might well have had a global nuclear war by now. But also, more war means more institutional turnover, and the destruction and reestablishment of institutions is about the only mechanism of institutional reform which actually works. Furthermore, if any country could threaten war or MAD against AI development, that might be one of the few things that could possibly actually enforce an AI Stop.
Ruby20

So gotta keep in mind that probabilities are in your head (I flip a coin, it's already tails or heads in reality, but your credence should still be 50-50). I think it can be the case that we were always doomed even if weren't yet justified in believing that.

Alternatively, it feels like this pushes up against philosophies of determinism and freewill. The whole "well the algorithm is a written program and it'll choose what is chooses deterministically" but also from the inside there are choices.

I think a reason to have been uncertain before and update more n... (read more)

3MondSemmel
I know that probabilities are in the map, not in the territory. I'm just wondering if we were ever sufficiently positively justified to anticipate a good future, or if we were just uncertain about the future and then projected our hopes and dreams onto this uncertainty, regardless of how realistic that was. In particular, the Glorious Transhumanist Future requires the same technological progress that can result in technological extinction, so I question whether the former should've ever been seen as the more likely or default outcome. I've also wondered about how to think about doom vs. determinism. A related thorny philosophical issue is anthropics: I was born in 1988, so from my perspective the world couldn't have possibly ended before then, but that's no defense whatsoever against extinction after that point. Re: AI timelines, again this is obviously speaking from hindsight, but I now find it hard to imagine how there could've ever been 50-year timelines. Maybe specific AI advances could've come a bunch of years later, but conversely, compute progress followed Moore's Law and IIRC had no sign of slowing down, because compute is universally economically useful. And so even if algorithmic advances had been slower, compute progress could've made up for that to some extent. Re: solving coordination problems: some of these just feel way too intractable. Take the US constitution, which governs your political system: IIRC it was meant to be frequently updated in constitutional conventions, but instead the political system ossified and the last meaningful amendment (18-year voting age) was ratified in 1971, or 54 years ago. Or, the US Senate made itself increasingly ungovernable with the filibuster, and even the current Republican-majority Senate didn't deign to abolish it. Etc. Our political institutions lack automatic repair mechanisms, so they inevitably deteriorate over time, when what we needed was for them to improve over time instead.
Ruby20

But since the number is subjective living your life like you know you are right is certainly wrong

 

I don't think this makes sense. Suppose you have a subjective belief that a vial of tasty fluid is lethal poison 90%, you're going to act in accordance with that belief. Now if other people think differently from you, and you think they might be right, maybe you adjust your final subjective probability to something else, but at the end of the day it's yours. That it's subjective doesn't rule it out being pretty extreme.

If what you mean is you can't be that confident given disagreement, I dunno, I wish I could have that much faith in people.

7mattmacdermott
In another way, being that confident despite disagreement requires faith in people — yourself and the others who agree with you. I think one reason I have a much lower p(doom) than some people is that although I think the AI safety community is great, I don’t have that much more faith in its epistemics than everyone else’s.
6Richard Korzekwa
Another victory for trend extrapolation!
Ruby20

"Serendipity" is a term I've been seen used for this, possibly was Venkatesh Rao.

Ruby42

Curated. The wiki pages collected here, despite being written in 2015-2017 remain excellent resources on concepts and arguments for key AI alignment ideas (both still widely used and those lesser known). I found that even for concepts/arguments like the orthogonality thesis and corrigibility, I felt a gain in crispness from reading these pages. The concept of, e.g. epistemic and instrumental efficiency I didn't have, yet feels useful in thinking about the rise of increasingly powerful AI.

Of course, there's also non-AI content that got imported. The Bayes guide likely remains the best resource for building Bayes intuition, and same with the guide on logarithms that is extremely thorough.

Ruby42

I think the guide should be 10x more prominent in this post.

2Screwtape
10x is a bit tricky in a small post like this, but I agree with the direction. Thank you for the nudge. I broke the guide out into its own line and bolded it, does that seem better?
Ruby20

You should see the option when you click on the triple dot menu (next to the Like button).

2Algon
Thanks! Clicking on the triple dots didn't display any options when I posted this comment. But they do now. IDK what went wrong.
Ruby20

So the nice thing about karma is that if someone thinks a wikitag is worthy of attention for any reason (article, tagged posts, importance of concept), they're able to upvote it and make it appear higher.

Much of the current karma comes from Ben Pace and I who did a pass. Rationality Quotes didn't strike me a page I particularly wanted to boost up the list, but if you disagree with me you're able to Like it.

In general, I don't think have a lot of tagged posts should mean a wikitag should be ranked highly. It's a consideration, but I like it flowing via peop... (read more)

Ruby30

Interesting. Doesn't replicate for me. What phone are you using?

2Warty
android phone with google chrome (related phenomena can be observed by scrolling to the 16-17 boundary and lowering the browser window width)
Answer by Ruby197

It's a compass rose, thematic with the Map and Territory metaphor for rationality/truthseeking.

The real question is why does NATO have our logo. 

6quetzal_rainbow
This is LGBTESCREAL agenda
1KvmanThinking
Ah. Thanks! (by the way, when these questions get answered, should I take them down or leave them up for others?)
Ruby179

Curated!  I like this post for the object-level interestingness of the cited papers, but also for pulling in some interesting models from elsewhere and generally reminding us that this is something we can do.

In times of yore, LessWrong venerated the the neglected virtue of scholarship. And well, sometimes it feels like it's still neglected. It's tough because indeed many domains have a lot of low quality work, especially outside of hard sciences, but I'd wager on there being a fair amount worth reading, and appreciate Buck point at a domain where that seems to be the case.

Ruby20

Was there the text of the post in the email or just a link to it?

1DusanDNesic
Oh, I didn't notice, but yeah, just a link to it, not the whole text!
Ruby50

Curated. I was reluctant to curate this post because I found myself bouncing off it some due to length – I guess in pedagogy there's a tradeoff between explaining at length (and you lose people) and you convey enough info vs keeping it brief and people read it but they don't get enough. Based on private convo, Raemon thinks length is warranted.

I'm curating because I do think this kind of project is valuable. Everyday it feels easier to lose our minds entirely to AI, and I think it's important to remember we can think better or worse, and we should be tryin... (read more)

2Domenic
It's interesting to compare this to the other curated posts I got in my inbox over the last week, What is malevolence? and How will we update about scheming. Both of those (especially the former) I bounced off of due to length. But this one I stuck with for quite a while, before I started skimming in the worksheet section. I think the instinct to apply a length filter before sending a post to many peoples' inboxes is a good one. I just wish it were more consistently applied :)
Ruby246

This doesn't seem right. Suppose there are two main candidates for how to get there, I-5 and J-6 (but who knows, maybe we'll be surprised by a K-7) and I don't know which Alice will choose. Suppose I know there's already a Very General Helper and Kinda Decent Generalizer, then I might say "I assign 65% chance that Alice is going to choose the I-5 and will try to contribute having conditioned on that". This seems like a reasonable thing to do. It might be for naught, but I'd guess in many case the EV of something definitely helpful if we go down Route A is ... (read more)

Yup, if you actually have enough knowledge to narrow it down to e.g. a 65% chance of one particular major route, then you're good. The challenging case is when you have no idea what the options even are for the major route, and the possibility space is huge.

Ruby*1311

Edit: we are not going to technically curate this post since it's an EA Forum crosspost and for boring technical reasons that breaks the curation email. I will leave this notice up though.

Curated. This piece definitely got me thinking. If we grant that some people are unusually altruistic, empathetic, etc., it stands to reason that there are others on the other end of various distributions. And then we should also expect various selection effects on where they end up.

It was definitely a puzzle piece clicking for me that these traits can coexist with [genui... (read more)

1DusanDNesic
I did somehow get this in my email, so it is curated?
Ruby50

Welcome! Don't be too worried, you can try posting some stuff and see how it's received. Based on how you wrote this comment, I think you won't have much trouble. The New User Guide and other stuff gets worded a bit sternly because of the people who tend not to put in much effort at all and expect to be well received – which doesn't sound like you at all. It's hard hard to write one document that's stern to those who need it and more welcoming to those who need that, unfortunately.

Ruby2110

Curated! It strikes me that asking "how would I update in response to...?" is both sensible and straightforward thing to be asking and yet not a form of question I'm seeing. I think we could be asking the same about slow vs fast takeoff, etc. and similar questions.

The value and necessity of this question also isn't just about not waiting for future evidence to come in, but realizing that "negative results" require interpretation too. I also think there's a nice degree of "preregistration" here is well that seems neat and maybe virtuous. Kudos and thank you.

Answer by Ruby60

I'm curious why the section on "Applying Rationality" in the About page you cited doesn't feel like an answer.

Applying Rationality

You might value Rationality for its own sake, however, many people want to be better reasoners so they can have more accurate beliefs about topics they care about, and make better decisions.

Using LessWrong-style reasoning, contributors to LessWrong have written essays on an immense variety of topics on LessWrong, each time approaching the topic with a desire to know what's actually true (not just what's convenient or pleasant to

... (read more)
0ashen8461
Thank you for your response. On reflection, I realize my original question was unclear. At its core is an intuition about the limits of critical thinking for the average person. If this intuition is valid, I believe some members of the community should, rationally, behave differently. While this kind of perspective doesn't seem uncommon, I feel its implications may not be fully considered. I also didn’t realize how much this intuition influenced my thinking when writing the question. My thoughts on this are still unclear, and I remain uncertain about some of the underlying assumptions, so I won’t argue for it here. Apologies for the confusion. I no longer endorse my question.
Ruby42

Errors are my own

At first blush, I find this caveat amusing.

1. If there are errors, we can infer that those providing feedback were unable to identify them.
2. If the author was fallible enough to have made errors, perhaps they are are fallible enough to miss errors in input sourced from others.

What purpose does it serve? Given its often paired with "credit goes to..<list of names> it seems like an attempt that people providing feedback/input to a post are only exposed to upside from doing so, and the author takes all the downside reputation risk if t... (read more)

4Viliam
"Credit goes to A, B, and C. The errors are probably theirs too, and my error was to trust them." Better? :D
8gwillen
I think this makes sense as a reminder of a thing that is true anyway, as you somewhat already said; but also consider situations like: * A given reviewer was only reviewing for substance, and the error is stylistic, or vice versa; * A given reviewer was only reviewing for a subset of the subject matter; * A given reviewer was reviewing an early draft, and an error was introduced in a later draft. In general a given reviewer will not necessarily have a real opportunity to catch any particular error, and usually a reader won't have enough context to determine whether they did or didn't. The author by contrast always bears responsibility for errors. I think the point of the caveat is that it is polite to thank people who helped, but putting someone's name on something implies they bear responsibility for it, and so the disclaimer is meant to keep the acknowledgement from being double-edged in an inappropriate way. Someone familiar with the writing and editing process will already in theory know all these things; someone who is not familiar maybe won't be. But ultimately I see it as kind of a phatic courtesy which merely alludes to all this.
Ruby169

This post is comprehensive but I think "safetywashing" and "AGI is inherently risky" are far too towards and the end and get too little treatment, as I think they're the most significant reasons against. 

This post also makes no mention of race dynamics and how contributing to them might outweigh the rest, and as RyanCarey says elsethread, doesn't talk about other temptations and biases that push people towards working at labs and would apply even if it was on net bad.

Ruby7-7

Curated. Insurance is a routine part of life, whether it be the car and home insurance we necessarily buy or the Amazon-offered protection one reflexively declines, the insurance we know doctors must have, businesses must have, and so on.

So it's pretty neat when someone comes along along and (compellingly) says "hey guys, you (or are at least most people) are wrong about when insurance makes sense to buy, the reasons you have are wrong, here's the formula". 

While assumptions can be questioned, e.g. infiniteness badness of going bankrupt and other fact... (read more)

6Davidmanheim
The math is good, the point is useful, the explanations are fine, but the embracing straw vulcan version of rationality and dismissing any notion of people legitimately wanting things other than money seems really quite bad, which leaves me wishing this wasn't being highlighted for visitors to the site.
Ruby146

Curated. This is a good post and in some ways ambitious as it tries to make two different but related points. One point – that AIs are going to increasingly commit shenanigans – is in the title. The other is a point regarding the recurring patterns of discussion whenever AIs are reported to have committed shenanigans. I reckon those patterns are going to be tough to beat, as strong forces (e.g. strong pre-existing conviction) cause people to take up the stances they do, but if there's hope for doing better, I think it comes from understanding the patterns.... (read more)

Ruby20

Yes, true, fixed, thanks!

Ruby20

Dog: "Oh ho ho, I've played imaginary fetch before, don't you worry."

Ruby113

My regular policy is to not frontpage newsletters, however I frontpaged this one as it's the first in the series and I think it's neat for more people to know this is a series Zvi intends to write.

Ruby51

Curated! I think it's generally great when people explain what they're doing and why in way legibile to those not working on it. Great because it let's others potentially get involved, build on it, expose flaws or omissions, etc. This one seems particularly clear and well written. While I haven't read all of the research, nor am I particularly qualified to comment on it, I like the idea of a principled/systematic approach behind, in comparison to a lot of work that isn't coming on a deeper, bigger, framework.

(While I'm here though, I'll add a link to Dmitr... (read more)

Ruby*82

Curated. I really like that even though LessWrong is 1.5 decades old now and has Bayesianism assumed as background paradigm while people discuss everything else, nonetheless we can have good exploration of our fundamental epistemological beliefs.

The descriptions of unsolved problems, or at least incompleteness of Bayesianism strikes me as technically correct. Like others, I'm not convinced of Richard's favored approach, but it's interesting. In practice, I don't think these problems undermine the use of Bayesianism in typical LessWrong thought. For example... (read more)

Ruby50

Welcome! Sounds like you're on the one hand at start of a significant journey but also you've come a long distance already. I hope you find much helpful stuff on LessWrong.

I hadn't heard of Daniel Schmachtenberger, but I'm glad to have learend of him and his works. Thanks.

6MalcolmOcean
Daniel Schmachtenberger has lots of great stuff.  Two pieces I recommend: 1. this article Higher Dimensional Thinking, the End of Paradox, and a More Adequate Understanding of Reality, which is about how just because two people disagree doesn't mean either is wrong 2. this Stoa video Converting Moloch from Sith to Jedi w/ Daniel Schmachtenberger, which is about races-to-the-bottom eating themselves Also hi, welcome Sage!  I dig the energy you're coming from here.
1Sage
Thank you! I hope I do yes, I am still learning how the forum works :) And you are welcome as well.
Ruby2-1

The actual reason why we lied in the second message was "we were in a rush and forgot." 

My recollection is we sent the same message to the majority group because:

  1. Treating it different would require special-casing it and that would have taken more effort.
  2. If selectors of different virtues had received a different messages, we wouldn't be able to have a properly compared their behavior.
  3. [At least in my mind], this was a game/test and when playing games you lie to people in the context of the game to make things work. Alternatively, it's like how scientific experimenters mislead subjects for the sake of the study.
2Raemon
Yeah I find the ‘you want to keep the message consistent for Science’ argument convincing (but think it’s good to still stick with the most reasonable interpretation of what our word was that we can, unless we have a specific reason not to that a reasonable number of nonteammates agree makes sense.)
Ruby40

Money helps. I could probably buy a lot of dignity points for a billion dollars. With a trillion variance definitely goes up because you could try crazy stuff and could backfire. (I mean true for a billion too). But EV of such a world is better. 

I don't think there's anything that's as simple as writing a check though.

US Congress gives money to specific things. I do not have a specific plan for a trillion dollars.

I'd bet against Terrance Tao being some kind of amazing breakthrough researcher who changes the playing field.

Ruby20

Your access should be activated within 5-10 minutes. Look for the button in the bottom right of the screen.

Ruby40

Not an original observation but yeah, separate from whether it's desirable, I think we need to be planning for it.

Ruby569

Just thinking through simple stuff for myself, very rough, posting in the spirit of quick takes
 

  1. At present, we are making progress on the Technical Alignment Problem[2] and like probably could solve it within 50 years.

  2. Humanity is on track to build ~lethal superpowerful AI in more like 5-15 years.
  3. Working on technical alignment (direct or meta) only matters if we can speed up overall progress by 10x (or some lesser factor if AI capabilities is delayed from its current trajectory). Improvements of 2x are not likely to get us to an adequate technical
... (read more)
Reply21111
5davekasten
I think this can be true, but I don't think it needs to be true: I suspect that if the government is running the at-all-costs-top-national-priority Project, you will see some regulations stop being enforceable.  However, we also live in a world where you can easily find many instances of government officials complaining in their memoirs that laws and regulations prevented them from being able to go as fast or as freely as they'd want on top-priority national security issues.  (For example, DoD officials even after 9-11 famously complained that "the lawyers" restricted them too much on top-priority counterterrorism stuff.)  
9Cole Wyeth
Though I am working on technical alignment (and perhaps because I know it is hard) I think the most promising route may be to increase human and institutional rationality and coordination ability. This may be more tractable than "expected" with modern theory and tools. Also, I don't think we are on track to solve technical alignment in 50 years without intelligence augmentation in some form, at least not to the point where we could get it right on a "first critical try" if such a thing occurs. I am not even sure there is a simple and rigorous technical solution that looks like something I actually want, though there is probably a decent engineering solution out there somewhere.
4Lao Mein
I have a few questions. 1. Can you save the world in time without a slowdown in AI development if you had a billion dollars? 2. Can you do it with a trillion dollars? 3. If so, why aren't you trying to ask the US Congress for a trillion dollars? 4. If it's about a lack of talent, do you think Terrance Tao can make significant progress on AI alignment if he actually tried? 5. Do you think he would be willing to work on AI alignment if you offered him a trillion dollars?
4jmh
I really like the observation in your Further Thoughts point. I do think that is a problem people need to look at as I would guess many will view the government involvement from a acting in public interests view rather than acting in either self interest (as problematic as that migh be when the players keep changing) or from a special interest/public choice perspective. Probably some great historical analysis already written about events in the past that might serve as indicators of the pros and cons here. Any historians in the group here?

"Cyborgism or AI-assisted research that gets up 5x speedups but applies differentially to technical alignment research"

How do you do you make meaningful progress and ensure it does not speed up capabilities?

It seems unlikely that a technique exists that is exclusively useful for alignment research and can't be tweaked to help OpenMind develop better optimization algorithms etc.

Noosphere89*18-11

I basically agree with this:

People who want to speed up AI will use falsehoods and bad logic to muddy the waters, and many people won’t be able to see through it 

In other words, there’s going to be an epistemic war and the other side is going to fight dirty, I think even a lot of clear evidence will have a hard time against people’s motivations/incentives and bad arguments.

But I'd be more pessimistic than that, in that I honestly think pretty much every side will fight quite dirty in order to gain power over AI, and we already have seen examples of st... (read more)

Ruby184

The “Deferred and Temporary Stopping” Paradigm

Quickly written. Probably missed where people are already saying the same thing.

I actually feel like there’s a lot of policy and research effort aimed at slowing down the development of powerful AI–basically all the evals and responsible scaling policy stuff.

A story for why this is the AI safety paradigm we’ve ended up in is because it’s palatable. It’s palatable because it doesn’t actually require that you stop. Certainly, it doesn’t right now. To the extent companies (or governments) are on board, it’s becaus... (read more)

Ruby70

Curated. I think Raemon's been doing a lot of work in the last year pushing this stuff, and this post pulls together in one place a lot of good ideas/advice/approach.

I would guess that because of the slow or absent feedback loops, people don't realize how bad human reasoning and decision-making is when operating outside of the familiar and quick feedback. That's many domains, but certainly the whole AI situation. Ray is going after the hard stuff here.

And the same time, this stuff ends up feeling like the "eat your vegetables" of reasoning and decision-mak... (read more)

Ruby41

Yeah, I think a question is whether I want to say "that kind of wireheading isn't mypoic" vs "that isn't wireheading". Probably fine eitherway if you're consistent / taboo adequately.

Ruby20

My guess is Ben created the event while on the East Coast and 6pm got timezone converted for West Coast. I've fixed it.

Ruby20

Once I'm rambling, I'll note another thought I've been mulling over:

My notion of value is not the same as the value that my mind was optimized to pursue. Meaning that I ought to be wary that typical human thought patterns might not be serving me maximally.

That's of course on top of the fact that evolution's design is flawed even by its own goals; humans rationanlize left, right, and center, are awfully myopic, and we'll likely all die because of it.

Ruby102

There's an age old tension between ~"contentment" and ~"striving" with no universally accepted compelling resolution, even if many people feel they have figured it out. Related:

In my own thinking, I've been trying to ground things out in a raw consequentialism that one's cognition (including emotions) is just supposed to take you towards more value (boring, but reality is allowed to be)[1].

I fear that a lot of what people do is ~"wireheading". The problem with wireheading is it's myopic. You feel good now (small amount of value) at the expense of greater v... (read more)

7tailcalled
I think gratitude also has value in letting you recognize what is worth maintaining and what has historically shown itself to have lots of opportunities and therefore in the future may have opportunities too.
1ABlue
I don't think wireheading is "myopic" when it overlaps with self-maintenance. Classic example would be painkillers; they do ~nothing but make you "feel good now" (or at least less bad), but sometimes feeling less bad is necessary to function properly and achieve long-term value. I think that gratitude journaling is also part of this overlap area. That said I don't know many peoples' experiences with it so maybe it's more prone to "abuse" than I expect.
2Ruby
Once I'm rambling, I'll note another thought I've been mulling over: My notion of value is not the same as the value that my mind was optimized to pursue. Meaning that I ought to be wary that typical human thought patterns might not be serving me maximally. That's of course on top of the fact that evolution's design is flawed even by its own goals; humans rationanlize left, right, and center, are awfully myopic, and we'll likely all die because of it.
Load More