For what it's worth, I was just learning about the basics of MIRI's research when this came out, and reading it made me less convinced of the value of MIRI's research agenda. That's not necessarily a major problem, since the expected change in belief after encountering a given post should be 0, and I already had a lot of trust in MIRI. However, I found this post by Jessica Taylor vastly clearer and more persuasive (it was written before "Rocket Alignment", but I read "Rocket Alignment" first). In particular, I would ...
Maybe people who rationalized their failure to lose weight by "well, even Eliezer is overweight, it's just metabolic disprivilege"
How many people raised their hands when Eliezer asked about the probability estimate? When I was watching the video I gave a probability estimate of 65%, and I'm genuinely shocked that "not many" people thought he had over a 55% chance. This is Eliezer we're talking about.............
I wonder if it negatively impacts the cohesiveness/teamwork ability of the resulting AI safety community by disproportionately attracting a certain type of person? It seems unlikely that everyone would enjoy this style
.
FWIW you can bet on some of these on PredictIt -- for example, Predictit assigns only a 47% chance Trump will win in 2020. That's not a huge difference, but still worth betting 5% of your bankroll (after fees) on if you bet half-Kelly. (if you want to bet with me for whatever reason, I'd also be willing to bet up to $700 that Trump doesn't win at PredictIt odds if I don't have to tie up capital)
We can test if the most popular books & music of 2019 sold less copies than the most popular books & music of 2009 (I might or might not look into this later)
GDP is 2x higher than in 2000
Why not use per capita real GDP (+25% since 2000)?
I'm thinking that if there were liquid prediction markets for amplifying ESCs, people could code bots to do exactly what John suggests and potentially make money. This suggests to me that there's no principled difference between the two ideas, though I could be missing something (maybe you think the bot is unlikely to beat the market?)
I think I'd feel differently about John's list if it contained things that weren't goodhartable, such as... I don't know, most things are goodhartable. For example, citation density does probably have an impact (not just a correlation) on credence score. But giving truth or credibility points for citations is extremely gameable. A score based on citation density is worthless as soon as it becomes popular because people will do what they would have anyway and throw some citations in on top. Popular authors may not even have to do that t...
Based on the quote from Jessica Taylor, it seems like the FDT agents are trying to maximize their long-term share of the population, rather than their absolute payoffs in a single generation? If I understand the model correctly, that means the FDT agents should try to maximize the ratio of FDT payoff : 9-bot payoff (to maximize the ratio of FDT:9-bot in the next generation). The algebra then shows that they should refuse to submit to 9-bots once the population of FDT agents gets high enough (Wolfram|Alpha link), without needing to drop the random encounter...
What's the difference between John's suggestion and amplifying ESCs with prediction markets? (not rhetorical)
I was somewhat confused by the discussion of LTFF grants being rejected by CEA; is there a public writeup of which grants were rejected?
I don't think there is a public writeup. So here is a quick summary:
In order to do this, the agent needs to be able to reason approximately about the results of their own computations, which is where logical uncertainty comes in
Why does being updateless require thinking through all possibilities in advance? Can you not make a general commitment to follow UDT, but wait until you actually face the decision problem to figure out which specific action UDT recommends taking?
Well, it's been 8 years; how close are ML researchers to a "proto-AGI" with the capabilities listed? (embarassingly, I have no idea what the answer is)
Apparently an LW user did a series of interviews with AI researchers in 2011, some of which included a similar question. I know most LW users have probably seen this, but I only found it today and thought it was worth flagging here.
What are the competing explanations for high time preference?
A better way to phrase my confusion: How do we know the current time preference is higher than what we would see in a society that was genuinely at peace?
The competing explanations I was thinking of were along the lines of "we instinctively prefer having stuff now to having stuff later"
Yeah, I was implicitly assuming that initiating a successor agent would force Omega to update its predictions about the new agent (and put the $1m in the box). As you say, that's actually not very relevant, because it's a property of a specific decision problem rather than CDT or son-of-CDT.
(I apologize in advance if this is too far afield of the intended purpose of this post)
How does the claim that "group agents require membranes" interact with the widespread support for dramatically reducing or eliminating restrictions to immigration ("open borders" for short) within the EA/LW community? I can think of several possibilities, but I'm not sure which is true:
Would trying to become less confused about commitment races before building a superintelligent AI count as a metaphilosophical approach or a decision theoretic one (or neither)? I'm not sure I understand the dividing line between the two.
if you're interested in anything in particular, I'll be happy to answer.
I very much appreciate the offer! I can't think of anything specific, though; the comments of yours that I find most valuable tend to be "unknown unknowns" that suggest a hypothesis I wouldn't previously have been able to articulate.
Have you written anything like "cousin_it's life advice"? I often find your comments extremely insightful in a way that combines the best of LW ideas with wisdom from other areas, and would love to read more.
The prior probability ratio is 1:99, and the likelihood ratio is 20:1, so the posterior probability is 120:991 = 20:99, so you have probability of 20/(20+99) of having breast cancer.
What does "120:991" mean here?
After thinking about it some more, I don't think this is true.
A concrete example: Let's say there's a CDT paperclip maximizer in an environment with Newcomb-like problems that's deciding between 3 options.
1. Don't hand control to any successor
2. Hand off control to a "LDT about correlations formed after 7am, CDT about correlations formed before 7am" successor
3. Hand off control to a LDT successor.
My understanding is that the CDT agent would take the choice that causes the highest number of paperclips to be created (in ...
That makes sense to me, but unfortunately I'm no closer to understanding the quoted passage. Some specific confusions:
Can someone explain/point me to useful resources to understand the idea of time preference as expresed in this post? In particular, I'm struggling to understand these sentences:
This suggests that near the center time preference has increased to the point where we’re creating scarcity faster than we’re alleviating it, while at the periphery scarcity is still actually being alleviated because there’s enough scarcity to go around, or perhaps marginal areas do not suffer so much from total mobilization.
I also don't understand ...
I think quantitative easing is an example (if I understood the post correctly, which I'm not sure about). By buying up bonds, the government is putting more dollars into the economy, which reduces the "amount of stuff produced per dollar", thus creating scarcity (in other words, QE increases aggregate demand). To alleviate this pressure, people make more stuff in order to meet the excess demand (i.e. unemployment rates go down). Forcing the unemployment rate down is the same as "requiring almost everyone to do things"
Maybe the claim that climate scientists are liars? I don't know if it's true, but if I knew it were false I'd definitely downvote the post...
I understand that, but I don't see why #2 is likely to be achievable. Corrigibility seems very similar to Wei Dai's translation example, so it seems like there could be many deceptive actions that humans would intuitively recognize as not corrigible, but which would fool an early-stage LBO tree into assigning a high reward. This seems like it would be a clear example of "giving a behaviour a high reward because it is bad". Unfortunately I can't think of any good examples, so my intuition may simply be mistaken.
Incidentally, it see...
That makes sense; so it's a general method that's applicable whenever the bandwidth is too low for an individual agent to construct the relevant ontology?
plus maybe other properties
That makes sense; I hadn't thought of the possibility that a security failure in the HBO tree might be acceptable in this context. OTOH, if there's an input that corrupts the HBO tree, isn't it possible that the corrupted tree could output a supposed "LBO overseer" that embeds the malicious input and corrupts us when we try to verify it? If the HBO tree is insecure, it seems like a manual process that verifies its output must be insecure as well.
I don't understand the argument that a speed prior wouldn't work: wouldn't the abstract reasoner still have to simulate the aliens in order to know what output to read from the zoo earths? I don't understand how "simulate a zoo earth with a bitstream that is controlled by aliens in a certain way" would ever get a higher prior weight than "simulate an earth that never gets controlled by aliens". Is the idea that each possible zoo earth with simple-to-describe aliens has a relatively similar prior weight to the real earth, so they collectively have a much higher prior weight?
I think it’s likely that these markets would quickly converge to better predictions than existing political prediction markets
Why would you expect this to be true? I (and presumably many others) spend a lot of time researching questions on existing political prediction markets because I can win large sums ($1k+ per question) doing so. I don't see why anyone would have an incentive to put in a similar amount of time to win Internet Points, and as a result I don't see why these markets would outperform existing political prediction markets...
Is there any information on how Von Neumann came to believe Catholicism was the correct religion for Pascal Wager purposes? "My wife is Catholic" doesn't seem like very strong evidence...
How do you ensure that property #3 is satisfied in the early stages of the amplification process? Since no agent in the tree will have context, and the entire system isn't very powerful yet, it seems like there could easily be inputs that would naively generate a high reward "by being bad", which the overseer couldn't detect.
From an epistemic rationality perspective, isn't becoming less aware of your emotions and body a really bad thing? Not only does it give you false beliefs, but "not being in touch with your emotions/body" is already a stereotyped pitfall for a rationalist to fall into...
Is meta-execution HBO, LBO, or a general method that could be implemented with either? (my current credences: 60% LBO, 30% general method, 10% HBO)
How does this address the security issues with HBO? Is the idea that only using the HBO system to construct a "core for reasoning" reduces the chances of failure by exposing it to less inputs/using it for less total time? I feel like I'm missing something...
.
Yep, I misread the page, my mistake
and from my perspective this is a good thing, because it means we've made moral progress as a society.
I know this is off-topic, but I'm curious how you would distinguish between moral progress and "moral going-in-circles" (don't know what the right word is)?
(Keeping in mind that I have nothing to do with the inquiry and can't speak for OP)
Why is it desirable for the inquiry to turn up a representative sample of unpopular beliefs? If that were explicitly the goal, I would agree with you; I'd also agree (?) that questions with that goal shouldn't be allowed. However, I thought the idea was to have some examples of unpopular opinions to use in a separate research study, rather than to directly research what unpopular beliefs LW holds.
If the conclusion of the research turns out to be "here is...
I downvoted because I think the benefit of making stuff like this socially unacceptable on LW is higher than the cost of the OP getting one less response to their survey. The reasons it might be " strong-downvote-worthy had it appeared in most other possible contexts" still apply here, and the costs of replacing it with a less-bad example seem fairly minimal.
I think the US is listed because it's mandatory that we register for the draft
Euthanasia should be a universal right.
This doesn't sound non-normative at all?
My current best-guess answer for what "HCH + annotated functional programming" and no indirection is:
Instead of initializing the tree with the generic question "what should the agent do next", you initialize the tree with the specific question you want an answer for. In the context of IDA, I think (??) this would be a question sampled from the distribution of questions you want the IDA agent to be able to answer well.
Is it fair to say the HCH + AFP part mainly achieves capability amplification, and the indirection part mainly achieves ...
Huh, I thought that all amplification/distillation procedures were intended as a way to approximate HCH, which is itself a tree. Can you not meaningfully discuss "this amplification procedure is like an n-depth approximation of HCH at step x", for any amplification procedure?
For example, the internal structure of the distilled agent described in Christiano's paper is unlikely to look anything like a tree. However, my (potentially incorrect?) impression is that the agent's capabilities at step x are identical to an HCH tree of depth x i...
Huh, what would you recommend I do to reduce my uncertainty around meta-execution (e.g. "read x", "ask about it as a top level question", etc)?
Is this necessarily true? It seems like this describes what Christiano calls "delegation" in his paper, but wouldn't apply to IDA schemes with other capability amplification methods (such as the other examples in the appendix of "Capability Amplification").
Why are you calling this a nitpick? IMO it's a major problem with the post -- I was very unhappy that no mention was made of this obvious problem with the reasoning presented.