AidanGoth — LessWrong

LESSWRONG
LW

Replying toAll AGI Safety questions welcome (especially basic ones) [May 2023]

All AGI Safety questions welcome (especially basic ones) [May 2023]

I thought it dealt with these ok -- could you be more specific?

It's linear because it's an expectation. It is under-specified in that it needs us to assume or prove the marginal distributions for the and I guess that's problematic if an algorithm for doing that is a big part of what the authors are looking for. But if we do have marginal distributions for each $X_{i}$ , then $E (X_{i}^{2}), E (X_{i}^{' 2}), E (X_{i}^{' 2} | π^{'})$ are well-defined and $~ E (\sum_{i = 1}^{n} X_{i}^{2} | π) = \sum_{i = 1}^{n} E (X_{i}^{' 2} | π^{'})$ .

Replying toAll AGI Safety questions welcome (especially basic ones) [May 2023]

AidanGoth3y

All AGI Safety questions welcome (especially basic ones) [May 2023]

This question is in the spirit of "I think I'm doing something dumb / obviously wrong -- help me see why" but it's maybe too niche for this thread. (Answers that redirect me to a better place to ask are welcome.)

I recently read Paul Christiano, Eric Neyman and Mark Xu's "Formalizing the presumption of independence" (https://arxiv.org/pdf/2211.06738.pdf). My understanding is that they aim to formalise some types of reasonable (but defeasible) “hand-waving” in otherwise formal proofs, in a way that maintains the underlying deductive structure of a formal proof and responds appropriately to new information / arguments. They're particularly interested in heuristic estimators that presume the independence of random variables so long as... (read 451 more words →)

Can we learn much by studying the behaviour of RL policies?

AidanGoth

Economists sometimes study revealed preferences, which are preferences that we can infer from choices, e.g. when given the choice between an apple or an orange, if I choose an apple, then I have revealed a preference for an apple over an orange. I'm wondering about the revealed preferences of RL policies (applying behavioural econ / experimental econ to RL policies). We can elicit revealed preferences from RL policies by observing their actions following various histories and we can see whether the revealed preferences satisfy various decision theoretic axioms.

Revealed preferences don’t tell us anything about the inner workings of an agent but they can tell us whether or not an agent is acting as... (read 243 more words →)

Forecasting extreme outcomes

AidanGoth

This document explores and develops methods for forecasting extreme outcomes, such as the maximum of a sample of n independent and identically distributed random variables. I was inspired to write this by Jaime Sevilla’s recent post with research ideas in forecasting and, in particular, his suggestion to write an accessible introduction to the Fisher–Tippett–Gnedenko Theorem.

I’m very grateful to Jaime Sevilla for proposing this idea and for providing great feedback on a draft of this document.

Summary

The Fisher–Tippett–Gnedenko Theorem is similar to a central limit theorem, but for the maximum of random variables. Whereas central limit theorems tell us about what happens on average, the Fisher–Tippett–Gnedenko Theorem tells us what happens in extreme cases. This makes it especially useful... (read 570 more words →)