All of AlexMennen's Comments + Replies

AlexMennen1814

I might indeed want to create a precedent here and maybe try to fundraise for some substantial fraction of it.

I wonder if it might be more effective to fund legal action against OpenAI than to compensate individual ex-employees for refusing to sign an NDA. Trying to take vested equity away from ex-employees who refuse to sign an NDA sounds likely to not hold up in court, and if we can establish a legal precident that OpenAI cannot do this, that might make other ex-employees much more comfortable speaking out against OpenAI than the possibility that third-p... (read more)

habryka116

Yeah, at the time I didn't know how shady some of the contracts here were. I do think funding a legal defense is a marginally better use of funds (though my guess is funding both is worth it).

AlexMennenΩ120

Yeah, sorry that was unclear; there's no need for any form of hypercomputation to get an enumeration of the axioms of U. But you need a halting oracle to distinguish between the axioms and non-axioms. If you don't care about distinguishing axioms from non-axioms, but you do want to get an assignment of truthvalues to the atomic formulas Q(i,j) that's consistent with the axioms of U, then that is applying a consistent guessing oracle to U.

AlexMennenΩ160

I see that when I commented yesterday, I was confused about how you had defined U. You're right that you don't need a consistent guessing oracle to get from U to a completion of U, since the axioms are all atomic propositions, and you can just set the remaining atomic propositions however you want. However, this introduces the problem that getting the axioms of U requires a halting oracle, not just a consistent guessing oracle, since to tell whether something is an axiom, you need to know whether there actually is a proof of a given thing in T.

2jessicata
The axioms of U are recursively enumerable. You run all M(i,j) in parallel and output a new axiom whenever one halts. That's enough to computably check a proof if the proof specifies the indices of all axioms used in the recursive enumeration.
AlexMennenΩ170

I think what you proved essentially boils down to the fact that a consistent guessing oracle can be used to compute a completion of any consistent recursively axiomatizable theory. (In fact, it turns out that a consistent guessing oracle can be used to compute a model (in the sense of functions and relations on a set) of any consistent recursively axiomatizable theory; this follows from what you showed and the fact that an oracle for a complete theory can be used to compute a model of that theory.)

I disagree with

Philosophically, what I take from this is th

... (read more)
2jessicata
U axiomatizes a consistent guessing oracle producing a model of T. There is no consistent guessing oracle applied to U. In the previous post I showed that a consistent guessing oracle can produce a model of T. What I show in this post is that the theory of this oracle can be embedded in propositional logic so as to enable provability preserving translations.
AlexMennenΩ370

a consistent guessing oracle rather than a halting oracle (which I theorize to be more powerful than a consistent guessing oracle).

This is correct. Or at least, the claim I'm interpreting this as is that there exist consistent guessing oracles that are strictly weaker than a halting oracle, and that claim is correct. Specifically, it follows from the low basis theorem that there are consistent guessing oracles that are low, meaning that access to a halting oracle makes it possible to tell whether any Turing machine with access to the consistent guessing or... (read more)

2jessicata
Thanks, didn't know about the low basis theorem.
AlexMennenΩ120

I don't understand what relevance the first paragraph is supposed to have to the rest of the post.

2jessicata
LS shows to be impossible one type of infinitarian reference, namely to uncountably infinite sets. I am interested in showing to be impossible a different kind of infinitarian reference. "Impossible" and "reference" are, of course, interpreted differently by different people.

Something that I think it unsatisfying about this is that the rationals aren't previleged as a countable dense subset of the reals; it just happens to be a convenient one. The completions of the diadic rationals, the rationals, and the algebraic real numbers are all the same. But if you require that an element of the completion, if equal to an element of the countable set being completed, must eventually certify this equality, then the completions of the diadic rationals, rationals, and algebraic reals are all constructively inequivalent.

2Gurkenglas
Hmmmm. What if I said "an enumeration of the first-order theory of (union(Q,{our number}),<)"? Then any number can claim to be equal to one of the constants.

This means that, in particular, if your real happens to be rational, you can produce the fact that it is equal to some particular rational number. Neither Cauchy reals nor Dedekind reals have this property.

2Gurkenglas
Sure! Fortunately, while you can use this to prove any rational real innocent of being irrational, you can't use this to prove any irrational real guilty of being irrational, since every first-order formula can only check against finitely many constants.

perhaps these are equivalent.

They are. To get enumerations of rationals above and below out of an effective Cauchy sequence, once the Cauchy sequence outputs a rational  such that everything afterwards can only differ by at most , you start enumerating rationals below  as below the real and rationals above  as above the real. If the Cauchy sequence converges to , and you have a rational , then once the Cauchy sequence gets to the point where everything after is gauranteed to differ by at most ... (read more)

My take-away from this:

An effective Cauchy sequence converging to a real  induces recursive enumerators for  and , because if , then  for some , so you eventually learn this.

The constructive meaning of a set is that that membership should be decidable, not just semi-decidable.

If  is irrational, then  and  are complements, and each semi-decidable, so they are decidable. If  is r... (read more)

2tailcalled
You can encode semidecidable sets constructively as functions into Σ, where Σ is the Sierpinski type. The Sierpinski type can be encoded various ways, e.g. coinductively by generators True:Σ and Maybe:Σ→Σ subject to the quotient relation Maybe(x)=x, which leads to two values, True and False:=Maybe(False). The Sierpinski type is closed under countable disjunctions and finite conjunctions, and therefore functions into it are closed under countable unions and finite intersections. In classical math, excluded middle implies that the Sierpinski type, the booleans, and the truth values are all isomorphic, but in constructive math these equivalences are taboo. In particular, the not function on the Sierpinski type is taboo, and would imply that the Sierpinski type is isomorphic to booleans.

If board members have an obligation not to criticize their organization in an academic paper, then they should also have an obligation not to discuss anything related to their organization in an academic paper. The ability to be honest is important, and if a researcher can't say anything critical about an organization, then non-critical things they say about it lose credibility.

1River
"anything related to", depending how it's interpreted, might be overly broad, but something like this seems like a necessary implication, yes. Is that a bad thing?

Yeah, I wasn't trying to claim that the Kelly bet size optimizes a nonlogarithmic utility function exactly, just that, when the number of rounds of betting left is very large, the Kelly bet size sacrifices a very small amount of utility relative to optimal betting under some reasonable assumptions about the utility function. I don't know of any precise mathematical statement that we seem to disagree on.

Well, we've established the utility-maximizing bet gives different expected utility from the Kelly bet, right? So it must give higher expected utility or it

... (read more)

Yeah, I was still being sloppy about what I meant by near-optimal, sorry. I mean the optimal bet size will converge to the Kelly bet size, not that the expected utility from Kelly betting and the expected utility from optimal betting converge to each other. You could argue that the latter is more important, since getting high expected utility in the end is the whole point. But on the other hand, when trying to decide on a bet size in practice, there's a limit to the precision with which it is possible to measure your edge, so the difference between optimal... (read more)

2philh
So like, this seems plausible to me, but... yeah, I really do want to distinguish between * This maximizes expected utility * This doesn't maximize expected utility, but here are some heuristics that suggest maybe that doesn't matter so much in practice If it doesn't seem important to you to distinguish these, then that's a different kind of conversation than us disagreeing about the math, but here are some reasons I want to distingish them: * I think lots of people are confused about Kelly, and speaking precisely seems more likely to help than hurt. * I think "get the exact answer in spherical cow cases" is good practice, even if spherical cow cases never come up. "Here's the exact answer in the simple case, and here are some considerations that mean it won't be right in practice" seems better than "here's an approximate answer in the simple case, and here are some considerations that mean it won't be right in practice". * Sometimes it's not worth figuring out the exact answer, but like. I haven't yet tried to calculate the utility-maximizing bet for those other utility functions. I haven't checked how much Kelly loses relative to them under what conditions. Have you? It seems like this is something we should at least try to calculate before going "eh, Kelly is probably fine". * I've spent parts of this conversation confused about whether we disagree about the math or not. If you had reliably been making the distinction I want to make, I think that would have helped. If I had reliably not made that distinction, I think we just wouldn't have talked about the math and we still wouldn't know if we agreed or not. That seems like a worse outcome to me. Well, we've established the utility-maximizing bet gives different expected utility from the Kelly bet, right? So it must give higher expected utility or it wouldn't be utility-maximizing.

I do want to note though that this is different from "actually optimal"

By "near-optimal", I meant converges to optimal as the number of rounds of betting approaches infinity, provided initial conditions are adjusted in the limit such that whatever conditions I mentioned remain true in the limit. (e.g. if you want Kelly betting to get you a typical outcome of  in the end, then when taking the limit as the number  of bets goes to infinity, you better have starting money , where  is the geometric growth rate you ... (read more)

4philh
Thanks for clarifying! Um, but to clarify a bit further, here are three claims one could make about these examples: 1. As wealth→∞, the utility maximizing bet at given wealth will converge to the Kelly bet at that wealth. I basically buy this. 2. As n→∞, the expected utility from utility-maximizing bets at timestep n converges to that from Kelly bets at timestep n. I'm unsure about this. 3. For some finite n, the expected utility at timestep n from utility-maximizing bets is no higher than that from Kelly bets. I think this is false. (In the positive: I think that for all finite n, the expected utility at timestep n from utility-maximizing bets is higher than that from Kelly bets. I think this is the case even if the difference converges to 0, which I'm not sure it does.) I think you're saying (2)? But the difference between that and (3) seems important to me. Like, it still seems that to a (non-log-money) utility maximizer, the Kelly bet is strictly worse than the bet which maximizes their utility at any given timestep. So why would they bet Kelly? ---------------------------------------- Here's why I'm unsure about 2. Suppose we both have log-money utility, I start with $2 and you start with $1, and we place the same number of bets, always utility-maximizing. After any number of bets, my expected wealth will always be 2x yours, so my expected utility will always be log(2) more than yours. So it seems to me that "starting with more money" leads to "having more log-money in expectation forever". Then it similarly seems to me that if I get to place a bet before you enter the game, and from then on our number of bets is equal, my expected utility will be forever higher than yours by the expected utility gain of that one bet. Or, if we get the same number of bets, but my first bet is utility maximizing and yours is not, but after that we both place the utility-maximizing bet; then I think my expected utility will still be forever higher than yours. And the sam

The reason I brought this up, which may have seemed nitpicky, is that I think this undercuts your argument for sub-Kelly betting. When people say that variance is bad, they mean that because of diminishing marginal returns, lower variance is better when the mean stays the same. Geometric mean is already the expectation of a function that gets diminishing marginal returns, and when it's geometric mean that stays fixed, lower variance is better if your marginal returns diminish even more than that. Do they? Perhaps, but it's not obvious. And if your marginal... (read more)

1RationalDino
The reason why variance matters is that high variance increases your odds of going broke. In reality, gamblers don't simply get to reinvest all of their money. They have to take money out for expenses. That process means that you can go broke in the short run, despite having a great long-term strategy. Therefore instead of just looking at long-term returns you should also look at things like, "What are my returns after 100 trials if I'm unlucky enough to be at the 20th percentile?" There are a number of ways to calculate that. The simplest is to say that if p is your probability of winning, the expected number of times you'll win is 100p. The variance in a single trial is p(1-p). And therefore the variance of 100 trials is 100p(1-p). Your standard deviation in wins is the square root, or 10sqrt(p(1-p)). From the central limit theorem, at the 20th percentile you'll therefore win roughly 100p - 8.5sqrt(p(1-p)) times. Divide this by 100 to get the proportion q that you won. Your ideal strategy on this metric will be Kelly with p replaced by that q. This will always be less than Kelly. Then you can apply that to figure out what rate of return you'd be worrying about if you were that unlucky. Any individual gambler should play around with these numbers. Base it on your bankroll, what you're comfortable with losing, how frequent and risky your bets are, and so on. It takes work to figure out your risk profile. Most will decide on something less than Kelly. Of course if your risk profile is dominated by the pleasure of the adrenaline from knowing that you could go broke, then you might think differently. But professional gamblers who think that way generally don't remain professional gamblers over the long haul.

Correct. This utility function grows fast enough that it is possible for the expected utility after many bets to be dominated by negligible-probability favorable tail events, so you'd want to bet super-Kelly.

If you expect to end up with lots of money at the end, then you're right; marginal utility of money becomes negigible, so expected utility is greatly effected by neglible-probability unfavorable tail events, and you'd want to bet sub-Kelly. But if you start out with very little money, so that at the end of whatever large number of ... (read more)

2philh
Okay, "Kelly is close to optimal for lots of utility functions" seems entirely plausible to me. I do want to note though that this is different from "actually optimal", which is what I took you to be saying. Oops! I actually was just writing things without thinking much and didn't realize it was the same.

If you bet more than Kelly, you'll experience lower average returns and higher variance.

No. As they discovered in the dialog, average returns is maximized by going all-in on every bet with positive EV. It is typical returns that will be lower if you don't bet Kelly.

1RationalDino
Dang it. I meant to write that as, That said, both median and mode are valid averages, and Kelly wins both.
1SimonM
I think the disagreement here is on what "average" means. All-in maximises the arithmetic average return. Kelly maximises the geometric average. Which average is more relevant is equivalent to the Kelly debate though, so hard to say much more

The Kelly criterion can be thought of in terms of maximizing a utility function that depends on your wealth after many rounds of betting (under some mild assumptions about that utility function that rule out linear utility). See https://www.lesswrong.com/posts/NPzGfDi3zMJfM2SYe/why-bet-kelly

2philh
So I claim that Kelly won't maximize E(√money), or more generally E(moneyx) for any x, or E(1−e−money), or E(log(√money)), or E(min(money,x)), or even E(log(money+x)) but it'll get asymptotically close when money>>x. Do you disagree? Your "When to act like your utility is logarithmic" section sounds reasonable to me. Like, it sounds like the sort of thing one could end up with if one takes a formal proof of something and then tries to explain in English the intuitions behind the proof. Nothing in it jumps out at me as a mistake. Nevertheless, I think it must be mistaken somewhere, and it's hard to say where without any equations.

For two, your specific claims about the likely confusion that Eliezer's presentation could induce in "laymen" is empirically falsified to some degree by the comments on the original post: in at least one case, a reader noticed the issue and managed to correct for it when they made up their own toy example, and the first comment to explicitly mention the missing unitarity constraint was left over 10 years ago.

Some readers figuring out what's going on is consistent with many of them being unnecessarily confused.

I don't think this one works. In order for the channel capacity to be finite, there must be some maximum number of bits N you can send. Even if you don't observe the type of the channel, you can communicate a number n from 0 to N by sending n 1s and N-n 0s. But then even if you do observe the type of the channel (say, it strips the 0s), the receiver will still just see some number of 1s that is from 0 to N, so you have actually gained zero channel capacity. There's no bonus for not making full use of the channel; in johnswentworth's formulation of the problem, there's no such thing as some messages being cheaper to transmit through the channel than others.

2Yair Halberstadt
A fair point. Or a similar argument, you can only transfer one extra bit of information this way, since the message representing a number of size 2n is only 1 bit larger than the message representing n.

We "just" need to update the three geometric averages on this background knowledge. Plausibly how this should be done in this case is to normalize them such that they add to one.

My problem with a forecast aggregation method that relies on renormalizing to meet some coherence constraints is that then the probabilities you get depend on what other questions get asked. It doesn't make sense for a forecast aggregation method to give probability 32.5% to A if the experts are only asked about A, but have that probability predictably increase if the experts are a... (read more)

AlexMennenΩ460

Oh, derp. You're right.

AlexMennenΩ34-2

I think the way I would rule out my counterexample is by strengthening A3 to if  and  then there is ...

6Scott Garrabrant
That does not rule out your counterexample. The condition is never met in your counterexample.
Answer by AlexMennenΩ10152

Q2: No. Counterexample: Suppose there's one outcome  such that all lotteries are equally good, except for the lottery than puts probability 1 on , which is worse than the others.

4Scott Garrabrant
I meant the conclusions to all be adding to the previous one, so this actually also answers the main question I stated, by violating continuity, but not the main question I care about. I will edit the post to say that I actually care about concavity, even without continuity.
5Scott Garrabrant
Nice! This, of course, seems like something we should salvage, by e.g. adding an axiom that if A is strictly preferred to B, there should be a lottery strictly between them.

I'm not sure why you don't like calling this "redundancy". A meaning of redundant is "able to be omitted without loss of meaning or function" (Lexico). So ablation redundancy is the normal kind of redundancy, where you can remove sth without losing the meaning. Here it's not redundant, you can remove a single direction and lose all the (linear) "meaning".

Suppose your datapoints are   (where the coordinates  and  are independent from the standard normal distribution), and the feature you're trying to measure is ... (read more)

What you're calling ablation redundancy is a measure of nonlinearity of the feature being measured, not any form of redundancy, and the view you quote doesn't make sense as stated, as nonlinearity, rather than redundancy, would be necessary for its conclusion. If you're trying to recover some feature , and there's any vector  and scalar  such that  for all data  (regardless of whether there are multiple such , which would happen if the data is contained in a proper affine subsp... (read more)

1Fabien Roger
Yep, high ablation redundancy can only exist when features are nonlinear. Linear features are obviously removable with a rank-1 ablation, and you get them by running CCS/Logistic Regression/whatever. But I don't care about linear features since it's not what I care about since it's not the shape the features have (Logistic Regression & CCS can't remove the linear information). The point is, the reason why CCS fails to remove linearly available information is not because the data "is too hard". Rather, it's because the feature is non-linear in a regular way, which makes CCS and Logistic Regression suck at finding the direction which contains all linearly available data (which exists in the context of "truth", just as it is in the context of gender and all the datasets on which RLACE has been tried). I'm not sure why you don't like calling this "redundancy". A meaning of redundant is "able to be omitted without loss of meaning or function" (Lexico). So ablation redundancy is the normal kind of redundancy, where you can remove sth without losing the meaning. Here it's not redundant, you can remove a single direction and lose all the (linear) "meaning".

Ablating along the difference of the means makes both CCS & Supervised learning fail, i.e. reduce their accuracy to random guessing. Therefore:

  • The fact that Recursive CCS finds many good direction is not due to some “intrinsic redundancy” of the data. There exist a single direction which contains all linearly available information.
  • The fact that Recursive CCS finds strictly more than one good direction means that CCS is not efficient at locating all information related to truth: it is not able to find a direction which contains as much information
... (read more)
1Fabien Roger
"Redundancy" depends on your definition, and I agree that I didn't choose a generous one. Here is an even simpler example than yours: positive points are all at (1...1) and negative points are all at (-1...-1). Then all canonical directions are good classifiers. This is "high correlation redundancy" with the definition you want to use. There is high correlation redundancy in our toy examples and in actual datasets. What I wanted to argue against is the naive view that you might have which could be "there is no hope of finding a direction which encodes all information because of redundancy", which I would call "high ablation redundancy". It's not the case that there is high ablation redundancy in both our toy examples (in mine, all information is along (1...1)), and in actual datasets.

My point wasn't that the equation didnt hold perfectly, but that the discrepancies are very suspicious. Two of the three discrepancies were off by exactly 1 order of magnitude, making me fairly confident that they are the result of a typo. (Not sure what's going on with the other discrepency).

5Stephen McAleese
You were right. I forgot the 1B parameter model row so the table was shifted by an order of magnitude. I updated the table so it should be correct now. Thanks for spotting the mistake.

In the table of parameters, compute, and tokens, compute/(parameters*tokens) is always 6, except in one case where it's 0.6, one case where it's 60, and one case where it's 2.75. Are you sure this is right?

1Stephen McAleese
Thanks for spotting this. I noticed that I originally used the formula C=6DN when it should really be C≈6DN because this is the way it's written in the OpenAI paper Scaling Laws for Neural Language Models (2020). I updated the equation. The amount of compute used during training is proportional to the number of parameters and the amount of training data: C∝DN→C≈kDN→C≈6DN. Where there is a conflict between this formula and the table, I think the table should be used because it's based on empirical results whereas the C≈6DN formula is more like a rule of thumb.
AlexMennenΩ7128

It would kind of use assumption 3 inside step 1, but inside the syntax, rather than in the metalanguage. That is, step 1 involves checking that the number encoding "this proof" does in fact encode a proof of C. This can't be done if you never end up proving C.

One thing that might help make clear what's going on is that you can follow the same proof strategy, but replace "this proof" with "the usual proof of Lob's theorem", and get another valid proof of Lob's theorem, that goes like this: Suppose you can prove that []C->C, and let n be the number encodi... (read more)

9Gurkenglas
Similarly: löb = □ (□ A → A) → □ A □löb = □ (□ (□ A → A) → □ A) □löb -> löb: löb premises □ (□ A → A). By internal necessitation, □ (□ (□ A → A)). By □löb, □ (□ A). By löb's premise, □ A.

If that's how it works, it doesn't lead to a simplified cartoon guide for readers who'll notice missing steps or circular premises; they'd have to first walk through Lob's Theorem in order to follow this "simplified" proof of Lob's Theorem.

The revelation that he spent maybe 10x as much on villas for his girlfriends as EA cause areas

Source?

The idea that he was trying to distance himself from EA to protect EA doesn't hold together because he didn't actually distance himself from EA at all in that interview. He said ethics is fake, but it was clear from context that he meant ordinary ethics, not utilitarianism.

3Matt Goldenberg
I don't think it's all that clear, there are reasonable interpretations where he's saying that he simply cared about winning as in making money, and it's possible he was trying to unambiguously lean into that narrative but did so poorly.

"Having been handed this enormous prize, how do I maximize the probability that I max out on utility?" Hm, but that actually doesn't give back any specific criterion, since basically any strategy that never bets your whole stack will win.

That's not quite true. If you bet more than double Kelly, your wealth decreases. But yes, Kelly betting isn't unique in growing your wealth to infinity in the limit as number of bets increases.

If the number of bets is very large, but due to some combination of low starting wealth relative to the utility bound and slow growth rate, it is not possible to get close to maximum utility, then Kelly betting should be optimal.

I basically endorse what kh said. I do think it's wrong to think you can fit enormous amounts of expected value or disvalue into arbitrarily tiny probabilities.

2Charlie Steiner
Yes, I would agree with this. If we suppose our utility function is bounded, then when given unlimited access to a gamble in our favor, we should basically be asking "Having been handed this enormous prize, how do I maximize the probability that I max out on utility?" Hm, but that actually doesn't give back any specific criterion, since basically any strategy that never bets your whole stack will win. What happens if you try to minimize the expected time until you hit the maximum? Optimal play will definitely diverge from the Kelly criterion when your stack is close to the maximum. But in the limit of a large maximum I think you recover the Kelly criterion, for basically the reason you give in this post.

It is true that in practice, there's a finite amount of credit you can get, and credit has a cost, limiting the practical applicability of a model with unlimited access to free credit, if the optimal strategy according to the model would end up likely making use of credit which you couldn't realistically get cheaply. None of this seems important to me. The easiest way to understand the optimal strategy when maximum bet sizes are much smaller than your wealth is that it maximizes expected wealth on each step, rather than that it maximizes expected log wealt... (read more)

Access to credit. In the logarithmic model, you never make bets that could make your net worth zero or negative.

2Dagon
Access to credit, presuming it's finite, just moves the floor, it doesn't change the shape.  It gets complicated when credit has a cost, because it affects the EV of bets that might force you to dip into credit for future bets.  If it's zero-interest, you can just consider it part of your bankroll.  Likewise future earnings - just part of your bankroll (though also complicated if future earnings are uncertain).

Again, the max being a small portion of your net worth isn't the assumption behind the model; the assumption is just that you don't get constrained by lack of funds, so it is a different model. It's true that if the reason you don't get constrained by lack of funds is that the maximum bets are small relative to your net worth, then this is also consistent with maximizing log wealth on each step. But this isn't relevant to what I brought it up for, which was to use it as a step in explaining the reason for the Kelly criterion in the section after it.

2Dagon
I suspect I'm dense, or at least missing something. If ability to make future bets aren't impacted by losing earlier bets, that implies that you cannot bet more than Kelly.  Or are there other ways to not be constrained by lack of funds?   An example of a bet you'd make in the linear model, which you wouldn't make in the logarithmic (bankroll-preserving) model, would help a lot.

No. The point of the model where acting like your utility is linear is optimal wasn't that this is a more realistic model than the assumptions behind the Kelly criterion; it's just another simplified model, which is slightly easier to analyze, so I was using it as a step in showing why you should follow the Kelly criterion when it is your wealth that constrains the bet sizes you can make. It's also not true that the linear-utility model I described is still just maximizing log wealth; for instance, if the reason that you're never constrained by available funds is that you have access to credit, then your wealth could go negative, and then its log wouldn't even be defined.

2Dagon
Sure, I'm a fan of simplified calculations.  But it's not either-or.  Kelly simplifies to "bet your edge" for even-money wagers, and that's great.  It simplifies to "bet the max on +EV wagers" in cases where "the max" is a small portion of your net worth.  It's great, but it's not a different model, it's just a simplified calculation for special cases.

Most of the arguments for Kelly betting that you address here seem like strawmen, except for (4), which can be rescued from your objection, and an interpretation of johnswentworth's version of (2), which you actually mention in footnote 3, but seem unfairly dismissive of.

The assumptions according to which your derived utility function is logarithmic is that expected utility doesn't get dominated by negligible-probability tail events. For instance, if you have a linear utility function and you act like it, you almost surely get 0 payout, but your expected p... (read more)

But in fact, I expect the honest policy to get significantly less reward than the training-game-playing policy, because humans have large blind spots and biases affecting how they deliver rewards.

The difference in reward between truthfulness and the optimal policy depends on how humans allocate rewards, and perhaps it could be possible to find a clever strategy for allocating rewards such that truthfulness gets close to optimal reward.

For instance, in the (unrealistic) scenario in which a human has a well-specified and well-calibrated probability... (read more)

5Ajeya Cotra
Yeah, I definitely agree with "this problem doesn't seem obviously impossible," at least to push on quantitatively. Seems like there are a bunch of tricks from "choosing easy questions humans are confident about" to "giving the human access to AI assistants / doing debate" to "devising and testing debiasing tools" (what kinds of argument patterns are systematically more likely to convince listeners of true things rather than false things and can we train AI debaters to emulate those argument patterns?) to "asking different versions of the AI the same question and checking for consistency." I only meant to say that the gap is big in naive HFDT, under the "naive safety effort" assumption made in the post. I think non-naive efforts will quantitatively reduce the gap in reward between honest and dishonest policies, though probably there will still be some gap in which at-least-sometimes-dishonest strategies do better than always-honest strategies. But together with other advances like interpretability or a certain type of regularization we could maybe get gradient descent to overall favor honesty.
AlexMennenΩ570

It sounds to me like, in the claim "deep learning is uninterpretable", the key word in "deep learning" that makes this claim true is "learning", and you're substituting the similar-sounding but less true claim "deep neural networks are uninterpretable" as something to argue against. You're right that deep neural networks can be interpretable if you hand-pick the semantic meanings of each neuron in advance and carefully design the weights of the network such that these intended semantic meanings are correct, but that's not what deep learning is. The other t... (read more)

AlexMennenΩ570

This seems related in spirit to the fact that time is only partially ordered in physics as well. You could even use special relativity to make a model for concurrency ambiguity in parallel computing: each processor is a parallel worldline, detecting and sending signals at points in spacetime that are spacelike-separated from when the other processors are doing these things. The database follows some unknown worldline, continuously broadcasts its contents, and updates its contents when it receives instructions to do so. The set of possible ways that the pro... (read more)

2Vladimir_Nesov
If you mark something like causally inescapable subsets of spacetime (not sure how this should be called), which are something like all unions of future lightcones, as open sets, then specialization preorder on spacetime points will agree with time. This topology on spacetime is non-Frechet (has nontrivial specialization preorder), while the relative topologies it gives on space-like subspaces (loci of states of the world "at a given time" in a loose sense) are Hausdorff, the standard way of giving a topology for such spaces. This seems like the most straightforward setting for treating physical time as logical time.
AlexMennenΩ340

Wikipedia claims that every sequence is Turing reducible to a random one, giving a positive answer to the non-resource-bounded version of any question of this form. There might be a resource-bounded version of this result as well, but I'm not sure.

  1. By "optimal", I mean in an evidential, rather than causal, sense. That is, the optimal value is that which signals greatest fitness to a mate, rather than the value that is most practically useful otherwise. I took Fisherian runaway to mean that there would be overcorrection, with selection for even more extreme traits than what signals greatest fitness, because of sexual selection by the next generation. So, in my model, the value of  that causally leads to greatest chance of survival could be , but high values for  are eviden
... (read more)
1Ryan Kidd
I think it's important to distinguish between "fitness as evaluated on the training distribution" (i.e. the set of environments ancestral peacocks roamed) and "fitness as evaluated on a hypothetical deployment distribution" (i.e. the set of possible predation and resource scarcity environments peacocks might suddenly face). Also important is the concept of "path-dependent search" when fitness is a convex function on X which biases local search towards X=1, but has global minimum at X=−1. 1. In this case, I'm imagining that Fisherian runaway boosts X as long as it still indicates good fitness on-distribution. However, it could be that X=1 is the "local optimum for fitness" and in reality X=−1 is the global optimum for fitness. In this case, the search process has chosen an intiial X-direction that biases sexual selection towards X=1. This is equivalent to gradient descent finding a local minima. 2. I think I agree with your thoughts here. I do wonder if sexual selection in humans has reached a point where we are deliberately immune to natural selection pressure due to such a distributional shift and acquired capabilities.

Fisherian runaway doesn't make any sense to me.

Suppose that each individual in a species of a given sex has some real-valued variable , which is observable by the other sex. Suppose that, absent considerations about sexual selection by potential mates for the next generation, the evolutionarily optimal value for  is 0. How could we end up with a positive feedback loop involving sexual selection for positive values of , creating a new evolutionary equilibrium with an optimal value  when taking into account sexual selectio... (read more)

1Ryan Kidd
In the context of your model, I see two potential ways that Fisherian runaway might occur: 1. Within each generation, males that survive with higher X are consistently fitter on average than males that survive with lower X because the fitness required to survive monotonically increases with X. Therefore, in every generation, choosing males with higher X is a good proxy for local improvements in fitness. However, the performance detriments of high X "off-distribution" are never signalled. In an ML context, this is basically distributional shift via proxy misalignment. 2. Positive feedback that negatively impacts fitness "on-distribution" might occur temporarily if selection for higher X is so strong that it has "acquired momentum" that ensures females will select for higher X males for several generations past the point the trait becomes net costly for fitness. This is possible if the negative effects of the trait take longer to manifest selection pressure than the time window during which sexual selection boosts the trait via preferential mating. This mechanism is temporary, however, but I can see search processes halting prematurely in an ML context.

I know this was tagged as humor, but taking it seriously anyway,

I'm skeptical that breeding octopuses for intelligence would yield much in the way of valuable insights for AI safety, since octopuses and humans have so much in common that AGI wouldn't. That said, it's hard to rule out that uplifting another species could reveal some valuable unknown unknowns about general intelligence, so I unironically think this is a good reason to try it.

Another, more likely to pay off, benefit to doing this would be as a testbed for genetically engineering humans for hi... (read more)

6Yair Halberstadt
Definitely would be super cool! That's the main reason I actually want this done.

One example of a class of algorithms that can solve its own halting problem is the class of primitive recursive functions. There's a primitive recursive function  that takes as input a description of a primitive recursive function  and input  and outputs  if  halts, and  otherwise: this program is given by , because all primitive recursive functions halt on all inputs. In this case, it is  that does not exist.

I think  should exist, at least for... (read more)

If a group decides something unanimously, and has the power to do it, they can do it. That would take them outside the formal channels of the EU (or in another context of NATO) but I do not see any barrier to an agreement to stop importing Russian gas followed by everyone who agreed to it no longer importing Russian gas. Hungary would keep importing, but that does not seem like that big a problem.

If politicians can blame Hungary for their inaction, then this partially protects them from being blamed by voters for not doing anything. But it doesn't protect ... (read more)

If you have a 10-adic integer, and you want to reduce it to a 5-adic integer, then to know its last n digits in base 5, you just need to know what it is modulo . If you know what it is modulo , then you can reduce it module , so you only need to look at the last n digits in base 10 to find its last n digits in base 5. So a base-10 integer ending in ...93 becomes a base-5 integer ending in ...33, because 93 mod 25 is 18, which, expressed in base 5, is 33.

The Chinese remainder theorem tells us that we can go backwards: given a 5-adic in... (read more)

Load More