OK. That makes more sense then. I'm not sure why you call it 'Fun Theory' though. It sounds like you intend it to be a theory of 'the good life', but a non-hedonistic one. Strangely it is one where people having 'fun' in the ordinary sense is not what matters, despite the name of the theory.
This is a moral theory about what should be fun
I don't think that can be right. You are not saying that there is a moral imperative for certain things to be fun, or to not be fun, as that doesn't really make sense (at least I can't make sense of it). You are instead saying that certain conditions are bad, even when the person is having fun (in the ordinary sense). Maybe you are saying that what is good for someone mostly maps to their fun, but with several key exceptions (which the theory then lists).
In any event, I agree with Z.M. Davis that you should capitalize your 'Fun' when you are using it in a technical sense, and explaining the sense in more detail or using a different word altogether might also help.
Eliezer,
Are you saying that one's brain state can be identical in two different scenarios but that you are having a different amount of fun in each? If so, I'm not sure you are talking about what most people call fun (ie a property of your experiences). If not, then what quantity are you talking about in this post where you have less of it if certain counterfactuals are true?
I would drop dead of shock
Eliezer, just as it was interesting to ask what probability estimate 'Nuts!' amounted to, I think it would be very useful for the forum of Overcoming Bias to ask what your implicit probability estimate for a 500 state TM being able to solve the halting problem for all TMs of up to 50 states.
I imagine that 'I would drop dead of shock' was intended to convey a probability of less than 1 in 10,000, or maybe 1 in 1,000,000?
Sorry, I didn't see that you had answered most of this question in the other thread where I first asked it.
Toby, if you were too dumb to see the closed-form solution to problem 1, it might take an intense effort to tweak the bit on each occasion, or perhaps you might have trouble turning the global criterion of total success or failure into a local bit-fixer; now imagine that you are also a mind that finds it very easy to sing MP3s...
The reason you think one problem is simple is that you perceive a solution in closed form; you can imagine a short program, much shorter than 10 million bits, that solves it, and the work of inventing this program was done in your mind without apparent effort. So this problem is very trivial on the meta-level because the program that solves it optimally appears very quickly in the ordering of possible programs and is moreover prominent in that ordering relative to our instinctive transformations of the problem specification.
But if you were trying random solutions and the solution tester was a black box, then the alternating-bits problem would indeed be harder - so you can't be measuring the raw difficulty of optimization if you say that one is easier than the other.
This is why I say that the human notion of "impressiveness" is best constructed out of a more primitive notion of "optimization".
We also do, legitimately, find it more natural to talk about "optimized" performance on multiple problems than on a single problem - if we're talking about just a single problem, then it may not compress the message much to say "This is the goal" rather than just "This is the output."
I take it then that you agree that (1) is a problem of 9,999,999 bits and that the travelling salesman version is as well. Could you take these things and generate an example which doesn't just give 'optimization power', but 'intelligence' or maybe just 'intelligence-without-adjusting-for-resources-spent'. You say over a set of problem domains, but presumably not over all of them given the no-free-lunch theorems. Any example, or is this vague?
Eliezer,
I'm afraid that I'm not sure precisely what your measure is, and I think this is because you have given zero precise examples: even of its subcomponents. For example, here are two optimization problems:
1) You have to output 10 million bits. The goal is to output them so that no two consecutive bits are different.
2) You have to output 10 million bits. The goal is to output them so that when interpreted as an MP3 file, they would make a nice sounding song.
Now, the solution space for (1) consists of two possibilities (all 1s, all 0s) out of 2^10000000, for a total of 9,999,999 bits. The solution space for (2) is millions of times wider, leading to fewer bits. However, intuitively, (2) is a much harder problem and things that optimized (2) are actually doing more of the work of intelligence, after all (1) can be achieved in a few lines of code and very little time or space, while (2) takes much more of these resources.
(2) is a pretty complex problem, but can you give some specifics for (1)? Is it exactly 9,999,999 bits? If so, is this the 'optimization power'? Is this a function of the size of the solution space and the size of the problem space only? If there was another program attempting to produce a sequence of 100 million bits coding some complex solution to a large travelling salesman problem, such that only two bitstrings suffice, would this have the same amount of optimization power?, or is it a function of the solution space itself and not just its size?
Without even a single simple example, it is impossible to narrow down your answer enough to properly critique it. So far I see it as no more precise than Legg and Hutter's definition.
I agree with David's points about the roughness of the search space being a crucial factor in a meaningful definition of optimization power.
I'm not sure that I get this. Perhaps I understand the maths, but not the point of it. Here are two optimization problems:
1) You have to output 10 million bits. The goal is to output them so that no two consecutive bits are different.
2) You have to output 10 million bits. The goal is to output them so that when interpreted as an MP3 file, they would make a nice sounding song.
Now, the solution space for (1) consists of two possibilities (all 1s, all 0s) out of 2^10000000, for a total of 9,999,999 bits. The solution space for (2) is millions of times wider, leading to fewer bits. However, intuitively, (2) is a much harder problem and things that optimized (2) are actually doing more of the work of intelligence, after all (1) can be achieved in a few lines of code and very little time or space, while (2) takes much more of these resources.
But if you say "Shut up and do what seems impossible!", then that, to me, sounds like dispelling part of the essential message - that what seems impossible doesn't look like it "seems impossible", it just looks impossible.
"Shut up and do what seems impossible!" is the literally correct message. The other one is the exaggerated form. Sometimes exaggeration is a good rhetorical device, but it does turn off some serious readers.
"Don't do it, even if it seems right" sounds merely clever by comparison
This was my point. This advice is useful and clever, though not profound. This literal presentation is both more clear in what it is saying and clear that it is not profound. I would have thought that the enterprise of creating statements that sound more profound than they are is not a very attractive one for rationalists. Memorable statements are certainly a good thing, but making them literally false and spuriously paradoxical does not seem worth it. This isn't playing fair. Any statement can be turned into a pseudo-profundity with these methods: witness many teachings of cults throughout the ages. I think these are the methods of what you have called 'Dark Side Epistemology'.
Eliezer,
Crossman and Crowley make very good points above, delineating three possible types of justification for some of the things you say:
1) Don't turn him in because the negative effects of the undermining of the institution will outweigh the benefits
2) Don't turn him in because [some non-consequentialist reason on non-consequentialist grounds]
3) Don't turn him in because you will have rationally/consequentialistly tied yourself to the mast making it impossible to turn him in to achieve greater benefits.
(1) and (3) are classic pieces of consequentialism, the first dating back at least to Mill. If your reason is like those, then you are probably a consequentialist and there is no need to reinvent the wheel: I can provide some references for you. If you support (2), perhaps on some kind of Newcomb's problem grounds, then this deserves a clear explanation. Why, on account of a tricky paradoxical situation that may not even be possible, will you predictably start choosing to make things worse in situations that are not Newcomb situations? Unless you are explicit about your beliefs, we can't help debug them effectively, and you then can't hold them with confidence for they won't have undergone peer scrutiny. [The same still goes for your meta-ethical claims].
the value of this memory card, was worth more than the rest of the entire observable universe minus the card
I doubt this would be true. I think the value of the card would actually be close to zero (though I'm not completely sure). It does let one solve the halting problem up to 10,000 states, but it does so in time and space complexity O(busy_beaver(n)). In other words, using the entire observable universe as computing power and the card as an oracle, you might be able to solve the halting problem for 7 state machines or so. Not that good... The same goes for having the first 10,000 bits of Omega. What you really want are the bits of Tau, which directly encode whether the nth machine halts. Sure you need exponentially more of them, but your computation is then much faster.