I like the idea from Pretentious Penguin that, IIA might not be satisfied in general, but if you first get the agent to read A, B, C, and then offer {A,B} as options and {A,B,C} as options, (a specific instance of) IIA could be satisfied in that context.
You can gain info by being presented with more options, but once you have gained info, you could just be invariant to being presented with the same info again.
so you would get IIA*: "whether you prefer option A or B is independent of whether I offer you an irrelevant option C, provided that you had already processed {A,B,C} beforehand"
You can't have processed all possible information at a finite time, so above is limited relative to the original IIA.
I also didn't check whether you get additional problems with IIA*.
Is there actually (only) a small number of moral worldviews?
My own moral worldview cares about the journey and has a bunch of preferences for "not going too fast", "not losing important stuff", "not cutting lives short", "not forcing people to grow up much faster than they would like". But my own moral worldview also cares about not having the destination artificially limited. From my vantage point (which is admittedly just intuitions barely held together with duck tape), it seems plausible that there is a set of intermediate preferences between MM and GG, somewhat well-indexed by a continuous "comfortable speed". Here are some questions on which I think people might differ according to their "comfortable speed":
- how far beyond the frontier are you entitled to remain and still live a nice life? (ie how much should we subsidize people who wait for the second generation of upload tech before uploading?)
- how much risk are you allowed to take in pushing the frontier?
- how much consensus do we require to decide the path forward to greater capabilities? (eg choosing which of the following is legitimate: genetic edits, uploading, or artificial intelligence)
- how much control do we want to exert over future generations?
- how much do you value a predictible future?
- how comfortable are you with fast change?
If my "comfortable speed" model is correct, then maybe you would want to assign regions of the lightcone to various preferences according to some gradient.
There can also be preferences over how much the present variance within humanity keeps interacting in the future.
With this picture in mind, one can see that whether an endeavor is infinite depends on one's measure — and e.g. if all you're interested in in mathematics is finding a proof of some particular single theorem, then maybe "math" seems finite to you.
Informally and intuitively, securing the future (in a conflict) naturally induces a measure that makes math seem finite, but enjoying the future (after you have somewhat secured it) naturally induces a measure that makes math seem infinite.
- If is an infinite endeavor, then "how should one do ?" is also infinite.[18] For example: math is infinite, so "how should one do math?" is infinite; ethics is infinite, so "how should one do ethics?" is infinite.[19]
- If is an infinite endeavor and there is a "faithful reduction" of to another endeavor , then is also infinite. (In particular, if an infinite endeavor is "faithfully" a subset of another endeavor , then is also infinite.)[20] For example, math being infinite implies that stuff in general is infinite; "how should one do math?" being infinite implies that "how should one think?" is infinite.
- If an endeavor constitutes a decently big part of an infinite endeavor, then it is infinite.[21][22] For example, to the extent that language is and will remain to be highly load-bearing in thinking, [figuring out how thinking should work] being infinite implies that [figuring out how language should work] is also infinite.
Important and useful, but also keep in mind:
a'. If "how should one do E?" is a finite endeavor then E is finite.
b'. If F finite, then any E that reduces to F must also be finite.
c'. If an endeavor contains not too much more than a finite endeavor, then it is finite.
perhaps "infinitude clusters"? perhaps "infinitudes don't exist alone?" perhaps "infinitudes are not isolated?" perhaps "non-solitary infinitudes?"
I want to preclude scenarios which look like doing a bunch of philosophy
A (perhaps related) pathology: a mind that, whenever it sees the symbol "0", it must prove a novel theorem in ZFC in order to complete the perception of the symbol. For such a mind, many (but not all!) repetitive and boring tasks will induce an infinite endeavor (because the mind contains such an endeavor).
I think it is a mistake to think of these as finite problems[12] — they are infinite.
It is often possible to build multiple instances of a "big/small" dichotomy, such that the theory has similar results. (e.g. [small="polynomial time"] vs [small="decidable"]).
When I imagine using your concept to think about AI, I want to say that cloning a strawberry/proving the Riemann hypothesis is a finite task that nevertheless likely implies an endeavor of a different character than [insert something you could do manually in a few months].
I wonder if the [small=finite] version of your concept is the one we should be using.
I'd maybe rather say that an infinite endeavor is one for which after any (finite) amount of progress, the amount of progress that could still be made is greater than the amount of progress that has been made, or maybe more precisely that at any point, the quantity of “genuine novelty/challenge” which remains to be met is greater than the quantity met already
[Edit: this line of reasoning is addressed later on; this comment is an artefact of not having first read the whole post, specifically see "With this picture in mind, one can see that whether an endeavor is infinite depends on one's measure "]
Imagine it makes sense to quantify the "genuine novelty" of puzzles using rational numbers, and imagine an infinite sequence of puzzles with "genuine novelty" of the i-th puzzle given {0..i-1} being equal to 1. Hence the sum of their "genuine novelty" diverges. Now, create a game which assigns 1/2^{i+1} points to solving the i-th puzzle. My guess as to the intended behaviour of your concept in this "tricky" example:
- the endeavour of maximizing the score on this game is infinite.
- the endeavour of being pretty sure to get a decent score at this game is not infinite.
- the game itself is neither necessarily finite nor infinite.
- for this game to support the weight of math/physics/cooking mushroom pies/etc..., it is not sufficient for me to become "very good at the game".
- The winner of this game is not determined by the finitude of the endeavors of the players [this is trivial, but note less trivially that for any finite number N, the prefix sequence maps to a finite endeavor, and the postfix sequence maps to an infinite endeavor with value in the game (if we let the endeavors "complete") of 1-1/2^{N}, 1/2^{N} respectively].
A different way to make the "game" a test of the concept boundary would be: you get 1 point if you give the solution to at least one of the puzzles in the infinite sequence, else you get 0 points. This game lets you trivially construct infinite endeavors, despite yielding (for every infinite endeavor X) no benefit over a finite prefix (of X). Understanding all the different ways you could have won the game is an infinite endeavor.
It is good to notice the spectrum above. Likely, for a fixed amount of compute/effort, one extreme of this spectrum gets much less agency than the other extreme. Call that the direct effect.
Are there other direct effects? for instance, do you get the same ability to "cure cancer" for a fixed amount of compute/effort across the spectrum? Seems like agency is useful so, probably the ability you get per unit compute is correlated with the agency across this spectrum.
If we are in a setting where an outside force demands you reach a given ability level, then this other indirect effect matters, because it means you will have to use a larger amount of compute.
[optional] To illustrate this problem, consider something that I don't think people think is safer: instead of using gradient descent, just sample the weights of the neural net at random until you get a low loss. (I am not trying to make an analogy here)
It would be great if someone had a way to compute the "net" effect on agency across the spectrum, also taking into account the indirect path of more compute needed -> more compute = more agency across the spectrum. I suspect it might depend on which ability you need to reach, and we might/might not be able to figure it out without experiments.
Good point. I think that if you couple the answers of an oracle to reality by some random process, then you are probably fine.
However, many want to use the outputs of the oracle in very obvious ways. For instance, you ask it what code you should put into your robot, and then you just put the code into the robot.
Could we have an oracle (i.e. it was trained according to some Truth criterion) where when you use it very straightforwardly, it exerts optimization pressure on the world?
One potential issue with "non-EA ideologies don’t even care about stars" is that in biological humans, ideologies don't get transmitted perfectly across generations.
It might matter (a lot) whether [the descendent of the humans currently subscribing to "non-EA ideologies" who end up caring about stars] feel trapped in an "unfair deal".
The above problem might be mitigated by allowing migration between the two zones (as long as the rules of the zones are respected). (ie the children of the star-dwellers who want to come back can do so unless they would break the invariants that allow earth-dwellers to be happy with perhaps some extra leeway/accommodation beyond what is allowed for native earth-dwellers and the children of earth-dwellers who want to start their own colony have some room to do so, reserved in the contract)
one potential source of other people's disagreement is the following intuition: "surely once the star-dwellers expand, they will use their overwhelming power to conquer the earth." Related to this intuition is the fact that expansion which starts out exponential will eventually be bounded by cubic growth (and eventually quadratic, due to gravitational effects, etc...) Basically, a deal is struck now in conditions of plenty, but eventually resources will grow scarce and the balance of power will decay to nothing by then.