A tangential note on third-party technical contributions to LW (if that's a thing you care about): the uncertainty about whether changes will be accepted, uncertainty about and lack of visibility into how that decision is made or even who makes it, and lack of a known process for making pull requests or getting feedback on ideas are incredibly anti-motivating.
So what happens when AIXI determines that there's this large computer, call it BRAIN whose outputs tend to exactly correlate with its outputs? AIXI may then discover the hypothesis that the observed effects of AIXI's outputs on the world are really caused by BRAIN's outputs. It may attempt to test this hypothesis by making some trivial modification to BRAIN so that it's outputs differ from AIXI's at some inconsequential time (not by dropping an anvil on BRAIN, because this would be very costly if the hypothesis is true). After verifying this, AIXI may then determine that various hardware improvements to BRAIN will cause its outputs to more closely match the theoretical Solomonoff Inductor, thus improving AIXI's long term payoff.
I mean, AIXI is waaaay too complicated for me to actually properly predict, but is this scenario actually so unreasonable?
Other possible implications of this scenario have been discusesd on LW before.
If you can't do either of these things, then you have little hope of choosing correct contrarian beliefs.
Notably, even if you can't do either of these things, sometimes you can rationally reject the mainstream position if you can conclude that the incentive structure for the "typical experts" makes them hopelessly biased in a particular direction.
This shouldn't lead to rejection of the mainstream position, exactly, but rejection of the evidential value of mainstream belief, and reversion to your prior belief / agnosticism about the object-level question.
Not sure I agree. Let me try to spell it out.
Here's what I always assumed a UDT agent to be doing: on one hand, it has a prior over world programs - let's say without loss of generality that there's just one world program, e.g. a cellular automaton containing a computer containing an operating system running the agent program. It's not pre-sliced, the agent doesn't know what transistors are, etc. On the other hand, the agent knows a quined description of itself, as an agent program with inputs and outputs, in whatever high-level language you like.
Then the agent tries to prove theorems of the form, "if the agent program implements such-and-such mapping from inputs to outputs, then the world program behaves a certain way". To prove a theorem of that form, the agent might notice that some things happening within the world program can be interpreted as "inputs" and "outputs", and the logical dependence between these is provably the same as the mapping implemented by the agent program. (So the agent will end up finding transistors within the world program, so to speak, while trying to prove theorems of the above form.) Note that the agent might discover multiple copies of itself within the world and set up coordination based on different inputs that they receive, like in Wei's post.
This approach also has a big problem, which is kind of opposite to the problem described in the RobbBB's post. Namely, it requires us to describe our utility function at the base level of reality, but that's difficult because we don't know how paperclips are represented at the base level of reality! We only know how we perceive paperclips. Solving that problem seems to require some flavor of Paul's "indirect normativity", but that's broken and might be unfixable as I've discussed with you before.
Solving that problem seems to require some flavor of Paul's "indirect normativity", but that's broken and might be unfixable as I've discussed with you before.
Do you have a link to this discussion?
That seems to be seriously GAZP violating. Trying to figure out how to put my thoughts on this into words but... There doesn't seem to be anywhere that the data is stored that could "notice" the difference. The actual program that is being the person doesn't contain a "realness counter". There's nowhere in the data that could "notice" the fact that there's, well, more of the person. (Whatever it even means for there to be "more of a person")
Personally, I'm inclined in the opposite direction, that even N separate copies of the same person is the same as 1 copy of the same person until they diverge, and how much difference between is, well, how separate they are.
(Though, of course, those funky Born stats confuse me even further. But I'm fairly inclined toward the "extra copies of the exact same mind don't add more person-ness. But as they diverge from each other, there may be more person-ness. (Though perhaps it may be meaningful to talk about additional fractions of personness rather than just one then suddenly two hole persons. I'm less sure on that.)
Why not go a step further and say that 1 copy is the same as 0, if you think there's a non-moral fact of the matter? The abstract computation doesn't notice whether it's instantiated or not. (I'm not saying this isn't itself really confused - it seems like it worsens and doesn't dissolve the question of why I observe an orderly universe - but it does seem to be where the GAZP points.)
Have Eliezer's views (or anyone else's who was involved) on the Anthropic Trilemma changed since that discussion in 2009?
I wonder if it would be fair to characterize the dispute summarized in/following from this comment on that post (and elsewhere) as over whether the resolutions to (wrong) questions about anticipation/anthropics/consciousness/etc. will have the character of science/meaningful non-moral philosophy (crisp, simple, derivable, reaching consensus across human reasoners to the extent that settled science does), or that of morality (comparatively fuzzy, necessarily complex, not always resolvable in principled ways, not obviously on track to reach consensus).
Where Recursive Justification Hits Bottom and its comments should be linked for their discussion of anti-inductive priors.
(Edit: Oh, this is where the first quote in the post came from.)
I agree with the message, but I'm not sure whether I think things with a binomial monkey prior, or an anti-inductive prior, or that don't implement (a dynamic like) modus ponens on some level even if they don't do anything interesting with verbalized logical propositions, deserve to be called "minds".
View more: Next
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
I included the word "sufficient" as an ass-covering move, because one facet of the problem is we don't really know what will serve as a "sufficient" amount of training data in what context.
But, what specific types of tasks do you think machines still can't do, given sufficient training data? If your answer is something like "physics research," I would rejoinder that if you could generate training data for that job, a machine could do it.
I don't see how we anything like know that deep NNs with ‘sufficient training data’ would be sufficient for all problems. We've seen them be sufficient for many different problems and can expect them to be sufficient for many more, but all?