One imagines that software could do what humans do -- hunt around in the space of optimizations until one looks plausible, try to find a proof, and then if it takes too long, try another. This won't necessarily enumerate the set of provable optimizations (much less the set of all enumerations), but it will produce some.
To do that it's going to need a decent sense of probability and expected utility. Problem is, OpenCog (and SOAR, too, when I saw it) is still based in a fundamentally certainty-based way of looking at AI tasks, rather than one focused on probability and optimization.
To do that it's going to need a decent sense of probability and expected utility. Problem is, OpenCog (and SOAR, too, when I saw it) is still based in a fundamentally certainty-based way of looking at AI tasks, rather than one focused on probability and optimization.
I don't see why this follows. It might be that mildly smart random search, plus a theorem prover with a fixed timeout, plus a benchmark, delivers a steady stream of useful optimizations. The probabilistic reasoning and utility calculation might be implicit in the design of the "self-improvement-finding submodule", rather than an explicit part of the overall architecture. I don't claim this is particularly likely, but neither does undecidability seem like the fundamental limitation here.
Personal opinion: OpenCog is attempting to get as general as it can within the logic-and-discrete-maths framework of Narrow AI. They are going to hit a wall as they try to connect their current video-game like environment to the real world, and find that they failed to integrate probabilistic approaches reasonably well. Also, without probabilistic approaches, you can't get around Rice's Theorem to build a self-improving agent.
Wellll.... the agent could make "narrow" self-improvements. It could build a formal specification for a few of its component parts and then perform the equivalent of provable compiler optimizations. But it would have a very hard time strengthening its core logic, as Rice's Theorem would interfere: proving that certain improvements are improvements (or, even, that the optimized program performs the same task as the original source code) would be impossible.
But it would have a very hard time strengthening its core logic, as Rice's Theorem would interfere: proving that certain improvements are improvements (or, even, that the optimized program performs the same task as the original source code) would be impossible.
This seems like the wrong conclusion to draw. Rice's theorem (and other undecidability results) imply that there exist optimizations that are safe but cannot be proven to be safe. It doesn't follow that most optimizations are hard to prove. One imagines that software could do what humans do -- hunt around in the space of optimizations until one looks plausible, try to find a proof, and then if it takes too long, try another. This won't necessarily enumerate the set of provable optimizations (much less the set of all enumerations), but it will produce some.
You might look into all the work that's been done with Functional MRI analysis of the brain-- your post reminds me of that. The general technique of "watch the brain and see which regions have activity correlated with various mental states" is a well known technique, and well enough known that all sorts of limitations and statistical difficulties have been pointed out (see wikipedia for citations.)
One man's modus ponens is another man's modus tollens.
In other words, even if this is completely correct, it doesn't disprove relativity. Rather, it disproves either relativity or most versions of utilitarianism--pick one.
In other words, even if this is completely correct, it doesn't disprove relativity. Rather, it disproves either relativity or most versions of utilitarianism--pick one.
It seems like all it shows is that we ought to keep our utility functions Lorentz-invariant. Or, more generally, when we talk about consequentialist ethics, we should only consider consequences that don't depend on aspects of the observer that we consider irrelevant.
I've seen the topic of flow discussed in a wide range of circles from the popular media to very specialized forums. It seems like people are in general agreement that a flow state would be ideal when working, and is generally easy to induce when doing something like coding since it meets most of the requirements for a flow inducing activity.
I'm curious if anyone has made substantial effort to reach a 'flow' state in tasks outside of coding, like reading or doing math etc etc., and what they learned. Are there easy tricks? Is it possible? Is flow just a buzzword that doesn't really mean anything?
I'm curious if anyone has made substantial effort to reach a 'flow' state in tasks outside of coding, like reading or doing math etc etc., and what they learned. Are there easy tricks? Is it possible? Is flow just a buzzword that doesn't really mean anything?
I find reading is just about the easiest activity to get into that state with. I routinely get so absorbed in a book that I forget to move. And I think that's the experience of most readers. It's a little harder with programming actually, since there are all these pauses while I wait for things to compile or run, and all these times when I have to switch to a web browser to look something up. With reading, you can just keep turning pages.
The canonical example is that of a child who wants to steal a cookie. That child gets its morality mainly from its parents. The child strongly suspects that if it asks, all parents will indeed confirm that stealing cookies is wrong. So it decides not to ask, and happily steals the cookie.
I find this example confusing. I think what it shows is that children (humans?) aren't very moral. The reason the child steals instead of asking isn't anything to do with the child's subjective moral uncertainty -- it's that the penalty for stealing-before-asking is lower than stealing-after-asking, and the difference in penalty is enough to make "take the cookie and ask forgiveness if caught" better than "ask permission".
I suspect this is related to our strong belief in being risk-averse when handing out penalties. If I think there's a 50% chance my child misbehaved, the penalty won't be 50% of the penalty if they were caught red-handed. Often, if there's substantial uncertainty about guilt, the penalty is basically zero -- perhaps a warning. Here, the misbehavior is "doing a thing you knew was wrong;" even if the child knows the answer in advance, when the child explicitly asks and is refused, the parent gets new evidence about the child's state of mind, and this is the evidence that really matters.
I suspect this applies to the legal system and society more broadly as well -- because we don't hand out partial penalties for possible guilt, we encourage people to misbehave in ways that are deniable.
It seems worth reflecting on the fact that the point of the foundational LW material discussing utility functions was to make people better at reasoning about AI behavior and not about human behavior.
I think part of Eliezer's point was also to introduce decision theory as an ideal for human rationality. (See http://lesswrong.com/lw/my/the_allais_paradox/ for example.) Without talking about utility functions, we can't talk about expected utility maximization, so we can't define what it means to be ideally rational in the instrumental sense (and we also can't justify Bayesian epistemology based on decision theory).
So I agree with the problem stated here, but "let's stop talking about utility functions" can't be the right solution. Instead we need to emphasize more that having the wrong values is often worse than being irrational, so until we know how to obtain or derive utility functions that aren't wrong, we shouldn't try to act as if we have utility functions.
Without talking about utility functions, we can't talk about expected utility maximization, so we can't define what it means to be ideally rational in the instrumental sense
I like this explanation of why utility-maximization matters for Eliezer's overarching argument. I hadn't noticed that before.
But it seems like utility functions are an unnecessarily strong assumption here. If I understand right, expected utility maximization and related theorems imply that if you have a complete preference over outcomes, and have probabilities that tell you how decisions influence outcomes, you have implicit preferences over decisions.
But even if you have only partial information about outcomes and partial preferences, you still have some induced ordering of the possible actions. We lose the ability to show that there is always an optimal 'rational' decision, but we can still talk about instances of irrational decision-making.
Essentially every post would have been better if it had included some additional thing. Based on various recent comments I was under the impression that people want more posts in Discussion so I've been experimenting with that, and I'm keeping the burden of quality deliberately low so that I'll post at all.
I appreciate you writing this way -- speaking for myself, I'm perfectly happy with a short opening claim and then the subtleties and evidence emerges in the following comments. A dialogue can be a better way to illuminate a topic than a long comprehensive essay.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
That is a monumentally difficult undertaking, unfeasible with current hardware limitations, certainly impossible in the "moments" timescale.
Doing an audit to catch all vulnerabilities is monstrously hard. But finding some vulnerabilities is a perfectly straightforward technical problem.
It happens routinely that people develop new and improved vulnerability detectors that can quickly find vulnerabilities in existing codebases. I would be unsurprised if better optimization engines in general lead to better vulnerability detectors.