I don't mean alignment with human concerns. I mean that the AI itself is engaged in the same project we are: building a smarter system than itself. So if it's hard to control the alignment of such a system then it should be hard for the AI. (In theory you can imagine that it's only hard at our specific level of intelligence but in fact all the arguments that AI alignment is hard seem to apply equally well to the AI making an improved AI as to us making an AI).
See my reply above. The AI x-risk arguments require the assumption that superintelligence nece...
Those are reasonable points but note that the arguments for AI x-risk depend on the assumption that any superintelligence will necessarily be highly goal directed. Thus, either the argument fails because superintelligence doesn't imply goal directed,
And given that simply maximizing the intelligence of future AIs is merely one goal in a huge space it seems highly unlikely that (especially if we try to avoid this one goal) we just get super unlucky and the AI has the one goal that is compatible with improvement.
I like the idea of this sequence, but -- given the goal of spelling out the argument in terms of first principles -- I think more needs to be done to make the claims precisce or acknowledge they are not.
I realize that you might be unable to be more precisce given the lack of precision in this argument generally -- I don't understand how people have invested so much time/mondy on research to solve the problem and so little on making the argument for it clear and rigorous -- but if that's the case I suggest you indicate where the definitions are insufficient...
Maybe this question has already been answered but I don't understand how recursive self-improvement of AIs is compatible with the AI alignment problem being hard.
I mean doesn't the AI itself face the alignment problem when it tries to improve/modify itself substantially? So wouldn't a sufficently intelligent AI refuse to create such an improvement for fear the goals of the improved AI would differ from its own?
I'd just like to add that even if you think this piece is completly mistaken I think it certainly shows we are definitely not knowledgeable enough about what and how values and motives work in us much less AI to confidently make the prediction that AIs will be usefully described with a single global utility function or will work to subvert their reward system or the like.
Maybe that will turn out to be true but before we spend so many resources on trying to solve AI alignment let's try to make the argument for the great danger much more rigorous first...usually best way to start anyway.
This is one of the most important posts ever on LW though I don't think the implications have been fully drawn out. Specifically, this post raises serious doubts about the arguments for AI x-risk as a result of alignment mismatch and the models used to talk about that risk. It undercuts both Bostrom's argument that an AI will have a meaningful (self-aware?) utility function and Yudkowsky's reward button parables.
The role these two arguments play in convincing people that AI x-risk is a hard problem is to explain why, if you don't anthropomorphize shoul...
I absolutely think that the future of online marketing g involves more asking ppl for their prefs. I know I go into my settings on good to active curate what they show me.
Indeed, I think Google is leaving a fucking pile of cash on the table by not adding a "I dislike" button and a little survey on their ads.
I feel there is something else going on here too.
Your claimed outside view asks us to compare a clean codebase with an unclean one and I absolutely agree that it's a good case for using currentDate when initially writing code.
But you motivated this by considering refactoring and I think things go off the rails there. If the only issue in your codebase was you called currentDate yyymmdd consistently or even had other consistent weird names it wouldn't be a message it would just have slightly weird conventions. Any coder working on it for a non-trivial len...
I don't think this is a big problem.. The people who use ad blockers are both a small fraction of internet users and the most sophisticated ones so I doubt they are a major issue for website profit. I mean sure, Facebook is eventually going to try to squeeze out the last few percent of users if they can do so with an easy countermeasure but if this was really a big concern websites would be pushing to get that info back from the company they use to host ads. Admittedly when I was working on ads for Google (I'm not cut out to be out of academia so I went...
Interesting, but I think it's the other end of the equation where the real problem lies: voting. Given the facts that
1) A surprisingly large fraction of the US population has tried hard drugs of one kind or another.
2) Even those who haven't almost surely know people who have and seem to find it interesting/fascinating/etc.. not horrifying behavior that deserves prison time.
So why is it that people who would never dream of sending their friend who tried coke to prison or even the friend who sold that friend some of his stash how do we end up with...
So your intent here is to diagnose the conceptual confusion that many people have with respect to infinity yes? And your thesis is that: people are confused about infinity because they think it has a unique referant while in fact positive and negative infinity are different?
I think you are on to something but it's a little more complicated and that's what gets people are confused. The problem is that in fact there are a number of different concepts we use the term infinity to describe which is why it so super confusing (and I bet there are more...
Stimulants are an excellent short term solution. If you absolutely need to get work done tonight and can't sleep amphetamine (i.e. Adderall) is a great solution. Indeed, there are a number of studies/experiments (including those the airforce relies on to give pilots amphetamines) backing up the fact that it improves the ability to get tasks done while sleep deprived.
Of course, if you are having long term sleep problems it will likely increase those problems.
There is a lot of philosophical work on this issue some of which recommends taking conditional probability as the fundamental unit (in which case Bayes theorem only applies for non-extremal values). For instance, see this paper
Computability is just \Delta^0_1 definability. There are plenty of other notions of definability you could try to cash out this paradox in terms of. Why pick \Delta^0_1 definability?
If the argument worked in any particular definability notion (e.g. arithmetic definability) it would be a problem. Thus, the solution needs to explain why the argument shouldn't convince you that with respect to any concrete notion of definable set the argument doesn't go through.
But that's not what the puzzle is about. There is nothing about computability in it. It is supposed to be a paradox along Russell's set of all sets that don't contain themselves.
The response about formalizing exactly what counts as a set defined by an English sentence is exactly correct.
Yah, enumerable means something different than computably enumerable.
This is just the standard sleeping beauty paradox and I’d suggest that the issue isn’t unique to FNC.
However, you are a bit quick in concluding it is time inconsistent as it’s not altogether clear that one is truly referring to the same event before and after you have the observation. The hint here is that in the standard sleeping beauty paradox the supposed update involves only information you already were certain you would get.
Id argue that what’s actually going on is that you are evaluating slightly different questions in the two cases
Don't. At least outside of Silicon Valley where oversubscription may actually be a problem. It's a good intention but it inevitably will make people worry they aren't welcome or aren't the right sort of people . Instead, describe what one does or what one talks about in a way that will appeal to the kind of people who would enjoy coming
Given that you just wrote a whole post to say hi and share your background with everyone I'm pretty confident you'll fit right in and won't have any problems being too shy. Writing a post like this rather than just commenting is such a less wrong kind of thing to do so I think you'll be right at home.
Searle can be any X?? WTF? That's a bit confusingly written.
The intuition Searle is pumping is that since he, as a component of the total system doesn't understand Chinese it seems counterintuitive to conclude that the whole system understands Chinese. When Searle says he is the system he is pointing to the fact that he is doing all the actual interpretation of instructions and is seems weird to think that the whole system has some extra experiences that let it understand Chinese even though he does not. When Searle uses the word understand he does not
...I essentially agree with you that science can't bridge the is-ought gap (see caveats) but it's a good deal more complicated than the arguments you give here allow for (they are a good intro but I felt it's worth pointing out the complexities).
I agree with your general thrust except your statement that "you longtermists can simply forgo your own pleasure wireheading and instead work very hard on the whole growth and reproduction agenda" if we are able to wirehead in an effective manner it might be morally obligatory to force them into wireheading to maximize utility.
Also, your concern about some kind of disaster caused by wireheading addiction and resulting deaths and damage is pretty absurd.
Yes, people are more likely to do drugs when they are more available but even if the government can't limit the devices that enable wireheading from legal purchase it will still require a greater effort to put together your wireheading setup than it currently does to drive to the right part of the nearest city (discoverable via google) and purchasing some heroin. Even if it did become very easy to access it's still not true that m
...You make a lot of claims here that seem unsupported and based on nothing but vague analogy with existing primitive means of altering our brain chemisty. For instance a key claim that pretty most of your consequences seem to depend on is this: "It is great to be in a good working mood, where you are in the flow and every task is easy, but if one feels “too good”, one will be able only to perform “trainspotting”, that is mindless staring at objects.
Why should this be true at all? The reason heroin abusers aren't very productive (and, imo, heroin isn't the mo
...Skimming the paper I'm not at all impressed. In particular, they make frequent and incautious use of all sorts of approximations that are only valid up to a point under certain assumptions but make no attempt to bound the errors introduced or justify the assumptions.
This is particularly dangerous to do in the context of trying to demonstrate a failure of the 2nd law of thermodynamics as the very thought experiment that might be useful will generally break those heuristics and approximations. Worse, the 2nd law is only a statistical regularity not a true
...Not everyone believes that everything is commesurable and people often wish to be able to talk about these issues without implicitly presuming that fact.
Moreover, values suggests something that is desirable because it is a moral good. A priority can be something I just happen to selfishly want. For instance, I might hold diminishing suffering as a value yet my highest current priority might be torturing someone to death because they killed a loved one of mine (having that priority is a moral failing on my part but doesn't make it impossible).
Two thoughts.
First, as a relatively in shape person who walks a ton (no car living in the midwest) I can attest that I often wish I had a golf cart/scooter solution. They don't need to be a replacement for walking (though good that they can be) they might also appeal to those of us who like to walk a lot but need a replacement for a car when it gets really hot or we need to carry groceries (motorcycle style scooters require licenses and can't always be driven on campuses or parks). It would be great if these became less socially disapproved of for the no
...Except if you actually go try and do the work people's pre-theoretic understanding of rationality doesn't correspond to a single precise concept.
Once you step into Newcomb type problems it's no longer clear how decision theory is supposed to correspond to the world. You might be tempted to say that decision theory tells you the best way to act...but it no longer does that since it's not that the two-boxer should have picked one box. The two-boxer was incapable of so picking and what EDT is telling you is something more like: you sho...
First, let me say I 100% agree with the idea that there is a problem in the rationality community of viewing rationality as something like momentum or gold (I named my blog rejectingrationality after this phenomena and tried to deal with it in my first post).
However, I'm not totally sure everything you say falls under that concept. In particular, I'd say that rationality realism is something like the belief that there is a fact of the matter about how best to form beliefs or take actions in response to a particular set of experiences and that ...
If you want to argue against that piece of reasoning give it a different name because it's not the Chinese room argument. I took multiple graduate classes with professor Searle and, while there are a number of details Said definitely gets the overall outline correct and the argument you advanced is not his Chinese room argument.
That doesn't mean we can't talk about your argument just don't insist it is Searle's Chinese room argument.
To the extent they define a particular idealization it's one which isn't interesting/compelling. What one would want to have to say there was a well defined question here is a single definition of what a rational agent is that everyone agreed on which one could then show favored such and such decision theory.
To put the point differently you and I can agree on absolutely every fact about the world and mathematics and yet disagree about which is the best decision theory because we simply mean slightly different things by rational agent. Moreover,...
Obviously you can and if you define that NEW idealization an X-agent (or more likely redefine the word rationality in that situation) and then there may be a fact of the matter about how an X-agent will behave in such situations. What we can't do is assume that there is a fact of the matter about what a rational agent will do that outstrips the definition.
As such it doesn't make sense to say CDT is right or TDT or whatever before introducing a specific idealization relative to which we can prove they give the correct answer. But that idealiz...
Your criticism of the philosophy/philosophers is misguided on a number of accounts.
1. You're basing those criticisms on the presentation in a video designed to present philosophy to the masses. That's like reading some phys.org arg article claiming that electrons can be in two locations at once and using that to critisize the theory of Quantum Mechanics.
2. The problem philosophers are interested in addressing may not be the one you are thinking of. Philosophers would never suggest that the assumption of logical omniscience prevents one from usi...
In particular, I'd argue that the paradoxical aspects of Newcomb's problem result from exactly this kind of confusion between the usual agent idealization and the fact that actual actors (human beings) are physical beings subject to the laws of physics. The apparent paradoxical aspects results because we are used to idealizing individual behavior in terms of agents where that formalism requires we specify the situation in terms of a tree of possibilities with each path corresponding to an outcome and with the payoff computed by looking at the pa...
It doesn't really make sense to talk about the agent idealization at the same time as talking about effective precommitment (i.e. deterministic/probabilistic determination of actions).
The notion of an agent is an idealization of actual actors in terms of free choices, e.g., idealizing individuals in terms of choices of functions on game theoretic trees. This is an incompatible idealization with thinking of such actors as being deterministically or probabilistically committed to actions for those same 'choices.'
Of course, ultimately, actual...
You are getting the statement of the Chinese room wrong. The claim isn't that the human inside the room will learn Chinese. Indeed, it's a key feature of the argument that the person *doesn't* ever count as knowing Chinese. It is only the system consisting of the person plus all the rules written down in the room etc.. which knows Chinese. This is what's supposed to (but not convincingly IMO) be an unpalatable conclusion.
Secondly, no one is suggesting that there isn't an algorithm that can be followed which makes it appear as i...
We can identify places we know (inductively) tend to lead us astray and even identify tricks that help us avoid being affected by common fallacies which often aflict humans. However, it's not at all clear if this actually makes us more rational in any sense.
If you mean act-rationality we'd have to study if this was a good life choice. If you mean belief rationality you'd have to specify some kind of measure/notion of importance to decide when it really matters you believed the true thing. After all if it's just maximizing the numb...
The inveresly proportional thing is a bad move. Sorting through potential charitable causes is itself charitable work and it's just crazy inefficient to do that by everyone voting on what tugs at their heartstrings but by paying someone smart to consider all the various pet causes and evaluate them. Worse, the causes that are least well known are often unknown for very good reason but will now get special attention.
The reason you are right about cases like the doctor example is that when you are actually in a location that then gets hit you *are* l...
Note that your whole delegation argument rests on the idea that you have (and know you have) some kind of superior knowledge (or virtue) about what needs to get done and you're just searching for the best way to get it done. The reason it made sense to stay involved in the local campaign was because you had the special advantage of being the person who knew the right way to improve the city so you could offer something more than any other equally virtuous person you might hand the money to instead..
In contrast, in the village case you *don't* ha...
Why assume whatever beings simulated us evolved?
Now I'm sure you're going to say well a universe where intelligent beings just pop into existence fully formed is surely less simple than one where they evolve. However, when you give it some more thought that's not true and it's doubtful if Occam's razor even applies to initial conditions.
I mean supposed for a moment the universe is perfectly deterministic (newtonian or no-collapse interp). In that case the Kolmogorov complexity of a world starting with a big bang that gives rise...
No, because we want the probability of being a simulation conditional on having complex surroundings not the probability of having complex surroundings conditional on a simulation. The fact that a very great number of simulated beings are created in simple universes doesn't mean that none is ever simulated in a complex one or tell us anything about whether being such a simulation is more likely than being in a physical universe.
Ok, this is a good point. I should have added a requirement that the true solution is C infinity on the part of the manifold that isn't in our temporal past. The backward's heat equation is ill-posed for this reason on...it can't be propogated arbitrarily far forward (i.e. back).
Which way you think this goes probably depends on just how strongly you think Occam's razor should be applied. We are all compelled to let the probability of a theory's truth go to zero as it's kolmogorov complexity goes to infinity but there is no prima facia reason to think it drops off particularly fast or slow. If you think , as I do, that there is only relatively weak favoring of more simple scientific laws while intelligent creatures would favor simplicity as a cognitive technique for managing complexity quite strongly you get my conclusion. But I'll admit the other direction isn't implausible.
The problem with this kind of analysis is that one is using the intuition of a physical scenario to leverage an ambiguity in what we mean by agent and decision.
Ultimately, the notion of decisions and agents are idealizations. Any actual person or AI only acts as the laws of physics dictate and agents, decisions or choices don't appear in any description in terms of fundamental physics. Since people (and programs) are complex systems that often make relatively sophisticated choices about their actions we introduce the idealization of agents and decis...
I worry such a plan will face significant legal hurdles. As suggested the building would probably not fall into the exceptions to the federal fair housing act (is that right) for choosing roommates (it's not a single family dwelling but a group of apartments in some sense).
But you EXACTLY want to choose who lives there based on political/religious beliefs (almost by definition it's impossible to be a rationalist and a dogmatic unquestioning conservative christian). Also by aspects of family makeup in that you don't want people living in this community to...
Sorry, but you can't get around the fact that humans are not well equipped to compute probabilities. We can't even state what our priors are in any reasonable sense much less compute exact probabilities.
As a result using probabilities has come to be associated with having some kind of model. If you've never studied the question and are asked how likely you think it is there are intelligent aliens you say something like "I think it's quite likely". You only answer with a number if you've broken it down into a model (chance life evolves averag...
Uhh, why not just accept that you aren't and can never be perfectly rational and use those facts in positive ways.
Bubbles are psychologically comforting and help generate communities. Rationalist bubbling (which ironically includes the idea that they don't bubble) probably does more to build the community and correct other wrong beliefs than almost anything else.
Until and unless rationalist take over society the best strategy is probably just to push for a bubble that actively encourages breaking other (non-rationalist) bubbles.
So the equations should be (definition of vaccine efficacy from wikipedia)
.6 * p(sick2) = p(sick2) - p(sick1)
p(sick1) - .4 p(sick2) = 0
.
i.e. efficacy is the difference be the unvaccinated and vacinated rates of infection divided by the unvaccinated rate. You have to assume there is no selective pressure in terms of who gets the vaccine (they have the same risk pool as the normal population for flu which is surely untrue) to get your assumtion that
.42 p(sick1) + .58p(sick2) = .1 p(sick1) + 1.38p(sick2) = .238
or 1.78 p(sick2) = .238
p(sick2)=.13 (weird I...
Not with respect to their revealed preferences for working in high risk jobs I understand. There are a bunch of economic papers on this but it was a surprisingly low number.
I agree that's a possible way things could be. However, I don't see how it's compatible with accepting the arguments that say we should assume that alignment is a hard problem. I mean absent such arguments why expect you have to do anything special beyond normal training to solve alignment?
As I see the argumentative landscape the high x-risk estimates depend on arguments that claim to give reason to believe that alignment is just a generally hard problem. I don't see anything in those arguments that distinguishes between these two cases.
In other words o... (read more)