But what does the code for that look like. It looks like maximize(# of paperclips in world), but how does it determine (# of paperclips in world)? You just said it has a model. But how can it distinguish between real input that leads to the perception of paperclips and fake input that leads to the perception of paperclips?

Well, if the acronym "POMDP" didn't make any sense, I think we should start with a simpler example, like a chessboard.

Suppose we want to write a chess-playing AI that gets its input from a camera looking at the chessboard. And for some reason, we give it a button that replaces the video feed with a picture of the board in a winning position.

Inside the program, the AI knows about the rules of chess, and has some heuristics for how it expects the opponent to play. Then it represents the external chessboard with some data array. Finally, it has some rules about how the image in the camera is generated from the true chessboard and whether or not it's pressing the button.

If we just try to get the AI to make the video feed be of a winning position, then it will press the button. But if we try to get the AI to get its internal representation of the data array to be in a winning position, and we update the internal representation to try to track the true chessboard, then it won't press the button. This is actually quite easy to do - for example, if the AI is a jumble of neural networks, and we have a long training phase in which it's rewarded for actually winning games, not just seeing winning board states, then it will learn to take into account the state of the button when looking at the image.

Question: How do you make the paperclip maximizer want to collect paperclips? I have two slightly different understandings of how you might do this, in terms of how it's ultimately programmed: 1) there's a function that says "maximize paperclips" 2) there's a function that says "getting a paperclip = +1 good point"

Given these two different understandings though, isn't the inevitable result for a truly intelligent paperclip maximizer to just hack itself and based on my two different understandings: 1) make itself /think/ that it's getting paperclips because that's what it really wants--there's no way to make it value ACTUALLY getting paperclips as opposed to just thinking that it's getting paperclips 2) find a way to directly award itself "good points" because that's what it really wants

I think my understanding is probably flawed somewhere but haven't been able to figure it out so please point out where

To our best current understanding, it has to have a model of the world (e.g. as a POMDP) that contains a count of the number of paperclips, and that it can use to predict what effect its actions will have on the number of paperclips. Then it chooses a strategy that will, according to the model, lead to lots of paperclips.

This won't want to fool itself because, according to basically any model of the world, fooling yourself does not result in more paperclips.

Karpathy mentions offhand in this video that he thinks he has the correct approach to AGI but doesnt say what it is. Before that he lists a few common approaches, so I assume it's not one of those. What do you think he suggests?

P.S. If this worries you that AGI is closer than you expected do not watch Jeff dean's overview lecture of DL research at Google.

I think I don't know the solution, and if so it's impossible for me to guess what he thinks if he's right :)

But maybe he's thinking of something vague like CIRL, or hierarchical self-supervised learning with generation, etc. But I think he's thinking of some kind of recurrent network. So maybe he has some clever idea for unsupervised credit assignment?

I guess the important thing to realise is that the size atoms is irrelevant to the problem. If we considered two atoms joined together to be a new "atom" then they would be twice as heavy, so the forces would be four times as strong, but there would be only half as many atoms, so there would be four times fewer pairs.

So the answer is just the integral as r and r' range over the interior of the earth of G ρ(r) ρ(r')/(r-r')^2, where ρ(r) is the density. We can assume constant density, but I still can't be bothered to do the integral.

The earth has mass 5.97*10^24 kg and radius 6.37*10^6 m, G = 6.674*10^-11 m^3 kg^-1 s^-2 and we want an answer in Newtons = m kg s^-2. So by dimensional analysis, the answer is about G M^2/r^2 = 5.86*10^25.

Cool insight. We'll just pretend constant density of 3M/4r^3.

This kind of integral shows up all the time in E and M, so I'll give it a shot to keep in practice.

You simplify it by using the law of cosines, to turn the vector subtraction 1/|r-r'|^2 into 1/(|r|^2+|r'|^2+2|r||r'|cos(θ)). And this looks like you still have to worry about integrating two things, but actually you can just call r' due north during the integral over r without loss of generality.

So now we need to integrate 1/(r^2+|r'|^2+2r|r'|cos(θ)) r^2 sin(θ) dr dφ dθ. First take your free 2π from φ. Cosine is the derivative of sine, so substitution makes it obvious that the θ integral gives you a log of cosine. So now we integrate 2πr (ln(r^2+|r'|^2+2r|r'|) - ln(r^2+|r'|^2-2r|r'|)) / 2|r'| dr from 0 to R. Which mathematica says is some nasty inverse-tangent-containing thing.

Okay, maybe I don't actually want to do this integral that much :P

exactly by the argument

I don't see any argument there.

To spell it out:

Beauty knows limiting frequency (which, when known, is equal to the probability) of the coin flips that she sees right in front of her will be equal to one-half. That is, if you repeat the experiment many times (plus a little noise to determine coin flips), then you get equal numbers of the event "Beauty sees a fair coin flip and it lands Heads" and "Beauty sees a fair coin flip and it lands Tails." Therefore Beauty assigns 50/50 odds to any coin flips she actually gets to see.

You can make an analogous argument from symmetry of information rather than limiting frequency, but it's less accessible and I don't expect people to think of it on their own. Basically, the only reason to assign thirder probabilities is if you're treating states of the world given your information as the basic mutually-exclusive-and-exhaustive building block of probability assignment. And the states look like Mon+Heads, Mon+Tails, and Tues+Tails. If you eliminate one of the possibilities, then the remaining two are symmetrical.

If it seems paradoxical that, upon waking up, she thinks the Monday coin is more likely to have landed tails, just remember that half of the time that coin landed tails, it's Tuesday and she never gets to see the Monday coin being flipped - as soon as she actually expects to see it flipped, that's a new piece of information that causes her to update her probabilities.

Thank you for the reply. I really appreciate it since it reminds me that I have made a mistake in my argument. I didn't say SSA means reasoning as if an observer is randomly selected from all actually existent observers ( past, present and /b/future/b/).

So how do you get Beauty's prediction? If at the end of the first day you ask for a prediction on the coin, but you don't ask on the second day, then now Beauty knows that the coin flip is, as you say, yet to happen, and so she goes back to predicting 50/50. She only deviates from 50/50 when she thinks there's some chance that the coin flip has already happened.

I think Elga's argument is beauty's credence should not be dependent on the exact time of coin toss. It seems reasonable to me since the experiment can be carried out exact the same way no matter if the coin is tosses on Sunday or Monday night. According to SSA beauty should update credence of H to 2/3 after learning it is Monday. If you think beauty shall give 1/2 if she finds out the coin is tossed on Monday night then her answer would be dependent on the time of coin toss. Which to me seems a rather weak position.

Regarding a betting odds argument. I have give a frequentist model in part I which uses betting odds as part of the argument. In essence, beauty's break even odd is at 1/2 while the selector's is at 1/3, which agrees with there credence.

According to SSA beauty should update credence of H to 2/3 after learning it is Monday.

I always forget what the acronyms are. But the probability of H is 1/2 after learning it's Monday, any any method that says otherwise is wrong, exactly by the argument that you can flip the coin on monday right in front of SB, and if she knows it's Monday and thinks it's not a 50/50 flip, her probability assignment is bad.

He proposes the coin toss could happen after the first awakening. Beauty’s answer ought to remain the same regardless the timing of the toss. A simple calculation tells us his credence of H must be 1/3. As SSA dictates this is also beauty’s answer. Now beauty is predicting a fair coin toss yet to happen would most likely land on T. This supernatural predicting power is a conclusive evidence against SSA.

So how do you get Beauty's prediction? If at the end of the first day you ask for a prediction on the coin, but you don't ask on the second day, then now Beauty knows that the coin flip is, as you say, yet to happen, and so she goes back to predicting 50/50. She only deviates from 50/50 when she thinks there's some chance that the coin flip has already happened.

Sometimes people absolutely will come to different conclusions. And I think you're part of the way there with the idea of letting people talk to see if they converge. But I think you'll get the right answer even more often if you set up specific thought-experiment processes, and then had the imaginary people in those thought experiments bet against each other, and say the person (or group of people all with identical information) who made money on average (where "average" means over many re-runs of this specific thought experiment) had good probabilities, and the people who lost money had bad probabilities.

I don't think this is what probabilities *mean*, or that it's the most elegant way to find probabilities, but I think it's a pretty solid and non-confusing way. And there's a quite nice discussion article about it somewhere on this site that I can't find, sadly.

Very clear argument, thank you for the reply.

The question is if we do not use bayesian reasoning, just use statistics analysis can we still get an unbiased estimation? The answer is of course yes. Using fair sample to estimate population is as standard as it gets. The main argument is of course what is the fair sample. Depending on the answer we get estimation of r=21 or 27 respectively.

SIA states we should treat beauty's own room as a randomly selected from all rooms. By applying this idea in bayesian analysis is how we get thirdism. To oversimplify it: we shall reason as some selector randomly chose a day and find beauty awake, which in itself is a coincidence. However there is no reason for SIA to apply only to bayesian analysis but not statistical analysis. If we use SIA reasoning in statistical analysis, treating her own room as randomly selected from all 81 rooms, then the 9 rooms are all part of a simple random sample, which by definition is unbiased. There is no baye's rule or conditioning involved because here we are not treating it as a probability problem. Beauty's own red room is just a coincidence as in bayesian analysis, it suggest a larger number of reds the same way the other 2 red rooms does.

If one want to argue those 9 rooms are biased, why not use the same logic in a bayesian analysis? Borrowing cousin_it's example. If there are 3 rooms with the number of red rooms uniformly distributed between 1 and 3. If beauty wakes up and open another door and sees another red what should her credence of R=3 be? If I'm not mistaken thirders will say 3/4. Because by randomly selecting 2 room out of 3 and both being red there are 3 ways for R=3 and 1 way for R=2. Here thirders are treating her own room the same way as the second room. And the two rooms are thought to be randomly selected aka unbiased. If one argues the 2 rooms are biased towards red because her own room is red, then the calculation above is no longer valid.

Even if one takes the unlikely position that SIA is only applicable in bayesian but not statistical analysis there are still strange consequences. I might be mistaken but in problems of simple sampling, in general, not considering some round off errors, the statistical estimation would also be the case with highest probability in a bayesian analysis with an uniform prior. By using SIA in a bayesian analysis, we get R=27 as the most likely case. However statistics gives an estimate of R=21. This difference cannot be easily explained.

To answer the last part of your statement. If beauty randomly opens 8 doors and found them all red then she has a sample of pure red. By simple statistics she should give R=81 as the estimation. Halfer and thirders would both agree on that. If they do a bayesian analysis R=81 would also be the case with the highest probability. I'm not sure where 75 comes from I'm assuming by summing the multiples of probability and Rs in the bayesian analysis? But that value does not correspond to the estimation in statistics. Imagine you randomly draw 20 beans from a bag and they are all red, using statistics obviously you are not going to estimate the bag contains 90% red bean.

Sorry for the slow reply.

The 8 rooms are definitely the unbiased sample (of your rooms with one red room subtracted).

I think you are making two mistakes:

First, I think you're too focused on the nice properties of an unbiased sample. You can take an unbiased sample all you want, but if we know information in addition to the sample, our best estimate might not be the average of the sample! Suppose we have two urns, urn A has 10 red balls and 10 blue balls, while urn B has 5 red balls and 15 blue balls. We choose an urn by rolling a die, such that we have a 5/6 chance of choosing urn A and a 1/6 chance of choosing urn B. Then we take a fair, unbiased sample of 4 balls from whatever urn we chose. Suppose we draw out 1 red ball and 3 blue balls. Since this is an unbiased sample, does the process that you are calling "statistical analysis" have to estimate that we were drawing from urn B?

Second, you are trying too hard to make everything about the rooms. It's like someone was doing the problem with two urns from the previous paragraph, but tried to mathematically arrive at the answer only as a function of the number of red balls drawn, without making any reference to the process that causes them to draw from urn A vs. urn B. And they come up with several different ideas about what the function could be, and they call those functions "the Two-Thirds-B-er method" and "the Four-Tenths-B-er method." When really, both methods are incomplete because they fail to take into account what we know about how we picked the urn to draw from.

To answer the last part of your statement. If beauty randomly opens 8 doors and found them all red then she has a sample of pure red. By simple statistics she should give R=81 as the estimation. Halfer and thirders would both agree on that. If they do a bayesian analysis R=81 would also be the case with the highest probability. I'm not sure where 75 comes from I'm assuming by summing the multiples of probability and Rs in the bayesian analysis? But that value does not correspond to the estimation in statistics. Imagine you randomly draw 20 beans from a bag and they are all red, using statistics obviously you are not going to estimate the bag contains 90% red bean.

Think of it like this: if Beauty opens 8 doors and they're all red, and then she goes to open a ninth door, how likely should she think it is to be red? 100%, or something smaller than 100%? For predictions, we use the average of a probability distribution, not just its highest point.

The HoTT book is pretty readable, but I'm not in a position to evaluate its actual goodness.

In your example, I think Bob is doing something unrelated to rationalist Taboo.

In the actual factual game of Taboo, you replace a word with a description that is sufficient to tell your team what the original word is. In rationalist Taboo, you replace a word with a description that is sufficient to convey the ideas you were trying to convey with the original word.

So if Bob tries to taboo "surprise" as "the feeling of observing a low-probability event," and Alice says "A license plate having the number any particular number is low probability - is it surprising?," Bob should think "Oh, the description I replaced 'surprise' with did not convey the same thing as the word 'surprise'. I need to try tabooing it differently."

This works better when you're trying to taboo the usage of a word in a specific context, because the full meaning of a word is very very complicated (though trying to make definitions can still be a fun and profitable game, I agree), but when you look at how you've used it in just one sentence, then you have some hope of pinning down what you mean by it to your satisfaction.

View more: Next

*0 points [-]