as they code I notice nested for loops that could have been one matrix multiplication.
This seems like an odd choice for your primary example.
For something to experience pain, some information needs to exist (e.g. in the mind of the sufferer, informing them that they are experiencing pain). There are known information limits, e.g. https://en.wikipedia.org/wiki/Bekenstein_bound or https://en.wikipedia.org/wiki/Landauer%27s_principle
These limits are related to entropy, space, energy, etc., so if you further assume the universe is finite (or perhaps equivalently, that the malicious agent can only access a finite portion of the universe due to e.g. speed-of-light limits), then there is an upon b...
Yeah, which I interpret to mean you'd "lose" (where getting $10 is losing and getting $200 is winning). Hence this is not a good strategy to adopt.
99% of the time for me, or for other people?
99% for you (see https://wiki.lesswrong.com/wiki/Least_convenient_possible_world )
More importantly, when the fiction diverges by that much from the actual universe, it takes a LOT more work to show that any lessons are valid or useful in the real universe.
I believe the goal of these thought experiments is not to figure out whether you should, in practice, sit in the waiting room or not (honestly, nobody cares what some rando on the internet would do in some rando waiting room).
Instead, the goal is to provide unit...
Any recommendations for companies that can print and ship the calendar to me?
Okay, but then what would you actually do? Would you leave before the 10 minutes is up?
why do I believe that it's accuracy for other people (probably mostly psych students) applies to my actions?
Because historically, in this fictional world we're imagining, when psychologists have said that a device's accuracy was X%, it turned out to be within 1% of X%, 99% of the time.
I really should get around to signing up for this, but...
Seems like the survey is now closed, so I cannot take the survey at the moment I see the post.
suppose Bob is trying to decide to go left or right at an intersection. In the moments where he is deciding to go either left or right, many nearly identical copies in nearly identical scenarios are created. They are almost entirely all the same, and if one Bob decides to go left, one can assume that 99%+ of Bobs made the same decision.
I don't think this assumption is true (and thus perhaps you need to put more effort into checking/arguing its true, if the rest of your argument relies on this assumption). In the moments where Bob is trying to decide ...
I see some comments hinting at towards this pseudo-argument, but I don't think I saw anyone make it explicitly:
Say I replace one neuron in my brain with a little chip that replicates what that neuron would have done. Say I replace two, three, and so on, until my brain is now completely artificial. Am I still conscious, or not? If not, was there a sudden cut-off point where I switched from conscious to not-conscious, or is there a spectrum and I was gradually moving towards less and less conscious as this transformation occurred?
If I am still conscious...
I played the game "blind" (i.e. I avoided reading the comments before playing) and was able to figure it out and beat the game without ever losing my ship. I really enjoyed it. The one part that I felt could have been made a lot clearer was that the "shape" of the mind signals how quickly they move towards your ship; I think I only figured that out around level 3 or so.
I'm not saying this should be discussed on LessWrong or anywhere else.
You might want to lead with that, because there have been some arguments in the last few days that people should repeal the "Don't talk about politics" rule on a rationality-focused Facebook group, and I thought you were trying to argue for in favor of repealing those rules.
But I'm saying that the impact of this article and broader norm within the rationalsphere made me think in these terms more broadly. There's a part of me that wishes I'd never read...
From a brief skim (e.g. "A Democratic candidate other than Yang to propose UBI before the second debate", "Maduro ousted before end of 2019", "Donald Trump and Xi Jinping to meet in March 2019", etc.), this seems to be focused on "non-personal" (i.e. global) events, whereas my understanding is the OP is interested in tracking predictions for personal events.
Spreadsheet sounds "good enough" if you're not sure you even want to commit to doing this.
That said, I'm "mildly interested" in doing this, but I don't really have inspiration for questions I'd like to make predictions on. I'm not particularly interested in doing predictions about global events and would rather make predictions about personal events. I would like a site that lets me see other people's personal predictions (really, just their questions they're prediction an answer to -- I don't car...
I think this technique only works for one-on-one (or a small group), live interactions. I.e. it doesn't work well for online writings.
The two components that are important for ensuring this technique is successful is:
1. You should tailor the confusion to the specific person you're trying to teach.
2. You have to be able to detect when the confusion is doing more damage than good, and abort it if necessary.
Note: I'm not sure if at the beginning of the game, one of the agents [of AlphaStar] is chosen according to the Nash probabilities, or if at each timestep an action is chosen according to the Nash probabilities.
It's the former. During the video demonstration, the pro player remarked how after losing game 1, in game 2 he went for a strategy that would counter the strategy AlphaStar used in game 1, only to find AlphaStar had used a completely different strategy. The AlphaStar representatives responded saying there's actually 5 AlphaStar agen...
I'm assuming you think wireheading is a disastrous outcome for a super intelligent AI to impose on humans. I'm also assuming you think if bacteria somehow became as intelligent as humans, they would also agree that wireheading would be a disastrous outcome for them, despite the fact that wireheading is probably the best solution that can be done given how unsophisticated their brains are. I.e. the best solution for their simple brains would be considered disastrous by our more complex brains.
This suggests the possibility that maybe the best solution that can be applied to human brains would be considered disastrous for a more complex brain imagining that humans somehow became as intelligent as them.
I feel like this game has the opposite problem of 2-4-6. In 2-4-6, it's very easy to come up with a hypothesis that appear to work with every set of test cases you come up with, and thus become overconfident in your hypothesis.
In your game, I had trouble coming up with any hypothesis that would fit the test cases.
Yeah, but which way is the arrow of causality here? Like, was he already a geeky intellectual, and that's why he's both good at calculus/programming and he reads SSC/OB/LW? Or was he "pretty average", started reading SSC/OB/LW, and then that made him become good at calculus/programming?
Yes, genetics + randomness determines most variation in human behavior, but the SSC/LW stuff has helped provide some direction and motivation.
Would any "participants in nuclear war" (for lack of a better term) be interested in killing escaping rich westerners?
Just don't ask your AI system to optimize for general and long-term preferences without a way for you to say "actually, stop, I changed my mind".
I believe that reduces to "solve the Friendly AI problem".
It's not clear to me that for all observers in our universe, there'd be a distinction between "a surgeon from a parallel universe suddenly appears in our universe, and that surgeon has memories of existing in a universe parallel to the one he now finds himself in." vs "a surgeon, via random quantum fluctuations, suddenly appears in our universe, and that surgeon has memories of existing in a universe parallel to the one he now finds himself in."
In your example, rather than consider all infinitely many parallel universes, you c...
So does that mean a GLUT in the zombie world cannot be conscious, but a GLUT in our world (assuming infinite storage space, since apparently we were able to assume that for the zombie world) can be conscious?
suppose that we (or Omega, since we're going to assume nigh omniscience) asked the person whether JFK was murdered by Lee Harvey Oswald or not, and if they get it wrong, then they are killed/tortured/dust-specked into oblivion/whatever.
Okay, but what is the utility function Omega is trying to optimize?
Let's say you walk up to Omega, tell it "was JFK murdered by Lee Harvey Oswald or not? And by the way, if you get this wrong, I am going to kill you/torture you/dust-spec you."
Unless we've figured out how to build safe oracles, with very high pro...
I also inferred rape from the story. It was the part about how in desperation, he reached out and grabbed at her ankle. And then he was imprisoned in response to that.
But what then makes it recommend a policy that we will actually want to implement?
First of all, I'm assuming that we're taking as axiomatic that the tool "wants" to improve itself (or else why would it have even bothered to consider recommending that it be modified to improve itself?); i.e. improving itself is favorable according to its utility function.
Then: It will recommend a policy that we will actually want to implement, because its model of the universe includes our minds and it can see that if it recommends a policy we will actually want to implement leads it to a higher ranked state in its utility function.
To steelman the parent argument a bit, a simple policy can be dangerous, but if an agent proposed a simple and dangerous policy to us, we probably would not implement it (since we could see that it was dangerous), and thus the agent itself would not be dangerous to us.
If the agent were to propose a policy that, as far as we could tell, appears safe, but was in fact dangerous, then simultaneously:
Can you be a bit more specific in your interpretation of AIXI here?
Here are my assumptions, let me know where you have different assumptions:
I think LearnFun might be informative here. https://www.youtube.com/watch?v=xOCurBYI_gY
LearnFun watches a human play an arbitrary NES games. It is hardcoded to assume that as time progresses, the game is moving towards a "better and better" state (i.e. it assumes the player's trying to win and is at least somewhat effective at achieving its goals). The key point here is that LearnFun does not know ahead of time what the objective of the game is. It infers what the objective of the game is from watching humans play. (More technically, it observes ...
The analogy with cryptography is an interesting one, because...
In cryptography, even after you've proven that a given encryption scheme is secure, and that proof has been centuply (100 times) checked by different researchers at different institutions, it might still end up being insecure, for many reasons.
Examples of reasons include:
Also, if you're going to measure information content, you really need to fix a formal language first, or else "the number of bits needed to express X" is ill-defined.
Basically, learn model theory before trying to wield it.
I don't know model theory, but isn't the crucial detail here whether or not the number of bits needed to express X is finite or infinite? If so, then it seems we can handwave the specific formal language we're using to describe X, in the same way that we can handwave what encoding for Turing Machines generally when talking ab...
I can't directly observe Eliezer winning or losing, but I can make (perhaps very weak) inferences about how often he wins/loses given his writing.
As an analogy, I might not have the opportunity to play a given videogame ABC against a given blogger XYZ that I've never met and will never meet. But if I read his blog posts on ABC strategies, and try to apply them when I play ABC, and find that my win-rate vastly improves, I can infer that XYZ also probably wins often (and probably wins more often than I do).
I guess I'm asking "Why would a finite-universe necessarily dictate a finite utility score?"
In other words, why can't my utility function be:
I suspect that if we're willing to say human minds are Turing Complete[1], then we should also be willing to say that an ant's mind is Turing Complete. So when imagining a human with a lot of patience and a very large notebook interacting with a billion year old alien, consider an ant with a lot of patience and a very large surface area to record ant-pheromones upon, interacting with a human. Consider how likely it is that human would be interested in telling the ant things it didn't yet know. Consider what topics the human would focus on telling the ant, ...
Why not link to the books or give their ISBNs or something?
There are at least two books on model theory by Hodges: ISBN:9780521587136 and ISBN:9780511551574
Why would we give the AI a utility function that assigns 0 utility to an outcome where we get everything we want but it never turns itself off?
The designer of that AI might have (naively?) thought this was a clever way of solving the friendliness problem. Do the thing I want, and then make sure to never do anything again. Surely that won't lead to the whole universe being tiled with paperclips, etc.
Alternately, letting "utility" back in, in a universe of finite time, matter, and energy, there does exist a maximum finite utility which is the sum total of the time, matter, and energy in the universe.
Why can't my utility function be:
?
I.e. why should we forbid a utility function that returns infinity for certain scenarios, except insofar that it may lead to the types of problems that the OP is worrying about?
But what about prediction markets?
Yes, this is a parable about AI safety research, with the humans in the story acting as the AI, and the aliens acting as us.
Right, I suspect just having heard about someone's accomplishments would be an extremely noisy indicator. You'd want to know what they were thinking, for example by reading their blog posts.
Eliezer seems pretty rational, given his writings. But if he repeatedly lost in situations where other people tend to win, I'd update accordingly.
I assume that you accept the claim that it is possible to define what a fair coin is, and thus what an unfair coin is.
If we observe some coin, at first, it may be difficult to tell if it's a fair coin or not. Perhaps the coin comes from a very trustworthy friend who assures you that it's fair. Maybe it's specifically being sold in a novelty store and labelled as an "unfair coin" and you've made many purchases from this store in the past and have never been disappointed. In other words, you have some "prior" probability belief that the c...
People who win are not necessarily rationalists. A person who is a rationalist is more likely to win than a person who is not.
Consider someone who just happens to win the lottery vs someone who figures out what actions have the highest expected net profit.
Edit: That said, careful not to succumb to http://rationalwiki.org/wiki/Argument_from_consequences maybe Genghis Khan really was one of the greatest rationalists ever. I've never met the guy nor read any of his writings, so I wouldn't know.
Actually, I think "Rationalists should WIN" regardless of what their goals are, even if that includes social wrestling matches.
The "should" here is not intended to be moral prescriptivism. I'm not saying in an morally/ethically ideal world, rationalists would win. Instead, I'm using "should" to help define what the word "Rationalist" means. If some person is a rationalist, then given equal opportunity, resources, difficult-of-goal, etc., they will on average, probabilistically win more often than someone who was not ...
rationalists as people who make optimal plays versus rationalists as people who love truth and hate lies
It's only possible for us to systematically make optimal plays IF we have a sufficient grasp of truth. There's only an equivocation in the minds of people who don't understand that one goal is a necessary precursor for the other.
No, I think there is an equivocation here, though that's probably because of the term "people who love truth and hate lies" instead of "epistemic rationalist".
An epistemic rationalist wants to know truth...
The problem with the horses of one color problem is that you are using sloppy verbal reasoning that hides an unjustified assumption that n > 1.
I'm not sure what you mean. I thought I stated it each time I was assuming n=1 and n=2.
In the induction step, we reason "The first horse is the same colour as the horses in the middle, and the horses in the middle have the same colour as the last horse. Therefore, all n+1 horses must be of the same colour". This reasoning only works if n > 1, because if n = 1, then there are no "horses in the middle", and so "the first horse is the same colour as the horses in the middle" is not true.
I think this argument is misleading.
Re "for game theoretical reasons", the paperclipper might take revenge if it predicted that doing so would be a signalling-disincentive for other office-supply-maximizers from stealing paperclips. In other words, the paperclip-maximizer is spending paperclips to take revenge solely because in its calculation, this actually leads to the expected total number of paperclips going up.
What does it mean for a program to have intelligence if it does not have a goal?
This is a very interesting question, thanks for making me think about it.
(Based on your other comments elsewhere in this thread), it seems like you and I are in agreement that intelligence is about having the capability to make better choices. That is, two agents given an identical problem and identical resources to work with, the agent that is more intelligent is more likely to make the "better" choice.
What does "better" mean here? We need to define some...
Feedback:
...Need an example? Sure! I have two dice, and they can each land on any number, 1-6. I’m assuming they are fair, so each has probability of 1/6, and the logarithm (base 2) of 1/6 is about -2.585. There are 6 states, so the total is 6* (1/6) * 2.585 = 2.585. (With two dice, I have 36 possible combinations, each with probability 1/36, log(1/36) is -5.17, so the entropy is 5.17. You may have notices that I doubled the number of dice involved, and the entropy doubled – because there is exactly twice as much that can happen, but the average entropy is
Imagine someone named Omega offers to play a game with you. Omega has a bag, and they swear on their life that exactly one of the following statements is true:
Omega then has an independent neutral third party reach into the bag and pull out a random piece of paper which they then hand to you. You look at the piece of paper and it says "1" on it. Omega doesn't get to look at the piece of p... (read more)