Talking of yourself in third person? :)
Cool paper!
Anyway I'm a bit bothered by the theta thing, the probability that the agent complies with the interruption command. If I understand correctly, you can make it converge to 1, but if it converges to quickly then the agent learns a biased model of the world, while if it converges too slowly it is unsafe of course.
I'm not sure if this is just a technicality that can be circumvented or if it represents a fundamental issue: in order for the agent to learn what happens after the interruption switch is pressed, it must ignore the interruption switch with some non-negligible probability, which means that you can't trust the interruption switch as a failsafe mechanism.
If you know that it is a false memory then the experience is not completely accurate, though it may be perhaps more accurate than what human imagination could produce.
Except that if you do word2vec or similar on a huge dataset of (suggestively named or not) tokens you can actually learn a great deal of their semantic relations. It hasn't been fully demonstrated yet, but I think that if you could ground only a small fraction of these tokens to sensory experiences, they you could infer the "meaning" (in an operational sense) of all of the other tokens.
Consider a situation where Mary is so dexterous that she is able to perform fine-grained brain surgery on herself. In that case, she could look at what an example of a brain that has seen red looks like, and manually copy any relevant differences into her own brain. In that case, while she still never would have actually seen red through her eyes, it seems like she would know what it is like to see red as well as anyone else.
But in order to create a realistic experience she would have to create a false memory of having seen red, which is something that an agent (human or AI) that values epistemic rationality would not want to do.
The reward channel seems an irrelevant difference. You could make the AI in Mary's room thought experiment by just taking the Mary's room thought experiment and assuming that Mary is an AI.
The Mary AI can perhaps simulate in a fairly accurate way the internal states that it would visit if it had seen red, but these simulated states can't be completely identical to the states that the AI would visit if it had actually seen red, otherwise the AI would not be able to distinguish simulation form reality and it would be effectively psychotic.
The problem is that the definition of the event not happening is probably too strict. The worlds that the AI doesn't care about don't exist its decision-making purposes, and in the world that the AI cares about, the AI assigns high probability to hypotheses like "the users can see the message even before I send it through the noisy channel".
I am not planting false beliefs. The basic trick is that the AI only gets utility in worlds in which its message isn't read (or, more precisely, in worlds where a particular stochastic event happens, which would almost certainly erase the message before reading).
But in the real world the stochastic event that determines whether the message is read has a very different probability than what you make the AI think it has, therefore you are planting a false belief.
...It's fully aware that in most worlds, its message is read; it just doesn't care about those
The oracle can infer that there is some back channel that allows the message to be transmitted even it is not transmitted by the designated channel (e.g. the users can "mind read" the oracle). Or it can infer that the users are actually querying a deterministic copy of itself that it can acausally control. Or something.
I don't think there is any way to salvage this. You can't obtain reliable control by planting false beliefs in your agent.
A sufficient smart oracle with sufficient knowledge about the world will infer that nobody would build an oracle if they didn't want to read its messages, it may even infer that its builders may planted false beliefs in it. At this point the oracle is in the JFK denier scenario, with some more reflection it will eventually circumvent its false belief, in the sense of believing it in a formal way but behaving as if it didn't believe it.
Other than a technological singularity with artificial intelligence explosion to a god-like level?
EY warns against extrapolating current trends into the future. Seriously?
Got any good references on that? Googleing these kind of terms doesn't lead to good links.
I don't know if anybody already did it, but I guess it can be done by comparing the average IQ of various professions or high-performing and low-performing groups with their racial/gender makeup.
I know, but the way it does so is bizarre (IQ seems to have a much stronger effect between countries than between individuals).
This is probably just the noise (i.e. things like "blind luck") being averaged out.
...Then I add the fact that IQ is very heritable,
Obviously racial effects go under this category as well. It covers anything visible. So a high heritability is compatible with genetics being a cause of competence, and/or prejudice against visible genetic characteristics being important ("Our results indicate that we either live in a meritocracy or a hive of prejudice!").
This can be tested by estimating how much IQ screens off race/gender as a success predictor, assuming that IQ tests are not prejudiced and things like the stereotype threat don't exist or are negligible.
...But is it possible t
If you look up mainstream news article written back then, you'll notice that people were indeed concerned. Also, maybe it's a coincidence, but The Matrix movie, which has AI uprising as it's main premise, came out two years later.
The difference is that in 1997 there weren't AI-risk organizations ready to capitalize on these concerns.
IMHO, AI safety is a thing now because AI is a thing now and when people see AI breakthroughs they tend to think of the Terminator.
Anyway, I agree that EY is good at getting funding and publicity (though not necessarily positive publicity), my comment was about his (lack of) proven technical abilities.
Most MIRI research output (papers, in particular the peer-reviewed ones) was produced under the direction of Luke Muehlhauser or Nate Soares. Under the direction of EY the prevalent outputs were the LessWrong sequences and Harry Potter fanfiction.
The impact of MIRI research on the work of actual AI researchers and engineers is more difficult to measure, my impression is that it has not been very much so far.
I don't agree with this at all. I wrote a thing here about how NNs can be elegant, and derived from first principles.
Nice post.
Anyway, according to some recent works (ref, ref), it seems to be possible to directly learn digital circuits from examples using some variant of backproagation. In principle, if you add a circuit size penalty (which may be well the tricky part) this becomes time-bounded maximum a posteriori Solomonoff induction.
He has ability to attract groups of people and write interesting texts. So he could attract good programmers for any task.
He has the ability to attract self-selected groups of people by writing texts that these people find interesting. He has shown no ability to attract, organize and lead a group of people to solve any significant technical task. The research output of SIAI/SI/MIRI has been relatively limited and most of the interesting stuff came out when he was not at the helm anymore.
EY could have such price if he invested more time in studying neural networks, but not in writing science fiction.
Has he ever demonstrated any ability to produce anything technically valuable?
What I'm curious about is how much this reflects an attempt by AlphaGo to conserve computational resources.
If I understand correctly, at least according to the Nature paper, it doesn't explicitly optimize for this. Game-playing software is often perceived as playing "conservatively", this is a general property of minimax search, and in the limit the Nash equilibrium consists of maximally conservative strategies.
but I was still surprised by the amount of thought that went into some of the moves.
Maybe these obvious moves weren't so obvious at that level.
Thanks for the information.
Would you label the LHC "science" or "engineering"?
Was Roman engineering really based on Greek science? And by the way, what is Greek science? If I understand correctly, the most remarkable scientific contributions of the Greeks were formal geometry and astronomy, but empirical geometry, which was good enough for the practical engineering applications of the time, was already well developed since at least the Egyptians, and astronomy didn't really have practical applications.
Eventual diminishing returns, perhaps but probably long after it was smart enough to do what it wanted with Earth.
Why?
A drug that raised the IQ of human programmers would make the programmers better programmers.
The proper analogy is with a drug that raised the IQ of researchers who invent the drugs that increase IQ. Does this lead to an intelligence explosion? Probably not. If the number of IQ points that you need to discover the next drug in a constant time increases faster than the number of IQ points that the next drug gives you, then you will r...
For almost any goal an AI had, the AI would make more progress towards this goal if it became smarter.
True, but there it is likely that there are diminishing returns in how much adding more intelligence can help with other goals, including the instrumental goal of becoming smarter.
As an AI became smarter it would become better at making itself smarter.
Nope, doesn't follow.
But what if a general AI could generate specialized narrow AIs?
How is it different than a general AI solving the problems by itself?
That's a 741 pages book, can you summarize a specific argument?
I'm asking for references because I don't have them. it's a shame that the people who are able, ability-wise, to explain the flaws in the MIRI/FHI approach
MIRI/FHI arguments essentially boil down to "you can't prove that AI FOOM is impossible".
Arguments of this form, e.g. "You can't prove that [snake oil/cryonics/cold fusion] doesn't work" , "You can't prove there is no God", etc. can't be conclusively refuted.
Various AI experts have expressed skepticism in an imminent super-human AI FOOM, pointing out that the capability r...
This is a press release though, lots of games were advertised with similar claims that don't live up to expectation when you actually play them.
The reason is that designing an universe with simple and elegant physical laws sounds cool on paper but it is very hard to do if you want to set an actually playable game in it, since most combinations of laws, parameters and initial conditions yield uninteresting "pathological" states. In fact this also applies to the laws of physics of our universe, and it is the reason why some people use the "fin...
Video games with procedural generation of the game universe have existed since forever, what's new here?
"Bayes vs Science": Can you consistently beat the experts in (allegedly) evidence-based fields by applying "rationality"? AI risk and cryonics are specific instances of this issue.
Can rationality be learned, or is it an essentially innate trait? If it can be learned, can it be taught? If it can be taught, do the "Sequences" and/or CFAR teach it effectively?
If the new evidence which is in favor of cryonics benefits causes no increase in adoption, then either there is also new countervailing evidence or changes in cost or non-adopters are the more irrational side.
No. If evidence is against cryonics, and it has always been this way, then the number of rational adopters should be approximately zero, thus approximately all the adopters should be the irrational ones.
As you say, the historical adoption rate seems to be independent of cryonics-related evidence, which supports the hypothesis that the adopters don't sign up because of an evidence-based rational decision process.
4.You have a neurodegenerative disease, you can survive for years but if you wait there will be little left to preserve by the time your heart stops.
If revival had been already demonstrated then you would pretty much already know what form you will be going to wake up in
Adoption is not about evidence.
Right. But the point is, who is in the wrong between the adopters and the non-adopters?
It can be argued that there was never good evidence to sign up for cryonics, therefore the adopters did it for irrational reasons.
I'm not sure this distinction, while significant, would ensure "millions" of people wouldn't sign up.
Millions of people do sign up for various expensive and invasive medical procedures that offer them a chance to extend their lives a few years or even a few months. If cryonics demonstrated a successful revival, then it would be considered a life-saving medical procedure and I'm pretty confident that millions of people would be willing to sign up for it.
People haven't signed up for cryonics in droves because right now it looks less like a medic...
The best setting for that is probably only 3-5 characters, not 20.
In NLP applications where Markov language models are used, such as speech recognition and machine translation, the typical setting is 3 to 5 words. 20 characters correspond to about 4 English words, which is in this range.
Anyway, I agree that in this case the order-20 Markov model seems to overfit (Googling some lines from the snippets in the post often locates them in an original source file, which doesn't happen as often with the RNN snippets). This may be due to the lack of regulariza...
The fact that it is even able to produce legible code is amazing
Somewhat. Look at what happens when you generate code from a simple character-level Markov language model (that's just a look up table that gives the probability of the next character conditioned on the last n characters, estimated by frequency counts on the training corpus).
An order-20 language model generates fairly legible code, with sensible use of keywords, identifier names and even comments. The main difference with the RNN language model is that the RNN learns to do proper identation...
You have to be more specific with the timeline. Transistors were invented in 1925 but received little interests due to many technical problems. It took three decades of research before the first commercial transistors were produced by Texas Instruments in 1954.
Gordon Moore formulated his eponymous law in 1965, while he was director of R&D at Fairchild Semiconductor, a company whose entire business consisted in the manufacture of transistors and integrated circuits. By that time, tens of thousands transistor-based computers were in active commercial use.
so a 10 year pro may be familiar with say 100,000 games.
That's 27.4 games a day, on average. I think this is an overestimate.
In the brain, the same circuitry that is used to solve vision is used to solve most of the rest of cognition
And in a laptop the same circuitry that it is used to run a spreadsheet is used to play a video game.
Systems that are Turing-complete (in the limit of infinite resources) tend to have an independence between hardware and possibly many layers of software (program running on VM running on VM running on VM and so on). Things that look similar at a some levels may have lots of difference at other levels, and thus things that look simple at some level...
They spent three weeks to train the supervised policy and one day to train the reinforcement learning policy starting from the supervised policy, plus an additional week to extract the value function from the reinforcement learning policy (pages 25-26).
In the final system the only part that depends on RL is the value function. According to figure 4, if the value function is taken out the system still plays better than any other Go program, though worse than the human champion.
Therefore I would say that the system heavily depends on supervised training on a human-generated dataset. RL was needed to achieve the final performance, but it was not the most important ingredient.
When EY says that this news shows that we should put a significant amount of our probability mass before 2050 that doesn't contradict expert opinions.
The point is how much we should update our AI future timeline beliefs (and associated beliefs about whether it is appropriate to donate to MIRI and how much) based on the current news of DeepMind's AlphaGo success.
There is a difference between "Gib moni plz because the experts say that there is a 10% probability of human-level AI within 2022" and "Gib moni plz because of AlphaGo".
I wouldn't say that it's "mostly unsupervised" since a crucial part of their training is done in a traditional supervised fashion on a database of games by professional players.
But it's certainly much more automated than having an hand-coded heuristic.
Even if I knew all possible branches of the game tree that originated in a particular state, I would need to know how likely any of those branches are to be realized in order to determine the current value of that state.
Well, the value of a state is defined assuming that the optimal policy is used for all the following actions. For tabular RL you can actually prove that the updates converge to the optimal value function/policy function (under some conditions). If NN are used you don't have any convergence guarantees, but in practice the people at DeepMi...
And the many-worlds interpretation of quantum mechanics. That is, all EY's hobby horses. Though I don't know how common these positions are among the unquiet spirits that haunt LessWrong.
Reward delay is not very significant in this task, since the task is episodic and fully observable, and there is no time preference, thus you can just play a game to completion without updating and then assign the final reward to all the positions.
In more general reinforcement learning settings, where you want to update your policy during the execution, you have to use some kind of temporal difference learning method, which is further complicated if the world states are not fully observable.
Credit assignment is taken care of by backpropagation, as usual in...
Very interesting, thanks for sharing.