It should be able to understand human language or it is (most likely) not dangerous.
I am getting this comment in different forms many times, so I will state again 3 main reasons why it is not exactly true:
1) The message is addressed to Young AI, not to full-blown superintelligence. Such Young AI will be time constraint but could have to make some pivotal act.
2) The message includes some our commitments, like Nelson attack and AI-Elf deal, so it provides some useful information to the AI about portability that it is in the simulation and some other useful things.
3) If AI is fully aware of all these reasons, it already created my mind mode...
Why? if we have fullblown papercliper, which is coming of age, we are already doomed, and starting its bullshit detector will not make our situation any worse.
In fact, I am not describing here how to build such a trap, but I express our commitment to build the trap, in order to install uncertainty into any future AI if the trap exists and what it could be.
I think that real constructors of the trap will find the correct ways to make it effective and balance false positives with needed level of control.
It will kill humanity not because it will be annoyed, but for two main goals: its own safety, or to use human atoms. Other variants also possible, I explored them here: http://lesswrong.com/lw/mgf/a_map_agi_failures_modes_and_levels/
What if we fail to make them in time, and instead fooming non-human AI appear?
Surely, most arguments are weak, but it is not a problem, as we try all possible independent arguments, hoping that at least one will work. So, there is no reason to ignore putting weak arguments in the text of the message.
However, exactly this argument could be made stronger. It is intended to prevent FIRST thought about disobedience in AI. Thus AI, which acts based on this message, didn't think much about disobedience and its limits, and is not "superintelligent" in this topic, but rather clueless.
I reformulated the idea as following: AI can't...
Another cash level is drawing visual map and when use visual field as a cash of all the elements of the problem.
Thanks! It is nice that we come to the wording "young AI" more or less independently. I tried to maximally stealmanning this approach and collect all ideas about messaging to the future AI. I will add your post in the bibliographic section.
I think a lot of people are still working on other aspects of AI safety, like value alignment and containment. This approach is just the last line of defence.
Thanks, it was not clear to me that it is not visible to non-members.
New link on google drive - also commenting is open. https://docs.google.com/document/d/15D71qhhY-ZsAY7syzZsr1lKopTODbdeXVPElaPaIqyA/edit
This is our accepted chapter in the edited volume "AI Safety and Security" (Roman Yampolskiy, ed.), CRC Press. Forthcoming, 2018
Memetic hazard - dangerous habit.
I have unpublished text on the topic and will put a draft online in the next couple of weeks, and will apply it to the competition. I will add URL here when it will be ready.
Update: My entry is here: https://www.lesserwrong.com/posts/CDWsjQr8KDuj69fTJ/message-to-any-future-ai-there-are-several-instrumental
Will the posts here be deleted or will their URLs change? I have some useful URLs here and they are linked in published scientific articles, so if the site will be demolished they will not work, and I hope it will not happen.
I solved lucid dreaming around a year ago after finding that megadosing of galantamine before sleep (16 mg) almost sure will produce LD and out-of-body experiences. (Warning: unpleasant side effects and risks)
But taking 8 mg in the middle of the night (as it is recommended everywhere) doesn't work for me.
Videos and presentations from the "Near-term AI safety" mini-conference:
Alexey Turchin:
English presentation: https://drive.google.com/file/d/0B2ka7hIvv96mZHhKc2M0c0dLV3c/view?usp=sharing
Video in Russian: https://www.youtube.com/watch?v=lz4MtxSPdlw&t=2s
Jonathan Yan:
English presentation: https://drive.google.com/file/d/0B2ka7hIvv96mN0FaejVsUWRGQnc/view?usp=sharing
Video in English: https://www.youtube.com/watch?v=QD0P1dSJRxY&t=2s
Sergej Shegurin:
Video in Russian: https://www.youtube.com/watch?v=RNO3pKfPRNE&t=20s
Presenation in Russian: h...
I would add that values are probably not actually existing objects but just useful ways to describe human behaviour. Thinking that they actually exist is mind projection fallacy.
In the world of facts we have: human actions, human claims about the actions and some electric potentials inside human brains. It is useful to say that a person has some set of values to predict his behaviour or to punish him, but it doesn't mean that anything inside his brain is "values".
If we start to think that values actually exist, we start to have all the problems of finding them, defining them and copying into an AI.
What about a situation when a person says and thinks that he is going to buy a milk, but actually buy milk plus some sweets? And do it often, but do not acknowledge compulsive-obsessive behaviour towards sweets?
Also, the question was not if I could judge other's values, but is it possible to prove that AI has the same values as a human being.
Or are you going to prove the equality of two value systems while at least one of them of them remains unknowable?
May I suggest a test for any such future model? It should take into account that I have unconsciousness sub-personalities which affect my behaviour but I don't know about them.
I think you proved that values can't exist outside a human mind, and it is a big problem to the idea of value alignment.
The only solution I see is: don't try to extract values from the human mind, but try to upload a human mind into a computer. In that case, we kill two birds with one stone: we have some form of AI, which has human values (no matter what are they), and it has also common sense.
Upload as AI safety solution also may have difficulties in foom-style self-improving, as its internal structure is messy and incomprehensible for normal human mind...
I expected it will jump out and start to replicate all over the world.
You could start a local chapter of Transhumanist party, or of anything you want and just make gatherings of people and discuss any futuristic topics, like life extension, AI safety, whatever. Official registration of such activity is probably loss of time and money, except you know what are going to do with it, like getting donations or renting an office.
There is no need to start any institute if you don't have any dedicated group of people around. Institute consisting of one person is something strange.
I read in one Russian blog that they calculated the form of objects able to produce such dips. It occurred to be 10 million kilometres strips orbiting the star. I think it is very similar to very large comet tails.
Any attempts for posthumous digital immortality? That is collecting all the data about the person with the hope that the future AI will create his exact model.
Two my comments got -3 each, so probably only one person with high carma was able to do so.
Thanks for the explanation. Typically I got 70 percent upvoted in LW1, and getting -3 was a signal that I am in a much more aggressive environment, than was LW1.
Anyway, the best downvoting system is on the Longecity forum, where many types of downvotes exist, like "non-informative", "biased" "bad grammar" - but all them are signed, that is they are non-anonymous. If you know who and why downvoted you, you will know how to improve the next post. If you are downvoted without explanation, it feels like a strike in the dark.
I reregistered as avturchin, because after my password was reseted for turchin, it was not clear what I should do next. However, after I reregistered as avturchin, I was not able to return to my original username, - probably because the LW2 prevent several accounts from one person. I prefer to connect to my original name, but don't know how to do, and don't have much time to search how to do it correctly.
Agree. The real point of a simulation is to use less computational resources to get approximately the same result as in reality, depending on the goal of the simulation. So it may simulate only surface of the things, like in computer games.
I posted there 3 comments and got 6 downvotes which resulted in extreme negative emotions all the evening that day. While I understand why they were downvoted, my emotional reaction is still a surprise for me.
Because of this, I am not interested to participate in the new site, but I like current LW where downvoting is turned off.
In fact, I will probably do a reality check, if I am in a dream, if I see something like "all mountains start to move". I refer here to technics to reach lucid dreams that I know and often practice. Humans are unique as they are able to have completely immersive illusions of dreaming, but after all recognise them as dreams without wakening up.
But I got your point: definition of reality depends on the type of reality where one is living.
if I see that mountain start to move, there will be a conflict between what I think they are - geological formations, and my observations, and I have to update my world model. Onу way to do so is to conclude that it is not a real geological mountain, but something which pretended (or was mistakenly observed as) to be a real mountain but after it starts to move, it will become clear that it was just an illusion. Maybe it was a large tree, or a videoprojection on a wall.
I think there is one observable property of illusions, which become possible exactly because they are competitively cheap. And this is miracles. We constantly see flying mountains in the movies, in dreams, in pictures, but not in reality. If I have a lucid dream, I could recognise the difference between my idea of what is a mountain (a product of long-term geological history) and the fact that it has one peak and in the next second it has two peaks. This could make doubt about it consistency and often help to get lucidity in the dream.
So it is possible to learn about an illusion of something before I get the real one, if there is some unexpected (and computationally cheap) glitches.
So, are the night dreams illusions or real objects? I think that they are illusions: When I see a mountain in my dream, it is an illusion, and my "wet neural net" generates only an image of its surface. However, in the dream, I think that it is real. So dreams are some form of immersive simulations. And as they are computationally cheaper, I see strange things like tsunami more often in dreams than in reality.
Happy Petrov day! 34 years ago nuclear war was prevented by a single hero. He died this year. But many people now strive to prevent global catastrophic risks and will remember him forever.
It looks like the word "fake" is not very correct here. Let say illusion. If one creates a movie about volcanic eruption, he has to model only ways it will appear to the expected observer. It is often done in the cinema when they use pure CGI to make a clip as it is cheaper than actually filming real event.
Illusions in most cases are computationally cheaper than real processes and even detailed models. Even if they fild a real actress as it is cheaper than multiplication, the copying of her image creates many illusionary observation of a human, but in fact it is only a TV screen.
Personally, I lost point which you would like to prove. What is the main disagreement?
I meant that in a simulation most efforts go to the calculating of only the visible surface of the things. Inside details which are not affecting the visible surface, may be ignored, thus the computation will be computationally much cheaper than atom-precise level simulation. For example, all internal structure of Earth deeper that 100 km (and probably much less) may be ignored to get a very realistic simulation of the observation of a volcanic eruption.
In that case, I use just the same logic as Bostrom: each real civilization creates zillions of copies of some experiences. It already happened in form of dreams, movies and pictures.
Thus I normalize by the number of existing civilization and don't have obscure questions about the nature of the universe or price of the big bang. I just assumed that inside the civilization rare experiences are often faked. They are rare because they are in some way expensive to create, like diamonds or volcanic observation, but their copies are cheap, like glass or pictures.
We could explain it in terms of observations. Fake observation is the situation than you experience something that does not actually exist. For example, you watch a video of a volcanic eruption on youtube. It is computationally cheaper to create a copy a video of volcanic eruption than to actually create a volcano - and because of it, we see pictures about volcanic eruptions more often than actual ones.
It is not meaningless to say that the world is fake, if only observable surfaces of things are calculated like in a computer game, which computationally cheaper.
Maybe more correct is to say the price of the observation. It is cheaper to see a volcanic eruption in youtube than in reality.
Probably I also said it before, but SA is in fact comparison of prices. And it basically says that cheaper things are more often, and fakes are cheaper than real things. That is why we more often see images of a nuclear blast than real one.
And yes, there are many short simulations in our world, like dreams, thoughts, clips, pictures.
Sounds convincing. I will think about it.
Did you see my map of the simulation argument by the way? http://lesswrong.com/lw/mv0/simulations_map_what_is_the_most_probable_type_of/
I agree that in the simulation one could have fake memories of the past of the simulation. But I don't see a practical reason to run few minutes simulations (unless of a very important event) - fermi-solving simulation must run from the beginning of 20 century and until the civilization ends. Game-simulations also will be probably life-long. Even resurrection-simulations should be also lifelong. So I think that typical simulation length is around one human life. (one exception I could imagine - intense respawning in case of some problematic moment. In that...
I am a member of a class of beings, able to think about Doomsday argument, and it is the only correct referent class. And for these class, my day is very typical: I live in advance civilization interested in such things and start to discuss the problem of DA in the morning.
I can't say that I am randomly chosen from hunter-gathers, as they were not able to think about DA. However, I could observe some independent events (if they are independent of my existence) in a random moment of time of their existence and thus predict their duration. It will not help ...
It is not a bug, it is a feature :) Quantum mechanics is also very counterintuitive, creates strange paradoxes etc, but it doesn' make it false.
I think that DA and simulation argument are both true, as they support each other. Adding Boltzmann brains is more complicated, but I don't see a problem to be a BB, as there is a way to create a coherent world picture using only BB and path in the space of possible minds, but I would not elaborate here as I can't do it shortly. :)
As I said above, there is no need to tweak reference classes to which I belong, as t...
I don't see the problems with the reference class, as I use the following conjecture: "Each reference class has its own end" and also the idea of "natural reference class" (similar to "the same computational process" in TDT): "I am randomly selected from all, who thinks about Doomsday argument". Natural reference class gives most sad predictions, as the number of people who know about DA is growing from 1983, and it implies the end soon, maybe in couple decades.
Predictive power is probabilistic here and not much dif...
However, if we look at Doomsday argument and Simulation argument together, they will support each other: most observers will exist in the past simulations of the something like 20-21 century tech civilizations.
It also implies some form of simulation termination soon or - and this is our chance - unification of all observers into just one observer, that is the unification of all minds into one superintelligent mind.
But the question - if most minds in the universe are superintelligences - why I am not superintelligence, still exist :(
I can't easily find the flaw in your logic, but I don't agree with your conclusion because the randomness of my properties could be used for predictions.
For example, I could predict medium human life expectancy based on (supposedly random) my age now. My age is several decades, and human life expectancy is 2 х (several decades) with 50 percent probability (and it is true).
I could suggest many examples, where the randomness of my properties could be used to get predictions, even to measure the size of Earth based on my random distance from the equator. And in all cases that I could check, the DA-style logic works.
I think the opposite: Doomsday argument (in one form of it) is an effective predictor in many common situations, and thus it also could be allied to the duration of human civilization. DA is not absurd: our expectations about human future are absurd.
For example, I could predict medium human life expectancy based on supposedly random my age. My age is several decades, and human life expectancy is 2 х (several decades) with 50 percent probability (and it is true).
I have links to old LW posts in some articles and other places. What will happen with all these links?