Comment author: TRIZ-Ingenieur 21 January 2016 01:22:25AM *  -1 points [-]

Why is regulation ungood? I want to understand the thoughts of other LWers why regulation is not wanted. Safe algorithms can only be evaluated if they are fully disclosed. There are many arguments against regulation - I know:

  • Nobody wants to disclose algorithms and test data.
  • Nobody wants projects being delayed.
  • Nobody wants to pay extra costs for external independent safety certifcation.
  • Developers do not want to "waste" their time with unproductive side issues.
  • Nobody wants to lose against a non-regulated competitor.
  • Safety concepts are complicated to understand and complex to implement.
  • Safety consumes performance at extra costs.

BUT: We ALL are facing an existential risk! Once algorithms manage to influence political decision making we do not even have the chance to lay down such regulations in law. We have to prepare the regulatory field by now! We should start this by starting a public debate. Like Nick Bostrum, Stephen Hawking, Elon Musk and many others already did. Today only a few ppm of the population know about these issues. And even top researchers are unaware of. At least a lecture on AI safety issues should become compulsory for IT, engineering, mathematics and physics students all over in the world.

In biotechnology Europe and especially Germany imposed strict regulations. The result was that even German companies joined or created subsidiary research companies in the US or UK, where regulations are minimal. This is no prototype solution for the Control Problem.

Local separation might work for GMOs - for AGI definitively not. AGI will be a game changer. Who is second has lost. If the US and EU would impose AI regulations and China and Israel not - where would the game winner come from? We have to face the full complexity of our world, dominated by multinational companies and their agendas. We should prepare a way how effective regulation can be made effective and acceptable for 192 countries and millions of companies. The only binding force among us all is the existential risk. There are viable methods to make regulation work: Silicon chip manufacturing luckily needs fabs that cost billions of dollars. It is a centralised point where regulation could be made effective. We could push hardware tripwires and enforce the use of certificated AI safeguard tools that interact compulsory with this special hardware. We can do it similarly like the content industry that pushed hardware manufactures to implement DRM hard- and software.

The trouble is: Nobody to this point has a clear idea how a globally acceptable regulation could look like; could work technically; could be made effective and could be monitored.

To lay out a framework how global regulation could be designed is to me one core element of AI safety engineering. The challenge is to find a high level of abstraction to include all thinkable developments. A body of AI safety engineers should derive from this detailed regulations that can be applied by AI developers, testers and AI safety Institutions.

The TÜV "Technischer Überwachungs-Verein" was founded in Germany after several incidents of exploded steam engine boilers with severe casualties. On the background of newspaper articles about these accidents and public pressure the manufacturers of boilers accepted the enforcement of technical steam boiler regulations and time and money consuming test procedures.

We cannot try out two or three Singularities and then change our mind on regulation.

As there are so many reasons why nobody in the development process wants regulation the only way is to enforce it trough a political process. To start this we need professionals with AI experience.

Meta: Whenever I ask for regulation I got downvoted. Therefore i disconneced this point from my previous one. Please downvote only including comment.

Comment author: TRIZ-Ingenieur 21 January 2016 12:01:09AM *  0 points [-]

What happens inside an AI can hardly be understood especially if structures get very complex and large. How the system finds solutions is mathematically clear and reproducible. But huge amounts of data make it incomprehensible to human beings. Today's researchers do not really know why a certain net configuration performs better than others. They define a metric to measure total performance - and do trial and error. Algorithms assist already with this. They play around with meta parameters and see how learning improves. Given that the improvement was a success the researcher will write some narrative in his paper why his algorithm performs better than previous others. Done. PhD granted. This is not what we should allow in the future.

Now the job of a safety engineer can start. It involves hell a lot of work and has a significant higher complexity than coming up with an algorithm and a narrative. The basic requirement is that everything is published - hardware, software, all training and test data. The safety engineer first hast to copy the exact system and check the promised performance. Then the real job begins:

Test the promised functionality with 10 - 100 times more test data than the author did. --> Task for AGI safety community: generation of ground truth annotated test data. AGI safety institutions should exchange these data among themselves but do not give it to developing researchers.

A saveguard I expect in future AI systems will be a tool AI that checks new training samples and update knowledge chunks. The logic behind: if only certified knowledge chunks are allowed as training samples the risk of malignant thoughts and developments can be reduced. The proper functionality of this tool AI has to be checked as well. In the training phase it certified most all training data to be acceptable and passed them through to the real AI algorithm. But does it properly block malignent training samples or knowledge chunks? --> task for AI safety community: create malignant training samples that try to subvert the intentionally "good-mannered" AI into a malignant one: Conspiracy ideas: everything you learned is exactly the opposite of what you learned until now; deviating ideas try to manipulate the AI that it shifts its priorities towards malignant ones, e.g. radicalisation; meta-manipulation to augment egoism.

The test using these data is two-folded:

  1. Test the tool-AI whether it properly censors these malignant ideas and hinders them that the AI learns these malignant ideas.
  2. Switch off the censoring tool AI and check how prone the AI is to these malignant ideas.

It goes without saying that such trials should only be done in special security boxed environments with redundant switch-off measures, trip-wires and all other features we hopefully will invent the next few years.

These test data should be kept secret and only to be shared among AI safety institutions. The only result a researcher will get as feedback like:"With one hour training we manipulated your algorithm that it wanted to kill people. We did not switch off your learning protection for this. "

Safety AI research is AI research. Only the best AI researchers are capable of AI safety research. Without deep understanding of internal functionality a safety researcher cannot reveal that the researcher's narrative was untrue.

Stephen Omohundro said eight years ago:

"AIs can monitor AIs" [Stephen Omohundro 2008, 52:45min]

and I like to add: - "and safety AI engineers can develop and test monitoring AIs". This underlines your point to 100%. We need AI researchers who fully understand AI and re-engineer such systems on a daily basis but focus only on safety. Thank you for this post.

Comment author: TRIZ-Ingenieur 20 January 2016 01:30:50AM 3 points [-]

The recent advances of deep learning projects combined with easy access to mighty tools like Torch or TensorFlow might trigger a different way: Start-ups will strive for some low-hanging fruits. Who is fastest gets all of the cake. Who is second has lost. The result of this were on display on CES: IoT systems full of security holes were pushed into the market. Luckily AI hardware/software is not yet capable to create an existential risk. Imagine you research as team member on a project that turns out to make your bosses billionairs... how are your chances being heard when you come up with your risk assessment: Boss, we need 6 months extra to design safeguards...

Comment author: KatjaGrace 30 December 2014 02:05:20AM 2 points [-]

Do you think tool AI is likely to help with AI safety in any way?

Comment author: TRIZ-Ingenieur 03 January 2015 11:14:19PM *  0 points [-]

Yes. Tool AIs built solely for AGI safeguarding will become existential for FAI:

AIs can monitor AIs [Stephen Omohundro 2008, 52:45min]

Capsulated tool AIs will be building blocks of a safety framework around AGI. Regulations for aircraft safety request full redundancy by independently developed control channels from different suppliers based on separate hardware. If an aircraft fails a few hundred people die. If safety control of a high capable AGI fails humankind is in danger.

Comment author: JoshuaZ 30 December 2014 03:35:12PM 2 points [-]

Can you expand?

Comment author: TRIZ-Ingenieur 03 January 2015 10:33:16PM 2 points [-]

Agent, oracle and tool are not clearly differenciated. I question wether we should differenciate these types the way Bostrums does. Katja last week drew a 4-quadrant classification scheme with dimensions "goal-directedness" and "oversight". Realisations of AI would be classified into sovereign|genie|autonomous tool|oracle(tool) by some arbitrarily defined thresholds.

I love her idea to introduce dimensions, but I think this entire classification scheme is not helpful for our control debate. AI realisations will have a multitude of dimensions. Tagging certain realisations with a classification title may help to explain dimensions by typified examples. We should not discuss safety of isolated castes. We do not have castes, we will have different kinds of AIs that will be different in their capabilities and their restrictions. The higher the capability, the more sophisticated restrictive measures must be.

On the dimension goal directedness: Bostrum seems to love the concept of final goal (German: "Endziel"). After achieving a final goal there is emptiness, nothing remains to be done. This concept that is foreign to evolution. Evolution is not about final goals. Evolution has an ethernal goal: survival. To survive it is neccessary to be fit enough to survive long enough to generate offspring and protect and train it long enough until it can protect itself. If grandparent generation is available they serve as backup for parent generation and further safeguard and source of experience for the young endangered offspring.

Instrumental goals in evolution are: Nutrition, looking for protection, learning, offspring generation, protecting, teaching.
These instrumental goals are paired with senses, motivations and drives: hunger/thirst, heat-sense/smelling/tasting/vision/hearing/fear, curiosity/playing, social behavior/sexuality, dominance behaviour/physical activity, teaching motivation.

All instrumental goals have to be met at least for a certain amount to achieve the ethernal goal: survival of species.

To define final goals as Bostrum points out on many occasions is dangerous and could lead to UFAI. To debate non-goal-directed types of AI is leading to nowhere. Non-goal-directed AI would do nothing else than thermodynamics: entropy will rise. To clarify our discussion we should state:

  • Any AGI has goal directedness. Number and complexity of goals will differ significantly.
  • Goals are fuzzy and can be contradictory. Partial solutions are acceptable for most goals.
  • Goal-directedness is a priority measure in a diversity of goals.
  • Any AGI has learning functionality.
  • Safe FAI will have repellent behavior towards dangerous actions or states. (Anti-goals or taboos)
  • Oversight over goals and taboos should be done by independent entities. (non-accessible to the AI)

Bostrum uses often goal and puts aside that we do not have to discuss about the end of the way but about the route and how to steer development if possible. A goal can be a "guiding star" if a higher entity knows it guides toward e.g. Bethlehem. Bostrums guiding star seems to be CE via FAI. Our knowledge about FAI is not advanced enough that we could formulate final goals or utility functions. Therefore I recommend not to focus our debate on diffuse final goal but on dimensions and gradients that point away from UFAI and towards controllability, transparency and friendliness.

Comment author: satt 25 December 2014 01:36:00AM 1 point [-]

I don't think there's one specific common vision about how a scheming AI would emerge.

I'm just extrapolating from my experience as someone who programs computers but can't even pinpoint why my e-book reader freezes up when it does. Twenty-first century computers are already so complex that no person can track what they do in real-time, and as computers and their programs become more complex, tracking their behaviour is going to become an even harder task. (The obvious way out is to implement meta-programs and/or meta-computers which do that job, but then of course we have to ask how we can keep a close eye on those.)

Comment author: TRIZ-Ingenieur 27 December 2014 12:33:44AM 1 point [-]

Also in this future, the monitoring software the AI's owner might use would also be near AI level intelligent.

A set of specialized oracles could be used to monitor inputs, internal computations and outputs. One oracle keeps records of every input and output. The question to this oracle is always the same: Is the AI lying? Another oracle is tasked with input steam analysis to filter out any taboo chunks. Other oracles can serve to monitor internal thought processes and self-improvement steps.

If these safeguarding oracles are strictly limited in their capabilities they do not pose a risk to the outside world. The core of such oracles could consist of straight forward defined heuristical rule sets.

Any of above mentioned oracles can be validated and independently certified. This would create a framework for tools, genies and sovereigns with assured friendliness.

Comment author: KatjaGrace 16 December 2014 02:20:35AM 1 point [-]

Does augmentation sound better or worse than other methods here? Should we lean more toward or against developing brain emulations before synthetic AI after thinking about these issues?

Comment author: TRIZ-Ingenieur 23 December 2014 12:40:25AM *  0 points [-]

WBE is not necessarily the starting point for augmentation. A safe AI path should avoid the slippery slope of self-improvement. An engineered AI with years of testing could be a safer starting point to augmentation because its value and safeguard system is traceable - what is impossible to a WBE. Other methods have to be implemented prior to starting augmentation.

Augmentation starting from WBE of a decent human character could end in a treacherous turn. We know from brain injuries that character can change dramatically. The extra abilities offered by extending WBE capabilities could destabilize mental control processes.

Summarizing: Augmentation is no alternative to other methods. Augmentation as singular method is riskier and therefore worse than others.

Comment author: woodhouse 09 December 2014 07:16:11PM 8 points [-]

As a political scientist, I find a shortfall of socio-political realism in many of the otherwise thoughtful and informed comments. Note, for example, the number of commentators who use pronouns including "we," "us," "our." Politics never works that way: It always includes "them," and almost always involves not just my side versus your side, but many sides operating with incomplete information, partially conflicting values, and different standard operating procedures. Related: Note the underlying assumption in many comments that all AI engineers/designers will behave in relatively similar, benign ways. That of course is unrealistic. Rogue nation-states will be able to hire technical talent, and the craziness of arms races occurs even if none of the participants deserves the term "rogue." Third, even within a single country there are multiple, partly competing security agencies, each keeping some secrets from others, each competing for turf, funding, bragging rights. Some are likely to be more careful than others even if they do not deliberately seek to evade boxing or the other controls under discussion. My comments do not exactly invalidate any of the technical and commonsensical insights; but without facing up more directly to variation and competition in international politics and economics, the technically oriented commentators are in danger of spinning fairy tales. (This applies especially to those who suppose it will be relatively easy to control runaway ASI.)

Comment author: TRIZ-Ingenieur 11 December 2014 01:06:15AM -1 points [-]

Wistleblowing and self-declarations will not help. Successful FAI development at MIRI will not help either - UFAI will be faster with more impact. An UFAI explosion can be stopped at extremely high costs. Switching off all computers, networks and global blackout for days. Computer hardware worth billions will have to be disposed of. Companies worth trillions will go bankrupt. Global financial depression will last for several years. Millions will die. After this experience the values of "them" and us come closer together and a global regulatory body can be established.

Comment author: Liso 10 December 2014 04:16:50AM 1 point [-]

This could be not good mix ->

Our action: 1a) Channel manipulation: other sound, other image, other data & Taboo for AI: lying.

This taboo: "structured programming languages.", could be impossible, because structure understanding and analysing is probably integral part of general intelligence.

She could not reprogram itself in lower level programming language but emulate and improve self in her "memory". (She could not have access to her code segment but could create stronger intelligence in data segment)

Comment author: TRIZ-Ingenieur 11 December 2014 12:33:42AM 0 points [-]

The taboo of lying is vital for thought monitoring. This taboo covers only the AGIs internal world representation. Based on its knowledge it never lies. By monitoring input and output channels the stunting system can detect lying and applies appropriate stunting measures.

If the stunting system manipulates input channels, memories or output channels the result to an outside observer will look like lying. The AGI is not capable to tell the truth when the stunting system has removed or manipulated information for safety reasons. The outside observer can check logs of the stunting system if it is vital to him to determine the truth.

Safe path to superintelligence with slow take-off by prohibiting self-improvement:

The taboo to learn structured programming languages is totally independent from capabilities to understand and analyse structured programs. In principle the AGI could learn this. The stunting system censors all input channels that the AGI never observes a single line of program code. The AGI could develop a new type of structured programming language from scratch. As soon as the stunting system observes structured code fragments in memory the section responsible for this will be stunted. The generated code can be deleted or manipulated. Either way the AGI experiences that these frightful structured programming languages are complicated and come with set-back after set-back.
If this AGI breaks out with free internet access it has no idea of our IT infrastructure and how to infiltrate it. If the AGI is moderately superintelligent it will need several weeks for learning hacking skills.

Self-Improvement with safeguards: If AGI architecture design and optimization uses new forms of representations that are totally unlike structured programming languages we could teach self-improvement without removing the programming taboo. Code and architecture generated in this self-improvement process is incompatible to existing IT systems. If several AGIs evolve from similar starting points diversification will rise from generation to generation.

Comment author: KatjaGrace 09 December 2014 02:11:09AM 4 points [-]

Are there solutions to the control problem other than capability control and motivation selection?

Comment author: TRIZ-Ingenieur 10 December 2014 02:35:54AM *  1 point [-]

Fear is one of the oldest driving forces to keep away from dangers. Fear is different from negative motivation. Motivation and goals are attractors. Fears, bad conscience and prohibitions are repellors. The repellent drives could count as third column to the solution of the control problem.

View more: Prev | Next