Petr Andreev - LessWrong

MIRI 2024 Communications Strategy

Problems of Legal Regulation

1.1. The adoption of such laws is long way

Usually, it is a centuries-long path: Court decisions -> Actual enforcement of decisions -> Substantive law -> Procedures -> Codes -> Declaration then Conventions -> Codes.

Humanity does not have this much time, it is worth focusing on real results that people can actually see. It might be necessary to build some simulations to understand which behavior is irresponsible.

Where is the line between creating a concept of what is socially dangerous and what are the ways to escape responsibility?

As a legal analogy, I would like to draw attention to the criminal case of Tornado Cash.

https://uitspraken.rechtspraak.nl/details?id=ECLI:NL:RBOBR:2024:2069

The developer created and continued to improve an unstoppable program that possibly changed the structure of public transactions forever. Look where the line is drawn there. Can a similar system be devised concerning the projection of existential risks?

1.2. The difference between substantive law and actual law on the ground, especially in countries built on mysticism and manipulation. Each median group of voters creates its irrational picture of the world within each country. You do not need to worry about floating goals.

There are enough people in the world in a different information bubbles than you, so you can be sure that there are actors with values opposite to yours.

1.3. Their research can be serious, but the worldview simplified and absurd. At the same time, resources can be extensive enough for technical workers to perform their duties properly.

The Impossibility of Ideological Influence

2.1. There is no possibility of ideologically influencing all people simultaneously and all systems.

2.2. If I understand you correctly, more than 10 countries can spend huge sums on creating AI to accelerate solving scientific problems. Many of these countries are constantly struggling for their integrity, security, solving national issues, re-election of leaders, gaining benefits, fulfilling the sacred desires of populations, class, other speculative or even conspiratorial theories. Usually, even layers of dozens of theories.

2.3. Humanity stands on the brink of new searches for the philosopher's stone, and for this, they are ready to spend enormous resources. For example, the quantum decryption of old Satoshi wallets plus genome decryption can create the illusion of the possibility of using GAI to solve the main directions of any transhumanist’s alhimists desires, to give the opportunity to defeat death within the lifetime of this or the next two generations. Why should a conditional billionaire and/or state leader refuse this?

Or, as proposed here, the creation of a new super IQ population, again, do not forget that some of the beliefs can be antagonistic.

Even now, from the perspective of AI, predicting the weather in 2100 is somehow easier than in 2040. Currently, there are about 3-4 countries that can create Wasteland-type weather, they partially come into confrontation approximately every five years. Each time, this is a tick towards a Wasteland with a probability of 1-5%. If this continues, the probability of Wasteland-type weather by 2040 will be:

1−0.993=0.0297011 - 0.99^3 = 0.0297011−0.993=0.029701
1−0.953=0.1426251 - 0.95^3 = 0.1426251−0.953=0.142625

By 2100, if nothing changes:

1−0.9915=0.13991 - 0.99^{15} = 0.13991−0.9915=0.1399
1−0.9515=0.46321 - 0.95^{15} = 0.46321−0.9515=0.4632

(A year ago, my predictions were more pessimistic as I was in an information field that presented arguments for the Wasteland scenario in the style of "we'll go to heaven, and the rest will just die." Now I off that media =) to be less realistic, Now it seems that this will be more related to presidential cycles and policy, meaning they will occur not every year, but once every 5 years, as I mentioned earlier, quite an optimistic forecast)

Nevertheless, we have many apocalyptic scenarios: nuclear, pandemic, ecological (the latter is exacerbated by the AI problem, as it will be much easier to gather structures and goals that are antagonistic in aims).

3. Crisis of rule of law

In world politics, there has been a rollback of legal institutions since 2016 (see UN analytics). These show crisis of common values. Even without the AI problem, this usually indicates either the construction of a new equilibrium or falling into chaos. I am a pessimist here and believe that in the absence of normalized common values, information bubbles due to the nature of hysteria become antagonistic (simply put, wilder information flows win, more emotional and irrational). But vice verse this is a moment where MIRI could inject value that existential safety is very important. Especially now cause any injection in out doom clock bottom could create effect that MIRI solved it

4. Problems of Detecting AI Threats

4.1. AI problems are less noticeable than nuclear threats (how to detect these clusters, are there any effective methods?).

4.2. Threat detection is more blurred, identifying dangerous clusters is difficult. The possibility of decentralized systems, like blockchain, and their impact on security. (decentralized computing is rapidly developing, there is progress in superconductors, is this a problem from the perspective of AI security detection?).

Questions about the "Switch off" Technology

5.1. What should a program with a "switch" look like? What is its optimal structure:

a) Proprietary software, (which blocks, functions are recommended to be closed from any distribution)

b) Close/Open API, (what functions can MIRI or other laboratories provide, but with the ability to turn off at any moment, for example, enterprises like OpenAI)

c) Open source with constant updates, (open libraries, but which require daily updates to create the possibility of remotely disabling research code)

d) Open code, (there is an assumption that with open code there is less chance that AI will come into conflict with other AIs, AI users with other AI users, open code can provide additional chances that the established equilibrium between different actors will be found, and they will not mutually annihilate each other. Because they could better in prediction each other behavior)

5.2. The possibility of using multi-signatures and other methods.

How should the button work? Should such a button and its device be open information? Of another code structure? another language? Analogues tech

Are there advantages or disadvantages of shutdown buttons, are there recommendations like at least one out of N pressed, which system seems the most sustainable?

5.3. Which method is the most effective?

Benefits and Approval

6.1. What benefits will actors gain by following recommendations? Leaders of most countries make decisions not only and not so much from their own perspective, but from the irrational desires of their sources of power, built on dozens of other, usually non-contradictory but different values.

6.2. Possible forms of approval and assistance in generating values. Help to defend ecology activists to defend from energy crisis? (from my point of view AI development not take our atoms, but will take our energy, water, sun, etc)

6.3. Examples of large ‘switch off’ projects, for AI infrastructure with enough GPU, electricity, like analogies nuclear power plants but for AI. If you imagine such objects plants what rods for reactions should be, how to pull them out, what "explosives" over which pits should be laid to dump all this into acid or another method of safe destroying

7.1. Questions of approval and material assistance for such enterprises. What are the advantages of developing such institutions under MIRI control compared to

7.2. The hidden maintenance of gray areas on the international market. Why is the maintenance of the gray segment less profitable than cooperation with MIRI from the point of view of personal goals, freedom, local goals, and the like?

Trust and Bluff

8.1. How can you be sure of the honesty of the statements? MIRI that it is not a double game. And that these are not just declarative goals without any real actions? From my experience, I can say that neither in poker bot cases nor in the theft of money using AI in the blockchain field did I feel any feedback from the Future Life Institute project. To go far, I did not even receive a single like from reposts on Twitter. There were no automatic responses to emails, etc. And in this, I agree with Matthew Barnett that there is a problem with effectiveness.

What to present to the public? What help can be provided? Help in UI analytics? Help in investigating specific cases of violations using AI?

For example, I have a problem where I need for consumer protection to raise half a million pounds against AI that stole money through low liquidity trading on Binance, how can I do this?

https://www.linkedin.com/posts/petr-andreev-841953198_crypto-and-ai-threat-summary-activity-7165511031920836608-K2nF?utm_source=share&utm_medium=member_desktop

https://www.linkedin.com/posts/petr-andreev-841953198_binances-changpeng-zhao-to-get-36-months-activity-7192633838877949952-3cmE?utm_source=share&utm_medium=member_desktop

I tried writing letters to the institute and to 80,000 hours, zero responses

SEC, Binance, and a bunch of regulators. They write no licenses, okay no. But why does and 80,000 generally not respond? I do not understand.

8.2. Research in open-source technologies shows greater convergence of trust. Open-source programs can show greater convergence in cooperation due to the simpler idea of collaboration and solving the prisoner's dilemma problem not only through past statistics of another being but also through its open-to-collaboration structure. In any case, GAI will eventually appear, possibly open monitoring of each other's systems will allow AI users not to annihilate each other.

8.3. Comparison with the game theory of the Soviet-Harvard school and the need for steps towards security. The current game theory is largely built on duel-like representations of game theory, where damage to the opponent is an automatic victory, and many systems at the local level continue to think they are there.

Therefore, it is difficult for them to believe in the mutual benefit of systems, that it is about WIN-WIN, cooperation, and not empty talk or just a scam for redistribution of influence and media manipulation.

AI Dangers

9.1. What poses a greater danger: multiple AIs, two powerful AIs, or one actor with a powerful AI?

9.2. Open-source developments in the blockchain field can be both safe and dangerous? Are there any reviews?

this is nice etherium foundation list of articles:

https://docs.google.com/spreadsheets/d/1POtuj3DtF3A-uwm4MtKvwNYtnl_PW6DPUYj6x7yJUIs/edit#gid=1299175463

what do you think about:

Open Problems in Cooperative AI, Cooperative AI: machines must learn to find common ground, etc articles?

9.3. Have you considered including the AI problem in the list of Universal jurisdiction https://en.wikipedia.org/wiki/Universal_jurisdiction

Currently, there are no AI problems or, in general, existential crimes against humanity. Perhaps it is worth joining forces with opponents of eugenics, eco-activists, nuclear alarmists, and jointly prescribing and adding crimes against existential risks (to prevent the irresponsible launch of projects that with probabilities of 0.01%+ can cause severe catastrophes, humanity avoided the Oppenheimer risk with the hydrogen bomb, but not with Chernobyl, and we do not want giga-projects to continue allowing probabilities of human extinction, but treated it with neglect for local goals).

In any case, introducing the universal jurisdiction nature of such crimes can help in finding the “off” button for the project if it is already launched by attracting the creators of a particular dangerous object. This category allow states or international organizations to claim criminal jurisdiction over an accused person regardless of where the alleged crime was committed, and regardless of the accused's nationality, country of residence, or any other relation to the prosecuting entity

9.4. And further the idea with licensing, to force actors to go through the verification system on the one hand, and on the other, to ensure that any technology is refined and becomes publicly available.

https://uitspraken.rechtspraak.nl/details?id=ECLI:NL:RBOVE:2024:2078

https://uitspraken.rechtspraak.nl/details?id=ECLI:NL:RBOVE:2024:2079

A license is very important to defend a business, its CEO, and colleagues from responsibility. Near-worldwide monopolist operators should work more closely to defend the rights of their average consumer to prevent increased regulation. Industries should establish direct contracts with professional actors in their fields in a B2B manner to avoid compliance risks with consumers.

Such organisation as MIRI could be strong experts that could check AI companies for safety especially they large enough to create existential risk or by opposite penalties and back of all sums that people accidentally lost from too weak to common AI attacks frameworks. People need to see simple show of their defence against AI and help from MIRI, 80000 and other effective altruist especially against AI bad users that already misalignment and got 100kk+ of dollars. It is enough to create decentralized if not now than in next 10 years

Examples and Suggestions

10.1. Analogy with the criminal case of Tornado Cash. In the Netherlands, there was a trial against a software developer who created a system that allows decentralized perfect unstoppable crime. It specifically records the responsibility of this person due to his violation of financial world laws. Please note if it can be somehow adapted for AI safety risks, where lines and red flags.

10.2. Proposals for games/novels. What are the current simple learning paths, in my time it was HPMOR -> lesswrong.ru -> lesswrong.com.

At present, Harry Potter is already outdated for the new generation, what are the modern games/stories about AI safety, how to further direct? How about an analogue of Khan Academy for schoolchildren? MIT courses on this topic?

Thank you for your attention. I would appreciate it if you could point out any mistakes I have made and provide answers to any questions. While I am not sure if I can offer a prize for the best answer, I am willing to donate $100 to an effective fund of your choice for the best engagement response.

I respect and admire all of you for the great work you do for the sustainability of humanity!

AGI Ruin: A List of Lethalities

Petr Andreev2mo10

I'm assuming you are already familiar with some basics, and already know what 'orthogonality' and 'instrumental convergence' are and why they're true.

isn't?

Key Problem Areas in AI Safety:

Orthogonality: The orthogonality problem posits that goals and intelligence are not necessarily related. A system with any level of intelligence can pursue arbitrary goals, which may be unsafe for humans. This is why it’s crucial to carefully program AI’s goals to align with ethical and safety standards. Ignoring this problem may lead to AI systems acting harmfully toward humanity, even if they are highly intelligent.
Instrumental Convergence: Instrumental convergence refers to the phenomenon where, regardless of a system's final goals, certain intermediate objectives (such as self-preservation or resource accumulation) become common for all AI systems. This can lead to unpredictable outcomes as AI will use any means to achieve its goals, disregarding harmful consequences for humans and society. This threat requires urgent attention from both lawmakers and developers.
Lack of Attention to Critical Concepts: At the AI summit in Amsterdam (October 9-11), concepts like instrumental convergence and orthogonality were absent from discussions, raising concern. These fundamental ideas remain largely out of the conversation, not only at such events but also in more formal documents, such as the vetoed SB 1047 bill. This may be due to insufficient awareness or understanding of the seriousness of the issue among developers and lawmakers.
Analysis of Past Catastrophes: To better understand and predict future AI-related disasters, it is crucial to analyze past catastrophes and the failures in predicting them. By using principles like orthogonality and instrumental convergence, we can provide a framework to explain why certain disasters occurred and how AI's misaligned goals or intermediate objectives may have led to harmful outcomes. This will not only help explain what happened but also serve as a foundation for preventing future crises.
Need for Regulation and Law: One key takeaway is that AI regulation must incorporate core safety principles like orthogonality and instrumental convergence, so that future judges, policymakers, and developers can better understand the context of potential disasters and incidents. These principles will offer a clearer explanation of what went wrong, fostering more involvement from the broader community in addressing these issues. This would create a more solid legal framework for ensuring AI safety in the long term.
Enhancing Engagement in Effective Altruism: Including these principles in AI safety laws and discussions can also promote greater engagement and adaptability within the effective altruism movement. By integrating the understanding of how past catastrophes might have been prevented and linking them to the key principles of orthogonality and instrumental convergence, we can inspire a more proactive and involved community, better equipped to contribute to AI safety and long-term ethical considerations.
Role of Quantum Technologies in AI: The use of quantum technologies in AI, such as in electricity systems and other critical infrastructure, adds a new layer of complexity to predicting AI behavior. Traditional economic models and classical game theory may not be precise enough to ensure AI safety in these systems, necessitating the implementation of probabilistic methods and quantum game theory. This could offer a more flexible and adaptive approach to AI safety, capable of handling vulnerabilities and unpredictable threats like zero-day exploits.
Rising Discrimination in Large Language Models (LLMs): At the Amsterdam summit, the "Teens in AI" project demonstrated that large language models (LLMs) tend to exhibit bias as they are trained on data that reflects structural social problems. This raises concerns about the types of "instrumental convergence monsters" that could emerge from such systems, potentially leading to a significant rise in discrimination in the future.
Conclusion:
To effectively manage AI safety, legal acts and regulations must include fundamental principles like orthogonality and instrumental convergence. These principles should be written into legislation to guide lawyers, policymakers, and developers. Moreover, analyzing past disasters using these principles can help explain and prevent future incidents, while fostering more engagement from the effective altruism movement. Without these foundations, attempts to regulate AI may result in merely superficial "false care," incapable of preventing catastrophes or ensuring long-term safety for humanity.

Looks like we will see a lot of Instrumental Convergance and Orthogonality disasters Isn't?

Nursing doubts

[+]Petr Andreev4mo-60

Being at peace with Doom

Petr Andreev4mo10

'Always Look on the Bright Side of Life'

Life is like playing Diablo

on hardcore mode: you can read all the guides, create the perfect build, and find ideal companions, only to die because the internet disconnects

Playing on hardcore is exciting—each game tells the story of how these characters will meet their end

'Always Look on the Bright Side of Death' - Monty Python

Raising children on the eve of AI

Petr Andreev5mo10

Do you know any interesting camp in Europe about HPMOR or something similar, my 11 daughter asked where is her letter to Hogwards. She start read book and ask why do nobody make film about this great fanfic.

Do you have any idea of good child camps for education in Europe? Or elsewhere?

My views on “doom”

[+]Petr Andreev2y-21-11

Top lesson from GPT: we will probably destroy humanity "for the lulz" as soon as we are able.

Petr Andreev2y-10

Version 1 (adopted):

Thank you, shminux, for bringing up this important topic, and to all the other members of this forum for their contributions.

I hope that our discussions here will help raise awareness about the potential risks of AI and prevent any negative outcomes. It's crucial to recognize that the human brain's positivity bias may not always serve us well when it comes to handling powerful AI technologies.

Based on your comments, it seems like some AI projects could be perceived as potentially dangerous, similar to how snakes or spiders are instinctively seen as threats due to our primate nature. Perhaps, implementing warning systems or detection-behavior mechanisms in AI projects could be beneficial to ensure safety.

In addition to discussing risks, it's also important to focus on positive projects that can contribute to a better future for humanity. Are there any lesser-known projects, such as improved AI behavior systems or initiatives like ZeroGPT, that we should explore?

Furthermore, what can individuals do to increase the likelihood of positive outcomes for mankind? Should we consider creating closed island ecosystems with the best minds in AI, as Eliezer has suggested? If so, what would be the requirements and implications of such places, including the need for special legislation?

I'm eager to hear your thoughts and insights on these matters. Let's work together to strive for a future that benefits all of humanity. Thank you for your input!

Version 0:

Thank you shminux for this topic. And other gentlements for this forum!

I hope I will not died with AI in lulz manner after this comment) Human brain need to be positive. Without this it couldn't work well.

According to your text it looks like any OPEN AI projects buttons could look like SNAKE or SPIDER at least to warning user that there is something danger in it on gene level.

You already know many things about primate nature. So all you need is to use it to get what you want

We have last mind journeey of humankind brains to win GOOD future or take lost!

What other GOOD projects we could focus on?

What projects were already done but noone knows about them? Better AI detect-behaviour systems? ZeroGPT?

What people should do to make higher probability of good scenarios for mankind?

Should we make close island ecosystems with best minds in AI as Eliezar said on Bankless youtube video or not?

What are the requirements for such places? Because then we need to create special legislation for such semiindependant places. It's possible. But talking with goverments is a hard work. Do you REALLY need it? Or this is just emotional words of Eliezar.

Thank you for answers!

Manifold: If okay AGI, why?

Petr Andreev2y10

I guess we need to maximase different good possible outcome, and each of them

for example to rise propability of Many competing AGIs form an equilibrium whereby no faction is allowed to get too powerful, humans could

prohibit all autonomous AGI use.

Esspecially those that use uncontrolled clusters of graphical proccessors in authocraties without international AI-safe supervisors like Eliezer Yudkowsky, Nick Bostrom or their crew

this, restrictions of weak APIs systems and need to use human operators

make nature borders of AI scalability so AGI find that it's more fervour to mimick and consensus with people and other AGI, at least to use humans like operators that work under AGI advises or make humanlike persons that simpler to work with human culture and other people

detection systems often use categorisation principles,

so even if AGI prohibit some rules without scalability it could function without danger longer cause security systems (that also some kind of tech officers with AI) couldn't find and destroy them,

this could create conditions to encourage the diversity and uniqueness of different AGIs

so all neurone beings, AGI, people with AI, could win some time to find new balances of using atoms of multiverse

more borders, more time to conquer longer live to every human, even win of two second for every 8kkk people worth it

more chances that different fuctions will find some kind of balance of AGI, people with AGI, people under AGI, other fractions

I remember autonomose poker AIs destroy weak ecosystems one by one, but now industry in sustainable growth with separate actors, each of them use AI but in very different manners

More separate systems, more chances that with time of destroying them one by one in one time AGI will find way how to function without destroying it's environment

PS separate way: send spacehips with prohibitaion of AGI (maybe only with life, no apes) as far as posible so when AGI happened on Earth it's couldn't get all of them)

LESSWRONG
is fundraising!
LW
$

Posts

Wiki Contributions

Comments

isn't?

Key Problem Areas in AI Safety:

Conclusion: