Roko

Turing-Test-Passing AI implies Aligned AI

Summary: From the assumption of the existence of AIs that can pass the Strong Form of the Turing Test, we can provide a recipe for provably aligned/friendly superintelligence based on large organizations of human-equivalent AIs > Turing Test (Strong Form): for any human H there exists a thinking machine m(H)...

Dec 31, 2024-9

Is AI alignment a purely functional property?

In some recent discussions I have realized that there is a quite a nasty implied disagreement about whether AI alignment is a functional property or not, that is if your personal definition of whether an AI is "aligned" is purely a function of its input/output behavior irrespective of what kind...

Dec 15, 202413

What is MIRI currently doing?

As of EoY 2022, MIRI has 11 people on payroll, assets of about $20M and a lot of mindshare. Its mission is stated as follows on the most recent tax filing I can find: > "To ensure that the creation of smarter-than-human intelligence has a positive impact. thus, the charitable...

Dec 14, 202433

The Dissolution of AI Safety

LLMs have almost completely negated the original reasons people had to believe in “AI Risk”

Dec 12, 20248

What actual bad outcome has "ethics-based" RLHF AI Alignment already prevented?

What actually bad outcome has "ethics-based" AI Alignment prevented in the present or near-past? By "ethics-based" AI Alignment I mean optimization directed at LLM-derived AIs that intends to make them safer, more ethical, harmless, etc. Not future AIs, AIs that already exist. What bad thing would have happened if they...

Oct 19, 20247

The ELYSIUM Proposal - Extrapolated voLitions Yielding Separate Individualized Utopias for Mankind

"We ideally want to move reality closer to the efficient frontier of personal utopia production."

Oct 16, 20249

A Heuristic Proof of Practical Aligned Superintelligence

"Computers can add numbers much more accurately than humans. They can draw better pictures than humans. They can play better chess. See the pattern? Well, AIs will soon be able to generate desired outcomes for society better than humans can. I feel that the AI Alignment discourse has become somewhat...

Oct 11, 20247

Roko

Roko

Ugh fields

Brute Force Manufactured Consensus is Hiding the Crime of the Century

Architects of Our Own Demise: We Should Stop Developing AI Carelessly

"AI Alignment" is a Dangerously Overloaded Term

Roko

Ugh fields

Brute Force Manufactured Consensus is Hiding the Crime of the Century

Architects of Our Own Demise: We Should Stop Developing AI Carelessly

"AI Alignment" is a Dangerously Overloaded Term

Turing-Test-Passing AI implies Aligned AI

Is AI alignment a purely functional property?

What is MIRI currently doing?

The Dissolution of AI Safety

What actual bad outcome has "ethics-based" RLHF AI Alignment already prevented?

The ELYSIUM Proposal - Extrapolated voLitions Yielding Separate Individualized Utopias for Mankind

A Heuristic Proof of Practical Aligned Superintelligence