I have previously written that I don't think Formal Proof is a good roadmap for AI alignment.
However, this framing seems unreasonably strong.

So here is a list of 20 problems related to alignment that a godlike proof oracle would be useful for.
For the purpose of this exercise, assume that you have an oracle that takes as input a math problem stated with about the level of rigor of a Math Olympiad problem, converts it into a formal specification, creates a formal proof, and then generates a human readable summary of that proof.
Notice that if you have such an oracle you can also ask it questions of the form "find X such that X has property Y"
1. Formal specification and proof of a quantum-resistant encryption scheme
Being able to encrypt messages knowing that no AI in the future will be able to read them is useful e.g. for avoiding Roko's Basilisk concerns
2. Defining Programs whose effects are limited in space and time
Suppose you have a superhuman AI and you don't want to unbox it, but you do want to use it. For example, you might ask it to build a nanobot that converts a certain amount of matter into diamond but not all matter.
3. Solve the alignment problem
Certain specifications of the alignment problem are computationally intractable but easy to specify.
for example
An AI is aligned, if after turning it on, a simulation of all living humans giving infinite time and resources to study the world created by the AI without interference would agree that it is aligned
4. Acquire 6 Million dollars and donate it to pro-alignment causes
Duh.
5. Determine whether the many-worlds hypothesis is true
...by solving the Grand Unified Theory in Physics.
If Many Worlds is false (e.g. because ER=EPR or Objective Waveform collapse is true), then certain alignment strategies (e.g. bargaining with other timelines) are no longer feasible.
6. Prove the existence of free-will (or not)
...by proving one of
- no-cloning implies that an agent's actions cannot be knowing in advance
- irreducible complexity implies an agent cannot be predicted without simulating it
7. Create a text-to-image model that provably does not reproduce any of the images in its training set
Seems useful to make people like this shut up.
8. Human Whole Brain Emulation
Take a person, put them in a VR environment, record their actions and then solve the problem:
find a neural network of with minimum complexity+error which predicts these actions
9. Proof (or disprove) the existence of God
Suppose a God of the Gaps exists. Or that God answers prayer. With a perfect model of physics this should be statistically detectable in terms of violations of natural law.
10. Prove that the Chinchilla Scaling Laws are correct (or not)
The question of whether or not this is the optimal text-prediction accuracy for a given amount of compute and data.
11. Prove a version of the Natural Abstraction Hypothesis is true (or false)
This hypothesis is currently lies on my "most likely path" for AI Alignment.
12. Create a chatbot that is provably "helpful, harmless and honest"
People seem that this would be useful. Or maybe not.
13. Prove whether the universe is deterministic (or not)
Suppose the universe is deterministic (and started from a low-entropy initialization point) then it should be possible to record a few hours of static anywhere in the universe and then find a low-Kolmogorov-complexity algorithm that outputs that static.
14. Prove the Collatz Conjecture
Per Erdos, "Mathematics may not be ready for such problems", which implies a proof would yield substantial new breakthroughs in math. Some of these may turn out to be useful for Alignment
15. Execute the Long Reflection
Given an emulation of a human brain, determine how they would solve the Alignment Problem if given 1m years to think about the problem.
16. Solve the Moral Mazes Problem
Determine what social structure optimally avoids the principal agent problems arising in large organizations.
17. TPU Design
Given a formal design language for computer chips, find the TPU Design which optimally achieves a goal (e.g GPT-3 accuracy at next character prediction).
Note: even if you don't plan to use such a TPU (because acceleration bad) knowing what it is seems important. Is it 10x faster 10,000x ?
18. Solve the Diamond Alignment Problem
Should be easier than the Alignment Problem to formally specify.
19. Stock Market Arbitrage
Find the stock trading algorithm that statistically maximizes returns when back tested against financial data.
20. Generate More questions for this List
Generate the lowest kolmogorov complexity program which outputs this list and ask it what question 21 would be.
It’s neither. I’m claiming you need more than just a theorem prover’s mental skills to accomplish the task. If you think you can formulate the specification in formal mathematical logic, so that you do only need a theorem prover to accomplish the task (and also a list of possible programs), then go ahead. I’d be excited if you succeeded.