HamletHenna comments on AI reflection problem - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (26)
It's nice that more people are starting to think about AI-related logic issues. But I don't understand this part:
Why is it okay? It seems to me that if the AI can prove ∀t. ∀P. [[t]]P → P, then it can prove that ∀P. (∃t. [[t]]P) → P and blow up by Löb's Theorem again. Like this:
Assume that ∃t. [[t]]P
Apply your statement and obtain ∃t. P
Note that t is unused and obtain P
Also, here's Wei Dai's solution from the original SL4 thread, which seems correct to me:
I agree with Wei Dai's solution, though somehow I didn't see it before. The formula I wrote was incorrect. Do you know why Eliezer is still talking about the problem seven years after Wei Dai provided the solution? Or more specifically, why he wants to have A1 ⊢ ∀P . ◻(A2⊢P) → P instead of A1 ⊢ ∀P . (A2⊢P) → P ?
I've rewritten the post, giving credit to Wei Dai, and expanding on my contribution. Does that look okay?
Your rewrite seems to have the same error as the previous version. A correct formalization of Wei's solution could look like this:
A1 ⊢ ∀P. (A1 ⊢ P) ↔ (A2 ⊢ P)
If A1=A2, that equation collapses into a tautology, as it should.