All of Tor Barstad's Comments + Replies

Tor Barstad*Ω1101

This feels to me like very much not how I would go about getting corrigibility.

It is hard to summarize how I would go about things, because there would be lots of steps, and lots of processes that are iterative.

Prior to plausible AGI/FOOM I would box it in really carefully, and I only interact with it in ways where it's expressivity is severely restricted.

I would set up a "council" of AGI-systems (a system of systems), and when giving it requests in an oracle/genie-like manner I would see if the answers converged. At first it would be the initial AGI-syste... (read more)

4Pattern
I like this idea. Although, if things don't converge, i.e. there is disagreement, this could potentially serve as identifying information that is needed to proceed, or reckon further/efficiently.