Stuart_Armstrong comments on A toy model of the control problem - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (24)
That still involves training it with no negative feedback error term for excess blocks (which would overwhelm a mere 0.1% uncertainty).
This is supposed to be a toy model of excessive simplicity. Do you have suggestions for improving it (for purposes of presenting to others)?
Maybe explain how it works when being configured, and then stops working when B gets a better model of the situation/runs more trial-and-error trials?
Ok.