adsenanim comments on Rationality Lessons in the Game of Go - Less Wrong

40 Post author: GreenRoot 21 August 2010 02:33PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (145)

You are viewing a single comment's thread. Show more comments above.

Comment author: adsenanim 26 August 2010 08:41:04AM *  0 points [-]

Sorry for the delay.

Let’s start a Fire.

The Fire requires 3 things: Air(A), Heat(H) and a Combustible(C) so that:

F == A+H+C.

We know that there are many true statements about F:

F == H+C+A

F == A+H+C

Etc.

Let’s say that these are also true:

F != A+A+A

F != B+A+A

Etc.

We also, because of trial and error, can enumerate the false statements, starting with:

F != A+H+C.

Etc.

Continuing with:

F == A+A+A

Etc.

Now this is where the flip-flop comes in:

The true and false of the basic circuit have an extraordinary amount of combinations for the purposes of making fire.

I came up with this idea not only because people learn games through both negative and positive reinforcement, but that many times we only have a partial picture of the possible combinations for a win.

This is redoubled when we think of thing in terms of arbitrary meanings such as air, heat and combustible.

Comment author: PhilGoetz 26 August 2010 10:44:09PM 1 point [-]

I still don't understand what the idea is.

Comment author: adsenanim 27 August 2010 02:37:50AM *  0 points [-]

The idea is this:

Not only that people can learn as much about a game from losing it as they can from winning it, but that they need to loose in order to learn how to win. The flip-flop acts as a helper in the process of trial and error.

The feedback caused by the wiring of two NOR gates of the flip-flop allow this because the switches are controlled by the true and false sets exclusively; one switch is always associated with the true statements and the other with false.

When we start to learn, all possibilities are indeterminate, they can be either true or false; F == A+A+A is just as valid as F != A+H+C.

The flip-flop becomes sort of an ex post facto method of examining the data of the experience depending on win or loss. With a loss there can be mild sorting of possibilities, but the real sorting comes with comparing wins and losses.

Let me know if how I am representing this idea is to brief, it is still in its infancy, and as I have said elsewhere in my posts, I haven’t read everything.

Comment author: swapnil 15 April 2011 12:07:05AM 0 points [-]

but that they need to loose in order to learn how to win. Can't people learn from others' mistakes? What do you say?

Comment author: adsenanim 27 August 2010 03:40:10AM 0 points [-]