Part 1 was previously posted and it seemed that people likd it, so I figured that I should post part 2 - http://waitbutwhy.com/2015/01/artificial-intelligence-revolution-2.html
Part 1 was previously posted and it seemed that people likd it, so I figured that I should post part 2 - http://waitbutwhy.com/2015/01/artificial-intelligence-revolution-2.html
There's a story about a card writing AI named Tully that really clarified the problem of FAI for me (I'd elaborate but I don't want to ruin it).
No problem, pinyaka.
I don't understand very much about mathematics, computer science, or programming, so I think that, for the most part, I've expressed myself in natural language to the greatest extent that I possibly can. I'm encouraged that about an hour and a half before my previous reply, DefectiveAlgorithm made the exact same argument that I did, albeit more briefly. It discourages me that he tabooed 'values' and you immediately used it anyway. Just in case you did decide to reply, I wrote a Python-esque pseudocode example of my conception of what an AGI with an arbitrary terminal value's very high level source code would look like. With little technical background, my understanding is very high level with lots of black boxes. I encourage you to do the same, such that we may compare. I would prefer that you write yours before I give you mine so that you are not anchored by my example. This way you are forced to conceive of the AI as a program and do away with ambiguous wording. What do you say?
I've asked Nornagest to provide links or further reading on the value stability problem. I don't know enough about it to say anything meaningful about it. I thought that wireheading scenarios were only problems with AIs whose values were loaded with reinforcement learning.
On this at least we agree.
From what I understand, even if you're biased, it's not a bad assumption. To my knowledge, in scenarios with AGIs that have their values loaded with reinforcement learning, the AGIs are usually given the terminal goal of maximizing the time-discounted integral of their future reward signal. So, they 'bias' the AGI in the way that you may be biased. Maybe so that it 'cares' about the rewards its handlers give it more than the far greater far future rewards that it could stand to gain from wireheading itself? I don't know. My brain is tired. My question looks wrong to me.
In fairness, I only used it to describe how they'd come to be used in this context in the first place, not to try and continue with my point.
I've never done something l... (read more)