Dewey 2011 lays out the rules for one kind of agent with a mutable value system. The agent has some distribution over utility functions, which it has rules for updating based on its interaction history (where "interaction history" means the agent's observations and actions since its origin). To choose an...
Epistemic status: Trying to air out some thoughts for feedback, we'll see how successfully. May require some machine learning to make sense, and may require my level of ignorance to seem interesting. Many current proposals for value learning are garden-variety regression (or its close cousin, classification). The agent doing the...
Discussion article for the meetup : Urbana-Champaign: Quorum for discourse WHEN: 06 September 2015 02:00:00PM (-0500) WHERE: Altgeld Hall, W. Green Street, Urbana, IL, 61801 Another year, another chance to come to a LW meetup. Find us at the scenic north entrance of Altgeld Hall. I'll bring delicious food. Depending...
Epistemic status: One part quotes (informative, accurate), one part speculation (not so accurate). One avenue towards AI safety is the construction of "moral AI" that is good at solving the problem of human preferences and values. Five FLI grants have recently been funded that pursue different lines of research on...
[This post borders on some well-trodden ground in information theory and machine learning, so ideas in this post have an above-average chance of having already been stated elsewhere, by professionals, better. EDIT: As it turns out, this is largely the case, under the subjects of the justifications for MML prediction...
One question I've had recently is "Are agents acting on selfish preferences doomed to having conflicts with other versions of themselves?" A major motivation of TDT and UDT was the ability to just do the right thing without having to be tied up with precommitments made by your past self...
I When preferences are selfless, anthropic problems are easily solved by a change of perspective. For example, if we do a Sleeping Beauty experiment for charity, all Sleeping Beauty has to do is follow the strategy that, from the charity's perspective, gets them the most money. This turns out to...