x

LESSWRONG
LW

KanHar — LessWrong

KanHar

KanHar

Message

6

1

8y

KanHar

KanHar has not written any posts yet.

Message

6 karma

1 comment

Member for 8 years

Replying toModels Don't "Get Reward"

Models Don't "Get Reward"

Considering the fact that weight sharing between actor and critic networks is a common practice, and given the fact that the critic passes gradients (learned from the reward) to the actor, for most practical purposes the actor gets all of the information it needs about the reward.

This is the case for many common architectures.

7

0