Joel Ye — LessWrong

Thanks for the post, it's neat to see the fields and terms existing for these questions.

I have two questions for hope of using this type of analysis in my work to analyze a lack of transfer between two distinct datasets A and B. (I see this is in your future work?)

1. Where does OOD data project, or data that is implausible for the model?

2. For more complex data, might we expect this MSP to most clearly show in places other than the final layer?

re: transfer, my hypothesis is that we might be able to see, having trained on A and B, that during inference, the heldout data from A rapidly becomes easily identifiable as A, and thus stands to reason that there's less to benefit from any of B's features. Alternatively, a more optimistic test for whether we might see transfer between A and B prior to training on B, is if we could tell that a sample from B is extremely unlikely or OOD, via raw likelihood or misbehaving MSP?

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments