Epistemic status: These are first positive results. I have not yet run extensive tests to verify repeatability, so take them with a grain of salt. This post is meant to disseminate early results and collect ideas for further experiments to concretise these findings.
Tldr:
I study whether LLMs understand their training data and can use that understanding to make inferences about later training data. Specifically, I measure whether LLMs can infer which declarative facts in their training data are relevant to the current context and then leverage them. I show that finetuning LLMs on declarative data describing different personas reduces the number of iterative finetuning steps (a proxy for reinforcement learning) required to display... (read 3989 more words →)
They define incoherence as the fraction of error explained by variance rather than bias, and then they find that on more complex tasks, a larger proportion of errors are incoherent i.e., caused by variance rather than bias.
But isn't this trivially obvious? On more complex tasks, models (and humans, monkeys, etc.) make more mistakes. So, unless models take more coherently misaligned actions on more complex tasks, so that coherent misalignment (bias) also increases with task complexity, the proportion of error caused by mistakes (variance) will increase.
Mistakes are increasing because of task complexity increasing. There is no reason to expect coherent misalignment to increase with task complexity. Therefore, their measure of incoherence will increase with task complexity.