I had an opportunity to ask an individual from one of the mentioned labs about plans to use external evaluators and they said something along the lines of:
“External evaluators are very slow - we are just far better at eliciting capabilities from our models.”
They earlier said something much to the same effect when I asked if they’d been surprised by anything people had used deployed LLMs for so far, ‘in the wild’. Essentially, no, not really, maybe even a bit underwhelmed.
I had an opportunity to ask an individual from one of the mentioned labs about plans to use external evaluators and they said something along the lines of:
“External evaluators are very slow - we are just far better at eliciting capabilities from our models.”
They earlier said something much to the same effect when I asked if they’d been surprised by anything people had used deployed LLMs for so far, ‘in the wild’. Essentially, no, not really, maybe even a bit underwhelmed.