All of Bleys's Comments + Replies

Bleys*92

On further reflection, I'd tentatively propose something along these lines as an additional measure:

As I've now seen others suggest, trigger limits determined only as a percentage of the state of the art's performance.

This could be implemented as a proposal to give a government agency the power to work as the overseer and final arbiter of deciding, once per year for the following year (and ad-hoc on an emergency basis), the metrics and threshold percentages of indexing what is determined state of the art.
This would be done in consultation with representati... (read more)

Bleys1514

Already, there are dozens of fine-tuned Llama2 models scoring above 70 on MMLU. They are laughably far from threats. This does seem like an exceptionally low bar. GPT-4, given the right prompt crafting, and adjusting for errors in MMLU has just been shown to be capable of 89 on MMLU. It would not be surprising for Llama models to achieve >80 on MMLU in the next 6 months.

I think focusing on a benchmark like MMLU is not the right approach, and will be very quickly outmoded. If we look at the other criteria (which, as you propose it now, any and all are a ... (read more)

1[anonymous]
I agree that benchmarks might not be the right criteria, but training cost isn't the right metric either IMO, since compute and algorithmic improvement will be bringing these costs down every year. Instead, I would propose an effective compute threshold, i.e. number of FLOP while accounting for algorithmic improvements.
9Bleys
On further reflection, I'd tentatively propose something along these lines as an additional measure: As I've now seen others suggest, trigger limits determined only as a percentage of the state of the art's performance. This could be implemented as a proposal to give a government agency the power to work as the overseer and final arbiter of deciding, once per year for the following year (and ad-hoc on an emergency basis), the metrics and threshold percentages of indexing what is determined state of the art. This would be done in consultation with representatives from each of the big AI labs (as determined by, e.g., having invested >$100M in AI compute), and including broader public, academic, and open source AI community feedback but ultimately decided by the agency. The power could also be reserved for the agency to determine that specific model capabilities, if well defined and clearly measurable, could be listed as automatically triggering regulation. This very clearly makes the regulation target the true "frontier AI" while leaving others out of the collateral crosshairs. I say tentatively, as an immediate need for any sort of specific model-capability-level regulation to prevent existential risk is not remotely apparent with the current architectures for models (Autoregressive LLMs). I see the potential in the future for risk, but pending major breakthroughs in architecture. Existing models, and the immediately coming generation, are trivially knowable as non-threatening at an existential level. Why? They are incapable of objective driven actions and planning. The worst that can be done is within the narrow span of agent-like actions that can be covered via extensive and deliberate programmatic connection of LLMs into heavily engineered systems. Any harms that might result would be at worst within a narrow scope that's either tangential to the intended actions, or deliberate human intent that's likely covered within existing criminal frameworks. The worst
Bleys32

If you are specifically trying to just ensure that all big AI labs are under common oversight, the most direct way is via compute budget. E.g., any organization with compute budget >$100M allocated for AI research. Would capture all the big labs. (OpenAI spent >$400M on compute in 2022 alone).

No need to complicate it with anything else.

Bleys00

If I remember correctly, Eliezer's old site had some references to this. Was this not common invocation on SL4?

Bleys00

Yes! Or even further, "I am now focusing my life on risk reduction and have significantly reduced akrasia in all facets of my life."

6AlexU
This sounds an awful lot like one of the examples I gave above. Ok, so you're focused on "risk reduction" and "reducing akrasia." So what's that mean? You've decided to buckle-up, wear sunscreen, and not be so lazy? Can't I get that from Reader's Digest or my mom?
Bleys90

Worth noting, it's givewell.net. givewell.com links to a Visa card program, givewell.net is a site which aims to answer "Where should I donate?"