Posts

Sorted by New

Wiki Contributions

Comments

slg30

I think it'd be good to cross-post this on the EA Forum.

edit: It's been posted, link here: https://forum.effectivealtruism.org/posts/zLkdQRFBeyyMLKoNj/still-no-strong-evidence-that-llms-increase-bioterrorism

slg65

This post reads like it wants to convince its readers that AGI is near/will spell doom, picking and spelling out arguments in a biased way. 

Just because many ppl on the Forum and LW (including myself) believe that AI Safety is very important and isn't given enough attention by important actors, I don't want to lower our standards for good arguments in favor of more AI Safety.

Some parts of the post that I find lacking:

 "We don’t have any obstacle left in mind that we don’t expect to get overcome in more than 6 months after efforts are invested to take it down."

I don't think more than 1/3 of ML researchers or engineers at DeepMind, OpenAI, or Anthropic would sign this statement.

"No one knows how to predict AI capabilities."

Many people are trying though (Ajeya Cotra, EpochAI), and I think these efforts aren't worthless. Maybe a different statement could be: "New AI capabilities appear discontinuously, and we have a hard time predicting such jumps. Given this larger uncertainty, we should worry more about unexpected and potentially dangerous capability increases".

"RLHF and Fine-Tuning have not worked well so far."

Not taking into account if RLHF scales (as linked, Jan Leike of OpenAI doesn't think so) and if RLHF leads to deception, from my cursory reading and experience, ChatGPT shows substantially better behavior than Bing, which might be due to the latter not using RLHF.


Overall I do agree with the article and think that recent developments have been worrying. Still, if the goal of the articles is to get independently-thinking individuals to think about working on AI Safety, I'd prefer less extremized arguments.