zeshen

Feedback welcomed: www.admonymous.co/zeshen

Comments

Sorted by
zeshen5-4

Agreed. I'm also pleasantly surprised that your take isn't heavily downvoted.

zeshen3-1

There’ll be discussions about how these systems will eventually become dangerous, and safety-concerned groups might even set up testing protocols (“safety evals”).

My impression is that safety evals were deemed irrelevant because a powerful enough AGI, being deceptively aligned, would pass all of them anyway. We didn't expect the first general-ish AIs to be so dumb, like how GPT-4 was being so blatant and explicit about lying to the TaskRabbit worker. 

zeshen50

Scott Alexander talked about explicit honesty (unfortunately paywalled) in contrast with radical honesty. In short, explicit honesty is being completely honest when asked, and radical honesty is being completely honest even without being asked. From what I understand from your post, it feels like deep honesty is about being completely honest about information you perceive to be relevant to the receiver, regardless of whether the information is explicitly being requested. 

Scott also links to some cases where radical honesty did not work out well, like this, this, and this. I suspect deep honesty may lead to similar risks, as you have already pointed out. 

And with regards to:

“what is kind, true, and useful?”

I think they would form a 3-circle venn diagram. Things that are within the intersection of all three circles would be a no-brainer. But the tricky bits are the things that are either true but not kind/useful, or kind/useful but not true. And I understood this post as a suggestion to venture more into the former. 

zeshen83

Can't people decide simply not to build AGI/ASI?

Yeah, many people, like the majority of users on this forum, have decided to not build AGI. On the other hand, other people have decided to build AGI and are working hard towards it. 

Side note: LessWrong has a feature to post posts as Questions, you might want to use it for questions in the future.

zeshen12

Definitely. Also, my incorrect and exaggerated model of the community is likely based on the minority who have a tendency of expressing those comments publicly, against people who might even genuinely deserve those comments. 

zeshen3-2

I agree with RL agents being misaligned by default, even more so for the non-imitation-learned ones. I mean, even LLMs trained on human-generated data are misaligned by default, regardless of what definition of 'alignment' is being used. But even with misalignment by default, I'm just less convinced that their capabilities would grow fast enough to be able to cause an existential catastrophe in the near-term, if we use LLM capability improvement trends as a reference. 

zeshen110

Thanks for this post. This is generally how I feel as well, but my (exaggerated) model of the AI aligment community would immediately attack me by saying "if you don't find AI scary, you either don't understand the arguments on AI safety or you don't know how advanced AI has gotten". In my opinion, a few years ago we were concerned about recursively self improving AIs, and that seemed genuinely plausible and scary. But somehow, they didn't really happen (or haven't happened yet) despite people trying all sorts of ways to make it happen. And instead of a intelligence explosion, what we got was an extremely predictable improvement trend which was a function of only two things i.e. data + compute. This made me qualitatively update my p(doom) downwards, and I was genuinely surprised that many people went the other way instead, updating upwards as LLMs got better. 

zeshen10

I've gotten push-back from almost everyone I've spoken with about this

I had also expected this reaction, and I always thought I was the only one who thinks we have basically achieved AGI since ~GPT-3. But looking at the upvotes on this post I wonder if this is a much more common view. 

zeshen30

My first impression was also that axis lines are a matter of aesthetics. But then I browsed The Economist's visual styleguide and realized they also do something similar, i.e. omit the y-axis line (in fact, they omit the y-axis line on basically all their line / scatter plots, but almost always maintain the gridlines). 

Here's also an article they ran about their errors in data visualization, albeit probably fairly introductory for the median LW reader.

Answer by zeshen32

I'm pretty sure you have come across this already, but just in case you haven't:

https://incidentdatabase.ai/taxonomy/gmf/

Load More