I'm not a grad physics student- I don't have a STEM degree, or the equivalent- I found the book very readable, nonetheless. It's by far my favourite textbook- feels like it was actually written by someone sane, unlike most.

Reply

Critical review of Christiano's disagreements with Yudkowsky

Iknownothing1y33

I'm really glad you wrote this!
I think you address an important distinction there, but I think there might be a further one to be made- in that how we measure/tell if a model is aligned in the first place.
There seems to be a growing voice which says that if a model's output seems to be the output we might expect from an aligned AI, then it's aligned.
I think it's important to distinguish that from the idea that the model is aligned if you actually have a strong idea of what it's values are, how it's gotten them, etc.

Reply

AI Safety Chatbot

Iknownothing1y10

I'm really excited to see this!!
I'd like it if this became embed-able so it could be used on ai-plans.com and on other sites!!
Goodness knows, I'd like to be able to get summaries and answers to obscure questions on some alignmentforum posts!

Reply

Why aren't more people in AIS familiar with PDP?

Iknownothing1y10

What do you think someone who knows about PDP knows that someone with a good knowledge of DL doesn't?
And why would it be useful?

Reply

Why Is No One Trying To Align Profit Incentives With Alignment Research?

Iknownothing1y10

I think folks in AI Safety tend to underestimate how powerful and useful liability and an established duty of care would be for this.

Reply

Here's the exit.

Iknownothing1y00

I think calling things a 'game' makes sense to lesswrongers, but just seems unserious to non lesswrongers.

Reply

1

How dath ilan coordinates around solving alignment

Iknownothing1y00

I don't think a lack of IQ is the reason we've been failing at making AI sensibly. Rather, it's a lack of good incentive making.
Making an AI recklessly is current much more profitable than not doing do- which imo, shows a flaw in the efforts which have gone towards making AI safe - as in, not accepting that some people have a very different mindset/beliefs/core values and figuring out a structure/argument that would incentivize people of a broad range of mindsets.

Reply