Wiki Contributions

Comments

Sorted by
Satron10

Given the lack of response, should I assume the answer is "no"?

Satron10

My intuition also tells me that the distinction might just lack the necessary robustness. I do wonder if Buck's intuition is different. In any case, it would be very interesting to know his opinion.

Satron80

I also wonder what Buck thinks about CoTs becoming less transparent (especially in light of recent o1 developments).

Satron10

Great post, very clearly written. Going to share it in my spaces.

Satron30

Sure, it sounds like a good idea! Below I will write my thoughts on your overall summarized position.

———

"I primarily wish to argue that, given the general lack of accountability for developing machine learning systems in worlds where indeed the default outcome is doom, it should not be surprising to find out that there is a large corporation (or multiple) doing so."

I do think that I could maybe agree with this if it was 1 small corporation. In your previous comment you suggested that you are describing not the intentional contribution to the omnicide, but the bit of rationalization. I don't think I would agree that that many people working on AI are successfully engaged in that bit of rationalization or that it would be enough to keep them doing it. The big factor is that in case of their failure, they personally (and all of their loved ones) will suffer the consequences.

"It is also not surprising that glory-seeking companies have large departments focused on 'ethics' and 'safety' in order to look respectable to such people."

I don't disagree with this, because it seems plausible that one of the reasons for creating safety departments is ulterior. However, I believe that this reason is probably not the main one and that AI safety labs are making genuinely good research papers. To take an example of Anthropic, I've seen safety papers that got LessWrong community excited (at least judging by upvotes). Like this one.

"I believe that there is no good plan and that these companies would exist regardless of whether a good plan existed or not... I believe that the people involved are getting rich risking all of our lives and there is (currently) no justice here"

For the reasons, that I mentioned in my first paragraph I would probably disagree with this. Relatedly, while I do think wealth in general can be somewhat motivating, I also think that AI developers are aware that all their wealth would mean nothing if AI kills everyone.

———

Overall, I am really happy with this discussion. Our disagreements came down to a few points and we agree on quite a bit of issues. I am similarly happy to conclude this big comment thread.

Satron2-1

I don't think it would be necessarily be easy to rationalize that vaccine that negatively affects 10% of the population and has no effect on 90% is good. It seems possible to rationalize that it is good for you (if you don't care about other people), but quite hard to rationalize that it is good in general.

Given how few politicians die as the result of their policies (at least in the Western world), the increase in their chance of death does seem negligible (compared to something that presumably increases this risk by a lot). Most bad policies that I have in mind don't seem to happen during human catastrophes (like pandemics).

However, I suspect that the main point of your last comment was that potential glory can, in principle, be one of the reasons for why people rationalize stuff, and if that is the case, then I can broadly agree with you!

Satron10

Looking at the 2 examples that you gave me, I can see a few issues. I wouldn't really say that saying "I don't know" once is necessarily a lie. If anything, I could find such an answer somewhat more honest in some contexts. Other than that, there is also the issue of both of the examples being of a much different scope and scale. Saying IDK to the committee and trying to take down someone's comment on the forum on the internet and definitely not on the same scale as the elaborate scheme of tricking or bribing/silencing multiple government employees who have access to your model. But even with all that aside, these 2 examples are only tangential to the topic of governmental oversight over OpenAI or Anthropic and don't necessarily provide direct evidence.

I can believe that you genuinely have information from private sources, but without any way of me verifying them, I am fine to leave this one at that.

Satron10

To modify my example to include an accountability mechanism that's also similar to the real life, the King takes exactly the same vaccines as everyone else. So if he messed up with the chemicals, he also dies.

I believe similar accountability mechanism works in our real world case. If CEOs build unsafe AI, they and everyone they valued in this life die. This seems like a really good incentive for them to not build unsafe AI.

At the end of the day, voluntary commitment such as debating with the critics are not as strong in my option. Imagine that they agree with you and go to the debate. Without the incentive of "if I mess up, everyone dies", the CEOs could just go right back to doing what they were doing. As far as I know voluntary debates have few (if any) actual legal mechanisms to hold CEOs accountable.

Satron10

"If a model trained on synthetic data is expected to have good performance out of distribution (on real-world problems) then I think that it would also be expected to have high performance at assessing whether it's in a simulation."

Noosphere89, you have marked this sentence with a "disagree" emoji. Would you mind expanding on that? I think it is a pretty important point and I'd love to see why you disagree with Ben.

Satron32

What about the following red team strategy:

Give an honest assessment of the suspicion level, unless there is a very well hidden backdoor, then give a low score. Also, only create backdoors if it is possible to hide them well.

Wouldn't this defeat the self-checking strategy?

Load More