Sequences

Modeling Transformative AI Risk (MTAIR)

Wiki Contributions

Comments

and then the 2nd AI pays some trivial amount to the 1st for the inconvenience

Completely as an aside, coordination problems among ASI don't go away, so this is a highly non trivial claim.

I thought that the point was that either managed-interface-only access, or API access with rate limits, monitoring, and an appropriate terms of service, can prevent use of some forms of scaffolding. If it's staged release, this makes sense to do, at least for a brief period while confirming that there are not security or safety issues.

These days it's rare for a release to advance the frontier substantially.

This seems to be one crux. Sure, there's no need for staged release if the model doesn't actually do much more than previous models, and doesn't have unpatched vulnerabilities of types that would be identified by somewhat broader testing.

The other crux, I think, is around public release of model weights. (Often referred to, incorrectly, as "open sourcing.") Staged release implies not releasing weights immediately - and I think this is one of the critical issues with what companies like X have done that make it important to demand staged release for any models claiming to be as powerful or more powerful than current frontier models. (In addition to testing and red-teaming, which they also don't do.)

It is funny, but it also showed up on April 2nd in Europe and anywhere farther east...

I think there are two very different cases of "almost works" that are being referred to. The first is where the added effort is going in the right direction, and the second is where it is slightly wrong. For the first case, if you have a drug that doesn't quite treat your symptoms, it might be because it addresses all of them somewhat, in which case increasing the dose might make sense. For the second case, you could have one that addresses most of the symptoms very well, but makes one worse, or has an unacceptable side effect, in which case increasing the dose wouldn't help. Similarly, we could imagine a muscle that is uncomfortable. The second case might then be a stretch that targets almost the right muscle. That isn't going to help if you do it more. The first case, on the other hand, would be a stretch that targets the right muscle but isn't doing enough, and obviously it could be great to do more often, or for a longer time.

Again, I think it was a fine and enjoyable post.

But I didn't see where you "demonstrate how I used very basic rationalist tools to uncover lies," which could have improved the post, and I don't think this really explored any underappreciated parts of "deception and how it can manifest in the real world" - which I agree is underappreciated. Unfortunately, this post didn't provide much clarity about how to find it, or how to think about it. So again, it's a fine post, good stories, and I agree they illustrate being more confused by fiction than reality, and other rationalist virtues, but as I said, it was not "the type of post that leads people to a more nuanced or better view of any of the things discussed." 

I disagree with this decision, not because I think it was a bad post, but because it doesn't seem like the type of post that leads people to a more nuanced or better view of any of the things discussed, much less a post that provided insight or better understanding of critical things in the broader world. It was enjoyable, but not what I'd like to see more of on Less Wrong.

(Note: I posted this response primarily because I saw that lots of others also disagreed with this, and think it's worth having on the record why at least one of us did so.)

"Climate change is seen as a bit less of a significant problem"
 

That seems shockingly unlikely (5%) - even if we have essentially eliminated all net emissions (10%), we will still be seeing continued warming (99%) unless we have widely embraced geoengineering (10%). If we have, it is a source of significant geopolitical contention (75%) due to uneven impacts (50%) and pressure from environmental groups (90%) worried that it is promoting continued emissions and / or causes other harms. Progress on carbon capture is starting to pay off (70%) but is not (90%) deployed at anything like the scale needed to stop or reverse warming.

Adaptation to climate change has continued (99%), but it is increasingly obvious how expensive it is and how badly it is impacting developing world. The public still seems to think this is the fault of current emissions (70%) and carbon taxes or similar legal limits are in place for a majority of G7 countries (50%) but less than half of other countries (70%).

To start, the claim that it was found 2 miles from the facility is an important mistake, because WIV is 8 miles from the market. For comparison to another city people might know better, in New York, that's the distance between World Trade Center and either Columbia University, or Newark Airport. Wuhan's downtown is around 16 miles across. 8 miles away just means it was in the same city. 

And you're over-reliant on the evidence you want to pay attention to. For example, even rstricting ourselves to "nearby coincidence" evidence, the Hunan the market is the largest in central China - so what are the odds that a natural spillover events occurs immediately surrounding the largest animal market? If the disease actually emerged from WIV, what are the odds that the cases centered around the Hunan market, 8 miles away, instead of the Baishazhou live animal market, 3 miles away, or the Dijiao market, also 8 miles away?

So I agree that an update can be that strong, but this one simply isn't.

Yeah, but I think that it's more than not taken literally, it's that the exercise is fundamentally flawed when being used as an argument instead of very narrowly for honest truth-seeking, which is almost never possible in a discussion without unreasonably high levels of trust and confidence in others' epistemic reliability.

Load More