tmeanen — LessWrong

LESSWRONG
LW

tmeanen — LessWrong

tmeanen2yQuick Take

Does LessWrong have a strategy for getting the ideas posted on this site out to other communities (e.g. academia, decision-makers at frontier labs, policy circles, etc)? My impression is that there are a whole lot of potentially impactful ideas floating around on this site, such as important macrostrategic considerations for short-timelines, fast-ish takeoff worlds. Do the LW mods have a strategy to get the right people hearing these ideas, or do we just wait until someone important stumbles across the site?

tmeanen's Shortform

tmeanen

This is a special post for quick takes (aka "shortform"). Only the owner can create top-level comments.

Replying toHow a chip is designed

tmeanen2y

How a chip is designed

fab will check design plans and inform designers what can and can't be manufactured

I'm curious - what are the most common reasons for this? I.e: what are the common requests that designers make that fabs can't manufacture?

tmeanen2y

Seems like a useful resource to have out there. Some other information that would be nice to have are details about the security of the data center - but there's probably limited information that could be included ^[1].

^{^}
Because you probably don't want too many details about your infosec protocols out there for the entire internet to see.

tmeanen2y

Reconnaissance might be a candidate for one of the first uses of powerful A(G)I systems by militaries - if this isn't already the case. There's already an abundance of satellite data (likely exabytes in the next decade) that could be thrown into training datasets. It's also less inflammatory than using AI systems for autonomous weapon design, say, and politically more feasible. So there's a future in which A(G)I-powered reconnaissance systems have some transformative military applications, the military high-ups take note, and things snowball from there.

Replying toMy AI Model Delta Compared To Christiano

tmeanen2y

My AI Model Delta Compared To Christiano

But if the core difficulty in solving alignment is developing some difficult mathematical formalism and figuring out relevant proofs then I think we won't suffer from the problems with the technologies above. In other words, I would feel comfortable delegating and overseeing a team of AIs that have been tasked with solving the Riemann hypothesis - and I think this is what a large part of solving alignment might look like.

-4

tmeanen2y

I've been in a number of arguments where people say things like "why is 90% doom such a strong claim? That assumes that survival is the default! "

Am I misunderstandng this sentence? How do "90% doom" and the assumption that survival is the default square with one another?

Replying toMy AI Model Delta Compared To Christiano

tmeanen2y

My AI Model Delta Compared To Christiano

“keyboard and monitor I’m using right now, a stack of books, a tupperware, waterbottle, flip-flops, carpet, desk and chair, refrigerator, sink, etc. Under my models, if I pick one of these objects at random and do a deep dive researching that object, it will usually turn out to be bad in ways which were either nonobvious or nonsalient to me, but unambiguously make my life worse"

But, I think the negative impacts that these goods have on you are (mostly) realized on longer timescales - say, years to decades. If you’re using a chair that is bad for your posture, the impacts of this are usually seen years down the line when your... (read more)

tmeanen2y

Plausibly one technology that arrives soon after superintelligence is powerful surveillance technology that makes enforcing commitments significantly easier than it historically has been. Leaving aside the potential for this to be misused for authoritarian government, advocating for this to be developed before powerful technologies of mass destruction may be a strategy.

Replying toAI catastrophes and rogue deployments

tmeanen2y

AI catastrophes and rogue deployments

Nice, I like this concept of rogue deployment as it highlights two distinct features that are both required for a safety method to be considered 'successful'. I'm understanding catastrophe with rogue deployment as having good enough safety measures but these safety measures were bypassed/turned off, whereas catastrophe without rogue deployment involves having safety measures that were fully operational the whole time but insufficient to prevent a model/human actor from causing a catastrophe.

So for example, we could get really great mech. interp tools, but avoiding catastrophe isn't guaranteed if all of these mech. interp tools are running on a single server (making them very easy to disable). To prevent rogue deployment we’d want multiple servers running these mech. interp tools to provide redundancy in case one goes down/gets hacked etc. So there's a concept here of the raw effectiveness of a safety method as well as its reliability. I'm sure others can probably think of more nuanced examples too.