Davidmanheim

Sequences

Modeling Transformative AI Risk (MTAIR)

Wiki Contributions

Comments

Sorted by

Are you familiar with Davidad's program working on compositional world modeling? (The linked notes are from before the program was launched, there is ongoing work on the topic.)

The reason I ask is because embedded agents and agents in multi-agent settings should need compositional world models that include models of themselves and other agents, which implies that hierarchical agency is included in what they would need to solve. 

It also relates closely to work Vanessa is doing (as an "ARIA Creator") in learning theoretic AI, related to what she has called "Frugal Compositional Languages" and see this work by @alcatal - though I understand both are not yet addressing on multi-agent world models, nor is it explicitly about modeling the agents themselves in a compositional / embedded agent way, though those are presumably desiderata.

That is an interesting question l, but I unfortunately do not know enough to even figure out how to answer it.

Good points. Yes, storage definitely helps, and microgrids are generally able to have some storage, if only to smooth out variation in power generation for local use. But solar storms can last days, even if a large long-lasting event is very, very unlikely. And it's definitely true that if large facilities have storage, shutdowns will have reduced impact - but I understand that the transformers are used for power transmission, so having local storage at the large generators won't change the need to shut down the transformers used for sending that power to consumers.

Do I understand correctly that the blue-green graph has a y-axis that goes above 100% median reduction, with error bars in that range? (This would happen if they estimated a proportion as a standard variable - not great practice, but I want to check that it is what happened.)

Question for a lawyer: how is non-reciprocity not an interstate trade issue that federal courts can strike down?

In addition to the point that current models are already strongly superhuman in most ways, I think that if you buy the idea that we'll be able to do automated alignment of ASI, you'll still need some reliable approach to "manual" alignment of current systems. We're already far past the point where we can robustly verify LLMs claims' or reasoning in a robust fashion outside of narrow domains like programming and math.

But on point two, I strongly agree that Agent foundations and Davidad's agendas are also worth pursuing. (And in a sane world, we should have tens or hundreds of millions of dollars in funding for each every year.) Instead, it looks like we have Davidad's ARIA funding, Jaan Talinn and LTFF funding some agent foundations and SLT work, and that's basically it. And MIRI abandoned agent foundations, while Openphil isn't, it seems, putting money or effort into them.

I partly disagree; steganography is only useful when it's possible for the outside / receiving system to detect and interpret the hidden messages, so if the messages are of a type that outside systems would identify, they can and should be detectable by the gating system as well. 

That said, I'd be very interested in looking at formal guarantees that the outputs are minimally complex in some computationally tractable sense, or something similar - it definitely seems like something that @davidad would want to consider.

I really like that idea, and the clarity it provides, and have renamed the post to reflect it! (Sorryr this was so slow- I'm travelling.)

Load More