I'm particularly interested in sustainable collaboration and the long-term future of value. I'd love to contribute to a safer and more prosperous future with AI! Always interested in discussions about axiology, x-risks, s-risks.
I enjoy meeting new perspectives and growing my understanding of the world and the people in it. I also love to read - let me know your suggestions! In no particular order, here are some I've enjoyed recently
Cooperative gaming is a relatively recent but fruitful interest for me. Here are some of my favourites
People who've got to know me only recently are sometimes surprised to learn that I'm a pretty handy trumpeter and hornist.
I'm interested to know how (if at all) you'd say the perspective you've just given deviates from something like this:
My current guess is you agree with some reasonable interpretation of all these points. And maybe also have some more nuance you think is important?
Given the picture I've suggested, the relevant questions are
A complementary angle: we shouldn't be arguing over whether or not we're in for a rough ride, we should be figuring out how to not have that.
I suspect more people would be willing to (both empirically and theoretically) get behind 'ruthless consequentialist maximisers are one extreme of a spectrum which gets increasingly scary and dangerous; it would be bad if those got unleashed'.
Sure, skeptics can still argue that this just won't happen even if we sit back and relax. But I think then it's clearer that they're probably making a mistake (since origin stories for ruthless consequentialist maximisers are many and disjunctive). So the debate becomes 'which sources of supercompetent ruthless consequentialist maximisers are most likely and what options exist to curtail that?'.
"This short story perfectly depicts the motivations and psychological makeup of my milieu," I think wryly as I strong upvote. I'm going to need to discuss this at length with my therapist. Probably the author is one of those salty mid-performing engineers who didn't get the offer they wanted from Anthropic or whatever. That thought cheers me up a little.
Esther catches sight of the content on my screen over my shoulder. "I saw that too," she remarks, looking faintly worried in a way which reminds me of why I am hopelessly in love with what she represents. "Are we, like, the bad guys, or maybe deluding ourselves that we're the good guys in a bad situation? It seems like that author thinks so. It does seem like biding my time hasn't really got me any real influence yet."
I rack my brain for something virtuous to say. "Yeah, um, safety-washing is a real drag, right?" Her worry intensifies, so I know I'm pronouncing the right shibboleths. God, I am really spiritually emaciated right now. I need to cheer her up. "But think about it, we really are in the room, right? Who else in the world can say that? It's not like Vox or Krishna are going to wake up any time soon. That's a lot of counterfactual expected impact."
She relaxes. "You're right. Just need to keep vigilant for important opportunities to speak up. Thanks." We both get back to tuning RL environments and meta-ML pipelines.
I think there's a magic [1] that the military is somehow also fairly firmly aligned with the constitution and non-partisan, though nominally also under the president's command. I don't really get it, and I don't know how much this helps.
and it is magic, as in it's an inexplicable (to me) and presumed-contingent (social) technology ↩︎
Yes, that makes a lot of sense. So glad to hear you're doing more roadmapping here! FLF also on this (as, I think, you know). We should compare notes at some stage.
I think Drexler had a good discussion of this point. Paraphrasing, it's not so much an axis as that. Naively, for any type of exchange, there's some ratio of offence to defence costs. And historically, defensive buildout is difficult to distinguish from offensive (this is not always the case, but it is a reasonable first approximation).
But improvements to verification could enable more decoupling of the resources legibly and trustworthily dedicated to defence vs offence: I can prove that what I'm constructing is a defensive-only measure. Thus sufficient defensive buildouts to neutralise observed/anticipated offensive buildouts can be pursued more freely by a community coordinating on this kind of verification even when the cost ratio meaningfully favours destroy).
Then you need to ask what barriers to communication, coordination, and verification might make it difficult to achieve this. And whether there are domains where offence is so cheap as to be practically undefendable. In those cases, you might need to coordinate around tech prevention and mutual restraint instead, or resort to centralisation of the needed authority to achieve safety there.
the opportunity cost of fighting over Earth's resources may be far higher than the cost of just... going elsewhere
If you're an individual or a marauding horde or whatever with a low replication or expansion rate then sure. But I don't know, algae doesn't think, "Oh, I will leave this slightly contested region alone in favour of open territory", it just replicates and grabs what it can of all available resources.
How can AI itself transform the playing field? Seb Krier of Google DeepMind wrote about this exact vision in "Coasean bargaining at scale".
The parts I find most disappointing about Seb's writing are an apparent attachment to (personal, delegate) agents in particular as form factor, and a sort of implied need to wait for AGI until we can get cracking on providing improvements to human collective epistemics and coordination. Why wait? We've unlocked many applicable building blocks already, and agentic form factors are only a very narrow slice of the design space (and not even especially appealing for many application cases).
I otherwise find parts of Seb's articulation appealing, especially unlocking more bottom-up coordination by reducing frictions, and treating inter-human activity as an important space for uplift by technology (rather than the sometimes solipsistic individualistic uplift stories from other places).
(I forgot that more conversation might happen on a LW crosspost, and I again lament that the internet has yet to develop a unified routing system for same-content-different-edition discourse. Copied comment from a few days ago on substack:)
I really appreciate this (and other recent) transparency. This is much improved since AI 2027.
One area I get confused by (same with Davidson, with whom I've discussed this a bit) is 'research taste'. When you say things like 'better at research taste', and when I look at your model diagram, it seems you're thinking of taste as a generic competence. But what is taste? It's nothing but a partially-generalising learned heuristic model of experiment value-of-information. (Said another way, it's a heuristic value function for the 'achieve insight' objective of research).
How do you get such learned models? No other way than by experimental throughput and observation thereof (direct or indirect: can include textbooks or notes and discussions with existing experts)!
See my discussion of research and taste
As such, taste accumulates like a stock, on the basis of experimental throughput and sample efficiency (of the individual or the team) at extracting the relevant updates to VOI model. It 'depreciates' as you go, because the frontier of the known moves, which moves gradually outside the generalising region of the taste heuristic (eventually getting back to naive trial and error), most saliently here with data and model scale, but also in other ways.
This makes sample efficiency (of taste accumulation) and experimental throughput extremely important, central in my view. You might think that expert interviews and reading all the textbooks ever etc provide meaningful jumpstart to the taste stock. But they certainly don't help with the flow. So then you need to know how fast it depreciates over the relevant regime.
(Besides pure heuristic improvements, if you think faster, you can also reason your way to somewhat better experiment design, both by naively pumping your taste heuristics for best-of-k, or by combining and iterating on designs. I think this reasoning boost falls off quite sharply, but I'm unsure. See my question on this)