This is a linkpost for https://paulbricman.com/hypothesis-subspace/
Related recommendation: Inward and outward steelmanning — LessWrong
Imagine that you encountered a car with square wheels
Inward steelmanning: "This is an abomination! It doesn't work! But maybe with round wheels it would be beautiful. Or maybe a different vehicle with square wheels could be beautiful."
Outward steelmanning: "This is ugly! It doesn't work! But maybe if I imagine a world where this car works, it will change my standards of beauty. Maybe I will gain some insight about this world that I'm missing."
If you want to be charitable, why not grant your opponent an entire universe with its own set of rules?
This post has been written for the first Refine blog post day, at the end of a week of readings, discussions, and exercises about epistemology for doing good conceptual research. Thanks Adam Shimi, Linda Linsefors, Dan Clothiaux for comments.
To steelmine (as per Tamsin): to intentionally look for what productive mistakes a research direction is hinting at.
There are quite a few analogies used regularly across alignment. Some popular ones include "prompts are like programs" or "interpretability is like neuroscience on ML models." While no analogy is perfect, some are useful, as they help us recycle years of intellectual labor if we get the translation key right. This opportunity is particularly relevant if you side with relatively short timelines, because you can use it to quickly scan through entire regions of hypothesis space for (part of) a silver bullet.
This is where I'm coming from in my work at Refine. I want to use the fellowship as an opportunity to investigate a dozen or so themes which connect prosaic alignment to other tangentially related disciplines and see which ones yield productive mistakes. This top-down approach of starting with broad themes and then zooming in on details is explicitly baked into the linked artifact which I'll be using as a sketchpad throughout the program. The left-to-right tiled layout represents branches exploring various assumptions, technical details, and failure modes, while allowing (and welcoming) targeted feedback.
That said, here are a few handpicked excerpts from said artifact to give you a taste:
Memetic Colonies
Parametric Ecologies
Latent Resonators
If you want to skim through more themes like the ones above, consider wandering around the actual artifact for a few minutes. While any feedback you might have would be welcome, at the time of writing I'm particularly interested in leading questions which I could use to branch out into new considerations (e.g. Would this still hold if X? Why do you believe Y? How could this account for Z?). The number of comments (and reactions?) also acts as a heuristic for guiding the growth of the conversational tree in promising directions.
While I want to keep meta-level thoughts for the end of the program, I personally believe ideas are like athletes. You train them by applying stressors, and it's only by challenging them that they'll grow stronger. If that's what you want from them, giving them an easy time is not particularly helpful. I'd be really grateful if you'd help me train those suckers through your feedback!