Someone asked me about this, so here are my quick thoughts.
Although I've learned a lot of math over the last year and a half, it still isn't my comparative advantage. What I do instead is,
Find a problem
that seems plausibly important to AI safety (low impact), or a phenomenon that's secretly confusing but not really explored (instrumental convergence). If you're looking for a problem, corrigibility strikes me as another thing that meets these criteria, and is still mysterious.
Think about the problem
Stare at the problem on my own, ignoring any existing thinking as much as possible. Just think about what the problem is, what's confusing about it, what a solution would look like. In retrospect, this has helped me avoid anchoring myself. Also, my prior for existing work is that it's confused and unhelpful, and I can do better by just thinking hard. I think this is pretty reasonable for a field as young as AI alignment, but I wouldn't expect this to be true at all for e.g. physics or abstract algebra. I also think this is likely to be true in any field where philosophy is required, where you need to find the right formalisms instead of working from axioms.
Therefore, when thinking about whether "responsibility for outcomes" has a simple core concept, I nearly instantly concluded it didn't, without spending a second glancing over the surely countless philosophy papers wringing their hands (yup, papers have hands) over this debate. This was the right move. I just trusted my own thinking. Lit reviews are just proxy signals of your having gained comprehension and coming to a well-considered conclusion.
Concrete examples are helpful: at first, thinking about vases in the context of impact measurement was helpful for getting a grip on low impact, even though it was secretly a red herring. I like to be concrete because we actually need solutions - I want to learn more about the relationship between solution specifications and the task at hand.
Make simplifying assumptions wherever possible. Assume a ridiculous amount of stuff, and then pare it down.
Don't formalize your thoughts too early - you'll just get useless mathy sludge out on the other side, the product of your confusion. Don't think for a second that having math representing your thoughts means you've necessarily made progress - for the kind of problems I'm thinking about right now, the math has to sing with the elegance of the philosophical insight you're formalizing.
Forget all about whether you have the license or background to come up with a solution. When I was starting out, I was too busy being fascinated by the problem to remember that I, you know, wasn't allowed to solve it.
Obviously, there are common-sense exceptions to this, mostly revolving around trying to run without any feet. It would be pretty silly to think about logical uncertainty without even knowing propositional logic. One of the advantages of immersing myself in a lot of math isn't just knowing more, but knowing what I don't know. However, I think it's rare to secretly lack the basic skills to even start on the problem at hand. You'll probably know if you are, because all your thoughts keep coming back to the same kind of confusions about a formalism, or something. Then, you look for ways to resolve the confusion (possibly by asking a question on LW or in the MIRIx Discord), find the thing, and get back to work.
Stress-test thoughts
So you've had some novel thoughts, and an insight or two, and the outlines of a solution are coming into focus. It's important not to become enamored with what you have, because it stops you from finding the truth and winning. Therefore, think about ways in which you could be wrong, situations in which the insights don't apply or in which the solution breaks. Maybe you realize the problem is a bit ill-defined, so you refactor it.
The process here is: break the solution, deeply understand why it breaks, and repeat. Don't get stuck with patches; there's a rhythm you pick up on in AI alignment, where good solutions have a certain flavor of integrity and compactness. It's OK if you don't find it right away. The key thing to keep in mind is that you aren't trying to pass the test cases, but rather to find brick after brick of insight to build a firm foundation of deep comprehension. You aren't trying to find the right equation, you're trying to find the state of mind that makes the right equation obvious. You want to understand new pieces of the world, and maybe one day, those pieces will make the difference.
ETA: I think a lot of these skills apply more broadly. Emotional trust in one's own ability to think seems important for taking actions that aren't e.g. prescribed by an authority figure. Letting myself just think lets me be light on my mental feet, and bold in where those feet lead me.
ETA 2: Apparently simulating drop-caps:
ike this
isn't the greatest idea. Formatting edit.
Very clever.
You're right that Said's criticism was substantive, and I didn't mean to downplay that in my comment. I do, in fact, think that Said is right: my formatting messes with archiving and search, and there are better alternatives. He has successfully persuaded me of this. In fact, I'll update the post after writing this comment!
The reason I made that comment is that I notice his tone makes it harder for me to update on and accept his argumentation. Although an ideal reasoner might not mind, I do. The additional difficulty of updating is a real cost, and the tone just seemed consistently unreasonable for the situation.
I don't think we should just prioritize authors' simply getting thicker skin, although I agree it's good for authors to strive for individually. Here is some of my reasoning.
Suppose I were a newcomer to the site, I wrote a post about my research habits, and then I recieved this comment thread in return. Would I write more posts? 2017-me would not have done so. Suppose I even saw this happening on someone else's post. Would I write posts? No. I, like many people I have anecdotally heard of, was already intimidated by the percieved harshness of this site's comments. I think you might not currently be appreciating how brutal this site can look at first. If there are tradeoffs we can make along the lines of saying "if you're resorting to X" instead of "if you're stooping to X", tradeoffs that don't really lose much informational content but do significantly reduce potential for abrasion, it seems sensible to make them.
Truly thickening one's skin seems pretty difficult. Maybe I can just sit down and Internal Double Crux this, but even if so, do we just expect authors to do this in general?
Microhedonics. If one has reasonably but imperfectly thick skin, then the author might be slightly discouraged from engaging with the community. Obviously there is a balance to be struck here, but the line I drew does not seem unreasonable to me.
ETA: My comment also wasn't saying that people have to specifically follow the scripted example. They don't need to say they just prefer X, or whatever. The "good" example is probably overly flowery. Just avoid being needlessly abrasive.