This looks extremely comprehensive and useful, thanks a lot for writing it! Some of my favourite tips (like clipboard managers and rectangle) were included, which is always a good sign. And I strongly agree with "Cursor/LLM-assisted coding is basically mandatory".
I passed this on to my mentees - not all of this transfers to mech interp, in particular the time between experiments is often much shorter (eg a few minutes, or even seconds) and often almost an entire project is in de-risking mode, but much of it transfers. And the ability to get shit done fast is super important
Thanks Neel! I'm glad you found it helpful. If you or your scholars recommend any other tools not mentioned in the post, I'd be interested to hear more.
I've been really enjoying voice to text + LLMs recently, via a great Mac App called Super Whisper (which can work with local speech to text models, so could also possibly be used for confidential stuff) - combining Super Whisper and Claude and Cursor means I can just vaguely ramble at my laptop about what experiments should happen and they happen, it's magical!
What keybinding do you set for it?
IIRC, ⌥+space conflicts with the default for ChatGPT and Alfred.app (which I use for clipboard history).
For anyone else who stumbles across this thread: when modifying the superwhisper toggle settings, hit spacebar then control, instead of control then spacebar. Also, it turns out that Control + Space is the default shortcut for switching keyboard input sources (at least on macOS Sequoia 15.3.1), make sure to disable that by going to System Settings → Keyboard → Keyboard Shortcuts → Input Sources.
Tmux allows you to set up multiple panes in your terminal that keep running in the background. Therefore, if you disconnect from a remote machine, scripts running in tmux will not be killed. We tend to run experiments across many tmux panes (especially overnight).
Does no one use suffix & disown
which sends a command to a background process that doesn't depend on the ssh process, or prefix nohup
which does the same thing? You have to make sure any logging that goes to stdout goes to a log file instead (and in this way tmux or screen are better)
Your remark about uv: forgot to mention that it's effectively a poetry replacement, too
Like htop: btm and btop are a little newer, nicer to look at
Also for json: jq
. cat file.json | jq
for pretty printing json to terminal
I didn't learn about disown or nohup until recently, because there was no impetus to, because I'd been using tmux. (My workflow also otherwise depended on tmux; when developing locally I liked its method of managing terminal tabs/splits.)
I find it useful to employ text based tools more than UI/UX based tools as they integrate faster with LLMs. For example, AiChat (https://github.com/sigoden/aichat/) does many things: chat with most (all?) models from command line / your text editor, upload files (pdfs, jpg, etc.), execute bash commands and more. It can take stdin and outputs into stdout so you can chain your tools.
This is extremely useful for people like me who are just starting out with alignment research (especially parts 3 and 4). Thanks a lot for sharing!
I have one question:
asyncio is very important to learn for empirical LLM research since it usually involves many concurrent API calls
I've lots of asyncio
experience, but I've never seen a reason to use it for concurrent API calls, because concurrent.futures, especially ThreadPoolExecutor, work as well for concurrent API calls and are more convenient than asyncio
(you don't need await
, you don't need the loop etc).
Am I missing something? Or is this just a matter of taste?
I recently switched from using threads to using asyncio, even though I had never used asyncio before.
It was a combination of:
Maybe the general point is that threads have more overhead, and if you're doing many thousands of things in parallel, asyncio can handle it more reliably.
Threads are managed by the OS and each thread has an overhead in starting up/switching. The asyncio coroutines are more lightweight since they are managed within the Python runtime (rather than OS) and share the memory within the main thread. This allows you to use tens of thousands of async coroutines, which isn't possible with threads AFAIK. So I recommend asyncio for LLM API calls since often, in my experience, I need to scale up to thousands of concurrents. In my opinion, learning about asyncio is a very high ROI for empirical research.
Our research is centered on empirical research with LLMs. If you are conducting similar research, these tips and tools may help streamline your workflow and increase experiment velocity. We are also releasing two repositories to promote sharing more tooling within the AI safety community.
John Hughes is an independent alignment researcher working with Ethan Perez and was a MATS mentee in the Summer of 2023. In Ethan's previous writeup on research tips, he explains the criteria that strong collaborators often have, and he puts 70% weight on "getting ideas to work quickly." Part of being able to do this is knowing what tools there are at your disposal.
This post, written primarily by John, shares the tools and principles we both use to increase our experimental velocity. Many readers will already know much of this, but we wanted to be comprehensive, so it is a good resource for new researchers (e.g., those starting MATS). If you are a well-versed experimentalist, we recommend checking out the tools in Part 2—you might find some new ones to add to your toolkit. We're also excited to learn from the community, so please feel free to share what works for you in the comments!
Quick Summary
uv
for Python package management.Part 1: Workflow Tips
Terminal
Efficient terminal navigation is essential for productivity, especially when working on tasks like running API inference jobs or GPU fine-tuning on remote machines. Managing directories, editing files, or handling your Git repository can feel tedious when relying solely on bash commands in a standard terminal. Here are some ways to make working in the terminal more intuitive and efficient.
zsh-autosuggestions
— suggests commands based on your history as you typezsh-syntax-highlighting
— syntax highlighting within the terminalzsh-completions
— complete some bash commands with tabzsh-history-substring-search
— type any substring of a previous command (it doesn't have to be the start) and use up and down keys to cycle through relevant history.
like~/.zshrc
and~/.tmux.conf
.gc
forgit commit -m
and many more in this file. Here are two which save a lot of time:rl
for getting the absolute file path followed by copying to your clipboard is incredibly helpful (see custom bins in here; big shout out to Ed Rees for this one)ls
aftercd
so, when you change the directory, you always see the files contained there.Note: there are many recommendations here, which can be overwhelming, but all of this is automated in John's dotfiles (including installing zsh and tmux, changing key repeat speeds on Mac and setting up aliases). So, if you'd like to get going quickly, we recommend following the README to install and deploy this configuration.
Integrated Development Environment (IDE)
Choosing the right Integrated Development Environment (IDE) can enhance your productivity, especially when using LLM coding assistants. A good IDE simplifies code navigation, debugging, and version control.
.cursorrules
file which informs the LLM how to act.breakpoint()
within your code is also very useful and often quicker than debugging with print statements.jsonl
filesGit, GitHub and Pre-Commit Hooks
Mastering Git, GitHub, and pre-commit hooks is key to maintaining a smooth and reliable workflow. These tools help you manage version control, collaborate effectively, and automate code quality checks to prevent errors before they happen.
.pre-commit-config.yaml
(e.g. here), config withinpyproject.toml
(e.g. here) and a Makefile (e.g. here) in the root of your repo. You must first pip install pre-commit and then runmake hooks
.Part 2: Useful Tools
Not all of these recommendations are directly related to research (e.g., time-tracking apps), but they are excellent productivity tools worth knowing about. The goal of this list is to make you aware of what’s available—not to encourage you to adopt all of these tools at once, but to provide options you can explore and incorporate as needed.
Software/Subscriptions
LLM Tools
LLM Providers
Command Line and Python Packages
pip
,pyenv
, andvirtualenv
. It is 10-100x faster thanpip
!Part 3: Experiment Tips
De-risk and extended project mode
First, we'd like to explain that there are usually two modes that a research project is in: namely, de-risk mode and extended project mode. These modes significantly change how you should approach experiments, coding style, and project management.
The workflow should always be conditioned on the situation:
Ethan tends to be in de-risk mode for 75% of his work, and he uses Python notebooks to explore ideas (for example, many-shot jailbreaking was derisked in a notebook with ~50 lines of code). The Alignment Science team at Anthropic is also primarily in "de-risk mode" for initial alignment experiments and sometimes switches to "Extended project mode" for larger, sustained efforts.
Note: Apollo defines these modes similarly as "individual sprint mode" and "standard mode" in their Engineering Guide. We opt for different names since lots of the research we are involved with can primarily be in de-risk mode for a long period of time.
Tips for both modes
./experiments/<name>/250109_jailbreaking_technique_v1
1_run_harmbench.sh
,2_run_classifier.sh
,3_analyse_attack_success_rate.ipynb
.Tips for extended project mode
jsonl
file at the end of the experiment with all the metadata, inputs, and outputs is useful..describe()
.fire
,hydra
andsimple_parsing
). We usesimple_parsing
(see example) because it allows you to define your args in a data class which gets automatically initiated and populated from the command line args.Part 4: Shared AI Safety Tooling Repositories
For many early-career researchers, there's an unnecessarily steep learning curve for even figuring out what good norms for their research code should look like in the first place. We're all for people learning and trying things for themselves, but we think it would be great to have the option to do that on top of a solid foundation that has been proven to work for others. That's why things like e.g. the ARENA curriculum are so valuable.
However, there aren't standardised templates/repos for most of the work in empirical alignment research. We think this probably slows down new researchers a lot, requiring them to unnecessarily duplicate work and make decisions that they might not notice are slowing them down. ML research, in general, involves so much tinkering and figuring things out that building from a strong template can be a meaningful speedup and provide a helpful initial learning experience.
For the MATS 7 scholars mentored by Ethan, Jan, Fabien, Mrinank, and others from the Anthropic Alignment Science team, we have created a GitHub organization called safety-research to allow everyone to easily discover and benefit from each others’ code. We are piloting using two repositories: 1) for shared tooling such as inference and fine-tuning tools and 2) providing a template repo to clone at the start of a project that has examples of using the shared tooling. We are open-sourcing these two repositories and would love for others to join us!
Repo 1: safety-tooling
Repo 2: safety-examples
Note: We are very excited about UK AISI's Inspect framework, which also implements lots of what is in safety-tooling and much more (such as tool usage and extensive model graded evaluations). We love the VSCode extension for inspecting log files and the terminal viewer for experiment progress across models and tasks. We aim to build a bigger portfolio of research projects that use Inspect within safety-examples and build more useful research tools that Inspect doesn't support in safety-tooling.
Acknowledgements
We'd like to thank Jack Youstra and Daniel Paleka, as many of the useful tool suggestions stem from conversations with them. For more of their recommendations, check out their blogs here and here. John would like to thank Ed Rees and others at Speechmatics, from whom he's borrowed and adapted dotfiles functionality over the years. Thanks to Sara Price, James Chua, Henry Sleight and Dan Valentine for providing feedback on this post.