TLDR: We’re hiring two research assistants to work on advancing developmental interpretability and other applications of singular learning theory to alignment. 

About Us

Timaeus’s mission is to empower humanity by making breakthrough scientific progress on alignment. Our research focuses on applications of singular learning theory to foundational problems within alignment, such as interpretability (via “developmental interpretability”), out-of-distribution generalization (via “structural generalization”), and inductive biases (via “geometry of program synthesis”). Our team spans Melbourne, the Bay Area, London, and Amsterdam, collaborating remotely to tackle some of the most pressing challenges in AI safety.

For more information on our research and the position, see our Manifund applicationthis update from a few months ago, our previous hiring call and this advice for applicants.

Position Details

  • Title: Research Assistant
  • Location: Remote
  • Duration: 6-month contract with potential for extension.
  • Compensation: Starting at $35 USD per hour as a contractor (no benefits).
  • Start Date: Starting as soon as possible. 

Key Responsibilities

  • Conduct experiments using PyTorch/JAX on language models ranging from small toy systems to billion-parameter models.
  • Collaborate closely with a team of 2-4 researchers.
  • Document and present research findings.
  • Contribute to research papers, reports, and presentations.
  • Maintain detailed research logs.
  • Assist with the development and maintenance of codebases and repositories.

Projects

As a research assistant, you would likely work on one of the following two projects/research directions (this is subject to change):

  • Devinterp of language models: (1) Continue scaling up techniques like local learning coefficient (LLC) estimation to larger models to study the development of LLMs in the 1-10B range. (2) Work on validating the next generations of SLT-derived techniques such as restricted LLC estimation and certain kinds of weight- and data-correlational analysis. This builds towards a suite of SLT-derived tools for automatically identifying and analyzing structure in neural networks. 
  • SLT of safety fine-tuning: Investigate the use of restricted LLCs as a tool for measuring (1) reversibility of safety fine-tuning and (2) susceptibility to jailbreaking. Having now validated many of our predictions around SLT, we are now working hard to make contact with real-world safety questions as quickly as possible. 

See our recent Manifund application for a more in-depth description of this research.

Qualifications

  • Strong Python programming skills, especially with PyTorch and/or JAX.  
  • Strong ML engineering skills (you should have completed at least the equivalent of a course like ARENA).
  • Excellent communication skills. 
  • Ability to work independently in a remote setting.
  • Passion for AI safety and alignment research. 
  • A Bachelor's degree or higher in a technical field is highly desirable.
  • Full-time availability.

Bonus:

  • Familiarity with using LLMs in your workflow is not necessary but a major plus. 
  • Knowledge of SLT and Developmental Interpretability is not required, but is a plus.

Application Process

Promising candidates will be invited for an interview consisting of:

1. A 30-minute background interview, and

2. A 30-minute research-coding interview to assess problem-solving skills in a realistic setting (i.e., you will be allowed expected to use LLMs and whatever else you can come up with).

Apply Now

Interested candidates should submit their applications by July 31st. To apply, please submit your resume, write a brief statement of interest, and answer a few quick questions here.

Join us in shaping the future of AI alignment research. Apply now to be part of the Timaeus team!

New Comment
4 comments, sorted by Click to highlight new comments since:

Strong ML engineering skills (you should have completed at least the equivalent of a course like ARENA).

What other courses would you consider equivalent?

To be clear, I don't care about the particular courses, I care about the skills. 

The application link in the second-last paragraph doesn't work. I see "This form can only be viewed by users in the owner's organisation".

This has been fixed, thanks.