User Comment Replies

Task: Suggest surprising experiments that challenge assumptions

Context: A researcher is considering an alignment proposal that hinges on some key assumptions. They would like to see some suggestions for experiments (either theoreetical thoughts experiments or actual real-world experiments) that could challenge those assumptions. If the experiment has been done, it should report the results.

Input type: An assumption about a powerful AI system

Output type: a suggestion for an experiment that could challenge that assumption. If it has been done already, the re... (read more)

Prize for Alignment Research Tasks

Nicholas Schiefer3y30

Task: Apply an abstract proposal to a concrete ML system

Context: A researcher is reading a highly theoretical alignment paper and is curious about whether/how it might apply to a real world machine learning system, like a large transformer trained using SGD. They would like to see what parts of the ML system would change under this proposal.

Input type: a theoretical alignment proposal and a description of an ML system

Output type: a description of how the ML system would change under the given proposal

Info constraints: none

Instance 1:

Input:

Abstract proposal... (read more)

Prize for Alignment Research Tasks

Nicholas Schiefer3y50

Task: convert mathematical expressions into natural language

Context: A researcher is reading a paper about alignment that contains a lot of well-specified but dense mathematical notation. They would like to see a less terse and more fluent description of the same idea that’s easier to read, similar to what a researcher might say to them at a blackboard while writing the math. This might involve additional context for novices.

Input type: a piece of mathematically-dense but well-specified text from a paper

Output type: a fluent, natural language descirption o... (read more)

1Alex Lawsen 3y

I love this idea mostly because it would hugely improve screen reader options for alignment research.

LESSWRONG
LW

All of Nicholas Schiefer's Comments + Replies