For 1., you could use the existing LM to judge whether each training datum ought to be included, or you could curate less than GPT-3 by also including reddit links with <3 karma.
For 2., you mean it should take natural-language peer-review-like feedback?
For 3., I suspect that such tasks scale given a different prompt.
I'm Ethan Perez, a final year PhD student at NYU, working on aligning language models with human preferences. I'm looking to hire research interns to work on projects in this space, starting early 2022. I expect candidates to have strong software engineering ability, for ML engineering (e.g., to finetune GPT2 to good performance on a new task) or data engineering (e.g., to quickly find high quality subsets of text within petabytes of Common Crawl data). Ideal candidates will have experience doing ML and/or NLP research, reading papers, and coming up with ideas to test. I'm looking for people who'd be able to work full-time (remotely), with compensation of $70/hour. I expect each project to take 4-8 months to complete and lead to a first-author publication at a top-tier ML or NLP conference.
Below are a few examples of projects I have in mind:
If what I've described sounds like a good fit, I'd love to hear from you over email! Just send me your website, resume, LinkedIn, GitHub, Google Scholar, or anything else that might be helpful for understanding your background (my email is perez at nyu dot edu).