https://github.blog/2021-06-29-introducing-github-copilot-ai-pair-programmer/
GitHub Copilot is powered by OpenAI Codex, a new AI system created by OpenAI. OpenAI Codex has broad knowledge of how people use code and is significantly more capable than GPT-3 in code generation, in part, because it was trained on a data set that includes a much larger concentration of public source code.
Will Copilot or similar systems become ubiquitous in the next few years? Will they increase the speed of software development or AI research? Will they change the skills necessary for software development?
Is this the first big commercial application of the techniques that produced GPT-3?
For anyone who's used Copilot, what was your experience like?
(Disclaimer: I work at OpenAI, and I worked on the models/research behind copilot. You should probably model me as a biased party)
I'll take the other side to that bet (the null hypothesis), provided the "significantly" unpacks to something reasonable. I'll possibly even pay to hire the contractors to run the experiment.
I think a lot of people make a lot of claims about new tech that will have a significant impact that end up falling flat. A new browser will revolutionize this or that; a new website programming library will make apps significantly easier, etc etc.
I think a good case in point is TypeScript. JavaScript is the most common language on the internet. TypeScript adds strong typing (and all sorts of other strong guarantees) and has been around for a while. However I would not say that TypeScript has significantly impacted the security/infosec situation.
I think my prediction is that Copilot does not significantly affect the computer security/infosec situation.
It's worth separating out that this line of research -- in particular training large language models on code data -- probably has a lot more possible avenues of impact than a code completer in VS Code. My prediction is not about the sum of all large language models trained on code data.
I also do think we agree that it would be good if models always produced the code-we-didnt-even-know-we-wanted, but for now I'm a little bit wary of models that can do things like optimize code outside of our ability to notice/perceive.