I first heard about OpenAI’s “research preview” of a remarkably human-like chatbot around the time I was attending NeurIPS 2022. My initial reaction was somewhat skeptical. I had worked on conversational agents as my master’s research at Carnegie Mellon University (CMU) back in 2016–2017, and at that time, the quality of most systems was pretty underwhelming. For context, Amazon Alexa once hosted a multimillion-dollar challenge for universities to build a version of Alexa that could hold a coherent 20-minute conversation—yet the winning team only managed about ten and a half minutes.
So, I expected this new chatbot to be just another clunky model. But after trying it—like everyone else—I was completely blown away. It quickly became clear we were on the verge of a new era in AI. Since then, the entire industry has shifted course, with every major player now racing to build their own GPT-like systems.
A Shift From Open to Closed?
In August 2024,Elon Musk filed a lawsuit in Northern California against OpenAI and two of its founders, Sam Altman and Greg Brockman, claiming that the company had strayed from its founding nonprofit mission and put profits ahead of the public good. Musk was an early investor and co-chaired OpenAI’s board when it launched in 2015. In his lawsuit, he described it as a “textbook tale of altruism versus greed,” alleging that Altman and Brockman “intentionally courted and deceived” him about the nature of OpenAI’s transition to a for-profit structure. Some people speculated he just had FOMO, given he was once closely involved with what might be humanity’s most powerful technology. Whatever the case, in ChatGPT’s early days, I could understand OpenAI’s reasoning: they worried that giving everyone access to such potent models could lead to malicious uses. Perhaps they wanted to fully understand the risks before going completely open-source.
I used to love OpenAI for posting nearly every idea and contribution publicly. Hearing that they might be shifting further from open-source to closed-source is perplexing—and it sparks a lot of questions in my mind. I’m not here to disrespect or accuse OpenAI of any wrongdoing; I truly believe they’re working for the greater good. But I am curious: Why can’t OpenAI remain open-source? Here are some of the questions I keep coming back to:
Capital Intensity and Open Sourcing In an interview, Sam Altman mentioned that building and training such models requires vast resources. Does this imply that open-source labs—which often work with fewer funds—can’t develop large-scale models? This seems at odds with the progress made by open-source communities, who regularly share code, model weights, and training data.
Investor Pressure vs. Nonprofit Ambitions I don’t know the details of OpenAI’s arrangement with Microsoft or other backers, but could investors be driving them toward a more conventional for-profit structure? That seems somewhat unlikely to me since a nonprofit approach might still have attracted enough funding or donations (though maybe not to the scale of a major corporate partnership).
If you have any insights or new questions about this, I’d love to hear them. I’m writing all this from a place of curiosity (and maybe naïve hope) that open-source remains an option for OpenAI.
What If OpenAI Was Open Source?
The open-source community is far bigger and more passionate than most people realize. People pour their time and energy into building, breaking, and experimenting with projects for nothing more than pure curiosity. If OpenAI had maintained a fully open-source philosophy, I can’t help but wonder how much more quickly we might have approached a true form of AGI.
Imagine if OpenAI had released not only its code but also the model weights for GPT-4. Everyone—from researchers in academia to startups—would be able to delve into these architectures without bearing the astronomical cost of training them from scratch. We already know these Large Language Models (LLMs) exhibit surprising (and sometimes counterintuitive) behaviors. In fact, the training of modern LLMs often contradicts what traditional machine learning conventions would suggest—for example, a nice paper by DeepMind demonstrates patterns that clash with standard ML wisdom. If GPT-4 was fully open, researchers worldwide could investigate and possibly explain these peculiar behaviors, speeding up our collective understanding and advancement in the field.
Bridging Curiosity and Concerns
Open access could also temper the public’s fears and misconceptions. Transparency often counters misinformation: the more people who can rigorously test and inspect a system, the less it seems like impenetrable “magic.” That, in turn, reduces the speculative doomsday scenarios and encourages real, data-driven discussions about a model’s capabilities and risks.
From my perspective, these AI systems aren’t magical; they’re powerful algorithms whose performance keeps exceeding expectations at a breathtaking pace. Still, they remain a mystery to many until a major player like OpenAI decides to open up more extensively. Such openness might be the key to both accelerating safe progress and fostering trust among researchers, developers, and the public.
Are there other reasons OpenAI might be moving away from open-source?
I first heard about OpenAI’s “research preview” of a remarkably human-like chatbot around the time I was attending NeurIPS 2022. My initial reaction was somewhat skeptical. I had worked on conversational agents as my master’s research at Carnegie Mellon University (CMU) back in 2016–2017, and at that time, the quality of most systems was pretty underwhelming. For context, Amazon Alexa once hosted a multimillion-dollar challenge for universities to build a version of Alexa that could hold a coherent 20-minute conversation—yet the winning team only managed about ten and a half minutes.
So, I expected this new chatbot to be just another clunky model. But after trying it—like everyone else—I was completely blown away. It quickly became clear we were on the verge of a new era in AI. Since then, the entire industry has shifted course, with every major player now racing to build their own GPT-like systems.
A Shift From Open to Closed?
In August 2024, Elon Musk filed a lawsuit in Northern California against OpenAI and two of its founders, Sam Altman and Greg Brockman, claiming that the company had strayed from its founding nonprofit mission and put profits ahead of the public good. Musk was an early investor and co-chaired OpenAI’s board when it launched in 2015. In his lawsuit, he described it as a “textbook tale of altruism versus greed,” alleging that Altman and Brockman “intentionally courted and deceived” him about the nature of OpenAI’s transition to a for-profit structure. Some people speculated he just had FOMO, given he was once closely involved with what might be humanity’s most powerful technology. Whatever the case, in ChatGPT’s early days, I could understand OpenAI’s reasoning: they worried that giving everyone access to such potent models could lead to malicious uses. Perhaps they wanted to fully understand the risks before going completely open-source.
I used to love OpenAI for posting nearly every idea and contribution publicly. Hearing that they might be shifting further from open-source to closed-source is perplexing—and it sparks a lot of questions in my mind. I’m not here to disrespect or accuse OpenAI of any wrongdoing; I truly believe they’re working for the greater good. But I am curious: Why can’t OpenAI remain open-source? Here are some of the questions I keep coming back to:
In an interview, Sam Altman mentioned that building and training such models requires vast resources. Does this imply that open-source labs—which often work with fewer funds—can’t develop large-scale models? This seems at odds with the progress made by open-source communities, who regularly share code, model weights, and training data.
I don’t know the details of OpenAI’s arrangement with Microsoft or other backers, but could investors be driving them toward a more conventional for-profit structure? That seems somewhat unlikely to me since a nonprofit approach might still have attracted enough funding or donations (though maybe not to the scale of a major corporate partnership).
If you have any insights or new questions about this, I’d love to hear them. I’m writing all this from a place of curiosity (and maybe naïve hope) that open-source remains an option for OpenAI.
What If OpenAI Was Open Source?
The open-source community is far bigger and more passionate than most people realize. People pour their time and energy into building, breaking, and experimenting with projects for nothing more than pure curiosity. If OpenAI had maintained a fully open-source philosophy, I can’t help but wonder how much more quickly we might have approached a true form of AGI.
Imagine if OpenAI had released not only its code but also the model weights for GPT-4. Everyone—from researchers in academia to startups—would be able to delve into these architectures without bearing the astronomical cost of training them from scratch. We already know these Large Language Models (LLMs) exhibit surprising (and sometimes counterintuitive) behaviors. In fact, the training of modern LLMs often contradicts what traditional machine learning conventions would suggest—for example, a nice paper by DeepMind demonstrates patterns that clash with standard ML wisdom. If GPT-4 was fully open, researchers worldwide could investigate and possibly explain these peculiar behaviors, speeding up our collective understanding and advancement in the field.
Bridging Curiosity and Concerns
Open access could also temper the public’s fears and misconceptions. Transparency often counters misinformation: the more people who can rigorously test and inspect a system, the less it seems like impenetrable “magic.” That, in turn, reduces the speculative doomsday scenarios and encourages real, data-driven discussions about a model’s capabilities and risks.
From my perspective, these AI systems aren’t magical; they’re powerful algorithms whose performance keeps exceeding expectations at a breathtaking pace. Still, they remain a mystery to many until a major player like OpenAI decides to open up more extensively. Such openness might be the key to both accelerating safe progress and fostering trust among researchers, developers, and the public.
Are there other reasons OpenAI might be moving away from open-source?