I may be completely confused about this, but my model of technological breakthroughs in history was basically this: A few guys independently connect the dots leading to a new invention, for example the telephone, approximately at the same time. One of them runs to the patent office a little faster than the others, and he gets the patent first. Now he gets to be forever known as the inventor of the telephone, and the rest of them are screwed; if they ever try to sell their own inventions, they will probably get sued to bankruptcy.

Today, we have a few different companies selling AIs (LLMs). What is different this time?

  • Is my model of the history wrong?
  • Is the current legal situation with patents different?
  • Are LLMs somehow fundamentally different from all other previous inventions?
  • Is it the fact that anyone can immediately publish all tiny partial steps in online journals, that makes it virtually impossible for any individual or institution to legally acquire the credit -- and the monopoly -- for the entire invention?
  • Or something else...?
New Answer
New Comment

4 Answers sorted by

faul_sname

2011

Is the current legal situation with patents different?

My understanding is that Google did patent transformers, but the patent explicitly only covered encoder/decoder architectures and e.g. GPT-2 uses a decoder-only architecture and so not covered under that patent (and that it would have been very hard for OpenAI to obtain and defend a patent for decoder-only transformers due to Google's prior art).

If your question is, instead, "why didn't the first person to come up with the idea of using computers to predict the next element in a sequence patent that idea, in full generality", keep in mind that (POSIWID aside) patents are intended "to promote the progress of science and useful arts". They are not meant as a way of allowing the first person to come up with an idea to prevent all further research in vaguely adjacent fields.

As a concrete example of the sorts of things patents don't do, take O'Reilly v. Morse, 56 U.S. 62 (1853). In his patent application, Morse claimed

Eighth. I do not propose to limit myself to the specific machinery or parts of machinery described in the foregoing specification and claims; the essence of my invention being the use of the motive power of the electric or galvanic current, which I call electro-magnetism, however developed for marking or printing intelligible characters, signs, or letters, at any distances, being a new application of that power of which I claim to be the first inventor or discoverer.

The court's decision stated

If this claim can be maintained, it matters not by what process or machinery the result is accomplished. For aught that we now know some future inventor, in the onward march of science, may discover a mode of writing or printing at a distance by means of the electric or galvanic current, without using any part of the process or combination set forth in the plaintiff's specification. His invention may be less complicated-less liable to get out of order-less expensive in construction, and its operation. But yet if it is covered by this patent the inventor could not use it, nor the public have the benefit of it without the permission of this patentee. [...] In fine, he claims an exclusive right to use a manner and process which he has not described and indeed had not invented, and therefore could not describe when he obtained his patent. The court is of opinion that the claim is too broad, and not warranted by law.

...which might have something to do with autoregressive language models being more popular than encoder/decoder ones.

"why didn't the first person to come up with the idea of using computers to predict the next element in a sequence patent that idea, in full generality"

 

Patents are valid for about 20 years. But Bengio et al used NNs to predict the next word back in 2000:

https://papers.nips.cc/paper_files/paper/2000/file/728f206c2a01bf572b5940d7d9a8fa4c-Paper.pdf

So this idea is old. Only some specific architectural aspects are new.

patents are intended "to promote the progress of science and useful arts". 

I knew this is how patents were supposed to work in theory, but I also assumed that the actual practice is different. People complain about patent trolls, patents being granted for trivial applications of existing ideas, patent claims written in a maximally vague way that later allows lawyers to claim that they apply to all kinds of things that the patent owner didn't even think about at the time, etc.

Amazon had "one click" patented, how did that promote the progress of science and useful arts?

3[anonymous]
All of these things indeed happen, but if they get resolved, this tends to happen in subsequent litigation for patent infringement, in which the party that gets accused of infringing raises the defense of invalidity, which then gets resolved by factfinders and courts.  In practice, it is relatively easy to get a patent approved because this is initially[1] not an explicitly adversarial process: the PTO (Patent and Trademark Office, in the US) simply reviews your patent claim and says 'yes'/'no' without usually getting direct input from your competitors/adversaries/random other people that might publicly assert your patent is nonsense. But a patent alone does not physically cause most meaningful stuff to happen: in order to actually exclude others from making, buying, or selling the invention, you need to file a specific cause of action in court. And that's when the bogus patent claims are usually brought down (if the alleged infringer fights back): they are superficially reasonable enough to get past the PTO, but not past an explicitly adversarial process in which the opponent's attorney explains to an unbiased and experienced judge why the patent is invalid. So what's the whole deal about patent trolls and other stuff like that? Well, it goes back to a clause I wrote in my first paragraph: "if they get resolved." Note that, in the story I told above, it might be easy to defeat a bogus patent in court, but you must generally still go to court in the first place. And this is a significant deterrent in many situations because of the necessary investments of time and resources, such as money. This becomes particularly prohibitive given the American rule that governs most situations that arise in the US and basically says that each party is responsible for paying its own attorney's fees (barring exceptional circumstances), regardless of who wins the case. So patent trolling persists not because terrible patents routinely survive close scrutiny, but instead because, a
2Viliam
OK, now it seems to me that the nature of the patent battle is different when it is "inventor vs inventor" or "corporation vs corporation". In a "corporation vs corporation" battle, stupid patents are destroyed at the court. In an "inventor vs inventor" battle, if the first inventor becomes a successful entrepreneur (or joins forces with one), it becomes an asymmetric "corporation vs other inventors" battle, and the other inventors lose. So I guess the answer to my original question is: because this time, multiple corporations immediately saw that this is going to be super profitable, so they keep each other in check. (I suppose if there was some genius in a garage with some revolutionary ideas trying to compete with the established AI companies, he would still get asymmetrically squashed like a bug... maybe using patents, maybe something else.)

Matthew Barnett

115

What is different this time?

I'm not confident in the full answer to this question, but I can give some informed speculation. AI progress seems to rely principally on two driving forces:

  • Scaling hardware, i.e., making training runs larger, increasing model size, and scaling datasets.
  • Software progress, which includes everything from architectural improvements to methods of filtering datasets.

On the hardware scaling side, there's very little that an AI lab can patent. The hardware itself may be patentable: for example, NVIDIA enjoys a patent on the H100. However, the mere idea of scaling hardware and training for longer are abstract ideas that are generally not legally possible to patent. This may help explain why NVIDIA currently has a virtual monopoly on producing AI GPUs, but there is essentially no barrier to entry for simply using NVIDIA's GPUs to train a state of the art LLM.

On the software side, it gets a little more complicated. US courts have generally held that abstract specifications of algorithms are not subject to patents, even though specific implementations of those algorithms are often patentable. As one Federal Circuit Judge has explained,

In short, [software and business-method patents], although frequently dressed up in the argot of invention, simply describe a problem, announce purely functional steps that purport to solve the problem, and recite standard computer operations to perform some of those steps. The principal flaw in these patents is that they do not contain an "inventive concept" that solves practical problems and ensures that the patent is directed to something "significantly more than" the ineligible abstract idea itself. See CLS Bank, 134 S. Ct. at 2355, 2357; Mayo, 132 S. Ct. at 1294. As such, they represent little more than functional descriptions of objectives, rather than inventive solutions. In addition, because they describe the claimed methods in functional terms, they preempt any subsequent specific solutions to the problem at issue. See CLS Bank, 134 S. Ct. at 2354; Mayo, 132 S. Ct. at 1301-02. It is for those reasons that the Supreme Court has characterized such patents as claiming "abstract ideas" and has held that they are not directed to patentable subject matter.

This generally limits the degree to which an AI lab can patent the concepts underlying LLMs, and thereby try to restrict competition via the legal process. 

Note, however, that standard economic models of economies of scale generally predict that there should be a high concentration of firms in capital-intensive industries, which seems to be true for AI as a result of massive hardware scaling. This happens even in the absence of regulatory barriers or government-granted monopolies, and it predicts what we observe fairly well: a small number of large companies at the forefront of AI development.

dr_s

42

I think your model only applies to some famous cases, but ignored others. Who invented computers? Who invented television networks? Who invented the internet?

Lots of things have inventors and patents only for specific chunks of them, or specific versions, but are as a whole too big to be encompassed. They're not necessarily very well defined technologies, but systems and concepts that can be implemented in many different ways. In these fields, focusing on patents is likely to be a losing strategy anyway as you'll simply stand still to protect your one increasingly obsolete good idea like Homer Simpson in front of his sugar while everyone else runs circles around you with their legally distinct versions of the same thing that they keep iterating and improving on. I think AI and even LLMs fall under this category. It's specifically quite hard to patent algorithms - and good thing too, or it would really have a chilling effect for the whole field. I think you can patent only a specific implementation of them, but that's very limited; you can't patent the concept of a self-attention layer, for example, as that's just math. And that kind of thing is all it takes to build your own spin on an LLM anyway.

ponkaloupe

10

don't forget the political environment:
- locally, there's a meaningful "break up big tech" current which could make it politically difficult to simultaneously sell AI as a paradigm shift and monopolize it for yourself via the legal apparatus. cynically, firms might view regulation as a path to achieve similar ends but with fewer political repercussions, less blatant than if they leveraged patents.
- globally, the country which presently enjoys the lead in AI sees itself in an economic battle against a competitor unlikely to respect its intellectual property claims. to the degree which states view AI through any lens related of "national defense", there will be some push to maintain competitiveness at least on the global stage.

6 comments, sorted by Click to highlight new comments since:

Your history is definitely wrong. Patents don't enforce themselves. Hollywood is on the west coast to make physical distance from Edison's lawyers and muscle. The Wright brothers went down in history as the inventors of the airplane, but they wasted the rest of their lives fighting over the patents.

Linchpin patents are rare. Maybe you patent one invention to make it just barely work, but that's not the end of the story. Someone else patents something else needed to make it scalable. Now there are two patents and a bilateral monopoly.

None of this is to say that patents were unimportant, so it's not an answer at all.

[Epistemic Status: extremely not endorsed brain noise] New EA cause area just dropped! Do lots of cutting edge algorithmic AI research, and then publish that research, but patent your published research and become a patent troll!

Specifically, Eliezer should copyright the idea of "using the AI to destroy humanity". Then none of the AI companies will be legally allowed to do it! Problem solved.

[-]dr_s70

Omnicide I can get behind, but patent infringement would be a bridge too far!

That's the proper dystopian capitalism.

Result: humanity is destroyed as soon as the patent expires.