In our first blog in this multi-part series, we explored key considerations for protecting artificial intelligence (“AI”) inventions in biotech and synthetic biology. In this part 2 of the series, we will examine some key considerations and hurdles in patenting machine learning-based biotech or synthetic biology inventions.
In this series, we are focusing on artificial intelligence inventions, but as Alan Turing aptly pointed out, that neologism is a “suitcase” term because you can stuff a lot of intelligence classifications and different types of technologies into it. Many of the ground-breaking AI developments in biotech are in the AI subfield of Machine Learning. First, we will briefly discuss what is meant by Machine Learning and discuss some relevant terms. Second, we will review some real world challenges in patenting AI inventions.
What is Machine Learning?
Machine learning (“ML”) is basically a term to cover algorithms that use statistics to find and apply patterns in digitally stored data, which can be images, numbers, words, etc. (For a user-friendly overview on the different terms, please see Karen Hao’s article “What is Machine Learning?” from the MIT Tech Review, available here.) Deep learning is a subfield of machine learning.
There are three general types of ML algorithms: Supervised Learning, Unsupervised Learning, and Reinforcement Learning. The MIT Tech Review published this helpful flow chart to explain what kind of ML the algorithm is using, though if you want a more technical explanation this is a helpful resource.
An ML algorithm is a way of classifying information, and a neural network is a type of algorithm that is meant to classify information the same way a human brain does. For example, a neural network can look at pictures and recognize certain elements, like pixel colors and classify them according to what they show. Neural networks are made up of nodes. A node is an individual computation where an algorithm assigns significance (or weight) to input data, the sum of that information is then passed through the activation function which determines what, if anything, is done with the output.
Here’s a diagram of what one node might look like:
Image Credit: Skymind
A neural network is several nodes together. Deep Learning (“DL”) is when more than three layers of neural networks are stacked.
Image Credit: Oracle
DL has spawned many of the most significant advancements in biotech in the past few years and is continuing to drive advancements. For example, DL can predict how genetic variation alters cellular processes involved in pathogenesis, use patient data to characterize disease progression, or speed up computational methods to predict protein structure.
Patenting Machine Learning Inventions
Applying for patent protection presents certain risks, especially for computer-based inventions. If your invention is merely a way to improve the functioning of a computer, without tying it to a practical application, then there is a significant risk that the patent office that may ultimately reject the application because it is based on ‘ineligible subject matter.’ Abstract ideas are subject matter that is ineligible for patent protection and can include mental processes (concepts performed by the human mind), methods of organizing human activity (such as fundamental economic concepts or managing interactions between peoples), or mathematical relationships, formulas or calculations. This last category is particularly important to AI-based inventions. For example, under U.S. law, an invention that is a stand-alone algorithm is likely to be seen as no more than abstract mathematics and, therefore, not eligible for patent protection.
Mathematical calculations that can be performed by the human mind “are the basic tools of scientific and technological work,” which are “free to all men and reserved exclusively to none.” Mayo Collaborative Servs. v. Prometheus Labs., 566 U.S. 66 (2012). This may seem an absurd restriction to some, as the human mind might be able to carry out the millions of calculations a neural network can perform, even if there is no guarantee that a human mind could finish those calculations in one lifetime. However, permitting patents on basic calculations would cripple scientific exploration and advancement. Therefore, to be eligible for patent protection, an invention centered on an algorithm must significantly advance a specific technical application, not merely use an algorithm to solve a problem. The patent application must explain in detail how the claimed algorithm interacts with the physical infrastructure of the computer, network, or both and explain the real world problem the invention is meant to address.
As previously discussed here and here, the tying of algorithms to real world solutions is a requirement in many jurisdictions globally, including the European Patent Office (EPO) and Israel. For example, new guidelines issued by the European Patent Office stress that the AI inventions must have an application for a specific field of technology. In this respect, patent offices are taking a somewhat technical approach and considering AI elements of an invention as any other software element.
Many AI patents face an uphill battle for patentability due to the use of computer systems and algorithms and the rapidly evolving law surrounding subject matter eligibility. To address the changes in law and stem the many patent application rejections, the U.S. Patent and Trademark Office (USPTO) issued Revised Patent Subject Matter Eligibility Guidance in January 2019 and Patent Eligibility Guidance Update in October 2019 which included examples for the revised subject matter eligibility. USPTO director Andrei Iancu stated recently that rejections of AI related patent applications have dropped from 60% to about 32% since the January 2019 guidelines issued.
- Neural Network for Facial Detection
The USPTO’s Example 39 from the October 2019 Patent Eligibility Guidance Update provides a very helpful example of an allowable patent claim for a method of training a neural network for facial detection. The invention attempts to solve the problem of inaccurate facial recognition through using an expanded training set of facial images and then addressing false positives by retraining the algorithm on a new set of images.
The example claim recites “A computer-implemented method of training a neural network for facial detection comprising: [a set of digital images] training the neural network in a first stage using the first training set; creating a second training set for a second stage of training comprising the first training set and digital non-facial images that are incorrectly detected as facial images after the first stage of training; and training the neural network in a second stage using the second training set.”
The USPTO analysis of this claim finds that it is patent-eligible subject matter, despite including an algorithm, because while “some of the limitations may be based on mathematical concepts, the mathematical concepts are not recited in the claims…” This shows that when an invention involves a neural network, a key focus of the claims should be the inventive means of achieving the result and not the underlying mathematical concepts. For while the claim does mention the computer-implemented method, it does not recite any mathematical relationships, formulas, or calculations.
- Deep-Learning Patents for Generating a Vaccine
One example of an invention that uses deep learning is U.S. Patent No. 10,196,427 “Epitope focusing by variable effective antigen surface concentration.” This invention “provides compositions and methods for the generation of an antibody or immunogenic composition, such as a vaccine, through epitope focusing by variable effective antigen surface concentration.”
According to the disclosure and the abstract, the invention relies heavily on “in silico bioinformatics” meaning, scientific experiments or research conducted or produced by means of computer modeling or computer simulation for the science of collecting and analyzing complex biological data. For example, the disclosure describes neural networks to “generate a map of the protein surfaces of a particular antigen” or to “generate an in silico library of antigenic variants.” The abstract describes one step of the invention as generating ‘in silico’ “a library of potential antigens for use in the immunogenic composition.”
However, the claims avoid tripping up on the subject matter eligibility requirement by not reciting the algorithms or the use of a computer in the claims. The claims merely describe what the computer is used to accomplish, without mentioning that the calculations are performed in silico.
For example, claim 1 recites a “method for eliciting an immune response in a human subject, the method comprising: delivering at least six antigens to the human subject, wherein each of the at least six antigens comprises: a target epitope that is common to each of the at least six antigens; and one or more non-conserved regions that are outside of the target epitope; wherein the at least six antigens are delivered such that each individual antigen of the at least six antigens is delivered in an amount that is insufficient to be immunogenic to the human subject on its own, while the at least six antigens are delivered in a combined amount that is sufficient to generate an immune response to the target epitope in the human subject.” Claim 1 and the remaining claims, all dependent, may contain limitations that are based on mathematical concepts but the claim language does not recite those mathematical concepts.
- Machine Learning for Cancer Diagnosis
Researchers have made many significant advancements in diagnosis of different kinds of cancer through ML. Patenting these types of formulations of aggregated data inventions can be a challenge as inventions that merely present the results of collecting and analyzing information without additional elements that identify a particular tool for the presentation or application of the data, are likely “abstract ideas.” Abstract ideas are another category of unpatentable subject matter and inventions that involve mathematical manipulation of data without additional elements to append that abstract idea are unpatentable.
In a recent example, the U.S. Patent Trial and Appeal Board (PTAB) affirmed an Examiner’s determination that Application No. 13/417,188, aimed at using ML to modernize cancer treatment failed subject matter eligibility. 2018 Pat. App. LEXIS 3052, *3 (PTAB April 19, 2018). In that case, the invention was a way to “connect multiple genomic alterations such as copy number, DNA methylation, somatic mutations, mRNA expression and microRNA expression” to create an “[i]ntegrated pathway analysis  expected to increase the precision and sensitivity of causal interpretations for large sets of observations.”
Claim 1 of the patent application read as follows: “1. A method of conveying biological sequence data, comprising: generating a data packet including a first header containing network routing information, a second header containing header information pertaining to the biological sequence data, and a payload containing a representation of the biological sequence data relative to a reference sequence; storing the data packet in a queue in communication with a network interface; and transmitting the data packet over a network accessible through the network interface.”
The patent application was rejected by the USPTO as ineligible subject matter because the claimed “method of generating a dynamic pathway map (DPM)” was merely “algorithmic concepts involving the mathematical manipulation of data.” The Examiner determined that the “claims do not include additional elements/steps appended to the abstract idea that are sufficient to amount to significantly more than” mathematical concepts and that, even though the additional elements appended to the abstract idea integrated multiple data sources to identify reproducible and interpretable molecular signatures of tumorigenesis and progression, those elements were “routine and conventional techniques for collecting data.”
In addition to the abstract idea issues, the 13/417,188 application it was also rejected by the USPTO for double patenting, which means that another patent application filed by the same inventors presumably covered the same technology. Interestingly, the USPTO issued patent No. 10,192,641 on that other patent application. That other application included a limitation in claim 1 that reads: “formulating a treatment option for the patient based on the reference pathway activity of the factor graph, wherein at least one of the above method operations is performed through a processor.” This limitation may have provided the missing additional steps appended to the abstract idea to amount to sufficiently more than mathematical concepts.
- Machine Learning to Create Sustainable Bioplastics
Many nascent protein engineering technology companies are developing fascinating sustainably sourced products using ML. One such company is Arzeda which is developing scratch proof computer screens for cell phones using a renewable source you might not believe – tulips. Arzeda has ported the metabolic pathway responsible for making a natural molecule called tulipalin, found in tulips, into industrial microbes. Arzeda is harnessing the power of machine learning to combine protein design, pathway design, HT screening and strain construction, to create and improve designer fermentation strains for virtually any chemical.
Arzeda’s U.S. Patent No. 10,025,900 describes its invention as providing “computational methods for engineering, selecting, and/or identifying proteins with a desired activity,” but as we have seen with the other successful applications, the claims do not state the mathematical equations, but rather the process to obtain the desired results. Here is part of claim 1 of the ’900 patent:
“(c) computationally selecting one or more amino acid sequences having structural homology and/or sequence homology to the template protein having the enzymatic activity;
(d) providing a structural model for each of the amino acid sequences selected in step (c);
(e) selecting the amino acid sequences satisfying the functional site description comprising steps of computationally docking a ligand and optimizing positioning of amino acid side chains and main chain atoms of the amino acid sequences; and
(f) recombinantly expressing and confirming the enzymatic activity for at least one of the amino acid sequences that satisfies the functional site description selected from step (e), thereby making the protein having the enzymatic activity.”
Key Lessons in Patenting Machine Learning Inventions
There are two key takeaways from the USPTO guidelines and the successful ML-based patent applications. First, focus on the requirements for how the desired result is achieved. Similar to the USPTO requirements for ML patents, the EPO guidelines require a similar “inventive step” and a “further technical effect,” which you can read about here. Second, use caution when reciting specific mathematical equations within the claim language.
Our next post in this series will focusing on the challenges and benefits of protecting your AI biotech inventions under trade secret law and how to determine what kind of IP protection, patents or trade secret, would be most beneficial for your AI biotech invention.