Will a US patent be litigated? AI can predict this with an accuracy of 79%

Patent litigation continues to be common in the United States, with just under 4000 cases filed in 2020, an increase of 11% from the 2019 figure. This is despite patent litigation being so expensive, with costs ranging from USD700K up to 4 million or more.

Given these costs, and also that litigated patents might be more valuable (on average) than patents that are not, it could be worthwhile to be able to predict which patents are litigated. This can help these owners put in appropriate risk management strategies, and also to help IP insurers to price insurance premiums.

Since it would be very useful if we could make such predictions, I investigated to see if this was possible using machine learning models.

A dataset of litigated patents

I used Patseer (patent searching database) to search for US patents granted after 2011 that have been litigated. and found around 18,000 such patents. The leading owners of these patents were Abbie (pharmaceuticals), Universal Entertainment (Japanese developer of gaming technologies and Intellectual Ventures (a patent investor).

The most common technical areas for the opposed patents was computer technology, followed by pharmaceuticals and biotechnology:

Building a prediction model

In order to create a prediction model, I took the above dataset of 18,000 litigated patents, and added to this dataset a list of 18,000 ‘randomly selected’ not-litigated US patents of a very similar age profile (I did this by adding 10 to each of the patent numbers for the opposed patents). This led to a final dataset of around 36,000 patents.

I extracted bibliographical data for all 36,000 patents from Patseer.

I then applied a range of machine learning (also known as ‘artificial intelligence’, or AI) models to this data, with the outcome being a predicted likelihood of opposition. The final model had an accuracy score of 79%.

So what does an accuracy of 79% mean in practice?

This can be explained by the figure below. This shows the range of predicted probabilities on the X-axis, grouped into a series of ‘buckets’ (a predicted 0 to 5% probability of litigation, 5 to 10% probability, etc), while the Y-axis shows for each of these buckets how many of the patents in each bucket were actually litigated - or not.

Looking at this in more detail:

The orange bars show the patents that were not opposed in reality. If we look at the left-hand-most bar (which is zero to 5% predicted probability of litigation), we can see that the vast majority of these patents (355 out of 362 patents to be precise) were not litigated.
The blue bars show the patents that were litigated in reality. If we look at the right-hand bar (which is 95% to 100% predicted probability of litigation), we can see that the vast majority of these patents (986 out of 1003 patents) were litigated.

So the prediction of litigation is working as we might expect - but this graph also shows that this prediction is not perfect - but in my view, definitely helpful.

Some examples of prediction scores

The utility of these prediction scores might be easier to understand if we consider some examples.

The patent in the database with the highest predicted probability of being opposed (99.4%) was US9205056B2, filed by Perdue Pharmaceuticals for Controlled release hydrocodone formulations. This patent was litigated by Alvogen Pine Brook LLC, and it is believed that the litigation was settled.
The patent in the database with the third-highest predicted probability of being opposed (99.3%) was US8192756B2, filed by Depomod Systems for a Gastric retained gabapentin dosage form. Both Google Patent and RPX show that this patent was litigated, but I could not find any further publically available details.
The patent in the database with the third-highest predicted probability of being opposed (99.2%) was US9131045B2, filed by Ultratec for a System for text assisted telephony. This patent was opposed by Captioncall LLC, which led to all claims being ruled as unpatentable after a PTAB Inter Partes Review.
The patent in the dataset with the lowest probability of being opposed (1.2%) was US7886701B2, filed by Schaeffler Technologies for an Apparatus for the variable setting of the control times of gas exchange valves of an internal combustion engine. This patent was not litigated.

What can we do with this model?

This machine learning can predict, with an accuracy of 79%, whether or not a granted US patent will be litigated. We can now use this model to make predictions from other US patents. The outcome will be a percentile likelihood of litigation.

I expect that these predictions will be helpful to patent attorneys, patent owners and patent insurers. If you think these predictions will be helpful to you, please contact me and I would be happy to provide litigation prediction scores for patents that you nominate.

Jan 13 Will a US patent be litigated? AI can predict this with an accuracy of 79%

A dataset of litigated patents

Building a prediction model

Some examples of prediction scores

What can we do with this model?

Jul 1 Sydney and Melbourne lead IP filings in Australia

Dec 17 Will an Australian patent be opposed? AI can predict this with an accuracy of 76%