http://ipkitten.blogspot.com/2024/06/using-ai-tools-to-help-assess-inventive.html

The cover article of the May 2024 edition of the CIPA Journal proposed a new test for inventive step using AI. The article was inspired by the EPO’s AI assisted search tool, AI-PreSearch. The CIPA journal article proposes to use an AI derived measurement of semantic similarity between the claims and the prior art as a new test for inventive step. However, in this Kat’s view, using the amount of “similarity” between the claims and the prior art as a test for inventive step would constitute a vast oversimplification of patent law, lacking any correspondence with the established legal concepts of novelty and inventive step. In this Kat’s view, the proposal presented in the CIPA Journal fails to recognise that, whilst AI search-tools such as AI-PreSearch may be excellent at searching the prior art, they possess no functionality for applying complex legal tests. 

EPO AI assisted search: Language models and vector search

Last year the EPO announced the introduction of a new tool to assist Examiners in patent search. According to an article by the EPO Head of Data Science Alexander Klenner-Bajaja, the AI assisted pre-search has relatively simple architecture using machine learning language model assisted vector search. Details of an earlier version of the model are described in Vowinckel et al. 2023

Multi-dimensional vector space…with cats

Vector search is a standard machine learning method whereby inputs (e.g. features, images, text) are represented as vectors and compared. In language model assisted search, the language model produces a vectorial representation of the input text which includes its contextual semantic information. The vectors can then be compared to each other to find semantically similar texts in an embedded space. The vector space may have many thousands of dimensions. Language model assisted vector search is a widely used technique to find and recommend personalised image, music, podcasts and even AirBnBs to users. 

AI-PreSearch uses a language model (EP-RoBERTa) that has been trained on patent documents. In AI-PreSearch, EP-RoBERTa produces a vector representation of the claims to be searched. The vector representation is then mapped to the 250,000 dimension patent subject area classification (CPC) space. The application can then be searched against all the prior art stored and embedded in a vector database. The closer in proximity the application vector to a prior art vector, the more textual and semantically similar the prior art is to the application. The model could be used to search the whole application or parts of it, such as the claims. 

“Similarity” is not a test for inventive step 

The CIPA Journal article proposes that AI-PreSearch could be used in a new inventive step test. The article proposes: 

Suppose a new patent application is received and converted into an embedding space using a large language model. The idea for a new test for inventive step is ‘the new application is inventive if the embedding space around the embedding vector of the new application within a radius of x, is empty and there is a technical effect […]’. Values of x could be found from historical data about granted patents and the state of the art. The historical values could then be used to determine a value for x to use now.

However, in this Kat’s view, the similarity between a claimed invention and the prior art, as determined by their relative positions in the embedded space, has nothing whatsoever to do with the current legal tests for novelty and inventive step. AI-PreSearch is simply a search tool for identifying documents semantically similar to the claims. The degree of “semantic similarity” between the claims and prior art does not overlap with any of the pre-existing tests for inventiveness, whether this is the Windsurfer/Pozzoli test of the UKIPO, the problem-solution test of the EPO or the non-obviousness test of the USPTO. 

In the problem-solution approach, for example, the first step is to identify the closest prior art. Superficially, it may seem that semantic “similarity” may help identify the closest prior art in the problem-solution approach. However, the closest prior art is “that which in one single reference discloses the combination of features which constitutes the most promising starting point for a development leading to the invention […] In practice, the closest prior art is generally that which corresponds to a similar use and requires the minimum of structural and functional modifications to arrive at the claimed invention” (EPO Guidelines for Examination, G-VII-5.1). 

AI-PreSearch assists in identifying contextually and semantically similar documents to the claimed invention. However, the simple vector search of AI-PreSearch does not and cannot identify a) which disclosure constitutes the most promising starting point for a development leading to the invention, b) which disclosure corresponds to a similar use to the claimed invention or c) which disclosure requires the minimum of structural and functional modifications to arrive at the claimed invention. None of these tests correspond to “similarity” in vector space. Similarly, there is no overlap of a test of similarity in vector space with any of the steps in the Windsurfer/Pozzoli test. 

The CIPA journal article admits that there is currently no legal basis for replacing the current tests for inventive step with a “similarity” test. However, this lack of legal basis is not only absent in the case law, it is also in the legal texts themselves. The European Patent Convention (EPC) states that “an invention shall be considered as involving an inventive step if, having regard to the state of the art, it is not obvious to a person skilled in the art” (Article 56 EPC). In this Kat’s view, the amount of semantic similarity between a disclosure and the claimed invention cannot be equated, according to any stretched definition of the term, with “non-obviousness” to a skilled person. 

Final thoughts

For this Kat, the use of a simple measure of “semantic similarity” between the claims and prior art as a test for inventive step, would constitute an absurd reduction of the complex legal notion of inventiveness. Readers may recall the infamous exchange (infamous at least to patent attorneys) in Episode 16, Series 6 of the US legal drama suits: 

Donna: Benjamin applied for a patent and it turns out our technology overlaps with someone else’s
Louis: How much overlap?
Donna: 32.5%
Louis: That’s over the threshold. Unless Benjamin can get you below 30…

“Only 30% overlap? That’s inventive!”

Every patent attorney knows that this exchange is legal nonsense (IPKat). The degree of overlap (or similarity) has nothing to do with the legal concepts for patentability of novelty and inventive step (even in the US). Furthermore, as the quote from Suits illustrates, such a system would be wide-open for abuse through the judicious manipulation of the legally meaningless measure of “similarity/overlap” in patent drafts. 

For this Kat, the proposal presented in the CIPA Journal ultimately fails to recognise the limited functionality of AI-PreSearch. AI-PreSearch, according to the EPO, is very good at searching. However, it has no ability to learn or apply legal tests. Importantly, AI-PreSearch’s language model EP-RoBERTa is not in the Generative Pre-trained (GPT) family of large language models made famous by OpenAI. EP-RoBERTa is based on BERT, an earlier type of large language model from Google, and the first to use transformers to represent contextual information in language. As such, unlike ChatGPT, EP-RoBERTa has no ability to answer questions, generate text or to learn to apply tests grounded in verbal reasoning. AI-PreSearch simply uses vector search to identify and rank the similarities of prior art documents to the claims of a patent application. Whilst AI-PreSearch may be great at searching, it has no hope of providing an alternative to inventive step assessment. 

GPT large language models (LLMs) such as ChatGPT, by contrast, have far greater functionality than simple AI assisted search tools. LLMs trained on patent prosecution data and legal texts can generate legal reasoning regarding the inventiveness or otherwise of a claimed invention. LLMs may also be combined with a vector search for prior art, to perform the full functionality of search and examination. Implementation of such a process would not constitute a new test for inventive step. Instead, it would be automating the current legal tests currently applied by patent examiners. However, we are not yet at the point where AI can replace a patent examiner. Specifically, the verbal reasoning produced by LLMs is currently fairly generic and superficial (IPKat). Nonetheless, as the functionality of these tools continues to grow, a future place for AI in patent examination seems likely. However, in this Kat’s view, it is probably safe to assume that the role of AI in patents will not be as new similarity test for inventive step. 

Further reading

Acknowledgements: Thanks as always to Mr PatKat (a.k.a Dr Laurence Aitchison) for his ML insights and expertise. 

Content reproduced from The IPKat as permitted under the Creative Commons Licence (UK).