RSA FG @ SIGIR 2019 in Paris

Mihai Lupu, Chief Researcher of the Research Studio Data Science of the RSA FG, together with Alexandros Bampoulidis and Luca Papariello presented his latest work at the international SIGIR conference in Paris in July. The research team developed a novel test collection for patents, which is publicly available and more comprehensive than all previous ones.

The new patent test collection contains global data from six patent authorities – including Europe, USA, Japan, China and Korea. It combines data in multiple languages, data types and across different domains. The complete collection consists of more than 60 million files and images from the past two years and is about 5 TB in size.

The application of the new patent test collection could help to develop further new tools for information retrieval; among other things, it could support the development of tools for assigning anonymised data sets – based on writing style, content and citation behaviour – to specific authors.

“A Horizontal Patent Test Collection”

von Mihai Lupu, Alexandros Bampoulidis und Luca Papariello, SIGIR 2019