Most of us use search engines every day, but we’d most like to find the things we don’t know – yet. What if the search systems could understand that need and provide us with the complementing information? Our Research Studio Data Science is working on that question.

Whether it is when researching a paper or in a discussion where we just need to prove the other person wrong – nowadays we often rely on search engines to enhance our knowledge and find specific information. Information Retrieval (IR) systems – such as search engines – are improving every day. But what if the search systems could understand what we already know about a topic and complement our knowledge while we are searching for something new?

A substantial number of studies are being done to explore the advancements of different aspects of these -IR systems. A division of studies has focused on the users’ knowledge as a key factor in search: Most have investigated modeling the users’ knowledge to explore behavioral effects during search – resulting from the users’ knowledge and/or the change in knowledge of the users due to facing new resources during search.

What am I missing? Using pre-existing knowledge to provide more helpful search results

In the DoSSIER project Knowledge Delta (KD), the Research Studio Data Science of the RSA FG investigates how web users’ knowledge can best be utilized to improve their search experience during a learning session. The project focuses on modeling both the knowledge and the search target of the users – using this data, it then provides matching information to each user according to their goal and their pre-existing background knowledge. The difference between what a user knows and doesn’t know in the resources on the web is defined as the knowledge delta. The knowledge delta for each user and each domain is specific, as the following example will demonstrate:

Let’s assume that a senior-year computer student and a law student both want to learn about a rather novel concept such as “Deep Learning”. Most likely, the computer student will have more knowledge in prerequisite materials of “Deep learning” such as statistics, general concepts, and algorithms. In contrast, the other student coming from a law background will be less familiar with the mentioned topics. Here the knowledge delta for the computer student will be smaller than the knowledge delta for the law student. In case the target domain had been a topic related to “Civil Rights”, the knowledge delta would have been different as the two students would have had reverse prerequisites on the topic.

Evaluating IR systems in knowledge acquisition

Aligned with the objectives of the DoSSIER project, the research team working on this project at our Data Science Studio (Yasin Ghafourian, Dr. Petr Knoth and Prof. Allan Hanbury) has recently published a paper contemplating on the evaluation methodologies of a system that is empowered by knowledge.

The publication discusses why the existing evaluation methodologies fall short in providing insight into the performance of a system that takes users’ knowledge into account during the search and retrieval process. Furthermore, and in order to provide directions for research into the evaluation of the systems as they are exploiting users’ knowledge, three possible evaluation methodologies are proposed with pros and cons for each of them discussed, providing a more profound intuition.

This publication will be used as a groundwork for further developments in the DoSSIER project in order to implement experimental methodologies for the evaluation of a knowledge-based search system that will be developed within the DoSSIER project. It can also be used as a starting reference for future works targeting knowledge acquisition tasks and their evaluation – making us all smarter in the process.