Selectical

A whitepaper on faster and more effective literature reviews, powered by AI.

Introduction

The challenge

The solution: Selectical

Background: Active Learning

Performance

Conclusion

AI-assisted Literature reviews

Systematic literature reviews are time consuming, therefore relatively costly, and have aspects of repetitiveness. Despite several attempts by others, it has been difficult so far to develop effective computer assistance. We have developed a tool based on artificial intelligence technology that assists with selecting relevant titles and abstract: Selectical.

This whitepaper explains the challenges faced during systematic literature review in the biomedical field, and the solutions Selectical provides. The tool Selectical reduces the workload of reviewers with 66% by using new, real time, self-learning AI-technology, and still identifies over 99% of relevant papers.

The challenge: high workload and high quality
The case of literature reviews in the medical domain

The selection of relevant scientific articles in a specific literature review is relatively time consuming and needs the input of skilled researchers. It is important that the selected titles and abstracts include all articles that are eventually relevant for data-extraction. Therefore each title/abstract requires scrutiny. Considering that literature reviews of thousands of titles/abstract are no exception, the workload is huge.

Why has this title/abstract selection task not been automated yet?

Automation of the title/abstract task seems a logical step, but machine-assisted selection yielding satisfactory results is hampered for several reasons:

  • There is little room for false-negatives. It is important to include more than 99% of the papers that eventually qualify for data-extraction.
  • It is difficult for computers to interpret natural text. Keyword analysis is not sufficient to find all relevant papers.
  • Every new literature review project requires a new set of selection criteria. Thus, an algorithm cannot learn from previous tasks.
  • Title and abstract may not be enough to assess whether a paper needs to be included. However, these 'doubts' need to be selected by the computer, which complicates automated selection.

Applying Artificial Intelligence (AI) is difficult with the above mentioned constraints, among other things because AI models will most of the time only be effective if they are trained in similar situations. It requires a special type of AI-technology to create a model that is able to replace (part of) the human work given those constraints.


The solution: Selectical
How does it work?

Selectical is an AI-driven tool that automatically learns at every literature review project what the relevant papers are, based on the researcher's selection input. After a short while Selectical has learned enough to do the rest of the title/abstract selection job. On average, Selectical finds in 34% of the manual selection time over 99% of all relevant papers.

Selectical works in every browser and is easy to use:

  1. Upload a set of articles
    • The system processes the articles and prepares the self-learning process.
  2. Get started: the tool presents articles (titles + abstracts) to the reviewer who will assess the relevance for the project. The reviewer clicks on the button for include, exclude or doubt. Afterwards a new article will be presented.
    • The tool learns directly and automatically from the reviewer's selections.
    • After a certain number of assessments by the reviewer, which varies by project, the reviewer is ready and the tool has determined all relevant and non-relevant studies.
  3. Export and download the results!

Applications

There are several applications of Selectical within the literature review process: as 'second reviewer' in case the title/abstract selection needs to be done in duplicate; as 'control tool'; and as 'primary selector'.

  1. Second review: after a 100% manual selection by a primary reviewer, Selectical can be used as the second reviewer. A 100% duplicate selection of title/abstracts is a requirement for systematic literature reviews. Selectical can reduce the time needed for the duplicate selection with one third through the self-learning process. The results of the selection by the primary reviewer can be stored in Selectical as well, and results of the first and second selection can easily be compared.
  2. Control-tool: it is also possible to use Selectical as a control-tool for the manual selection. Selectical learns from the labels (include, exclude, doubt) assigned during the manual selection and identifies articles with a probability of being labelled incorrectly. Only a small set of articles need to be reassessed and labels adapted if needed. The results can be exported after the corrections are made.
  3. Primary selector: Of course Selectical can also be used as sole, primary reviewer. If you have acquainted yourself with Selectical's capabilities by using it as a second reviewer, you could use it as primary and only review tool for your literature reviews as well.

Selectical: how it works
How and why does this automated tool work?

Preparation

After uploading article information in Selectical, this information (title, abstract, additional info from databases such as PubMed and Embase) is processed and optimized by Selectical in order to set up efficient processing of the title/abstract selection. This initial step requires some, but limited time and computing power, and no time from the reviewer him/herself. After this set up, the reviewer can start labelling the titles/abstracts.

Real-time self-learning AI

When the reviewer starts selecting the articles in Selectical, the AI starts learning. The AI is trained by the selections made by the reviewer, to discriminate between 'relevant', 'non-relevant' and 'not sure'. This process is called Active Learning (see explanation box). Eventually the AI has learned which articles are relevant and which are not, without the interference of the reviewer.

When can the reviewer stop

If not all articles are labelled by the human reviewer, how will we know for sure that all relevant articles will be selected by the AI-tool? That is the challenge in developing a tool for the literature selection task. Selectical uses an innovative strategy to quantify the uncertainty about unseen articles. If no measurable uncertainty is left for the remaining, unseen articles, the human selection activities will stop. An export of the results for all articles can be made.

The technique behind this

Active Learning

Active Learning means that the self-learning Artificial Intelligence is actively adjusted by the input of the human user (reviewer in our case). The AI learns by 'cheating' from the human user how the task should be executed.

This is possible because the AI-algorithm relates a level of certainty to the decisions made. In other words, the AI can be certain or uncertain about the automated decisions made.

In the case of selection of titles/abstract the AI has to learn what is a relevant article and what isn't. To do so:

  1. The AI presents the user with the title/abstract that has the highest relevance and presents it to the user.
  2. The user labels the title/article as relevant or irrelevant.
  3. The AI uses this new piece of information to learn and adjusts, if necessary, its decisions regarding relevance and the level certainty attached.
  4. Repeat from 1.

Eventually, the AI can execute the work of the human user with a high level of certainty, by repeating these steps several times. How often depends on the problem at hand.

Selectical: Performance
How well is Selectical doing its task?

We use two criteria to judge the results of Selectical's work:

  • Amount of work saved

    What part of the total amount of titles/abstracts to be reviewed, does not have to be reviewed by the human reviewer? In other words, how much human work is saved by Selectical?
  • Quality

    What percentage of relevant articles is found by Selectical?

Selectical was challenged on these two criteria by testing it on literature review projects that had already been fully labelled by human reviewers. We simulated that Selectical would have been used in these projects and compared the results. We tested 36 literature review projects that were each 25 times simulated with Selectical. Each simulation round used different random initial parameters. We used literature review projects that varied in health/disease area (e.g. infectious diseases, chronic diseases, rare diseases, nutrition, alcohol use), focussed subjects (e.g. effectiveness of a certain vaccin) and wider subjects (e.g. the natural history of a disease), and with different sizes (100 to over 7000 titles/abstracts).

The 36 literature review projects included 80 thousand titles/abstracts of which 2000 labelled by the human reviewers as relevant.

The average results of these simulation with respect to efficiency and quality were:

CriteriumResult
Amount of work saved66%
Quality99.3%

For the detailed results, please feel free to send us an email at hello@wearelandscape.nl.

Comments regarding the results

  • In smaller reviews with less than 1000 titles/abstracts to be screened the amount of work saved is little. However, in those cases Selectical still provides a nice interface to do and store the title/abstract selection work.
  • Some of the articles the AI did not label similar to the human reviewer, could after scrutiny of the original, human results be labelled as human errors. The AI had correctly labelled them differently.
  • Literature review projects with focussed review objectives lead to a better performance of the AI.
  • Selectical comes up with better results and is more user friendly as compared to other title/abstract selection tools. We tested several of these tools. Abstrackr did include many titles/abstracts that were eventually not relevant for data-extraction. This led to extra work in the next stage of article selection, namely when the article is assessed in full text. In the end, using Abstrackr increased the time needed for article selection as compared to full selection by a human reviewer. A drawback of several other tools (such as Abstrackr, Rayyan, Bioreader, Colandr, StArt, RobotAnalyst) is that they do not present a clear stop for the human reviewer. Instead, the human reviewer gets feedback about the potential relevance of an article (e.g. by a rating/measure of uncertainty), and it is the reviewer's decision whether to stop or not.. However, this might trigger the human reviewer to proceed with manual selection 'just to be sure', which makes using a tool less efficient. In addition, some of the before mentioned tools (Rayyan, StArt) performed poorly with respect to their estimate of relevance of articles; the feedback received lacked, for example, accuracy or discriminative power.

In Conclusion

High quality, time-saving automated selection of research papers has long been a challenge for Artificial Intelligence. However, with real-time self-learning AI, Selectical returns more than 99% of the relevant articles saving 66% of the reviewer's time.

Simulations and the experience of users show that Selectical can assist in literature review projects with a wide range of review objectives. In addition, Selectical shows better performance than comparable tools.

If you are curious and would like to know the performance of Selectical on your type of literature reviews, we offer to do some test simulations on recent projects.