Machine-assisted screening (one human, one auto-screen in Dual Screening)#439

I noticed that Nested Knowledge has Dual Screening, and also has an Inclusion Prediction Model for each study.

When I’m in Dual mode, we currently have two human screeners and one (human) adjudicator. From what I understand, there is an allowable practice of having one automated screener. It would save me a lot of time if, instead of two human screeners, I could have only one human screener, then substitute the inclusion prediction model for the other screener. Since another human will adjudicate, I think this would get those time savings without reducing quality.

Could a new Dual mode be created that has one auto-screen per study? Let me know!

4 years ago

Changed the status to

Under Consideration

4 years ago

There’s a few questions to consider here:

How does a user indicate they want robot-screening
Should robot-screening be a one-time action, or a continuous activity (e.g. as new references flow in, they are auto-screened)
How can a user undo robot-screening (in case they are displeased with results, or have a protocol change)
What level of manual control should the user have over the robot’s conservatism (likely favoring inclusion), if any?

My instincts answer:

Either on Admin Setttings or as a Bulk Action from Inspector
I could see a case for either. Continuous is more convenient, but robot-screening could be perceived as dangerous and therefore not something that happens without explicit user intention.
Depends on the continuous/one-time decision. It would simple enough to clear screening decisions based on the user (robot) who created them.
Since our models produce a [0-1] probability, we will be computng optimal classification thresholds using the training set, whenever the model is trained. We’d likely use a threshold that optimizes the geometric mean of precision/recall, but we could invent other weighting schemes on precision/recall or TPR/FPR. To start, I’d prefer that NK handles this behind the scenes, and modify functionality if we get user feedback.

4 years ago

A couple further notes:

We’ll need to add reviewer-level screening decisions to our model training data (currently only looks at final, adjudicated decisions)
It would be cool to provide human-readable explanations of why the record was deemed an include/exclude. e.g. what words/keywords/features most drove the include/exclude decision. This may require some rework on the backend, since we use incomprehensible word embeddings.

4 years ago

Changed the status to

In Progress

4 years ago

Complete in release 1.43.0

4 years ago

Changed the status to

Completed

4 years ago

Make a suggestion