A Domain-Forward Approach to Algorithmic Decision-Making: Lessons Learned from the COVID-19 Pandemic

Bryan Wilder & Maia Majumder

A Domain-Forward Approach to Algorithmic Decision-Making Lessons Learned from the COVID-19 Pandemic.jpg

Introduction

During the Coronavirus Disease 2019 (COVID-19) pandemic, algorithms have become increasingly common in medical and public health decision-making, with applications ranging from diagnostics1 to forecasting2 to resource allocation.3 This phenomenon has emerged in part due to advances in artificial intelligence (AI) that have been ongoing for the last several decades. In particular, systems based on machine learning increasingly leverage complex forms of data such as images or natural language to perform a range of predictive tasks.The COVID-19 pandemic has presented computer scientists and mathematicians with an opportunity to make their skills useful for social good. As AI researchers who have been involved in COVID-19 research since early 2020, we are enthusiastic about this merging of disciplines; however, we are also aware that decision-making systems that over-rely on algorithms can falter in the face of complex, sociotechnical problems––including those exposed by the pandemic.

A highly-publicized cautionary example concerns the rule-based system used by Stanford Medical Center to prioritize workers for vaccination.4 The algorithm scored workers using a range of variables (e.g., age and professional role). In the resulting allocation, vaccines were offered preferentially to senior staff, many of whom were working remotely, compared to only 5 of over 1,300 residents who were working in person. This example illustrates how even simple algorithmic decision aids—with none of the complexity of machine learning models—can easily create life-threatening consequences in deployment. Closer involvement of domain experts (e.g., those familiar with the clinical care environment at Stanford) during the development and testing process for the algorithm might have averted this adverse outcome.

The broader lesson extends far beyond this example, however; methods rooted in computer science often falter in first contact with another domain. As AI researchers, we propose a domain-forward approach to algorithmic decision-making. Such an approach entails close involvement by domain experts from the ground up: in conceptualizing the aims of a project, understanding the strengths and weaknesses of data sources, informing the properties required of a potential solution, and vetting the resulting system. These aims are related to a number of current topics in AI research––explainability5, interpretability6, human-in-the-loop systems7, etc.—which aim to make AI more usable by humans. In this context though, we are most concerned with domain experts as members of the scientific team designing algorithms. Over the course of the pandemic, we have worked to operationalize this philosophy by engaging with domain experts at every step of our COVID-19-related work––from conceptualization of problems to dissemination of results. Here, we describe three case studies that showcase this process.

Examples

One project developed an agent-based model for COVID-19 dynamics across a range of early hotspots, aiming to uncover between-population differences and model the effectiveness of potential interventions.8 From the start, the team for this project included an infectious disease clinician who has been actively involved in treating COVID-19 patients. Although one of us (Majumder) has training in epidemiology, an important part of our modeling effort relied on characterizing the clinical course of the disease and how this course may vary depending on a patient’s comorbidities. Accordingly, clinical expertise was important to interpret the rapidly emerging literature and make appropriate modeling choices. We uncovered a range of differences in epidemic dynamics across populations, leveraging a model which incorporated the influence of demographics on the disease’s clinical course.

In another study, we used natural language processing approaches to analyze the wealth of scientific papers that have been generated during the pandemic and compare knowledge generation between bench (i.e., laboratory-oriented) scientists and clinical (i.e., patient-oriented) scientists working on COVID-19.9 From conception of the project, we ensured that both bench and clinical scientists––in addition to natural language processing specialists––were represented in the study team. We found that, as of Summer 2020, bench-science publications have lagged behind those produced in clinical science. The bench scientists on our team were able to help us identify potential domain-specific causes—for instance, dramatic funding cuts to bench science in recent years—to explain this gap in productivity.

In a last example, after President Trump’s April 23rd remarks regarding the injection of bleach to treat COVID-19, we paired Internet search query and news media data with autoregressive techniques to determine the impact of this misinformation event on purchasing and off-label use of disinfectants.10 Due to the critical role of science communication (scicomm) when studying and disseminating misinformation research, we recruited a scicomm specialist to join our team at project conception. We found that the April 23rd misinformation event prompted increased interest in purchasing and off-label use of disinfectants, as well as in poison control departments across the US––likely due to poisoning events following off-label use. The scicomm specialist on our team was able to help us (1) tailor the language of our paper to focus on the egregiousness of President Trump’s remarks as opposed to shaming those who fell victim to the misinformation event and (2) disseminate our findings to other scicomm and misinformation researchers.

Conclusion

The COVID-19 pandemic is just one example of a complex, sociotechnical problem that can benefit greatly from domain-forward approaches to algorithmic decision-making. The strategies we adopted during the pandemic were shaped by our pre-pandemic experiences. For example, one of our long-running projects concerned HIV prevention for youth experiencing homelessness. The goal was to algorithmically target a “peer leader” intervention by identifying a set of influential youth to recruit as advocates for HIV awareness and condom use.. From its start, this project was a team effort between social work researchers and AI researchers. The community context and implementation expertise provided by the social workers allowed us to identify where computational techniques would be most impactful in practice. We uncovered key challenges related to the limited data available in this domain and developed algorithms to both guide the collection of information11 about social network structures in this population and optimize the set of peer leaders12 in light of that structure. The end result was a successful field trial in which the AI-augmented intervention produced a significant reduction in rates of unprotected sex.13 For us, this project emphasized the importance of including domain experts as co-designers of an algorithmic approach from day one. Without their involvement, it would have been impossible to know which problems deserved our focus at all.

As we move forward through the COVID-19 pandemic and look ahead to new epidemics that the future will inevitably bring, computational researchers should remember the value provided by domain experts. The interface between algorithms and sociotechnical problems will never be straightforward to navigate. Scientific teams which leverage a complete spectrum of expertise are our best hope for developing algorithmic approaches which meet the needs of complex medical and public health challenges.

About the Authors

Majumder_Maia copy.jpg

Maimuna (Maia) Majumder, M.P.H., Ph.D. is a ladder-rank faculty member in the Computational Health Informatics Program at Harvard Medical School and Boston Children's Hospital. Her research interests involve the application of artificial intelligence approaches to public health problems, with a focus on emerging epidemics and digital data streams.

DSC_1045.JPG

Bryan Wilder is a PhD Candidate in computer science at Harvard University. His research focuses on the development of techniques rooted in machine learning, optimization, and social networks for public health applications.

References

  1. Elaziz MA, Hosny KM, Salah A, Darwish MM, Lu S, Sahlol AT. New machine learning method for image-based diagnosis of COVID-19. PLOS One. 2020 Jun 26;15(6):e0235187.
  2. Poirier C, Liu D, Clemente L, Ding X, Chinazzi M, Davis J, Vespignani A, Santillana M. Real-time forecasting of the COVID-19 outbreak in Chinese provinces: machine learning approach using novel digital data and estimates from mechanistic models. Journal of medical Internet research. 2020;22(8):e20285.

  3. Wedlund L, Kvedar J. New machine learning model predicts who may benefit most from COVID-19 vaccination. NPJ Digital Medicine. 2021 Mar 26;4(1):59-.

  4. Guo, E. and Hao, K., 2021. This is the Stanford vaccine algorithm that left out frontline doctors. [online] MIT Technology Review. Available at: <https: data-preserve-html-node="true"//www.technologyreview.com/2020/12/21/1015303/stanford-vaccine-algorithm/> [Accessed 23 April 2021].

  5. Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D. A survey of methods for explaining black box models. ACM computing surveys (CSUR). 2018 Aug 22;51(5):1-42.

  6. Doshi-Velez F, Kim B. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608. 2017 Feb 28.

  7. Amershi S, Cakmak M, Knox WB, Kulesza T. Power to the people: The role of humans in interactive machine learning. AI Magazine. 2014 Dec 22;35(4):105-20.

  8. Wilder B, Charpignon M, Killian JA, Ou HC, Mate A, Jabbari S, Perrault A, Desai AN, Tambe M, Majumder MS. Modeling between-population variation in COVID-19 dynamics in Hubei, Lombardy, and New York City. Proceedings of the National Academy of Sciences. 2020 Oct 13;117(41):25904-10.

  9. Doanvo A, Qian X, Ramjee D, Piontkivska H, Desai A, Majumder M. Machine learning maps research needs in COVID-19 literature. Patterns. 2020 Dec 11;1(9):100123.

  10. Rivera JM, Gupta S, Ramjee D, El Hayek GY, El Amiri N, Desai AN, Majumder MS. Evaluating interest in off-label use of disinfectants for COVID-19. The Lancet Digital Health. 2020 Nov 1;2(11):e564-6.

  11. Wilder B, Immorlica N, Rice E, Tambe M. Maximizing influence in an unknown social network. In: Proceedings of the AAAI Conference on Artificial Intelligence; 2018 (Vol. 32, No. 1).

  12. Wilder B, Onasch-Vera L, Hudson J, Luna J, Wilson N, Petering R, Woo D, Tambe M, Rice E. End-to-End Influence Maximization in the Field. In: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems; 2018 (pp. 1414-1422).

  13. Wilder B, Onasch-Vera L, Diguiseppi G, Petering R, Hill C, Yadav A, Rice E, Tambe M. Clinical trial of an AI-augmented intervention for HIV prevention in youth experiencing homelessness. In: Proceedings of the AAAI Conference on Artificial Intelligence; 2021.