Biomedical Informatics in the Age of COVID-19: A Conversation with Dr. Isaac Kohane
Interview By Sarosh Nagar
HHPR Editor Sarosh Nagar interviewed Isaac Kohane, MD, Ph.D., Chair of the Department of Biomedical Informatics, Harvard Medical School. He has engaged in many research projects that deploy biomedical informatics to improve treatments for a host of diseases. Read more about Professor Kohane here: https://dbmi.hms.harvard.edu/people/isaac-kohane
Sarosh Nagar (SN): Thank you for speaking with us today, Dr. Kohane. So, to start, what got you interested in bioinformatics in the medical field in the first place?
Isaac Kohane (IK): Well, I came into my undergraduate studies as someone ready to be part of this new era of biology, but what I found was not what I expected. I had not had very much experience with computing as a high school student, so I did not expect to become as enamored as I did with computing. There was a professor, James Kidwell, at Brown University, who was my honors thesis advisor, and his course on population genetics introduced me to many of these computational approaches, and it struck me as a fascinating way to explore biology, and this interest inspired many other projects in college.
I then went on to medical school, just carried by the sheer momentum of my earlier plans. But one of the biggest weaknesses I observed was that the approach to medicine was taught as if we did not have computing to help us assist us in our decision making. But fortunately, I was introduced to several new professionals, and I saw this emerging field of artificial intelligence and medicine. As a result, through time, I was attracted to medical informatics by the allure of approaching medicine and biomedical discovery, limiting bias and misplaced presuppositions.
SN: Fascinating. So now you’re the chief of biomedical informatics at HMS, and you’ve previously said that these discoveries in biomedical informatics have major applications in terms of coronavirus testing, which has been largely unexplored. Would you mind elaborating on that point? What major applications could you see with AI and COVID?
IK: So I think there are many applications, and I’m glad to articulate those, but there are several structural limitations in our health care system that make the use of AI against COVID harder, namely that our data remains siloed. The implementation of electronic health records has allowed us to have much more health data than we’ve ever had before, but because much of it remains siloed in individual health systems, it makes it hard to create a timely picture of what was going on with our patients so that we could adapt our treatments as a consequence.
We never reached that critical mass of data. And so, in the COVID era, we went old fashioned, relying on word of mouth from colleges in China, Italy, and France before the virus hit our shores, and even as it hit our shores, we relied on various anecdotal bits of information from various hospitals. Unfortunately, this process made us learn what we needed in months, not in days. One of the things that we could have learned earlier, for example, was that there was a strong inflammatory picture or that there was a serious coagulation problem. But the response to get there was delayed by weeks.
In fact, in March, we put together a consortium because I was alarmed that we were not gathering data. We asked ourselves: can we see what’s going on with our patients worldwide? So we assembled our consortium of 110 hospitals in over seven countries, and we started having weekly meetings. Many of the consortium members were colleagues of mine with whom I had collaborated over prior years. Using our software to extract data from the electronic health record, we were able to put together in four weeks not only the analysis but a full pre-print, all from the first week of hospitalizations. But of course, this is a purely bottom-up approach, and we were not plugged into any national decision-making system. But the point is that the pieces of data were there. It was poorly understood, siloed, and not made available for machine learning models from day 1.
Lots of things, therefore, were missed and frankly continued to be missed. Because if we had a health system that could aggregate its data as needed at the push of a button, much as we do for things that are arguably important, like the stock market or airplane reservations, we could know more. We would know exactly what the false-positive rates were on PCR-based COVID testing. We know exactly how many patients had both a positive PCR and then a possible antibody test to analyze possible re-infection or continued shedding of the virus. And those facts are slowly coming out. But again, these were knowable easily half a year ago. So, unfortunately, the first problem of applying AI to the COVID epidemic is the data access problem.
The second limitation was that we don’t have a health workforce equipped to work with augmented intelligence for machine learning. That’s a multifactorial problem, having to do both with our hospitals’ leadership, the integration of quantitatively-minded individuals into the tissue of the health care delivery system, and the incentives for implementation of technology. You may have cumbersome systems now, such as related to data entry in electronic health recordings, which makes doctors’ lives harder. But we have billions of dollars spent in the US by the federal government to incentive payments for electronic health records, and such incentives could lead to the more rapid dissemination of AI in healthcare.
In a sense, I’m sorry to disappoint your readers because I would love to talk about some of the most elaborate and distinguishing achievements, and I’m glad to describe some smaller experiments. But, overall, I’ve talked about the larger American healthcare system, which is driven by reimbursement models that are not well aligned to the rapid dissemination of technology. Perhaps a shift to something akin to a model of value-based care would be more effective in disseminating healthcare technologies, but we will also need efforts that result in better data aggregation for these kinds of public health efforts. We also need to have informed and numerate leaders in health care.
But having said that, there are some small experiments that suggest that we can see greater success. We’ve seen great success, for example, with the data-driven machine learning approach, where the community developed great tools to detect retinopathy, a disease of the retina often caused by diabetes, which is a global problem. Google then took that technology and improved it by adding a lot more data, many more labels and made it so good that it outperformed the best ophthalmologists. Now, there are trials happening in India and Africa, which certainly demonstrates that this retinopathy detection can easily exceed what’s available locally. This fact is especially true in places where there are not many trained ophthalmologists and so many more cases. In the end, you see an exciting foreshadowing of these tools’ potential.
SN: That’s fascinating. But to facilitate the introduction of these new innovations, what broad policies are needed to overcome these data aggregation problems in particular? What kinds of policies do you think people at government levels or like in the private sector should begin enacting to make aggregation easier for you? There is much potential here, so I’m curious how this process can be facilitated.
IK: So I think that what we have to do is align the incentives and, given the fact that we’re not going to be able to change the incentive mechanism of medicine overnight, we have to develop regulatory mechanisms to allow data sharing. I’m very hopeful that we may have hit upon the right mechanism, at least in the United States. For example, the program SMART on FHIR, developed with colleagues, created a standard for accessing healthcare data out of electronic health records. Importantly, this standard was adopted by another colleague of ours who was at Duke, whom Apple later hired. Now the Apple Health app uses that API to access data from about 500 hospitals throughout the United States and in about 1400 hospitals in the next two years. At the same time, other universities and institutions are also beginning to adopt it as well.
Additionally, in 21st Century Cures Act, which Congress has passed, the bill stipulated that an API such as SMART on FHIR can be implemented so patients can have access to data. Now, that’s more important than merely giving access to the patients (which is already good) since it can give us the ability to share our data with other people if we are willing and explicitly consent. As a result, I believe we can appeal to patients’ self-interest and say “contribute your data,” such as to the CDC, for example, to help with treatments.
So putting patients in charge of data flow, I think, is the essential step. Now we have several pieces of good news, with the technical infrastructure of SMART on FHIR and the legislative infrastructure of the 21st Century Cures Act. Now what remains to be seen is whether market forces and public opinion and policy will accelerate it or not. And so I know it will happen eventually. The question is ultimately how fast it may be. Some patients will choose not to share or will withdraw their data in the future, and I think that’s something that they should be entitled to do. But I believe that by and large, humankind is sufficiently altruistic that, for limited declared purposes, they will share their data, especially because they can un-share it at a later point. So I think making the patient making the decision on the aggregation of that data is the future, and rightfully so.
SN: That is fascinating! I assume this is your solution to any data privacy issues since the information is in the patient’s hands?
IK: Exactly! In the end, when I go to a doctor, I want an automated process that can suggest to me the best treatment or diagnosis. Maybe the doctor says something different, but I’d like to know, just based on my data, what the best of machine learning has to say about me. Should I take this drug? Is there a side effect? These are all questions I think these technologies can answer. I’ve written about this idea in past projects, but the idea is that machine learning and AI can help look after the patient. Suppose we have patient-controlled data and therefore have enabled patient control decision making. In that case, I think we’ll have placed the person who cares the most about you in a position to have a substantial role in your health-related decision making.
Now, it’s not going to replace doctors, not at all, but doctors are just human beings, like the rest of us. They make mistakes. They don’t know everything. They’re not always up to date. So it’s good for us to have the best knowledge backing up that great intuitive human intelligence along with which they work. Together, they can help us truly improve the quality of care in many tremendous ways.
SN: That is a truly interesting answer about how AI and biomedical informatics can intertwine with patient-focused decision-making. Thank you so much for this interview and for sharing your fascinating insights with the HHPR.