III. Overview of AI technologies in medicine
As described in the section entitled “Background and context”, a broad array of technologies can be described as AI. With high-level definitions of relevant concepts including artificial intelligence, algorithms, and machine learning are defined, it is necessary to explore in more detail the potential types of medical AI applications. As this report focuses on the impact of AI on the doctor-patient relationship, not all potential medical applications will be considered. As a first step, we can distinguish between three types of AI according to their intended users:
- AI for biomedical researchers
- AI for patients
- AI for health professionals
Of these categories, AI for patients and health professionals are most relevant for the purposes of this report given the focus on the doctor-patient relationship.
Other taxonomies are of course possible; a recent report by the WHO, for example, distinguishes between AI applications for use in:
- Health care
- Health research and drug development
- Health systems management and planning
- Public health and public health surveillance
The taxonomy deployed here focuses on the intended users of AI systems because appropriate solutions to ethical challenges introduced by these systems typically vary according to the interests, level of expertise, and requirements of different stakeholder groups.
Read more
Although not directly relevant to the doctor-patient relationship, it is worth reviewing a few examples of AI used for medical research. One of the most common applications in biomedical research is drug discovery. For example, a recent discovery by computer scientists and cancer specialists at the Institute of Cancer Research and Royal Marsden NHS Foundation Trust of a new drug regime for a rare form of brain cancer in children (diffuse intrinsic pontine glioma). Deepmind’s recent advances on protein folding via AlphaFold likewise indicate the promise of AI for fundamental research. AI can also be used for structuring, labelling, and searching unorganized or heterogenous medical datasets; image classifiers, for example, can process huge volumes of medical imaging data much faster than manual labellers. Such systems can also be useful for administrative and operational purposes as discussed below.
One noteworthy usage of AI that blurs the boundaries between research and clinical care is that of polygenic embryo screening, in which an algorithm summarizes “the estimated effect of hundreds or thousands of genetic variants associated with an individual’s risk of having a particular condition or trait.” This practice raises the spectre of eugenics by potentially allowing parents to select embryos both for health advantages, but also for socially desirable non-disease-related traits.
Many AI applications are in development to be used directly by patients, often in collaboration with a health professional or artificial agent. These include telemedicine applications used for remote observation, clinical encounters, and video-observed therapy; virtual assistants and chat bots for information or triage; applications for managing chronic illnesses such as cardiovascular disease or hypertension; health and well-being ‘apps’; personal health monitoring systems including wearables with built-in analytics and behavioural recommendations; and remote monitoring systems for facial recognition, gait detection, biometrics, and health-related behaviours.
One purported benefit of AI systems aimed at patients it to “empower patients and communities to assume control of their own health care and better understand their evolving needs.” Health monitoring and telemedicine systems could, for example, assist patients in self-management of chronic conditions like diabetes, hypertension, or cardiovascular disease. Therapeutic “chat bots” may also be able to assist in management of mental health conditions. It has been predicted, for example, that the GPT-3 natural language application could eventually be used as the basis for conversational agents working directly with patients, for example as an initial point of contact or (more controversially) for triaging non-critical patients. These applications seem highly likely given the existing deployment of ‘virtual GP’ chat bots which direct service enquiries and provide information to patients; it should be noted, however, that such applications have been the subject of significant debate over their ethical acceptability and regulation. Likewise, they may lead to reduced access to human care.
Finally, a wide variety of applications are aimed at health professionals. Three broad categories can be distinguished:
- Applications designed for diagnostics, therapeutics, and other forms of clinical care
- Applications designed for operational or administrative uses
- Applications designed for public health surveillance
The distinction between these categories is not always clear, as will be discussed below. To limit the focus of this report to the potential impact of AI on the doctor-patient relationship, only the first two categories will be surveyed. Public health surveillance could also be conceived as an extension of the clinical experience or doctor-patient relationship, insofar as patients may be contacted proactively by public health officials for clinical follow-up. Nonetheless, this report is concerned principally with the immediate clinical experience and relationship between individual health professionals and their patients.
AI systems aimed at clinical care are designed to fulfil a broad range of tasks, including diagnosis recommendations, optimization of treatment plans, and various other forms of decision-support.
According to the WHO:
“AI is being evaluated for use in radiological diagnosis in oncology (thoracic imaging, abdominal and pelvic imaging, colonoscopy, mammography, brain imaging and dose optimization for radiological treatment), in non-radiological applications (dermatology, pathology), in diagnosis of diabetic retinopathy, in ophthalmology and for RNA and DNA sequencing to guide immunotherapy."
Future applications currently in development (but not yet deployed clinically) include systems to detect “stroke, pneumonia, breast cancer by imaging, coronary heart disease by echocardiography and detection of cervical cancer,” including systems designed specifically for use in low- and middle-income countries (LMIC). Systems are being designed to predict the risk of lifestyle diseases including cardiovascular disease and diabetes.
Development of medical image classification systems has been highly prevalent in recent years. Prior work, for example, has shown that neural networks can achieve consistently higher sensitivity for pathological findings in radiology. Image classification systems can also be used to support detection of tuberculosis, COVID-19, and other conditions through interpreting staining images or X-rays. Another emerging phenomenon is that of “digital twins,” which are systems that simulate individual organs or multi-organ systems of individual patients for purposes of disease modelling and prediction.
Generally speaking, the deployment of AI in clinical care remains nascent. Clinical efficacy has been established for relatively few systems when compared to the significant research activity in healthcare applications of AI. Research, development, and pilot testing often do not translate into proven clinical efficacy, commercialization, or widespread deployment. The generalization of performance from trials to clinical practice generally remains unproven.
A 2019 meta-analysis of deep-learning image classifiers in healthcare found that despite claims of equivalent accuracy between AI systems and human healthcare professionals:
“Few studies present externally validated results or compare the performance of deep learning models and health-care professionals using the same sample.” Likewise, “poor reporting is prevalent in deep learning studies, which limits reliable interpretation of the reported diagnostic accuracy.”
The evidence base for clinical efficacy of deep learning systems may have improved in subsequent years, but broad adoption will seemingly hinge on standardised reporting of accuracy to enable assessment of clinical efficacy by medical regulators and clinical care excellence bodies.
A near term challenge for image classifiers is to build systems which can assess multiple image or scan types, such as X-rays and CT scans, which are often considered in combination by human radiologists while AI systems typically can only interpret one or the other. A similar challenge exists for detection of multiple conditions or pathologies, with existing classifiers often trained to only detect a single type of abnormality.
Finally, many AI systems are also designed for administrative or operational purposes. AI systems can help with several aspects of hospital administration and operational evaluations. Discharge planning tools, for instance, can estimate discharge dates and barriers for hospitalized patients and flag up those that are clinically (nearly) ready to be discharged to clinicians, along with a list of necessary steps to take prior to discharge. Some systems can even schedule necessary follow-up appointments and care. Natural language processing systems could be used for automation of routine or labour-intensive tasks, such as searching and navigation of electronic health record (EHR) systems or automated preparation of medical documentation and orders. According to the WHO, “Clinicians might use AI to integrate patient records during consultations, identify patients at risk and vulnerable groups, as an aid in difficult treatment decisions and to catch clinical errors. In LMIC, for example, AI could be used in the management of antiretroviral therapy by predicting resistance to HIV drugs and disease progression, to help physicians optimize therapy.”
Distinguishing between uses of AI for clinical care and research versus those used for operational and quality improvement purposes by hospitals and health systems is often difficult. Many such systems are designed to identify at-risk patients. The UCLA Health network, for example, uses a tool that identified patients in primary care that are at high risk of being hospitalized or making frequent visits to an emergency room in the coming year. Similarly, Oregon Health and Science University use a regression algorithm to monitor patients across the hospital for signs of sepsis. Both are treated as a type of operational tool for monitoring and prioritising quality of care, and not as part of clinical care or research.