A new paradigm in medical image analysis

By Professor Gustavo Carneiro*
Thursday, 16 July, 2020

The main goal of computer-aided diagnosis (CAD) systems in medical image analysis is the prediction of clinically relevant outcomes (eg, diagnosis, prognosis, pathology) directly from medical images. The prediction of ejection fraction from echocardiography images or breast cancer probability from mammograms are examples of CAD systems.

A few decades ago, the first attempts to predict outcomes were based on CAD systems that would take medical images and directly produce outcomes using hand-designed image features and rudimentary classification models. These first attempts produced unremarkable prediction accuracy for a few reasons, including lack of computational power, and few medical images available for modelling and testing the systems.

Furthermore, hand-designed features were developed based on conversations with clinicians or on standard image processing tools, so in mathematical terms, they were not optimally designed for improving the prediction of clinical outcomes and consequently the classification results were sub-optimal.

One important breakthrough in the late 80s was the development of an effective image segmentation tool, called active contour model, by Dr Kass and colleagues.¹ This tool enabled the development of viable CAD systems to segment organs from medical images and allowed researchers to revisit the problem of designing CAD systems to predict clinically relevant outcomes using features extracted not only from medical images, but also from segmentation maps. Such classification based on images and segmentation maps has dominated the field until today.

Despite a strong push for the development of effective hand-designed image features and more accurate segmentation methods, the prediction accuracy never reached clinically acceptable levels.

Machine learning

The ’90s witnessed remarkable breakthroughs in machine learning — a process where computer algorithms use datasets containing a large collection of observations (eg, images) and outcomes (eg, diagnosis) to automatically build an optimal mathematical model that can make previously unseen observations and predict outcomes.

Machine learning penetrated the field of medical image analysis in the late ’90s, when researchers implemented models that could segment medical images and predict clinically relevant outcomes using large datasets containing images and relevant annotations (ie, diagnosis or segmentations). These machine-learning-based CAD systems showed promising results, but there were a few issues:

the methods relied on hand-designed features, so results were still sub-optimal; and
the models needed large, manually annotated datasets.

The second point above was particularly worrying because a lot of the segmentation annotations were not part of the manual image analysis protocol, so clinicians needed to be hired and taught how to produce such annotations.

This process had two major problems:

given that it was an expensive annotation process, datasets were never large enough to allow for the modelling of robust methods; and
since the annotations were not part of clinical protocols, they contained noise that inevitably biased the methods.

It is also interesting to note that the classification paradigm mentioned above, based on images and segmentation maps, remained virtually unaltered during this period.

Deep learning

Meanwhile, deep learning — a particular type of machine learning model formed by a hierarchical structure of classification layers — was being studied by now world-famous researchers, such as Professors Geoff Hinton, Yann LeCun, Yoshua Bengio and Jürgen Schmidhuber.

Deep learning models have been developed incessantly since the late ’80s, but there were always some roadblocks that did not allow their effective use by the scientific community: lack of computational power; small annotated datasets; ineffective modelling of deep learning classifiers, etc.

The main trade-off is that deep learning models are highly complex models that tend to require datasets that are orders of magnitude larger than previous machine learning models.

In 2012, Professor Hinton had a major breakthrough that practically affected the whole scientific community (and later on, the whole of society). He and his colleagues successfully developed a deep learning model that produced the best classification result (by a large margin) in a benchmark computer vision problem.² Such a breakthrough was enabled by the use of a large dataset (one million images) and more adequate computational power based on graphics processing units.

The relevance of deep learning models for medical image analysis was that it would allow the development of models that could automatically learn optimal features and classifiers to produce a specific output, allowing the models to be mathematically optimal for predicting clinical outcomes.

Despite success in various classification problems, deep learning required even larger annotated datasets than previous machine-learning-based CAD systems. As mentioned above, the acquisition and annotation of medical image analysis was an expensive and unreliable process, so many researchers studied ways to make deep learning models based on small and potentially unreliable annotated datasets.

Deep learning and decision explanation

Since 2012, with deep learning, researchers have been trying a strategy using only images and outcomes from hospital databases — the same strategy as that adopted in the ’70s and ’80s. The problem is that this approach requires the system to figure out how to explain the decisions made. For example, if the system says there is a 90% probability of malignancy in a mammogram, it should highlight/delineate the region with a suspicious lesion, but the databases from hospitals typically won’t contain that information. This is the current challenge — we can use the hospital databases (containing images and clinical outcomes) without any additional annotation, but the system needs to be ‘smart’ to explain the decisions (by, for example, segmenting image regions).

The field of medical image analysis is currently working on the development of CAD systems modelled with extremely large datasets (containing hundreds of thousands of patients) that follow this new paradigm. Preliminary results show that these systems can not only produce accurate clinically relevant outcomes, but can explain the decisions for reaching a particular outcome.^3-5 We believe that in the near future, these CAD systems will also be able to automatically discover new imaging biomarkers associated with clinically relevant outcomes, possibly having a profound impact in medicine.

*Gustavo Carneiro is a professor at the School of Computer Science at The University of Adelaide and Director of Medical Machine Learning at the Australian Institute for Machine Learning.

References

Kass, Michael, Andrew Witkin, and Demetri Terzopoulos. “Snakes: Active contour models.” International journal of computer vision 1.4 (1988): 321-331.
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. “Imagenet classification with deep convolutional neural networks.” Advances in neural information processing systems. 2012.
Carneiro, G., Pu, L. Z. C. T., Singh, R., & Burt, A. (2020). Deep Learning Uncertainty and Confidence Calibration for the Five-class Polyp Classification from Colonoscopy. Medical Image Analysis, 101653.
Maicas, G., Bradley, A. P., Nascimento, J. C., Reid, I., & Carneiro, G. (2019). Pre and post-hoc diagnosis and interpretation of malignancy from breast DCE-MRI. Medical Image Analysis, 58, 101562.
Gale, W., Oakden-Rayner, L., Carneiro, G., Palmer, L. J., & Bradley, A. P. (2019, April). Producing Radiologist-Quality Reports for Interpretable Deep Learning. In 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019) (pp. 1275-1279). IEEE.

A new paradigm in medical image analysis

Machine learning

Deep learning

Deep learning and decision explanation

NHS hospital pilots AI in discharge summaries

Doctors at breaking point — can AI medical scribes help?

Does routine AI assistance erode endoscopy skills?

Content from other channels on our network