Deep Learning Shows Potential for Accurately Reading Mammograms

The use of deep learning (DL) technology could help radiologists increase the quality of breast cancer screening programs, lower costs, and reduce the variability in the cancer detection process


Karssemeijer
Karssemeijer

And the role of DL technology in imaging doesn’t stop there. In fact, it is likely that DL computers can be trained to read mammograms as well as radiologists and — in the future — maybe even outperform them, said presenter Nico Karssemeijer, PhD, a professor of computer-aided diagnosis (CAD) at Radboud University Medical Center Nijmegen, the Netherlands, during an RSNA 2017 session.

It is possible that radiologists — even when working with high performance equipment under optimal conditions — can fail to detect breast cancer. Dr. Karssemeijer said that the development of CAD systems was supposed to help address the problem of undetected cancers in screening mammography.

“But CAD hasn’t delivered on what it was intended to do,” said Dr. Karssemeijer, also director of ScreenPoint Medical BV, a developer of DL and image analysis technology in Nijmegen.

Advances in DL technology, however, show that artificial neural networks can be trained to perform the same tasks as humans. And, according to Dr. Karssemeijer, reading screening mammograms is a task where the conditions are ideal for the application of DL, considering it is a repetitive task for which large amounts of data are available for training.

An example of the potential utility of DL in screening mammography was demonstrated in another presentation, “Detecting Breast Cancer in Mammography: How Close Are Computers to Radiologists?” by Dr. Karssemeijer and colleagues.

In the study, researchers compared the performance of a DL computer detection system to that of six radiologists in detecting breast cancer using digital mammography.

The radiologists retrospectively reviewed 155 exams (73 malignant and 82 negative exams, of which 42 were biopsy-proven benign lesions, and 40 normal cases defined as BIRADS 1 or 2). The DL computer system was applied to the same dataset.

The researchers found that the receiver operating characteristics area under the curve was 0.83 (CI: 0.76-0.90) compared to 0.79 (CI: 0.72-0.86) for the DL system, suggesting that there was no statistical difference in the average performance of the six radiologists compared to the DL system.

DL Aids Radiology Decisions

According to Dr. Karssemeijer, the key to improving the reading of screening mammograms is not necessarily the detection of suspicious areas on mammograms, but in making decisions about which ones radiologists should act on.

“When we develop these systems further we can get beyond the level of human performance and move to a situation where radiologists will always be involved, but more in the sense of checking computer output rather than doing first reads themselves. So that’s a good sign for the future of screening mammography.”