Deep Learning Model Performs Better at Classifying Lung Nodules When Provided with Prior CT Images

Algorithm may reduce unnecessary follow-up CT or diagnostic interventions


Kiran Vaidhya Venkadesh
Venkadesh
Colin Jacobs, PhD
Jacobs

Using a multinational dataset, a deep learning (DL) algorithm – much like a radiologist - was better able to predict the malignancy of lung nodules when it was able to compare new images to those of a previous exam versus previously validated models which used a single CT examination.

 

Early-stage lung cancer can manifest as small pulmonary nodules, with CT examinations highly effective at depicting these nodules. However, many pulmonary nodules are benign, as demonstrated by the false-positive rate of 24% in the National Lung Screening Trial. It is challenging for radiologists to identify and monitor potentially malignant nodules; despite the presence of nodule management guidelines, accurate characterization remains tedious and is subject to inter- and intrareader variability.

 

In a study published in Radiology, Kiran Vaidhya Venkadesh, a PhD candidate at Radboud University Medical Center in the Netherlands, and his research team tested a version of their DL model that was newly trained to evaluate both the current CT images and those from a prior CT scan. The purpose was to provide a more precise estimation of malignancy risk.

 

The algorithm was trained with nodules from the NLST and evaluated with two external test sets from the Danish Lung Cancer Screening Trial (DLCST) and the Multicentric Italian Lung Detection Trial (MILD). The algorithm was compared with three validated models.

 

The first was a previously validated DL algorithm that only processed a single CT examination. The second was the Pan-Canadian Early Detection of Lung Cancer (PanCan) model introduced in 2013. PanCan, too, relies on information from a single CT scan, but also requires human input for variables including the size and type of nodule, the number of nodules identified in the scan, the patient’s biological sex and other factors.  The investigators then tested the algorithm on size-matched external sets from the DLSCT and the MILD. 

 

“Working with data from multiple national registries allows us to thoroughly analyze how our algorithm would perform on a completely new, unseen dataset from another country,” said senior author Colin Jacobs, PhD, an assistant professor and principal investigator within the Department of Medical Imaging of Radboud University Medical Center.

 

“As we have seen in previous studies and in real-world situations, AI algorithms may suffer from data drift,” Dr. Jacobs explained. "Observing a reliable performance on independent external validation datasets from multiple national registries gives us trust that our algorithm will generalize well across different datasets.”

Jacobs image

Examples of screening-detected pulmonary nodules from the Danish Lung Cancer Screening Trial (DLCST) and the Multicentric Italian Lung Detection Trial (MILD), wherein malignancy risks were estimated accurately by the deep learning (DL) algorithm that combines a current and prior CT examination. The lines correspond to the malignancy risk estimation algorithms (solid blue, DL algorithm with prior CT; dotted blue, DL algorithm; dotted green, PanCan model). The percentages correspond to the risk scores from 0% to 100%. (A) Annual low-dose axial chest CT images in a 55-year-old woman with a lung cancer diagnosis in the DLCST show a growing spiculated malignant nodule, with a volume doubling time (VDT) of 481 days. All algorithms produced high malignancy risk scores. (B) Biennial low-dose axial chest CT images in a 67-year-old man with a lung cancer diagnosis in the MILD show a growing malignant nodule (VDT, 232 days). The DL algorithms produced high malignancy risk scores. (C) Annual low-dose axial chest CT images in a 66-year-old male participant without a lung cancer diagnosis in the DLCST show a stable benign nodule in which all algorithms produced low malignancy risk scores. (D) Biennial low-dose axial chest CT images in a 78-year-old male participant without a lung cancer diagnosis in the MILD show a stable part-solid benign nodule, in which the DL algorithm, which combines current and prior CT, produced a low malignancy risk score. However, the algorithm that only processed a single CT produced a high malignancy risk score. PanCan = Pan-Canadian Early Lung Cancer Detection Study. https://doi.org/10.1148/radiol.223308 ©RSNA 2023

DL Algorithm Effectively Incorporated Prior CT Imaging

In receiver operating characteristic analysis, the new DL algorithm significantly outperformed the PanCan model as well as the previous algorithm that only processed a single examination, by a wide margin.

 

The algorithm relied solely on imaging data to generate its risk estimates, and did not consider any demographic information, explained Venkadesh. In its training phase, the utilized NLST data included participants aged 55 to 75 years with a smoking history of at least 30 pack-years.

 

“Because our algorithm was trained on data from heavy smokers, its performance on patients without a significant smoking history remains uncertain,” he said. “Nevertheless, previous research on deep learning algorithms, including ones trained on nodules from NLST like ours, has demonstrated excellent accuracy in predicting risk even in cases involving individuals who have never smoked.”

 

An accompanying Radiology editorial, authored by Carolyn Horst, MBBS, PhD, academic clinical fellow at King’s College London and Guy’s and St Thomas’ National Health Service Foundation Trust, London, and Mizuki Nishino, MD, MPH, professor of radiology at the Brigham and Women’s Hospital and Harvard Medical School, Boston, underscored how studies like this could help to deliver malignancy risk tools that assess not only a nodule’s malignancy but also its potential to cause harm within a specific future timeframe.

 

“The described DL tool may enable earlier intervention for indeterminate but ultimately malignant pulmonary nodules, thereby obviating further follow-up imaging before a conclusive histologic diagnosis is undertaken,” the editorial authors said. “This would limit the potential of so-called stage shift in indeterminate but aggressive lesions that would otherwise metastasize between scan intervals.”

 

Additionally, this deep learning model could be applied to the current screening paradigm to reduce follow-up scans for nodules that are unlikely to manifest in clinically important cancers within three years, they said.

 

Opportunities To Optimize CT Reading Workflow

By providing a more precise understanding of whether and when follow-up examinations are needed, DL tools could help radiologists reduce scan burden and anxiety for screening participants as well as unnecessary costs.

 

This study is one of the first deep learning analyses to use longitudinal CT scans for pulmonary nodule malignancy risk estimation, observed Dr. Jacobs.

 

He predicted that as future AI algorithms employ multiple longitudinal scans when they’re available, their performance will align more closely with how radiologists interpret CT scans in clinical practice.

 

For More Information

Access the Radiology study, “Prior CT Improves Deep Learning for Malignancy Risk Estimation of Screening-detected Pulmonary Nodules” and accompanying editorial, “If Only We Had a Time Machine: Prior CT in Deep Learning for Lung Nodule Prognostication.”

 

Read previous RSNA News stories on lung cancer: