Synopsis:
BI-RADS breast tissue composition defines which imaging modality is best suited for tissue examination. However it is subjective and varies between readers whereas AI techniques have been shown to remove subjectivity. We evaluate the use of state-of-the-art AI algorithms on a general whole-body noncontrast MRI to quantify the amount of fat versus nonfat tissue and compare with radiologists reports. Our results show significant correlation between the AI and radiologists' decisions. Further, we show on large dataset that the rate of replacement of nonfat fibroglandular tissue with fatty tissue is almost triple the rate in premenopausal women than postmenopausal women.
Summary of Main Findings:
Developed an automatic, reliable & reproducible way of separating breast-fat from thoracic-fat; quantifying breast-density volume from MRI. We show association between AI and expert-assigned density-classes. The fibroglandular-tissue decline in premenopausal is 2.7 times greater than postmenopausal.
Introduction:
Breast cancer is the second most common cancer diagnosed in women after skin cancer. Typically, younger women have more dense tissue and as they age this glandular tissue type gets variably replaced by fatty tissue. For women who retain or have dense tissue, their risk of cancer is higher than women with fatty replacement. Currently, radiologists use Breast Imaging Reporting and Data System (BI-RADS) to classify breast tissue composition into 4 categories: Almost entirely fat, Scattered fibroglandular tissue, Heterogeneous fibroglandular tissue and Extreme fibroglandular tissue. Radiologists' use BI-RADS to report the risk of a lesion being breast cancer and provide guidance for imaging surveillance. Despite its long clinical success, the BI-RADS tissue composition atlas is subjective and varies between readers and even within the same reader1,2. Inter-reader agreement using Bi-RADS is only 65%3. In recent years many automated methods have been suggested to compute the dense versus fatty tissue volume from mammograms and dynamic contrast-enhanced magnetic resonance imaging. These methods may have a potential advantage of increased reproducibility and diagnostic accuracy.
Additionally, we evaluate the possibility of using routine T2 weighted MRI images for the purposes of quantifying the amount of dense tissue in the breast. Subsequently, we then identify quantifiable thresholds on the percentage volumes to classify the breast in each of the 4 classes of the BI-RADS scoring system.
Methods:
From a cohort of 1400 female subjects that underwent whole-body MRI (WB-MRI) scanning as part of a preventative health screening program, 1187 females were selected. Women with breast Implants or those whose radiology report classifications were not available, were excluded from the study. Out of 1187, 698 self-reported to be postmenopausal with median age of 62 years, 489 self-reported to be premenopausal with median age of 41years. The WB-MRI protocol included an axial T2-weighted MRI sequence of the chest, which served as the dataset for this study. Breast composition assignment to one of the four classes was performed independently by radiologists at the time of WB-MRI interpretation and reporting.
The automatic quantification of the breast was done in a hierarchical manner. Firstly, the breast was segmented using a deep neural network. An nnUNet4 was trained on radiologist labeled breast tissue on 40 scans and the trained network was used to infer results on the rest of the data. The trained model is capable of distinguishing breast-fat from thoracic-fat (Fig. 1). Secondly, another nnUNet based network was trained to separate the fibroglandular tissue from fatty tissue in the previously segmented breast. The strategy behind invoking a breast pre-segmentation step was to minimize the risk of false positives related to potential confounding tissue types at the difficult to delineate edges of unsegmented breasts. The groundtruth data for tissue classification was generated using manual labeling of a part of data. nnUNet was trained using default parameters for both the tasks.
Results:
The breast segmentation results were evaluated by splitting the data into train and test sets. The segmentation results were evaluated using dice scores on the test data. The segmentation model was able to achieve a dice score of 92% on the test data. Similarly, for breast tissue segmentation a dice score of 90% was obtained. Fig.2 shows that the mean value for each class increases from dense to fatty class. Premenopausal women (fibroglandular=57.54ml, fatty=1847.01ml), postmenopausal (fibroglandular=131.57ml, fatty=1446.02ml). However, the variance of the percentage dense volume is large suggesting a large overlap between classes and hence the need to have data driven thresholds that can estimate the classes more accurately. From the regression analysis, we observe that the rate of change of fatty tissue is different in premenopausal (larger slope=2.84) than postmenopausal women(smaller slope=1.06) as shown in Fig. 3. so these groups should be evaluated separately. Based on the plots shown below in Fig. 2, the threshold on percentage fat for dense, heterogeneous, scattered and fatty should be < 93%, 93-96.5%, 96.5-99%, > 99%, respectively. The correlation between AI assigned classes and radiologist reports was significant (r=0.34, p < 0.0001).
Discussion:
In this study, a two step automated process (breast segmentation followed by fat tissue subtype differentiation) was performed as part of AI-driven calculation of breast fatty-tissue percentage on MRI. The overarching goal being a step towards optimal classification of the breast density classes as per the BI-RADS system. Our results demonstrate statistically significant correlation between AI-driven breast fatty-tissue percentage quantification and radiologist-assigned BI-RADS breast density classes. Using proposed AI methods, the quantification can be used to remove the subjectivity and improve the accuracy of the breast dense tissue volume estimates. Also, our results indicate having a single model for both premenopausal and postmenopausal women may not be suitable as the rate of change is different in these populations and henceforth should be analyzed separately.
Conclusion:
While our AI-driven breast tissue composition quantification results correlated with the current standard of breast density classification by radiologist determination, the analysis also highlights the inherent subjectivity of the current standard as large degrees of overlap between groups were observed in the spectrums of fatty-percentage quantifications within each group.
References:
[1] JMH Timmers, HJ van Doorne-Nagtegaal, ALM Verbeek, GJ Den Heeten, and MJM Broeders. A dedicated bi-rads training programme: effect on the inter-observer variation among screening radiologists. European journal of radiology , 81(9):2184–2188, 2012.
[2] Emily F Conant, Brian L Sprague, and Despina Kontos. Beyond bi-rads density: a call for quantification in the breast imaging clinic. Radiology, 286(2):401–404, 2018.
[3] Conant, Emily F et al. “Beyond BI-RADS Density: A Call for Quantification in the Breast Imaging Clinic.” Radiology vol. 286,2 (2018): 401-404.
[4] Isensee, F., Jaeger, P. F., Kohl, S. A., Petersen, J., & Maier-Hein, K. H. (2020). nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nature Methods, 1-9.
Read full paper here.