type
status
date
slug
summary
tags
category
icon
password
Breast cancer is one of the most common cancers among women worldwide, and millions of women undergo breast cancer screening every year to detect and treat the disease early. Mammography is currently the most commonly used screening method. However, existing breast cancer risk assessment models have certain limitations in accuracy and universality, which limits the effectiveness of screening strategies.
Limitations of existing risk assessment models
- Lack of accuracy
Traditional risk assessment models, such as the Tyrer-Cuzick and Gail models, have limited accuracy in predicting breast cancer risk. For example, in a prospective UK screening cohort, the area under the curve (AUC) for the Tyrer-Cuzick and Gail models were 0.62 and 0.59, respectively, suggesting that they have limited ability to identify high-risk patients.
- Insufficient use of image information
Traditional models rely primarily on clinical and patient-reported risk factors, such as age, family history, and hormonal factors, and do not fully utilize the rich information in mammograms.
- Race and equipment differences
Existing models perform inconsistently across different races and between different mammography equipment, limiting their widespread application.
Mirai, a deep learning model developed jointly by MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Jameel Clinic, solves the above problems.
Features of Mirai:
Multi-time point risk prediction
Mirai can predict breast cancer risk at multiple time points ( e.g., risk within 1, 2, 3, 4, and 5 years), thereby providing more comprehensive information for clinical decision-making.
Handling missing risk factor information
Mirai is able to process potentially missing risk factor information (such as age, family medical history, hormonal factors, etc.) and supplement model inputs by predicting these factors to ensure the accuracy of risk prediction.
Consistent risk assessment
Mirai uses a conditional adversarial training mechanism to ensure that it can perform consistent risk assessments on different types of mammography equipment. This means that no matter what device is used to take the mammogram, the model can provide relatively consistent risk prediction results.
Efficiently identify high-risk patients
Mirai performed well on multiple datasets, particularly in identifying patients at high risk of developing breast cancer within the next five years, and was more accurate than existing risk assessment models such as the Tyrer-Cuzick model.
Extensive international validation
Mirai demonstrated consistent performance across different ethnicities, ages, and breast density categories, demonstrating its potential application value in multiple clinical settings.
Technical Methods of Mirai
Dataset and model development
Dataset collection
- Massachusetts General Hospital (MGH)
dataset: includes 210,819 training samples, 25,644 validation samples, and 25,855 test samples. The data comes from mammograms of 56,786 patients and contains detailed risk factor information such as age, family history, and hormonal factors.
- Karolinska University Hospital
dataset: contains 19,328 samples from 7,353 patients, mainly screening data from 2008 to 2016.
- Chang Gung Memorial Hospital (CGMH)
dataset: contains 13,356 samples from 13,356 patients, mainly screening data from 2010 to 2011.
Model Architecture
- Image Encoder: Each mammogram view is encoded using a shared ResNet-18 model.
- Image Aggregator: Uses a Transformer model to aggregate the encoded information from different views into a comprehensive vector.
- Risk Factor Prediction Module: Predicts traditional risk factors such as age, weight, and hormonal factors from mammograms.
- Additive risk layer: combines the image aggregator output with risk factor information to predict a patient’s breast cancer risk over the next 5 years.
Model training and testing
Training process
- The Mirai model is trained on the MGH dataset, using mammograms and corresponding risk factor information.
- A conditional adversarial training mechanism is used to ensure consistent predictions of the model on different devices.
Model Evaluation
- The models were tested on the MGH, Karolinska and CGMH datasets to evaluate their predictive accuracy (C index and area under the ROC curve).
- Compare the performance of the Mirai model with other models (Tyrer-Cuzick model, Hybrid DL, and Image-Only DL).
Specific subgroup analysis
- In the MGH dataset, subgroup analyses were performed by race (white, African American, and Asian American), age group, breast density category, and different devices to evaluate the consistent performance of the models.
- In the Karolinska dataset, C-index calculation was performed by future cancer subtype (aggressiveness, HER2 status, etc.).
High-risk patient identification
High risk threshold setting
- A 20% lifetime risk from the Tyrer-Cuzick model was used as the high-risk threshold.
- The same specificity thresholds as the Tyrer-Cuzick model were set for the Image-Only DL, Hybrid DL, and Mirai models for sensitivity comparison.
High-risk patient identification performance
- The sensitivity and specificity of each model in identifying high-risk patients were evaluated, and the performance of each model on different test sets (MGH, Karolinska, and CGMH) was compared.
Bias removal and feature importance analysis
Equipment deviation elimination
- The conditional adversarial training mechanism is used to ensure the consistency of the model's predictions on different mammography devices.
- Evaluate the debiasing effect through the device identity classifier.
Feature Importance Analysis
- Assess the importance of each risk factor in the Mirai model prediction and calculate the significance score of each risk factor.
Prospective research and model improvement directions
Prospective studies
The actual clinical application effect of the model needs to be further verified in large-scale clinical trials.
Model improvement direction
- The model's predictive accuracy was further improved by using 3D mammograms in conjunction with the patient's imaging history.
- Study how to adapt to mammography equipment from different manufacturers to ensure wide clinical application.
Experimental Results of Mirai
Model performance
Overall Performance
- The C-index of the Mirai model on three test sets (MGH, Karolinska, and CGMH) is 0.76, 0.81, and 0.79, respectively, showing higher performance than existing models such as Tyrer-Cuzick and Hybrid DL.
- The area under the receiver operating characteristic curve (AUC) of the Mirai model was significantly higher than that of other models in identifying high-risk patients within 5 years.
Multi-time point risk prediction
- The Mirai model performed well in predicting breast cancer risk within 1, 2, 3, 4, and 5 years, achieving high AUC values at each time point.
- In the MGH dataset, the 5-year AUC of the Mirai model was 0.76, which was significantly higher than that of the Hybrid DL (P < 0.001) and Tyrer-Cuzick models (P < 0.001).
High-risk patient identification
Sensitivity and Specificity
- In the MGH dataset, the Mirai model has significantly higher sensitivity than the Tyrer-Cuzick model and other deep learning models under the same specificity conditions. For example, the Mirai model identified 41.5% of high-risk patients within 5 years on the MGH test set, while the Tyrer-Cuzick model only identified 22.9%.
- The sensitivity of the Mirai model is also significantly higher than that of the Image-Only DL model in the Karolinska and CGMH datasets. For example, in the CGMH dataset, the sensitivity of the Mirai model is 37.4%, while that of the Image-Only DL model is 24.5%.
Subgroup analysis
Race and Age Group
The Mirai model has consistent C-index performance across different races (white, African American, Asian American) and age groups (<40, 40-50, 50-60, 60-70, >70). For example, in the MGH dataset, the Mirai model has a C-index of 0.75 and 0.80 for white and Asian Americans, respectively, while the Tyrer-Cuzick model has a C-index of 0.64 and 0.54, respectively.
Breast Density Categories
The Mirai model performed consistently across different breast density categories (fatty, scattered fibroglandular, heterogeneously dense, and extremely dense), demonstrating its stability and reliability across different breast densities.
Future cancer subtypes
In the Karolinska dataset, the Mirai model's C-index performance was consistent across different future cancer subtypes (aggressiveness, HER2 status, etc.), further validating its applicability across different cancer types.
Equipment deviation elimination
Device consistency
Through conditional adversarial training, the Mirai model predicts consistent results on different mammography devices, eliminating device bias. In the MGH test set, the AUC of the device identity classifier dropped from 0.76 without adversarial training to 0.50, showing a significant bias elimination effect.
Feature Importance
Significance of risk factors
In the MGH test set, the most important risk factors include the patient's BRCA status, family history (whether there is a family history of the disease), and reproductive status (whether there have been children). Compared with these traditional risk factors, the significance score of mammogram is about 30 times higher.
General Discussion
Consistency Across Races and Devices
The consistent performance of the Mirai model across ethnicities and devices suggests its potential for application in a wider range of clinical settings to improve the accuracy and efficiency of breast cancer screening.
Potential for clinical application
The Mirai model's significant advantage in identifying high-risk patients suggests that it could help improve early breast cancer detection rates and reduce overscreening and treatment of low-risk patients.
Conclusions of Mirai
High-performance risk prediction
The Mirai model significantly outperformed existing traditional risk assessment models (such as the Tyrer-Cuzick model) and other deep learning models (such as Hybrid DL and Image-Only DL) on three independent test sets (MGH, Karolinska, and CGMH). In particular, Mirai's AUC value was higher than the control model in all test sets in terms of breast cancer risk prediction within 5 years.
Wide applicability
The Mirai model performed consistently across different races, age groups, and breast density categories, demonstrating its broad applicability in different clinical settings. This feature enables the Mirai model to provide reliable risk predictions in a diverse patient population.
High-risk patient identification
The Mirai model significantly outperformed existing models in identifying patients at high risk of developing breast cancer within the next 5 years. For example, in the MGH test set, the Mirai model identified 41.5% of high-risk patients, while the Tyrer-Cuzick model only identified 22.9%.
Equipment deviation elimination
Through conditional adversarial training, the Mirai model successfully eliminated the prediction bias between different mammography devices, ensuring its consistency and reliability across multiple devices.
Feature Importance Analysis
In risk prediction, the significance score of mammograms is much higher than that of traditional risk factors (such as BRCA status, family history, and reproductive status), indicating the important role of imaging information in breast cancer risk assessment.
Clinical applications and future research
Potential for clinical application
The Mirai model has demonstrated great potential in improving breast cancer screening strategies. Through more accurate risk prediction and identification of high-risk patients, the Mirai model is expected to increase early detection rates of breast cancer, reduce overscreening and treatment, thereby improving patient outcomes and reducing healthcare costs.
Future research directions
Future studies could further incorporate patients' imaging history and use 3D mammograms to improve the model's predictive accuracy.
Conduct prospective clinical trials in larger and more diverse populations to validate the effectiveness of the Mirai model in actual clinical applications.
Study how to adapt to mammography equipment from different manufacturers to ensure the application of the Mirai model in a wider range of clinical environments.
- Author:KCGOD
- URL:https://kcgod.com/ai-model-to-predict-future-breast-cancer
- Copyright:All articles in this blog, except for special statements, adopt BY-NC-SA agreement. Please indicate the source!
Relate Posts
Google Launches Gemini-Powered Vids App for AI Video Creation
FLUX 1.1 Pro Ultra: Revolutionary AI Image Generator with 4MP Resolution
X-Portrait 2: ByteDance's Revolutionary AI Animation Tool for Cross-Style Expression Transfer
8 Best AI Video Generators Your YouTube Channel Needs
Meta AI’s Orion AR Glasses: Smart AI-Driven Tech to Replace Smartphones