KEYWORDS: Tumor growth modeling, Cancer detection, Deep learning, Data modeling, Image processing, Image enhancement, Process modeling, Mammography, Image filtering
We present an automated method to generate synthetic contrast-enhanced mammography cases with simulated microcalcification clusters. This method accounts for existing textures in the breast, with the simulated clusters inserted in the low-energy image. In parallel, potential mass-like enhancement is modelled from real values in the recombined image. The same deep learning model was trained with different amounts and ratios of real and synthetic data. When trained with real data only, malignant masses are more often correctly detected and classified than malignant microcalcification clusters. The addition of synthetic data with simulated clusters during training could increase detection sensitivity for all types of malignant lesions and maintained similar levels of AUC for classification. This enhanced performance was consistent on both internal and external test sets. These findings demonstrate the potential applicability of synthetic data to enhance deep learning models, especially when real data are scarce or imbalanced.
Deep learning (DL) models can be trained on contrast-enhanced mammography (CEM) images to detect and classify lesions in the breast. As they often put more emphasis on the masses enhanced in the recombined image, they can fail in recognizing microcalcification clusters since these are hardly enhanced and are mainly visible in the (processed) lowenergy image. Therefore, we developed a method to create synthetic data with simulated microcalcification clusters to be used for data augmentation and explainability studies when training DL models. At first 3-dimensional voxel models of simulated microcalcification clusters based on descriptors of the shape and structure were constructed. In a set of 500 simulated microcalcification clusters the range of the size and of the number of microcalcifications per cluster followed the distribution of real clusters. The insertion of these clusters in real images of non-delineated CEM cases was evaluated by radiologists. The realism score was acceptable for single view applications. Radiologists could more easily categorize synthetic clusters into benign versus malignant than real clusters. In a second phase of the work, the role of synthetic data for training and/or explaining DL models was explored. A Mask R-CNN model was trained with synthetic CEM images containing microcalcification clusters. After a training run of 100 epochs the model was found to overfit on a training set of 192 images. In an evaluation with multiple test sets, it was found that this high level of sensitivity was due to the model being capable of recognizing the image rather than the cluster. Synthetic data could be applied for more tests, such as the impact of particular features in both background and lesion models.
Characterization of microcalcification clusters in the breast and differentiation between benign and malignant structures on (contrast-enhanced) mammography (CEM) images is of great importance to determine cancerous lesions. Computer algorithms may help performing these tasks, but typically need large sets of data for model training. Therefore this paper develops a method to create synthetic microcalcification clusters that can later be used to overcome data sparsity problems. Starting from descriptors of the shape and size, both benign and malignant microcalcifications were created and then combined into 3-dimensional cluster models given realistic geometric properties. The distributions of the largest diameter and the number of microcalcifications per cluster in a set of 500 simulated clusters were set such that they agreed with those of real clusters. An existing simulation tool was then extended to insert the clusters into processed, low-energy CEM background images with appropriate contrast values. In a validation study comprised of 40 real and 40 synthetic cases, radiologists were asked to evaluate realism and malignancy. It was found that the shape and the structure of the individual microcalcifications as well as the complete clusters were realistic. Thus the descriptors were chosen correctly and enabled a good classification between benign and malignant cases. The realistic brightness and boundary smoothness proved the simulation tool can correctly insert the 3D clusters into real background images and is suitable of creating a large set of realistic microcalcification clusters simulated in existing (contrast-enhanced) mammography images. With improvements on the correspondence of insertion location in craniocaudal and mediolateral oblique view, which proved more challenging to simulate realistically, this promising method is expected to be applicable for modeling complete synthetic cases. Such a dataset can be used for data enrichment where data sources are limited and for development and training purposes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.