KEYWORDS: Prostate, Evolutionary algorithms, Magnetic resonance imaging, Image segmentation, Artificial intelligence, Deep learning, Biopsy, Data modeling, Education and training, Medicine
PurposeAccurate whole-gland prostate segmentation is crucial for successful ultrasound-MRI fusion biopsy, focal cancer treatment, and radiation therapy techniques. Commercially available artificial intelligence (AI) models, using deep learning algorithms (DLAs) for prostate gland segmentation, are rapidly increasing in numbers. Typically, their performance in a true clinical context is scarcely examined or published. We used a heterogenous clinical MRI dataset in this study aiming to contribute to validation of AI-models.ApproachWe included 123 patients in this retrospective multicenter (7 hospitals), multiscanner (8 scanners, 2 vendors, 1.5T and 3T) study comparing prostate contour assessment by 2 commercially available Food and Drug Association (FDA)-cleared and CE-marked algorithms (DLA1 and DLA2) using an expert radiologist’s manual contours as a reference standard (RSexp) in this clinical heterogeneous MRI dataset. No in-house training of the DLAs was performed before testing. Several methods for comparing segmentation overlap were used, the Dice similarity coefficient (DSC) being the most important.ResultsThe DSC mean and standard deviation for DLA1 versus the radiologist reference standard (RSexp) was 0.90±0.05 and for DLA2 versus RSexp it was 0.89±0.04. A paired t-test to compare the DSC for DLA1 and DLA2 showed no statistically significant difference (p=0.8).ConclusionsTwo commercially available DL algorithms (FDA-cleared and CE-marked) can perform accurate whole-gland prostate segmentation on a par with expert radiologist manual planimetry on a real-world clinical dataset. Implementing AI models in the clinical routine may free up time that can be better invested in complex work tasks, adding more patient value.
ABSTRACT When training a deep learning model, the dataset used is of great importance to make sure that the model learns relevant features of the data and that it will be able to generalize to new data. However, it is typically difficult to produce a dataset without some bias toward any specific feature. Deep learning models used in histopathology have a tendency to overfit to the stain appearance of the training data { if the model is trained on data from one lab only, it will usually not be able to generalize to data from other labs. The standard technique to overcome this problem is to use color augmentation of the training data which, artificially, generates more variations for the network to learn. In this work we instead test the use of a so called domain-adversarial neural network, which is designed to prevent the model from being biased towards features that in reality are irrelevant such as the origin of an image. To test the technique, four datasets from different hospitals for Gleason grading of prostate cancer are used. We achieve state of the art results for these particular datasets, and furthermore for two of our three test datasets the approach outperforms the use of color augmentation.
Prostate cancer is the most diagnosed cancer in men. The diagnosis is confirmed by pathologists based on ocular inspection of prostate biopsies in order to classify them according to Gleason score. The main goal of this paper is to automate the classification using convolutional neural networks (CNNs). The introduction of CNNs has broadened the field of pattern recognition. It replaces the classical way of designing and extracting hand-made features used for classification with the substantially different strategy of letting the computer itself decide which features are of importance.
For automated prostate cancer classification into the classes: Benign, Gleason grade 3, 4 and 5 we propose a CNN with small convolutional filters that has been trained from scratch using stochastic gradient descent with momentum. The input consists of microscopic images of haematoxylin and eosin stained tissue, the output is a coarse segmentation into regions of the four different classes. The dataset used consists of 213 images, each considered to be of one class only. Using four-fold cross-validation we obtained an error rate of 7.3%, which is significantly better than previous state of the art using the same dataset. Although the dataset was rather small, good results were obtained. From this we conclude that CNN is a promising method for this problem. Future work includes obtaining a larger dataset, which potentially could diminish the error margin.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.