Postdoctoral research fellow working with the Department of Radiology at Mayo Clinic. Ph.D. in Computer Science from the National University of Singapore.
Publications (6)
This will count as one of your downloads.
You will have access to both the presentation and article (if available).
Phantom-based quality control, the current standard of QC in medical imaging, calibrates image quality at a population level, but does not account for the influence of patient variation on quality. In this work, we present a method to evaluate task-based image quality directly in individual clinical CT exams. Noise power spectrum (NPS) is measured in selected local image regions satisfying linearity and noise stationarity constraints, and globally over the volumetric image. Together with a semi-empirical model of image resolution, NPS is used to calculate noise-equivalent quanta (NEQ), a fundamental metric of image fidelity and information content. The NEQ may be extended to task-based detectability (d’) via a specified task function and model observer. We show that this method can: 1) elucidate intra-patient variations in signal detectability, and 2) task performance variations across a patient population. The method may be implemented in a hospital-wide online system that monitors imaging performance in CT exams in real-time.
Purpose. Spine surgery involves complex workflows and disparate levels of system integration that challenge the introduction of emerging technologies. This work develops a computational simulation framework based on statistical surgical process models (SPM) to quantitatively evaluate variations in the workflow and implementation of image guidance systems in terms of key outcome measures in spine surgery. Method. A statistical SPM was developed for spine surgery to describe the effects of various intraoperative technologies (viz., fluoroscopy, CT, image-to-world registration, and planning methods) and a range of procedural variables (e.g., surgeon skill, patient body mass index (BMI), target vertebrae, and fusion length) on key outcome measures, including cycle time, radiation dose and the quality of surgical product (geometric accuracy in pedicle screw placement). The model was parameterized by statistical distributions informed by clinical observation, expert feedback, literature review, and clinical data. Results. The results quantify the advantages of intraoperative CT and/or long-length scout radiography for reduced cycle time in vertebral localization – (4.8-7.2) min, compared to (5.8-12.4) min by fluoroscopy. The models further demonstrate the cycle time for imaging, registration, and planning in surgical guidance: the mean procedure cycle time for 11-level fusion was 540 min by fluoroscopy compared to 441 min for CT + navigation. Analysis of radiation dose quantified the effective dose to the patient (and operating room) between fluoroscopy and CT. The geometric accuracy of pedicle screw placement showed median error of 2.7 mm for fluoroscopy compared to 1.8 mm for CT+navigation and a corresponding reduction in frequency of pedicle breach for the latter. Conclusions. A statistical SPM provides a powerful framework for procedure simulation, evaluation of emerging technologies, and optimization of procedural workflow. Such modeling provides a quantitative basis to evidence the value of emerging technologies and identify optimal means of integration / implementation in clinical workflow.
Detection of low contrast liver metastases varies between radiologists. Training may improve performance for lower-performing readers and reduce inter-radiologist variability. We recruited 31 radiologists (15 trainees, eight non-abdominal staff, and eight abdominal staff) to participate in four separate reading sessions: pre-test, search training, classification training, and post-test. In the pre-test, each radiologist interpreted 40 liver CT exams containing 91 metastases, circumscribed suspected hepatic metastases while under eye tracker observation, and rated confidence. In search training, radiologists interpreted a separate set of 30 liver CT exams while receiving eye tracker feedback and after coaching to increase use of coronal reformations, interpretation time, and use of liver windows. In classification training, radiologists interpreted up to 100 liver CT image patches, most with benign or malignant lesions, and compared their annotations to ground truth. Post-test was identical to pre-test. Between pre- and post-test, sensitivity increased by 2.8% (p = 0.01) but AUC did not change significantly. Missed metastases were classified as search errors (<2 seconds gaze time) or classification errors (>2 seconds gaze time) using the eye tracker. Out of 2775 possible detections, search errors decreased (10.8% to 8.1%; p < 0.01) but classification errors were unchanged (5.7% vs 5.7%). When stratified by difficulty, easier metastases showed larger reductions in search errors: for metastases with average sensitivity of 0-50%, 50-90%, and 90-100%, reductions in search errors were 16%, 35%, and 58%, respectively. The training program studied here may be able to improve radiologist performance by reducing errors but not classification errors.
Purpose: Radiologists exhibit wide inter-reader variability in diagnostic performance. This work aimed to compare different feature sets to predict if a radiologist could detect a specific liver metastasis in contrast-enhanced computed tomography (CT) images and to evaluate possible improvements in individualizing models to specific radiologists.Approach: Abdominal CT images from 102 patients, including 124 liver metastases in 51 patients were reconstructed at five different kernels/doses using projection domain noise insertion to yield 510 image sets. Ten abdominal radiologists marked suspected metastases in all image sets. Potentially salient features predicting metastasis detection were identified in three ways: (i) logistic regression based on human annotations (semantic), (ii) random forests based on radiologic features (radiomic), and (iii) inductive derivation using convolutional neural networks (CNN). For all three approaches, generalized models were trained using metastases that were detected by at least two radiologists. Conversely, individualized models were trained using each radiologist’s markings to predict reader-specific metastases detection.Results: In fivefold cross-validation, both individualized and generalized CNN models achieved higher area under the receiver operating characteristic curves (AUCs) than semantic and radiomic models in predicting reader-specific metastases detection ability (p < 0.001). The individualized CNN with an AUC of mean (SD) 0.85(0.04) outperformed the generalized one [AUC = 0.78 ( 0.06 ) , p = 0.004]. The individualized semantic [AUC = 0.70 ( 0.05 ) ] and radiomic models [AUC = 0.68 ( 0.06 ) ] outperformed the respective generalized versions [semantic AUC = 0.66 ( 0.03 ) , p = 0.009; radiomic AUC = 0.64 ( 0.06 ) , p = 0.03].Conclusions: Individualized models slightly outperformed generalized models for all three feature sets. Inductive CNNs were better at predicting metastases detection than semantic or radiomic features. Generalized models have implementation advantages when individualized data are unavailable.
There is substantial variability in the performance of radiologist readers. We hypothesized that certain readers may have idiosyncratic weaknesses towards certain types of lesions, and unsupervised learning techniques might identify these patterns. After IRB approval, 25 radiologist readers (9 abdominal subspecialists and 16 non-specialists or trainees) read 40 portal phase liver CT exams, marking all metastases and providing a confidence rating on a scale of 1 to 100. We formed a matrix of reader confidence ratings, with rows corresponding to readers, and columns corresponding to metastases, and each matrix entry providing the confidence rating that a reader gave to the metastasis, with zero confidence used for lesions that were not marked. A clustergram was used to permute the rows and columns of this matrix to group similar readers and metastases together. This clustergram was manually interpreted. We found a cluster of lesions with atypical presentation that were missed by several readers, including subspecialists, and a separate cluster of small, subtle lesions where subspecialists were more confident of their diagnosis than trainees. These and other observations from unsupervised learning could inform targeted training and education of future radiologists.
The diagnostic performance of radiologist readers exhibits substantial variation that cannot be explained by CT acquisition protocol differences. Studying reader detectability from CT images may help identify why certain types of lesions are missed by multiple or specific readers. Ten subspecialized abdominal radiologists marked all suspected metastases in a multi-reader-multi-case study of 102 deidentified contrast-enhanced CT liver scans at multiple radiation dose levels. A reference reader marked ground truth metastatic and benign lesions with the aid of histopathology or tumor progression on later scans. Multi-slice image patches and 3D radiomic features were extracted from the CT images. We trained deep convolutional neural networks (CNN) to predict whether an average (generalized) or individual radiologist reader would detect or miss a specific metastasis from an image patch containing it. The individualized CNN showed higher performance with an area under the receiver operating characteristic curve (AUC) of 0.82 compared to a generalized one (AUC = 0.78) in predicting reader-specific detectability. Random forests were used to build the respective versions from radiomic features. Both the individualized (AUC = 0.64) and generalized (AUC = 0.59) predictors from radiomic features showed limited ability to differentiate detected from missed lesions. This shows that CNN can identify and learn automated features that are better predictors of reader detectability of lesions than radiomic features. Individualized prediction of difficult lesions may allow targeted training of idiosyncratic weaknesses but requires substantial training data for each reader.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.