|
1.INTRODUCTIONPhoton-counting spectral computed tomography (CT) is a promising novel technology for next-generation CT scanners.1–3 Advantages of photon-counting detectors, compared to standard energy-integrating detectors, include higher contrast-to-noise ratio and spatial resolution, and improved low-dose imaging. One common issue in photon-counting CT is detector inhomogeneity, which results in energy threshold variation across detector elements, and, if not corrected for, leads to streak artifacts in the sinogram domain and ring artifacts in the image domain. This type of inhomogeneity can emerge due to an insufficiently calibrated forward model, temperature differences, and defective pixels.4 Many methods have been suggested for artifact and noise reduction in CT imaging and lately there has been a shift towards deep learning as a way to tackle these problems.4–9 In this work, we add to this literature by training a deep neural network for ring artifact correction in the sinogram domain and demonstrating its effectiveness in reducing ring artifacts for virtual monoenergetic images at a range of energy levels in photon-counting spectral CT. 2.METHOD2.1Photon-counting spectral CT2.1.1Material decompositionConsider a multi-bin system with B > 2 energy bins and, for simplicity, a 2-dimensional image space. The material decomposition starts with the ansatz that the X-ray linear attenuation coefficient μ(x, y; E) can be approximated by a linear combination of M basis materials where am and τm(E) are the basis coefficients and basis functions, respectively. It is usually resolved in the sinogram domain and thus the target variables are the material line integrals where R denotes the Radon transform operator. The expected number of photons in energy bin j follows the polychromatic Beer-Lambert law This is our forward model. Finally, the measured data is the vector y := [y1,…, yB] where for each j we assume that Hence, the (non-linear) inverse problem is to map the photon counts y to the material line integrals A := [A1, …, AM]. The most common approach to this problem is maximum likelihood.10–12 Setting up the objective as the log likelihood and simplifying yields This is subsequently solved using some iterative algorithm, e.g., the logarithmic barrier method.13 2.1.2Data generationAfter generating numerical basis material phantoms (soft tissue, bone and iodine) by segmenting CT images from the KiTS19 dataset,9 photon-counting imaging was simulated using the fanbeam function in Matlab and a spectral response model of a photon-counting silicon detector14 with 0.5 × 0.5 mm2 pixels. The simulation was performed for 120 kVp and 200 mAs with 1579 detector pixels and 1600 view angles. After simulating Poisson noise, the maximum likelihood method was used for material decomposition of the simulated sinograms into bone and soft tissue basis sinograms, which were then reconstructed on a 583 × 583 pixel grid. To avoid streak artifacts due to photon starvation, a logarithmic barrier function was used to penalize large negative basis projection values. To simulate the effect of threshold variations, the simulation was performed with a random threshold shift (σ = 0.5 keV) applied independently to each of the eight thresholds of each detector pixel, and two material decompositions were performed: one with the actual bin thresholds that were used in the simulations, including the random shift, and one with the nominal bin thresholds, where the latter configuration yields images with ring artifacts. 2.2Deep learning2.2.1Problem statementWe propose an image processing technique for ring artifact correction based on deep neural networks. More formally, let x ∈ ℝM×H×W denote the streak corrupted basis sinograms and y ∈ ℝM×H×W their streak artifact free counterpart, where M, H and W are the number of basis materials, view angles and detector pixels, respectively. Then our objective is to learn We let fθ be a neural network and learn the map (6) by learning parameters θ. 2.2.2Network architectureUNet is an widely utilized architecture for a range of different tasks in biomedical imaging. The defining feature is the encoder-decoder structure. We use a version of the original UNet15 shown in Fig. 1. 2.2.3Loss functionsMean square error (MSE) is perhaps the most commonly used loss function for applications of deep learning in biomedical imaging Using MSE loss will encourage the output to match the target pixel-by-pixel. This low-level per-pixel comparison is well known to produce output that is overly-smooth and lacking fine details that affect the perceptual quality.16, 17For several image transformation tasks, it has proved useful to instead employ a perceptual loss function which, instead of comparing pixel-by-pixel, compares high-level feature representations between the output and target. These feature representations are extracted from a pretrained convolutional neural network. We follow16 and use VGG1618 pretrained on ImageNet19 as feature extractor, or loss network. Let ϕj denote the j-th layer of VGG16, then our perceptual loss is defined as where Cj is the number of channels in layer j. We will set j = 9 which corresponds to “relu2_2” in.16 3.TRAINING DETAILSFrom each of the 1600 × 1579 basis sinograms we extract 20 256 × 256 patches. A total of 630 samples, yielding 12600 patches, are split 70/30 into a training and validation set. The network is trained using Adam20 with β1 = 0.5, β2 = 0.9, and learning rate γ = 1×10–4 for 100 epochs with a batch size of 16 on one NVIDIA GeForce RTX 3070 Laptop GPU. We standardize the input by diving by the channel-wise standard deviation. We can obtain ring corrupted data with a range of different artifact magnitudes by taking a linear combination of streak corrupted and streak free basis sinograms. In this work, we are mainly concerned with the case when the rings are barely perceptible. Let w denote the weight given to the ring corrupted basis sinogram and (1 − w) the weight given to its ring free counterpart. We found that w = 0.4 produces a realistic level for the artifacts and w = 1 a suitable level to train on. 4.RESULTS4.1Qualitative resultsQualitative results are available in Fig. 2 and 3. First, in Fig. 2, we have the results from the sinogram domain. Here, a pair of streak corrupted basis sinograms are passed through the network to produce the corresponding predicted pair. Note that despite training on 256×256 patches the network generalizes sufficiently to be able to deal with the entire 1600×1579 basis sinograms. The network does a fairly good job at removing the streaks. We subsequently reconstruct basis images from these sinograms and form virtual monoenergetic images at 40, 70, and 100 keV displayed in Fig. 3. Streak correction in the sinogram domain translates well into ring correction in the image domain. However, some residual rings are still visible. Note that, somewhat surprisingly, there is no significant difference in the performance of the network trained using MSE loss and the network trained using the perceptual loss. 4.2Quantitative resultsQuantitative results are available in table 1. We employ the standard metrics used in this type of literature. Namely, structural similarity index measure (SSIM)21 and peak signal-to-noise ratio (PSNR). However, we appreciate the fact that these are not necessarily great metrics of perceptual quality* and instead stress our qualitative results. Note that, surprisingly, the network trained with a perceptual loss achieves higher PSNR than the network trained with MSE loss. However, this difference is sufficiently small to reasonably be attributed to stochastic variation in the optimization procedure. We also investigate the resolution by adding a central circular insert in the KiTS19 phantoms and retrieving the edge spread function as an average over radial profiles in the region of interest (ROI). We then fit a Gaussian error function to estimate the resolution as its standard deviation. Both networks produce a slight decrease in resolution. Table 1.Quantitative results
5.CONCLUSIONDetector inhomogeneity, a common issue in photon-counting spectral CT, results in streak artifacts in the sinogram domain and ring artifacts in the image domain. In this work, we propose a deep learning image processing technique for ring artifact correction in the sinogram domain. Artifact corrupted data is generated by solving the material decomposition problem with a correctly and an incorrectly calibrated forward model. We trained a deep neural network to remove the streaks in the basis sinograms, which are subsequently reconstructed to produce ring corrected basis images and virtual monoenergetic images. Instead of training a network to produce output that is similar to target pixel-by-pixel, we use a perceptual loss function that encourages the feature representation of the output to be similar to that of the target. Unexpectedly, we found that the network trained using the standard MSE loss essentially performs on par with the network trained using the perceptual loss. Future research will address the slight degradation in resolution caused by the networks, investigate why the networks perform so similarly, and further develop this method on a larger and more diverse dataset. ACKNOWLEDGEMENTSThis study was financially supported by MedTechLabs and the Göran Gustafsson foundation. Mats Persson and Dennis Hein disclose research collaboration with GE Healthcare. Alma Eguizabal discloses consultancy with GE Healthcare. REFERENCESRoessl, E. and Proksa, R.,
“K-edge imaging in x-ray computed tomography using multi-bin photon counting detectors,”
Physics in Medicine and Biology, 52 4679
–4696
(2007). https://doi.org/10.1088/0031-9155/52/15/020 Google Scholar
Willemink, M. J., Persson, M., Pourmorteza, A., Pelc, N. J., and Fleischmann, D.,
“Photon-counting ct: Technical principles and clinical prospects,”
Radiology, 289
(2), 293
–312
(2018). https://doi.org/10.1148/radiol.2018172656 Google Scholar
Danielsson, M., Persson, M., and Sjölin, M.,
“Photon-counting x-ray detectors for CT,”
Physics in Medicine & Biology, 66 03TR01
(2021). https://doi.org/10.1088/1361-6560/abc5a5 Google Scholar
Fang, W., Li, L., and Chen, Z.,
“Removing ring artefacts for photon-counting detectors using neural networks in different domains,”
IEEE Access,
(8), 42447
–42457
(2020). https://doi.org/10.1109/Access.6287639 Google Scholar
Yang, Q., Yan, P., Zhang, Y., Yu, H., Shi, Y., Mou, X., Kalra, M. K., Zhang, Y., Sun, L., and Wang, G.,
“Low-dose ct image denoising using a generative adversarial network with wasserstein distance and perceptual loss,”
IEEE transactions on medical imaging, 37
(6), 1348
–1357
(2018). https://doi.org/10.1109/TMI.2018.2827462 Google Scholar
Kim, B., Han, M., Shim, H., and Baek, J.,
“A performance comparison of convolutional neural networkbased image denoising methods: The effect of loss functions on low-dose ct images,”
Medical physics, 46
(9), 3906
–3923 Lancaster)2019). https://doi.org/10.1002/mp.v46.9 Google Scholar
Wang, Z., Li, J., and Enoh, M.,
“Removing ring artifacts in cbct images via generative adversarial networks with unidirectional relative total variation loss,”
Neural computing and applications, 31
(9), 5147
–5158
(2019). https://doi.org/10.1007/s00521-018-04007-6 Google Scholar
Nauwynck, M., Bazrafkan, S., Heteren, A. V., Beenhouwer, J. D., and Sijbers, J.,
“Ring artifact reduction in sinogram space using deep learning,”
in 6th International Conference on Image Formation in X-Ray Computed Tomography,
(2020). Google Scholar
Eguizabal, A., Persson, M. U., and Grönberg, F.,
“A deep learning post-processing to enhance the maximum likelihood estimate of three material decomposition in photon counting spectral CT,”
[Medical Imaging 2021: Physics of Medical Imaging]International Society for Optics and Photonics, 11595 1080
–1089 SPIE(2021). Google Scholar
Grönberg, F., Lundberg, J., Sjölin, M., Persson, M., Bujila, R., Bornefalk, H., Almqvist, H., Holmin, S., and Danielsson, M.,
“Feasibility of unconstrained three-material decomposition: imaging an excised human heart using a prototype silicon photon-counting ct detector,”
European radiology, 30
(11), 5904
–5912
(2020). https://doi.org/10.1007/s00330-020-07017-y Google Scholar
Alvarez, R. E.,
“Estimator for photon counting energy selective x-ray imaging with multibin pulse height analysis,”
Medical Physics, 38
(5), 2324
–2334
(2011). https://doi.org/10.1118/1.3570658 Google Scholar
Ducros, N., Abascal, J. F. P.-J., Sixou, B., Rit, S., and Peyrin, F.,
“Regularization of nonlinear decomposition of spectral x-ray projection images,”
Medical Physics, 44
(9), el74
–el87
(2017). https://doi.org/10.1002/mp.12283 Google Scholar
Boyd, S. and Vandenberghe, L., Convex optimization, Cambridge university press(2004). https://doi.org/10.1017/CBO9780511804441 Google Scholar
Persson, M., Wang, A., and Pelc, N. J.,
“Detective quantum efficiency of photon-counting CdTe and Si detectors for computed tomography: a simulation study,”
Journal of Medical Imaging, 7
(4), 1
–28
(2020). https://doi.org/10.1117/1.JMI.7.4.043501 Google Scholar
Ronneberger, O., Fischer, P., and Brox, T.,
“U-net: Convolutional networks for biomedical image segmentation,”
Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015], Lecture Notes in Computer Science, 234
–241 Springer International Publishing, Cham
(2015). Google Scholar
Johnson, J., Alahi, A., and Fei-Fei, L.,
“Perceptual losses for real-time style transfer and super-resolution,”
in European Conference on Computer Vision,
(2016). https://doi.org/10.1007/978-3-319-46475-6 Google Scholar
Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., and Shi, W.,
“Photo-realistic single image super-resolution using a generative adversarial network,”
in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)],,
105
–114
(2017). Google Scholar
Simonyan, K. and Zisserman, A.,
“Very deep convolutional networks for large-scale image recognition,”
2015). Google Scholar
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., and Fei-Fei, L.,
“Imagenet large scale visual recognition challenge,”
International journal of computer vision, 115
(3), 211
–252
(2015). https://doi.org/10.1007/s11263-015-0816-y Google Scholar
Kingma, D. P. and Ba, J.,
“Adam: A method for stochastic optimization,”
3rd International Conference on Learning Representations, ICLR, San Diego, CA, USA
(201520152015). Google Scholar
Wang, Z., Bovik, A., Sheikh, H., and Simoncelli, E.,
“Image quality assessment: from error visibility to structural similarity,”
in IEEE Transactions on Image Processing,
600
–612
(2004). Google Scholar
|