|
1.Introduction1.1.Optical Coherence TomographyOptical coherence tomography (OCT)1 is a recent imaging method that allows high-resolution, cross-sectional images through tissues and materials, similar to ultrasound. OCT acquires images by measuring backscattered light. Compared to traditional imaging approaches such as computed tomography (CT) and magnetic resonace imaging (MRI), OCT has higher resolution, so it can reveal the microstructure of tissues and materials. Images from OCT systems typically have a resolution of . Over the past , OCT has been successfully used in disease diagnosis, biomedical research, material evaluation, and many other domains. 1.2.Automated Analysis of OCT ImagesIn practical applications, the number of OCT images obtained from the imaging device is usually very large. In addition, until now surgeons have limited experience in analyzing these images, so we need methods to automatically process them. Several efforts have been made in this area. Fernandez 2 have employed nonlinear complex diffusion and coherence-enhancing diffusion to automatically detect retinal layer structures. Baroni 3 have tested texture classification of retinal layers in optical coherence tomography images. In Ref. 4, Koozekanani have devised an approach for retinal thickness measurements from OCT images using a Markov boundary model. Zysk and Boppart5 have tested the automated diagnosis of breast tumor tissue in OCT images. In Ref. 6, Bazant-Hegemark and Stone proposed a near-real-time method for OCT image classification using principal component analysis (PCA) and linear discriminant analysis (LDA). They calculate row statistical values, such as mean, standard deviation, etc., for preprocessed images, these values are used as feature vectors. For classification, PCA and LDA are used. For most cases, image classification is the most important task. In all known existing methods, this is done by texture analysis, such as cooccurrence matrix, Fourier transform, etc. Different kinds of machine learning methods, such as linear discriminant analysis and artificial neural network, are used to classify these images. As OCT images are usually structure-poor, texture analysis is the best choice. However, all these methods extract features from images only globally. For nonhomogeneous images, global features cannot describe them effectively. In these images, local regions possess different kinds of patterns. So, we can no longer process them together—we need to analyze these local regions separately. Over the past , many methods based on analysis of local regions in images have been developed7, 8, 9, 10 in the computer vision domain. In these methods, images are represented by local regions (subimages). Compared to traditional approaches, these methods are robust to background clutter and partial occurrence. When OCT images are nonhomogeneous—that is, they have different kinds of local regions, the local region–based approach is a better choice. In this paper, we propose an effective and accurate method for OCT image classification based on local region features and earth mover’s distance (EMD). Our approach is robust, and it can handle partial abnormal problem. We divide each image into several subimages (local regions) and then extract features from these subimages and combine them to form signatures.11, 12, 13 Thus, our method is local region–based, as compared to the traditional global approach, which computes features from the entire image, it is robust and can effectively handle the partial abnormal problem. For classification, we implemented both the k nearest neighbor (KNN) classifier and the support vector machine (SVM) classifier with the EMD kernel. We used an OCT image set to evaluate our method, which contains normal skin OCT images and nevus flammeus images. In experiments, we compared our methods with a baseline approach , and achieved much higher performance. We believe that our approach can be applied to other types of OCT images, and it is especially suitable for nonhomogeneous images. The paper is organized as follows. In Sec. 2, we will describe our approach in detail, including preprocessing, the approach (baseline method), and our and schemes. Section 3 will give experimental results and discussion. Conclusions are given in Sec. 4. 2.MethodsIn order to demonstrate the effectiveness of our method, we compared it with the approach proposed in Ref. 6. Instead of using LDA, we chose a support vector machine (SVM), as it is the best classifier in almost all cases. For both methods, the preprocessing procedure is identical, and is similar to methods proposed in Ref. 6 In Sec. 2.1, we will describe the preprocessing method. Section 2.2 will describe the baseline method—that is, the approach. Our new method will be introduced in Sec. 2.3. 2.1.PreprocessingFor preprocessing, we employed a method similar to Ref. 6, which contains the following three main steps: The first step is to remove background noise in images. We calculate the mean and standard deviation of the top 20 rows for each image, and then is used as intensity threshold. In experiments, we tested different threshold values, from to , and found that is the best choice. All pixels whose intensity is less than this threshold are considered as background. After background removal, we conducted surface recognition and normalization identical to Ref. 6. We also performed surface smoothing in order to remove noisy points in the surfaces. Figure 1 shows the preprocessing results. Figure 1a is the original image, and Fig. 1b is the preprocessed image. We can see that background noise was removed and all A-scans were aligned according to surface. In order to demonstrate the results of surface normalization better, we shift the surface point for each column to line 10, and thus the “bright bar” is not located at the top of the images. 2.2.Baseline Method: PCA and Support Vector MachineIn the baseline method, we calculate the mean value and standard deviation of each row for preprocessed images. These values are combined to form a feature vector. PCA was performed on these feature vectors. Among all classifiers, SVM often produces state-of-the-art results in high-dimensional problems,14, 15, 16 so we chose it for our classification task instead of LDA. SVM finds a hyperplane that separates two-class data with a maximal margin. When the data set is not linearly separable, SVM uses two strategies, soft margin classification and kernel mapping. There are four kinds of kernel functions usually used: linear kernel, polynomial kernel, radius basis function (RBF), and sigmoid kernel. In experiments, we chose the RBF kernel: In our implementation, we used the source code from Ref. 15 which is a perfect implementation of the various SVM algorithms.2.3.Our Methods: and SchemesThe earth mover’s distance11, 12, 13 is a distance between two distributions (signatures) that reflects the minimal amount of work that must be performed to transform one distribution into the other by moving “distribution mass” around. It is a special case of the transportation problem from linear optimization, for which efficient algorithms are available. Signatures are defined as follows: where are -dimensional vectors, and scalars are weights for each vector. When used to compare distributions that have the same overall mass, the EMD is a true metric and has easy-to-compute lower bounds. If two signatures and arewhere and are -dimensional vectors, and and are weights, then the EMD between and is defined as:where scalars are flow values, and is the ground distance—that is, the distance between vectors and . Usually, this is a Euclidean distance; in our approach, we also chose Euclidean distance as ground distance. Scalars are found to minimize the following function:In Ref. 13 Rubner used EMD for image and texture retrieval; they represent the image as signatures. Signatures are calculated from the color distribution of images or the output of a Gabor filter applied to texture images. Then, EMD is computed for each image pair. This distance can be used for image and texture retrieval.In our approach, we do not use color distribution or a Gabor filter; instead, we choose local region features to form signatures. We divide each OCT image into some local regions (subimages), and features are extracted from these subimages and combined to form signatures. We calculate the mean of each row in a subimage, and these values are combined to form the feature vector of the subimage. Then, all of the subimage’s feature vectors are combined to form a matrix, which is the matrix of the signature. Additionally, each feature vector has a weight, and all of these weights form the weight vector of the signature. For classification, we tested both KNN and SVM algorithms. The main framework for our methods contains the following five steps:
The framework of our method is shown in Fig. 2 . In our approach, we represent images with signatures. The effective region of OCT images contains only the top 80 rows in surface-normalized images. For each preprocessed image, we divide its top 80 rows into 10 subimages and extract the features of these subimages (the mean of each row); all these feature vectors are combined to form the signature. The weights are set to be 1 for each vector, as all local regions are equally important. Once we have calculated EMD between each training image and testing sample, we use KNN and SVM algorithms to classify the testing images. For the scheme, we must calculate the EMD kernel before learning and testing. 3.Experimental Results and Discussion3.1.Experimental DataWe used an image set to evaluate our approach. This data set contains normal skin OCT images and nevus flammeus images; the total number of images is 100, of which 50 are normal and the remaining 50 are abnormal. These images are taken from 14 patients; the size of all images is , and the gray level is 256. The practical axial resolution of our OCT imaging system is about , and the transverse resolution is about . The actual SNR is about , and the penetration depth into skin is more than . Figure 3 shows two images in our image set, Fig. 3a is normal skin, and Fig. 3b is nevus flammeus. In normal images, the layer structure is obvious, the epidermal layer is smooth and continuous, and the dermis is uniform. But in nevus flammeus images, the layer structure is destroyed and the epidermal layer is discontinuous. In addition, the dermis is not uniform. So, OCT can effectively discriminate normal skin and nevus flammeus. As we can see from Fig. 3, nevus flammeus images usually have local abnormal regions—they do not demonstrate global uniform pattern. 3.2.Experimental Results of Baseline MethodOur baseline method is quite similar to the approach in Ref. 6—the main difference is that we choose SVM instead of LDA. In experiments, we tested four different numbers of principal components (pc): 20, 50, 80, and 100. The classification accuracy is demonstrated in Fig. 4 . We can see that the number of principal components is not crucial for classification performance; this phenomenon was also reported in Ref. 6. When the pc number is 20, the accuracy is 0.86; as the pc number increased to 100, the accuracy increased only to 0.89. The value of penalty parameter for SVM is 8 in all experiments. All results are got from 5-fold cross-validation. 3.3.Experimental Results of MethodIn our method, we represent images as signatures, which are formed by feature vectors of local regions in each image. Two signatures are shown in Fig. 5 : Fig. 5a is normal skin, and Fig. 5b is nevus flammeus. We can see that signatures can effectively discriminate these two kinds of images, as they are formed by local region features. In experiments, we tested a lot of values for parameter of the KNN algorithm. The classification results are listed in Table 1 . In Table 1, the values of parameter include 1, 3, 5, 7, 9, 11, 13, and 15. We conducted 2-fold cross-validation five times; in each round, we randomly select 50 images as training samples and the rest as testing samples. The bottom row lists the average classification accuracy for each value. We can see that as increases, the performance decreases. When is 1, the accuracy is 0.97, which is much higher than the baseline method. Table 1Classification accuracy of our EMD+KNN method.
The average accuracy is illustrated in Fig. 6, from which we can see the relation between parameter and classification accuracy. is the best choice, and as increases, the overall accuracy will decrease. When , the KNN algorithm is in fact the nearest neighbor (NN) classifier. 3.4.Experimental Results of MethodTable 2 demonstrates the classification accuracy of the scheme. In experiments, we tested different values for scaling parameter of the EMD kernel and different values for penalty parameter for SVM. Each row in Table 2 represents different values, and each column represents different values. We can see that outperforms both the baseline and methods; when the scaling is properly chosen, the accuracy could reach 0.99, which is a promising result. Classification accuracy with different values is listed in Fig. 7 . We can see that as increases from 15 to 60, the accuracy increases simultaneously. But when has reached 55, the improvement is negligible. Table 2Classification accuracy of SVM with EMD kernel. A : scaling parameter for EMD kernel; C : penalty parameter for SVM.
Compared to the baseline method, our approach achieved much higher performance. Essentially, the former is a global method, while the latter is a local approach. For most nevus flammeus OCT images, the pattern is not uniform; they usually contain local abnormal regions, and thus global features cannot describe this property. Unlike the global method, the EMD approach represents images as signatures, which are formed by combinations of the features of several local regions, and it can completely and accurately describe different regions of images. Furthermore, EMD allows for partial matching and is robust to clutters, so our approach could achieve higher classification accuracy than the baseline method. We believe our method can also be used to classify other types of OCT images that do not demonstrate uniform, homogeneous patterns. 4.ConclusionWe have proposed a new method for OCT image classification. Our approach is based on calculating the signatures of images and EMD between image pairs, which can effectively handle nonhomogeneous images. Experimental results demonstrated the effectiveness of our method, which achieved classification accuracy of 0.97 and 0.99 for and schemes, respectively, for appropriate parameter values. Compared to the baseline method, which achieved accuracy of 0.89, our method possesses obvious advantages, and it is especially suitable for nonhomogeneous images. We believe that our method can also be applied to classification tasks of other types of OCT images. AcknowledgmentsThis research was supported by 863 Project of China under Grant No. 2006AA02Z472, the National Natural Science Foundation of China under Grant No. 60971006. The authors are thankful for the anonymous reviewer’s helpful comments. ReferencesD. Huang, E. A. Swanson, C. P. Lin, J. S. Schumann, W. G. Stinson, W. Chang, M. R. Hee, T. Flotte, K. Gregory, C. A. Puliafito, and J. G. Fujimoto,
“Optical coherence tomography,”
Science, 254 1178
–1181
(1991). https://doi.org/10.1126/science.1957169 0036-8075 Google Scholar
D. C. Fernandez, H. M. Salinas, and C. A. Puliafito,
“Automated detection of retinal layer structures on optical coherence tomography images,”
Opt. Express, 13 10200
–10216
(2005). https://doi.org/10.1364/OPEX.13.010200 1094-4087 Google Scholar
M. Baroni, S. Diciotti, A. Evangelisti, P. Fortunato, and A. La Torre,
“Texture classification of retinal-layers in optical coherence tomography,”
847
–850
(2007). Google Scholar
D. Koozekanani, K. Boyer, and C. Roberts,
“Retinal thickness measurements from optical coherence tomography using a Markov boundary model,”
IEEE Trans. Med. Imaging, 20 900
–916
(2001). https://doi.org/10.1109/42.952728 0278-0062 Google Scholar
A. M. Zysk and S. A. Boppart,
“Computational methods for analysis of human breast tumor tissue in optical coherence tomography images,”
J. Biomed. Opt., 11
(5), 054015
(2006). https://doi.org/10.1117/1.2358964 1083-3668 Google Scholar
F. Bazant-Hegemark and N. Stone,
“Near real-time classification of optical coherence tomography data using principal components fed linear discriminant analysis,”
J. Biomed. Opt., 13
(3), 034002
(2008). https://doi.org/10.1117/1.2931079 1083-3668 Google Scholar
L. Zhu, A. Rao, and A. Zhang,
“Theory of keyblock-based image retrieval,”
ACM Trans. Inf. Syst., 20
(2), 224
–257
(2002). Google Scholar
S. Lazebnik, C. Schmid, and J. Ponce,
“A sparse texture representation using local affine regions,”
IEEE Trans. Pattern Anal. Mach. Intell., 27
(8), 1265
–1278
(2005). https://doi.org/10.1109/TPAMI.2005.151 0162-8828 Google Scholar
S. Agarwal and D. Roth,
“Learning a sparse representation for object detection,”
113
–130
(2002). Google Scholar
G. Csurka, C. Bray, C. Dance, and L. Fan,
“Visual categorization with bags of keypoints,”
1
–22
(2004). Google Scholar
E. Levina and P. Bickel,
“The earth mover’s distance is the mallows distance: some insights from statistics,”
251
–256
(2001). Google Scholar
Y. Rubner, C. Tomasi, and L. Guibas,
“The earth mover’s distance as a metric for image retrieval,”
Int. J. Comput. Vis., 40
(2), 99
–121
(2000). https://doi.org/10.1023/A:1026543900054 0920-5691 Google Scholar
Y. Rubner, C. Tomasi, and L. J. Guibas,
“A metric for distributions with applications to image databases,”
59
–66
(1998). Google Scholar
V. Vapnik, The Nature of Statistical Learning Theory, Springer, New York
(1995). Google Scholar
C.-C. Chang and C.-J. Lin,
“LIBSVM: a library for support vector machines,”
(2001) http://www.csie.ntu.edu.tw/~cjlin/libsvm Google Scholar
C.-W. Hsu, C.-C. Chang, and C.-J. Lin,
“A practical guide to support vector classification,”
(2009) www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf Google Scholar
O. Chapelle, P. Haffner, and V. Vapnik,
“Support vector machines for histogram-based image classification,”
IEEE Trans. Neural Netw., 10
(5), 1055
–1064
(1999). https://doi.org/10.1109/72.788646 1045-9227 Google Scholar
F. Jing, M. Li, H.-J. Zhang, and B. Zhang,
“Support vector machines for region-based image retrieval,”
II-21-4
(2003). Google Scholar
|