Gender classification, a two-class problem (male or female), has been the subject of extensive research recently and gained a lot of attention due to its varied set of applications. The proposed work relies on individual facial features to train a convolutional neural network (CNN) for gender classification. In contrast with previously reported results that assume the facial features are independent, we consider the facial features as correlated features by training a single CNN that jointly learns from all facial features. In terms of accuracy, our results either outperform, or are on par with, other gender classification techniques applied to three different datasets namely specs on faces, groups, and face recognition technology. In terms of performance, the proposed CNN has significantly fewer parameters as compared with other techniques reported in the literature. Our learnable parameters are fewer than those required in techniques reported in recent work, which enables them to make the network less sensitive to over-fitting and easier to train than techniques that use different CNNs for each facial feature as reported in the literature.
Markers such as CD13 and CD133 have been used to identify Cancer Stem Cells (CSC) in various tissue images. It is highly likely that CSC nuclei appear as brown in CD13 stained liver tissue images. We observe that there is a high correlation between the ratio of brown to blue colored nuclei in CD13 images and the ratio between the dark blue to blue colored nuclei in H&E stained liver images. Therefore, we recommend that a pathologist observing many dark blue nuclei in an H&E stained tissue image may also order CD13 staining to estimate the CSC ratio. In this paper, we describe a computer vision method based on a neural network estimating the ratio of dark blue to blue colored nuclei in an H&E stained liver tissue image. The neural network structure is based on a multiplication free operator using only additions and sign operations. Experimental results are presented.
The analysis and interpretation of histopathological samples and images is an important discipline in the diagnosis of
various diseases, especially cancer. An important factor in prognosis and treatment with the aim of a precision medicine
is the determination of so-called cancer stem cells (CSC) which are known for their resistance to chemotherapeutic
treatment and involvement in tumor recurrence. Using immunohistochemistry with CSC markers like CD13, CD133 and
others is one way to identify CSC. In our work we aim at identifying CSC presence on ubiquitous Hematoxilyn and Eosin
(HE) staining as an inexpensive tool for routine histopathology based on their distinct morphological features.
We present initial results of a new method based on color deconvolution (CD) and convolutional neural networks
(CNN). This method performs favorably (accuracy 0.936) in comparison with a state-of-the-art method based on 1DSIFT
and eigen-analysis feature sets evaluated on the same image database. We also show that accuracy of the CNN is
improved by the CD pre-processing.
Assessment of visual quality plays a crucial role in modeling, implementation, and optimization of image- and video-processing applications. The image quality assessment (IQA) techniques basically extract features from the images to generate objective scores. Feature-based IQA methods generally consist of two complementary phases: (1) feature extraction and (2) feature pooling. For feature extraction in the IQA framework, various algorithms have been used and recently, the two-dimensional (2-D) mel-cepstrum (2-DMC) feature extraction scheme has provided promising results in a feature-based IQA framework. However, the 2-DMC feature extraction scheme completely loses image-phase information that may contain high-frequency characteristics and important structural components of the image. In this work, “2-D complex mel-cepstrum” is proposed for feature extraction in an IQA framework. The method tries to integrate Fourier transform phase information into the 2-DMC, which was shown to be an efficient feature extraction scheme for assessment of image quality. Support vector regression is used for feature pooling that provides mapping between the proposed features and the subjective scores. Experimental results show that the proposed technique obtains promising results for the IQA problem by making use of the image-phase information.
Oğuzhan Oğuz, Cem Emre Akbaş, Maen Mallah, Kasım Taşdemir, Ece Akhan Güzelcan, Christian Muenzenmayer, Thomas Wittenberg, Ayşegül Üner, A. Cetin, Rengül Çetin Atalay
In this article, algorithms for cancer stem cell (CSC) detection in liver cancer tissue images are developed. Conventionally, a pathologist examines of cancer cell morphologies under microscope. Computer aided diagnosis systems (CAD) aims to help pathologists in this tedious and repetitive work. The first algorithm locates CSCs in CD13 stained liver tissue images. The method has also an online learning algorithm to improve the accuracy of detection. The second family of algorithms classify the cancer tissues stained with H and E which is clinically routine and cost effective than immunohistochemistry (IHC) procedure. The algorithms utilize 1D-SIFT and Eigen-analysis based feature sets as descriptors. Normal and cancerous tissues can be classified with 92.1% accuracy in H and E stained images. Classification accuracy of low and high-grade cancerous tissue images is 70.4%. Therefore, this study paves the way for diagnosing the cancerous tissue and grading the level of it using H and E stained microscopic tissue images.
In this paper, we present a technique for automatically classifying human carcinoma cell images using textural features. An image dataset containing microscopy biopsy images from different patients for 14 distinct cancer cell line type is studied. The images are captured using a RGB camera attached to an inverted microscopy device. Texture based Gabor features are extracted from multispectral input images. SVM classifier is used to generate a descriptive model for the purpose of cell line classification. The experimental results depict satisfactory performance, and the proposed method is versatile for various microscopy magnification options.
The infrared (IR) cameras plays an important role in the measurement and analysis of object signature. However, especially the scientific IR cameras that are used for research and military purposes have manual focusing system that reduces the sensitivity and reliability of the measurement taken. Camera autofocus algorithms extract various features from the camera images in order to define a measure for determining the most focused camera image instance. In this work, a no-reference image quality measure is modified and the modified measure is proposed for the autofocus of infrared cameras. Experimental results show that the proposed measure can be used in the problem of autofocus of infrared cameras, successfully.
One of the main disadvantages of using commercial broadcasts in a Passive Bistatic Radar (PBR) system is the range resolution. Using multiple broadcast channels to improve the radar performance is offered as a solution to this problem. However, it suffers from detection performance due to the side-lobes that matched filter creates for using multiple channels. In this article, we introduce a deconvolution algorithm to suppress the side-lobes. The two-dimensional matched filter output of a PBR is further analyzed as a deconvolution problem. The deconvolution algorithm is based on making successive projections onto the hyperplanes representing the time delay of a target. Resulting iterative deconvolution algorithm is globally convergent because all constraint sets are closed and convex. Simulation results in an FM based PBR system are presented.
KEYWORDS: Target detection, Fermium, Frequency modulation, Atrial fibrillation, Doppler effect, Radar, Signal to noise ratio, Surveillance, Detection and tracking algorithms, Signal processing
Passive Bistatic Radar (PBR) systems use illuminators of opportunity, such as FM, TV, and DAB broadcasts. The most common illuminator of opportunity used in PBR systems is the FM radio stations. Single FM channel based PBR systems do not have high range resolution and may turn out to be noisy. In order to enhance the range resolution of the PBR systems algorithms using several FM channels at the same time are proposed. In standard methods, consecutive FM channels are translated to baseband as is and fed to the matched filter to compute the range-Doppler map. Multichannel FM based PBR systems have better range resolution than single channel systems. However superious sidelobe peaks occur as a side effect. In this article, we linearly predict the surveillance signal using the modulated and delayed reference signal components. We vary the modulation frequency and the delay to cover the entire range-Doppler plane. Whenever there is a target at a specific range value and Doppler value the prediction error is minimized. The cost function of the linear prediction equation has three components. The first term is the real-part of the ordinary least squares term, the second-term is the imaginary part of the least squares and the third component is the l2-norm of the prediction coefficients. Separate minimization of real and imaginary parts reduces the side lobes and decrease the noise level of the range-Doppler map. The third term enforces the sparse solution on the least squares problem. We experimentally observed that this approach is better than both the standard least squares and other sparse least squares approaches in terms of side lobes. Extensive simulation examples will be presented in the final form of the paper.
Features extracted at salient points are used to construct a region covariance descriptor (RCD) for target tracking. In the classical approach, the RCD is computed by using the features at each pixel location, which increases the computational cost in many cases. This approach is redundant because image statistics do not change significantly between neighboring image pixels. Furthermore, this redundancy may decrease tracking accuracy while tracking large targets because statistics of flat regions dominate region covariance matrix. In the proposed approach, salient points are extracted via the Shi and Tomasi’s minimum eigenvalue method over a Hessian matrix, and the RCD features extracted only at these salient points are used in target tracking. Experimental results indicate that the salient point RCD scheme provides comparable and even better tracking results compared to a classical RCD-based approach, scale-invariant feature transform, and speeded-up robust features-based trackers while providing a computationally more efficient structure.
In this paper, an online adaptive decision fusion framework is developed for image analysis and computer vision applications. In this framework, it is assumed that the compound algorithm consists of several sub-algorithms, each of which yields its own decision as a real number centered around zero, representing the confidence level of that particular sub-algorithm. Decision values are linearly combined with weights that are updated online according to an active fusion method based on performing orthogonal projections onto convex sets describing sub-algorithms. It is assumed that there is an oracle, who is usually a human operator, providing feedback to the decision fusion method. A video-based wildfire detection system is developed to evaluate the performance of the algorithm in handling the problems where data arrives sequentially. In this case, the oracle is the security guard of the forest lookout tower verifying the decision of the combined algorithm. Simulation results are presented.
In this article, we introduce the concept of fractional wavelet transform. Using a two-channel unbalanced lifting
structure it is possible to decompose a given discrete-time signal x[n] sampled with period T into two sub-signals
x1[n] and x2[n] whose average sampling periods are pT and qT, respectively. Fractions p and q are rational
numbers satisfying the condition: 1/p + 1/q = 1. The low-band sub-signal x1[n] comes from [0, π/p] band and the high-band wavelet signal x2[n] comes from (π/p, π] band of the original signal x[n]. Filters used in the liftingstructure are designed using the Lagrange interpolation formula. It is straightforward to extend the proposed
fractional wavelet transform to two or higher dimensions in a separable or non separable manner.
In this paper, a method for detection of popcorn kernels infected by a fungus is developed using image processing. The
method is based on two dimensional (2D) mel and Mellin-cepstrum computation from popcorn kernel images. Cepstral
features that were extracted from popcorn images are classified using Support Vector Machines (SVM). Experimental
results show that high recognition rates of up to 93.93% can be achieved for both damaged and healthy popcorn kernels
using 2D mel-cepstrum. The success rate for healthy popcorn kernels was found to be 97.41% and the recognition rate
for damaged kernels was found to be 89.43%.
In this paper, an adaptive color transform for image compression is introduced. In each block of the image, coefficients of the color transform are determined from the previously compressed neighboring blocks using weighted sums of the RGB pixel values, making the transform block-specific. There is no need to transmit or store the transform coefficients because they are estimated from previous blocks. The compression efficiency of the transform is demonstrated using the JPEG image coding scheme. In general, the suggested transformation results in better peak signal-to-noise ratio (PSNR) values for a given compression level.
An image feature extraction method based on the two-dimensional (2-D) mel cepstrum is introduced. The concept of one-dimensional mel cepstrum, which is widely used in speech recognition, is extended to 2-D in this article. The feature matrix resulting from the 2-D mel-cepstral analysis are applied to the support-vector-machine classifier with multi-class support to test the performance of the mel-cepstrum feature matrix. The AR, ORL, and Yale face databases are used in experimental studies, which indicate that recognition rates obtained by the 2-D mel-cepstrum method are superior to the recognition rates obtained using 2-D principal-component analysis and ordinary image-matrix-based face recognition. Experimental results show that 2-D mel-cepstral analysis can also be used in other image feature extraction problems.
Target detection in SAR images using region covariance (RC) and codifference methods is shown to be accurate
despite the high computational cost. The proposed method uses directional filters in order to decrease the search
space. As a result the computational cost of the RC based algorithm significantly decreases. Images in MSTAR
SAR database are first classified into several categories using directional filters (DFs). Target and clutter image
features are extracted using RC and codifference methods in each class. The RC and codifference matrix features
are compared using l1 norm distance metric. Support vector machines which are trained using these matrices
are also used in decision making. Simulation results are presented.
Shadows constitute a problem in many moving object detection and tracking algorithms in video. Usually, moving
shadow regions lead to larger regions for detected objects. Shadow pixels have almost the same chromaticity as the
original background pixels but they only have lower brightness values. Shadow regions usually retain the underlying
texture, surface pattern, and color value. Therefore, a shadow pixel can be represented as a.x where x is the actual
background color vector in 3-D RGB color space and a is a positive real number less than 1. In this paper, a shadow
detection method based on two-dimensional (2-D) cepstrum is proposed.
In this paper, a novel descriptive feature parameter extraction method from synthetic aperture radar (SAR) images is
proposed. The new approach is based on region covariance (RC) method which involves the computation of a covariance
matrix whose entries are used in target detection and classification. In addition the region co-difference matrix is also
introduced. Experimental results of object detection in MSTAR (moving and stationary target recognition) database are
presented. The RC and region co-difference method delivers high detection accuracy and low false alarm rates. It is also
experimentally observed that these methods produce better results than the commonly used principal component analysis
(PCA) method when they are used with different distance metrics introduced.
An algorithm for human-eye localization in images is presented for faces with frontal pose and upright orientation. A given face region is filtered by a highpass wavelet-transform filter. In this way, edges of the region are highlighted, and a caricature-like representation is obtained. Candidate points for each eye are detected after analyzing horizontal projections and profiles of edge regions in the highpass-filtered image. All the candidate points are then classified using a support vector machine. Locations of each eye are estimated according to the most probable ones among the candidate points. It is experimentally observed that our eye localization method provides promising results for image-processing applications.
A novel method to detect flames in infrared (IR) video is proposed. Image regions containing flames appear as bright regions in IR video. In addition to ordinary motion and brightness clues, the flame flicker process is also detected by using a hidden Markov model (HMM) describing the temporal behavior. IR image frames are also analyzed spatially. Boundaries of flames are represented in wavelet domain and the high frequency nature of the boundaries of fire regions is also used as a clue to model the flame flicker. All of the temporal and spatial clues extracted from the IR video are combined to reach a final decision. False alarms due to ordinary bright moving objects are greatly reduced because of the HMM-based flicker modeling and wavelet domain boundary modeling.
In this paper, a human face detection method in images and video is presented. After determining possible face candidate
regions using color information, each region is filtered by a high-pass filter of a wavelet transform. In this way, edges of
the region are highlighted, and a caricature-like representation of candidate regions is obtained. Horizontal, vertical and
filter-like projections of the region are used as feature signals in dynamic programming (DP) and support vector machine
(SVM) based classifiers. It turns out that the support vector machine based classifier provides better detection rates
compared to dynamic programming in our simulation studies.
KEYWORDS: Wavelets, Signal processing, 3D image processing, Distortion, Computer programming, Data modeling, Wavelet transforms, Reconstruction algorithms, 3D acquisition, Linear filtering
We propose a new Set Partitioning In Hierarchical Trees (SPIHT) based
mesh compression framework. The 3D mesh is first transformed to 2D
images on a regular grid structure. Then, this image-like representation is wavelet transformed and SPIHT is applied on the wavelet domain data. The method is progressive because the
resolution of the reconstructed mesh can be changed by varying the length of the 1D data stream created by SPIHT algorithm. Nearly perfect reconstruction is possible if full length of 1D data is received.
In this paper, a moving object tracking algorithm for infrared image sequences is presented. The tracking algorithm is based on the mean-shift tracking method which is based on comparing the histograms of moving objects in consecutive image frames. In video obtained after visible light, the color histogram of the object is used for tracking. In forward looking infrared image sequences, the histogram is constructed not only from the pixel values but also from a highpass filtered version of the original image. The reason behind
the use of highpass filter outputs in histogram construction is to
capture structural nature of the moving object. Simulation examples are presented.
Tracking moving objects in video can be carried out by correlating a template containing object pixels of the current frame. This approach may produce erronous results under noise. We determine a set of significant pixels on the object by analyzing the wavelet transform of the template and correlate only these pixels with the current frame to determine the position of the object. These significant pixels are easily trackable features of the image and incrase the performance of the tracker.
Image registration refers to the problem of spatially aligning two or more images. A challenging problem in this area is the registration of images obtained by different types of sensors. In general such images have different gray level characteristics and commonly used techniques such as those based on area correlations cannot be applied directly. On the other hand, contours representing the region boundaries are preserved in most cases. Therefore, contour based registration techniques are applicable to multimodal sensors. In this paper, various registration techniques based on subband decomposition and projection along x and y directions are introduced. The effect of binarization is investigated. Unknown translation and scaling parameters are computed using cross-correlation methods over the projections. Performance of the algorithms is compared.
In this paper, a small moving object method detection method in video sequence is described. In the first step, the camera motion is eliminated using motion compensation. An adaptive subband decomposition structure is then used to analyze the motion compensated image. In the highband subimages moving objects appear as outliers and they are detected using a statistical detection test based on lower order statistics. It turns out that in general, the distribution of the residual error image pixels is almost Gaussian. On the other hand, the distribution of the pixels in the residual image deviates from Gaussianity in the existence of outliers. By detecting the regions containing outliers the boundaries of the moving objects are estimated. Simulation examples are presented.
In this paper, a small moving object method detection in video sequences is described. In the first step, the camera motion is eliminated using motion compensation. An adaptive subband decomposition structure is then used to analyze the motion compensated image. In the low-high and high- low subimages small moving objects appear as outliers and they are detected using a statistical Gaussianity detection test based on higher order statistics. It turns out that in general, the distribution of the residual error image pixels is almost Guassian. On the other hand, the distribution of the pixels in the residual image deviates from Gaussianity in the existence of outliers. Simulation examples are presented.
Inverse halftoning is the problem of recovering a continuous- tone image from a given halftoned image. In this paper, a new inverse halftoning method which uses a set theoretic formulation is introduced. The new method exploits the prior information at hand, and uses space-domain projections, frequency-domain projections, and space-scale domain projections to obtain a feasible reconstruction of the continuous-tone image. The proposed method is also extended for the inverse halftoning of color error-diffused images.
KEYWORDS: Mammography, Digital filtering, Image filtering, Computer aided diagnosis and therapy, Error analysis, Nonlinear filtering, Picture Archiving and Communication System, Breast cancer, Computing systems, Databases
With increasing use of Picture Archiving and Communication Systems, computer-aided diagnosis methods will be more widely utilized. In this paper, we develop a CAD method for the detection of microcalcification clusters in mammograms, which are an early sign of breast cancer. The method we propose makes use of 2D adaptive filtering and a Gaussianity test recently developed by Ojeda et al. for causal invertible time series. The first step of this test is adaptive linear prediction. It is assumed that the prediction error sequence has a Gaussian distribution as the mammogram images do not contain sharp edges. Since microcalcifications appear as isolated bright spots, the prediction error sequence contains large outliers around microcalcification locations on the second step of the algorithm is the computation of a test statistic from the prediction error values to determine whether the samples are from a Gaussian distribution. The Gaussianity test is applied over small, overlapping square regions. The regions, in which the Gaussianity test fails, are marked as suspicious regions. Experimental results obtained from a mammogram database are presented.
In this paper, compression of binary digital fingerprint images is considered. High compression ratios for fingerprint images is essential for handling huge amount of images in databases. In our method, the fingerprint image is first processed by a binary nonlinear subband decomposition filter bank and the resulting subimages are coded using vector quantizers designed for quantizing binary images. It is observed that the discriminating properties of the fingerprint images are preserved at very low bit rates. Simulation results are presented.
Computer-aided diagnosis will be an important feature of the next generation picture archiving and communication systems. In this paper, computer-aided detection of microcalcifications in mammograms using a nonlinear subband decomposition and outlier labeling is examined. The mammogram image is first decomposed into subimages using a nonlinear subband decomposition filter bank. A suitably identified subimage is divided into overlapping square regions in which skewness and kurtosis as measures of the asymmetry and impulsiveness of the distribution are estimated. A region with high positive skewness and kurtosis is marked as a region of interest. Finally, an outlier labeling method is used to find the locations of microcalcifications in these regions. Simulation studies are presented.
Image coding using wavelet transform, DCT, and similar transform techniques is well established. On the other hand, these coding methods neither take into account the special characteristics of the images in a database nor are they suitable for fast database search. In this paper, the digital archiving of Ottoman printings is considered. Ottoman documents are printed in Arabic letters. Witten et al. describes a scheme based on finding the characters in binary document images and encoding the positions of the repeated characters This method efficiently compresses document images and is suitable for database research, but it cannot be applied to Ottoman or Arabic documents as the concept of character is different in Ottoman or Arabic. Typically, one has to deal with compound structures consisting of a group of letters. Therefore, the matching criterion will be according to those compound structures. Furthermore, the text images are gray tone or color images for Ottoman scripts for the reasons that are described in the paper. In our method the compound structure matching is carried out in wavelet domain which reduces the search space and increases the compression ratio. In addition to the wavelet transformation which corresponds to the linear subband decomposition, we also used nonlinear subband decomposition. The filters in the nonlinear subband decomposition have the property of preserving edges in the low resolution subband image.
Multimedia and Picture Archiving and Communication System (PACS) applications require efficient ways of handling images for communication and visualization. In many Visual Information and Management Systems (VIMS), it may be required to get quick responses to queries. Usually, a VIMS database has a huge number of images and may provide lots of images for each query. For example, in a PACS, the VIMS provides 10 to 100 images for a typical query. Only a few of these images may actually be needed. In order to find the useful ones, the user has to preview each image by fully decompressing it. This is neither computationally efficient, nor user friendly. In this paper, we propose a scheme which provides a magnifying glass type previewing feature. With this method, a multiresolution previewing without decompressing the whole image is possible. Our scheme is based on block transform coding which is the most widely used technique in image and video coding. In the first step of our scheme, all of the queried images are displayed in the lowest possible resolution (constructed from the DC coefficients of the coded blocks). If the user requests more information for a region of a particular image by specifying its size and place, then that region is hierarchically decompressed and displayed. In this way, large amounts of computations and bandwidth usage are avoided and a good user interface is accomplished. This method changes the ordering strategy of transform coefficients, thus reduces the compression ratio, however this effect is small.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.