PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 12526, including the Title Page, Copyright information, Table of Contents, and Conference Committee information.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Propelled by the discovery in the late 1990s of the accelerated expansion of the universe, an unknown energy called dark-energy was hypothesized to be responsible. In February of this year a team of astronomers reported that this dark-energy resided within isolated supermassive black holes. In this paper the Space Dual of Einstein’s Relativity (SDER) in physics, first published in a 2008 SPIE article, will be seen to theoretically support this finding. The SDER theory is anchored in the medium duality formed by a vacuum and a black hole, both extreme mediums for the motion and retention, respectively, of mass-energy. It will be seen that while motion’s speeds have ‘the speed of light in a vacuum’ given by c = 2.9979 x 108 m/sec as an upper bound that led to Einstein’s relativity in 1905, retention’s paces have ‘the pace of dark in a black hole’ given by χ = 6.13 x 1063 sec/m3 as an upper bound that led to the SDER in 2008 and its theoretical support for the finding by astronomers that dark-energy resides in isolated black holes. χ
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Through their growth in the postmodern Western economies, but recently also in China in particular, "events", with a focus on the business-oriented events that dominate the industry, represent time-limited gatherings with high innovation potential. Increased digital development increases the change in importance from knowledge mediation to network building. The contact and interaction control of participants as company representatives as well as sponsors, analog, hybrid or also digital, has thereby predominantly the goal of experience exchange and project initiation to force, while the finally developing network formation promotes innovation formation. Matchmaking is to be seen as a crucial tool that leads to completely new qualities in this regard through AI-based solutions for all forms of business events such as congresses, conferences, or exhibitions through learning and adapting algorithms. Although the prerequisite always remains the collection of sufficient qualitative data from participants, the potentials of a relevant data analytics and a predictive "match", e.g., in the form of one-to-one meetings ultimately lead to a more efficient and effective use of such business-oriented events.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image registration is an important pre-processing step for many image exploitation algorithms such as geo-location, object recognition, vision-aided navigation, and image fusion. The utility and effectiveness of downstream exploitation algorithms depends on reliable image registration. Image registration failure can corrupt processing and performance of downstream algorithms by mis-associating image features. For example, feature mis-association leads to generation of incorrect target geo-coordinates in aerial surveillance applications, or erroneous vision-based measurements in vision aided navigation. Accurate and dependable registration failure detection mitigates deleterious effects of erroneous registration. However, in autonomous operation modes, with no human in-the-loop, and in the absence of registration solution groundtruth knowledge, verifying registration solutions is problematic. In this paper we present a machine learning based image registration verification system that operates autonomously, without ground-truth. We train a machine learning algorithm to identify correct registration solutions, even for difficult multi-modal image registration in which sensor phenomenology differences produce different feature manifestation. The verification approach includes techniques for mitigation of falsealarms that may arise due to feature ambiguity. We present examples of feature ambiguity for correlation-related registration techniques. We describe Radon transform processing, covariance estimation, and fusion techniques for feature ambiguity detection. We present numerical verification performance results from a small pilot study designed to investigate the feasibility of using machine learning for reliable registration verification.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The thermal imaging system often suffers from low contrast, low-spatial resolution, and blur under heat radiation conditions. Currently, available image enhancement methods are most suitable for visible images. However, existing methods often enhance objects and noise for noisy infrared images, simultaneously producing a very poor result. This paper presents two single infrared image enhancement methods, including (i) a bi-logarithmic histogram equalization with Quasi-symmetric correction and (ii) a combined luminance and reflection decomposition and image fusion-based method. Computer simulations on benchmarking infrared Kuangxd database show that the proposed algorithm performance outperforms conventional image enhancement methods, including a cutting-edge learning-based method, in terms of subjective and objective evaluations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
K-means is a popular unsupervised ML algorithm for analyzing and recognizing natural occurring patterns to cluster similar points together. When applied to the color space of an image, it can work to recognize segments of the image where more meaningful clustering can be applied. Color quantization has been employed for decades now to optimize the memory usage of saved images. Typical images are composed of red, blue, and green channels, each represented by a byte in memory. Therefore, each pixel can be represented by a total of 24 bits, resulting in around 16.8 million unique colors. However, the human perceptive system is not sensitive enough to require full usage of this color space and it is beneficial to find ways to reduce the number of colors closer to what the eye is able to distinguish. This results in more efficient use of memory while still preserving details and color separation in the image. The key issue is to determine how much a picture can be quantized before the image starts to degrade to the point that a human would be able to discern the difference. Currently there is no algorithm available to aptly determine where this point occurs or whether each color channel should be treated identically. This research applies K-means color clustering to each color channel of the image separately to optimize compression. The introduction of principal component analysis (PCA) informed K-means in place of randomly seeded K-means on each color channel separately further improves performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Computational topology has rapidly progressed to become a useful tool for processing and analyzing diverse data from a topological perspective. Computational topology converts relationships between data into topological relationships, which can be analyzed using topological techniques such as homology, cohomology, and homotopy. In this paper we apply techniques from computational topology to the problem of path planning for vision aided navigation of aerial platforms in GPS-denied regions. Navigation in GPS-denied areas requires external aiding to reduce the drift that naturally occurs with un-aided Inertial Measurement Units. One aiding mechanism, vision aided navigation (VAN), generates vision based measurements by registering imagery acquired from an onboard camera to geo-located reference imagery of the fly-over region. Successful VAN requires distinct, stable, and discriminative image scene content to permit successful image registration. Previously, we developed a probabilistic path planning approach for VAN. The path planner operates on image fiducials that generate reliable and accurate vision measurements. In this paper, we develop a principled approach to refine image fiducial selection using topological techniques. We describe the application of computational topology and provide numerical examples. We first motivate and describe the image fiducial selection problem. We also briefly describe an application of computational topology to analysis of the shape of image fiducial match score data, and an application of cohomology for path planning.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Iris recognition is a widely used biometric technology that has high accuracy and reliability in well-controlled environments. However, the recognition accuracy can significantly degrade in non-ideal scenarios, such as off-angle iris images. To address these challenges, deep learning frameworks have been proposed to identify subjects through their off-angle iris images. Traditional CNN-based iris recognition systems train a single deep network using multiple off-angle iris image of the same subject to extract the gaze invariant features and test incoming off-angle images with this single network to classify it into same subject class. In another approach, multiple shallow networks are trained for each gaze angle that will be the experts for specific gaze angles. When testing an off-angle iris image, we first estimate the gaze angle and feed the probe image to its corresponding network for recognition. In this paper, we present an analysis of the performance of both single and multimodal deep learning frameworks to identify subjects through their off-angle iris images. Specifically, we compare the performance of a single AlexNet with multiple SqueezeNet models. SqueezeNet is a variation of the AlexNet that uses 50x fewer parameters and is optimized for devices with limited computational resources. Multi-model approach using multiple shallow networks, where each network is an expert for a specific gaze angle. Our experiments are conducted on an off-angle iris dataset consisting of 100 subjects captured at 10-degree intervals between -50 to +50 degrees. The results indicate that angles that are more distant from the trained angles have lower model accuracy than the angles that are closer to the trained gaze angle. Our findings suggest that the use of SqueezeNet, which requires fewer parameters than AlexNet, can enable iris recognition on devices with limited computational resources while maintaining accuracy. Overall, the results of this study can contribute to the development of more robust iris recognition systems that can perform well in non-ideal scenarios.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The EU-funded project, Real-Time On-Site Forensic Trace Qualification (RISEN) aims to enable the use of advanced sensors in the field in order to get results in near real-time. The project also aims to visualize the data by innovative means, such as in virtual reality (VR). The Swedish National Forensic Centre, NFC, has been developing methods for 3D modeling of crime scenes since 2016, and have conducted several studies in the use of VR for CSI application. This paper describes the status and possibilities with VR for CSI training and how the results from the RISEN project can be utilized within forensic training.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To respond to a disaster in a targeted manner, it is essential to have a comprehensive overview of the situation. Despite the widespread use of social media in disaster management, near-real-time situational awareness is challenging in times of crises. It is especially difficult to achieve near-real-time situational awareness with the help of social media. Every crisis has a trajectory which consists of countless events, triggering many reactions. To be able to assess the situation at any time a considerable amounts of computing power is required. Our research aims to solve this problem with the help of a toolset that enables the user to process micro-blogging data using Open Source Intelligence, Event Tracking and Machine Learning to automate the generation of short situational awareness reports.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Object detection is a common task for defense and intelligence applications in the visible spectrum. Deep learning models achieve state-of-the-art performance for this task. Besides, thermal cameras are more robust against real-world computer vision problems such as illumination changes. Thus, they are commonly used in military applications and security surveillance tasks. Recently, significant improvements have been made by exploiting numerous data for object detection. However, collecting and labeling a vast number of samples is challenging, especially for applications on the thermal spectrum. Since deep networks are sensitive to domain shift, a deep model trained on visible spectrum data may fail to generalize thermal spectrum data. Therefore, developing a method to adapt models to different domains, e.g., visible-to-thermal, is crucial. Feature-level unsupervised domain adaptation methods train a model that maps both domains into a common feature space without requiring image pairs. Such methods reduce the domain gap between two different domains. Besides feature-level adaptation, we propose applying pixel-level transformations on the source domain, e.g., visible spectrum images, to reduce further the domain gap between the visible and thermal spectrum. This results in a significant improvement in the performance of domain adaptive object detection methods. We propose to apply gray-scale conversion, histogram matching, histogram equalization, gamma correction, adaptive gamma correction, and Fourier domain adaptation as pixel-level transformations on visible spectrum images. We conduct experiments on real-world datasets. The evaluations use Cityscapes as the visible spectrum dataset and FLIR ADAS as the thermal spectrum dataset.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The design and development of autonomous vehicles ensure to move safely on roads while focusing on pedestrian detection systems has brought convention so that pedestrians can be detected quickly and precisely. Moreover, the researchers have mentioned that pedestrian skin detection has proven to be a tough challenge since the color of the skin can vary in appearance due to various factors such as weather conditions, sun lighting, occlusion, race, etc. Our proposed methodology, the radar-camera fusion technique, is used to predict the obstacle in any scenario. A convolution neural network extracts pedestrian features from RGB images and radar data. Also, we have introduced data preparation and feature extraction. We feature mapping to get more detection accuracy and clustering to find the similarities between features that will attain darker skin pedestrian details.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we explore the use of low-light image enhancement as a preprocessing step to improve the quality of novel view synthesis by Neural Light Fields (NeLF). NeLF is a 3D scene representation method that employs a light field representation, differing from prior methods based on volumetric rendering schemes. One of the main advantages of NeLF is its faster rendering speed, as it requires only one network forward pass without Ray marching. However, NeLF struggles to model low-illumination scenes due to its viewer-centered framework, which does not consider the interaction between illumination and scenes. To address this issue, we propose the use of 2D low-light image enhancement as a preprocessing solution. Our approach utilizes the Alpha-rooting by 2-D DFT as a preprocessing step to enhance low-light images prior to their use by the NeLF model. We demonstrate that this approach leads to significant improvements in the quality of novel view synthesis by NeLF on low-light images. We also consider how this can have practical applications in various domains such as applied human biomechanics.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Keratoconus is a progressive eye disease prevalent worldwide. Keratoconus is caused by the change in curvature of the eyeball. An unevenly shaped lens causes blurry and inaccurate vision in the patient, which eventually may cause blindness. The existing keratoconus detection techniques use a less accurate keratoscope and bulky, expensive topography imaging (OCT) device, which causes inconvenience in diagnosing this disease in a remote and scarce setup. In this paper, I propose a novel smartphone-based keratoconus diagnosis technique that uses a smartphone camera to acquire a 2D image of the cornea with Placido disc reflection, and process different image processing techniques including entropy-based edge detection, multiple circles detections, finally calculating 3D topography of the eye in an app. The existing inconvenient methods of keratoconus detection can be replaced by our proposed method, which is accurate, quick, reliable, and simple for usage by clinicians and patients in remote settings.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Semantic segmentation (SS) is a critical computer vision task that involves labeling each pixel in an image with a corresponding semantic class. It has numerous applications, such as autonomous driving, augmented reality, and video surveillance, which require real-time processing. However, many existing SS algorithms prioritize accuracy over computational efficiency, often resulting in complex models with high memory and computational requirements and future leading to challenging applications with critical real-time performance and limited computational resources. The paper proposes a novel CNN architecture incorporating binary neural network techniques to improve efficiency without sacrificing accuracy. The proposed MIX-Decision CNN architecture combines accuracy and efficiency and is optimized for real-time performance on embedded platforms. The paper uses the KITTI dataset to benchmark the proposed architecture on three embedded platforms: Raspberry Pi 3, Raspberry Pi 4, and NVIDIA Jetson Nano 4 GB. The paper's main contributions are two-fold: (i) the novel MIX-Decision CNN architecture offers an efficient and accurate solution for real-time SS tasks using binary neural network techniques, and (ii) the paper demonstrates the feasibility of the proposed architecture by implementing and benchmarking it on various embedded platforms. The results show promising performance for real-world scenarios where real-time processing is crucial.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The growing network of highway video surveillance cameras generates an immense amount of data that proves tedious for manual analysis. Automated real-time analysis of such data may provide many solutions, including traffic monitoring, traffic incident detection, and smart-city planning. More specifically, assessing traffic speed and density is critical in determining dynamic traffic conditions and detecting slowdowns, traffic incidents, and traffic alerts. However, despite several advancements, there are numerous challenges in estimating vehicle speed and traffic density, which are integral parts of ITS. Some of these challenges include variations in road networks, illumination constraints, weather, structure occlusion, and vehicle user-driving behavior. To address these issues, this paper proposes a novel deep learning-based framework for instant-level vehicle speed and traffic flow density estimation to effectively harness the potential of existing large-scale highway surveillance cameras to assist in real-time traffic analysis. This is achieved using the state-of-the-art region-based Siamese MOT network, SiamMOT which detects and associates object instances for multi-object tracking (MOT), to accurately estimate instant level vehicle speed in live video feeds. The UA-DETRAC dataset is used to train the speed estimation model. Computer simulations show that the proposed framework a) allows the classifying of traffic density into light, medium, or heavy traffic flows, b) is robust to different types of road networks and illuminations without prior road information, and c) shows good performance when compared to current state-of-the-art methods using adequate performance metrics.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The objective of image compression is to reduce irrelevance and redundancy of the image data to be able to store or transmit data in an efficient manner by minimizing the number of bits required to represent an image accurately. JPEG is capable of achieving an image compression ratio of 10:1 with little perceptible loss in image quality using standard metrics, and has become the most widely used standard image compression in the world since its release. Traditionally, compression techniques have relied on linear transforms to approximate 2-D signals (images), and the omission of specific constituent vectors has been mostly arbitrary. These techniques can save incredible amounts of memory while retaining image integrity. Recently techniques have been developed that use neural networks to approximate these signals. These networks offer the advantage of decorrelating image data to find a series of vectors to represent an image that is smaller than traditional techniques by estimating gradient descent, thus finding the minimum number of bits required to represent an image. Expansion to the development of these architectures is happening rapidly through informed design drawing upon other fields that have recently seen increased focus such as computer vision and image analysis applications. A novel efficient neural network is proposed in this work to compress infrared images at state of the art levels while preserving overall image quality to handle the demands spanning from the daily commute to combat environments.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Eye-tracking holds numerous promises for improving the mixed reality experience. While eye-tracking devices are capable of accurate gaze mapping on 2D surfaces, depth estimation of gaze points remains a challenging problem. Most gaze-based interaction applications are supported by estimation techniques that find a mapping between gaze data and corresponding targets on a 2D surface. This approach inevitably leads to a biased outcome, as the nearest objects in the line of sight will tend to be the target of interest more often. One viable solution would be to estimate gaze as a 3D coordinate (x, y, z) rather than the traditional 2D coordinate (x, y). This article first introduces a new comprehensive 3D gaze dataset collected in a realistic scene setting. Data was collected using a head-mounted eye-tracker and a depth estimation camera. Next, we present a novel depth estimation model, trained on the new gaze dataset to accurately predict gaze depth based on calibrated gaze vectors. This method could help develop a mapping between gaze and 3D objects on a 3D plane. The presented model improves the reliability of depth measurement of visual attention in real scenes as well as the accuracy of depth-based scenes in virtual reality environments. Improving situational awareness using 3D gaze data will benefit several domains, particularly human-vehicle interaction, autonomous driving, and augmented reality.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Skin cancer is the most common type of cancer in United States with 9,500 new cases diagnosed daily. It is one of the deadliest forms, however early detection and treatments can lead to recovery. More and more modern medical systems employs deep learning (DL) vision models as an assistive secondary diagnostic tool. This progress is derived from the superior performance by convolutional neural networks (CNNs) across a wide number of medical applications. However, recent discovery has revealed that adding small but faint noises to images can cause these models to make classification errors. These adversarial attacks can undermine defense measures and hamper the operations of deep learning models in real-world settings. The objective of this paper is to explore the effects of image degradation on popular off-the-shelf Deep Learning (DL) vision models. First, the investigation of the effects of adversarial attacks on image classification accuracy, sensitivity, and specificity are evaluated. Then we introduce pepper noise as an adversarial attack, which is an extension of the one-pixel attack on deep learning models. Second, we propose a novel texture descriptor Ordered statistics Local Binary Patterns (OS-LBP) for recognizing potential skin cancer areas. Third, we will demonstrate how our OS-LBP is successful in mitigating some of the effects of image degradations caused by adversarial attacks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Compressed sensing theory allows for a high-resolution image recovery of sparse image data. However, image scene data from an RGB camera, captured at night-time or in fog, dust, or rainy conditions, is difficult to read. The fusion of IR camera data with RGB camera data allows us to determine the difference between objects in the scene and noise (dust, rain, fog). This paper demonstrates a compressive sensing methodology applied to the scattered image data of the RGB camera to determine objects or obstacles in a scene, which allows for low-cost solutions for the problem of autonomous driving in unfair imaging conditions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Spectral Domain Optical Coherence Tomography (SD-OCT) is a widely used imaging technique in ophthalmology. However, it often suffers from severe distortion due to speckle noise, which can obscure critical retinal structures and lesions. These distortions can significantly reduce accuracy of image-based diagnostic tasks. Developing effective techniques for reducing speckle noise and improving the quality of SD-OCT images is crucial. However, there are two main challenges in removing speckle noise: (1) balancing the removal of noise while preserving essential image details, and (2) that speckle noise can have varying intensity and size levels, making it challenging to develop a onesize-fits-all approach. If too much noise is removed, the image may become overly smoothed and lose essential details. On the other hand, if noises are not sufficiently removed, the image may still appear noisy and distorted. Different methods and algorithms may need to be used depending on the noise characteristics and the specific image being processed. Despite these challenges, various denoising techniques, such as wavelet-based, non-local means, and adaptive median filtering, have been proposed in the literature. Each method has its strengths and weaknesses, and the choice of the method should be based on the noise characteristics and the desired trade-off between noise removal and image preservation. While recent works in deep learning have shown promise in denoising OCT images, they require extensive training data and complex hardware, limiting their practicality in many settings. This paper presents an edge-preserving noise removal method for improving the quality of SD-OCT by reducing the effect of noise using a new morphology-based bitonic filter. This filter is created by combining extended Okada with various kernel sizes. This approach allows us to efficiently remove speckle noise from OCT images while minimizing the loss of details and enhancing image quality. Compared to existing methods, the presented approach is more efficient and requires fewer computational resources. It could enhance the accuracy of image-based diagnostic tasks, ultimately benefiting patients and clinicians alike.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Deep learning-based image enhancement is challenging in underwater and medical imaging domains, where high-quality training data is often limited. Due to water distortion, loss of color, and contrast, images captured in these settings could be of better quality, making it easier to train deep learning models effectively on large and diverse datasets. This limitation can negatively impact the performance of these models. This paper proposes an alternative approach to supervised color image enhancement to address this challenge. Specifically, the authors propose to enhance images in both the spatial and frequency domains using their two × two model quaternion image structure, which was previously proposed. The color image components plus gray or brightness are map into the grayscale image of twice size and then HE of new grays is calculated. The new colors and gray of the image are reconstructed from the new image in two × two model. The approach is tested extensively through computer simulations, demonstrating that the proposed framework achieves competitive performance in quantitative and qualitative metrics compared to state-of-the-art approaches.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The performance of real-world CV systems used in outdoor surveillance and autonomous vehicles severely suffers from adverse weather conditions. Removing mist, rain streaks, adherent raindrops, and snow is an important processing step in real-world applications. Several solutions based on deep learning were proposed for multiple-type weather removal. Existing methods are prohibitively expensive regarding computational requirements and aren’t suitable for real-time operation. We propose ChebTF–lightweight encoder-decoder architecture based on quaternion neural network principles and a novel polynomial transform block to address this issue. The assessment of synthetic benchmarking datasets and realworld images, both quantitatively and qualitatively, demonstrates the comparable performance of the proposed ChebTF in handling various weather artifacts compared to other leading weather removal methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The solution to the problem of human actions recognition on video sequences is one of the key areas on the path to the development and implementation of computer vision systems in various spheres of life. Such areas as video surveillance systems, monitoring, contactless control interfaces, video processing as a preliminary stage of processing, etc. While additional sources of information (such as depth sensors, thermal sensors) allows to get more informative features, and thus increase the reliability and stability of recognition. In this work we focus on the simultaneous extraction of skeleton and 3D local binary dense micro-block difference information of an object from a visible and thermal image. The proposed algorithm is a four-stage procedure: (a) fusion information from visible cameras and thermal sensors based on the PLIP model (parameterized model of logarithmic image processing), (b) image preprocessing using a 3D Gabor filter, (c) a descriptor calculation using 3D local binary dense micro-block difference with skeleton points, and (d) classification. The proposed algorithm is based on capturing 3D sub-volumes located inside a video sequence patch and calculating the difference in intensities between these sub-volumes; for intensified motion, used the convolution with a bank of 3D arbitrarily oriented Gabor filters. We calculate the local features for pre-processed frames, such as 3D local binary dense micro-block difference (3D LBDMD). As a result of the experiments, it was shown that the method proposed in this study shows good performance compared to other state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An image haze removal algorithm based on multiscale block-rooting processing has been developed for removing haze/fog from an image based on frequency-domain coefficient correction of a set of images followed by their fusion based on the Laplacian pyramid. A new stage is proposed in obtaining a local-global estimate of high-contrast images, which are also used in the general fusion model. The proposed block-based multiscale enhancement method base on 3-D block-rooting multiscale transform domain technique, comprising: finding similar blocks in the image by block-matching; block-grouping for different block sizes; applying 3-D block-matching parametric image enhancement; calculating the quality measure of enhancement; optimizing parameters of image enhancement method through the quality measure of enhancement; fusing different enhanced images. To test the performance of the proposed algorithm, the public database O-HAZE is used.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The article presents a new approach to minimizing a multicriteria objective function that allows the operation of suppressing the noise component with the possibility of preserving the boundaries of sharp changes in brightness. Using this feature can also be used as an edge detector. The approach proposed in the paper is built on an iterative approach with the proof of the convergence of the function and the optimization of the rate of convergence. The weighted difference between the obtained estimates with different processing parameters is used as a detector. Dependences are given that reflect the result of processing smooth functions and functions with discontinuity points. The show the results of the work of the detector on a set of test data with objects of simple shapes. The results of the for solving the problem of detecting coating defects and determining the boundaries of the formation of plasma spheres during the formation of a surface layer are presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.