PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
Computed Tomography (CT) imaging serves as a crucial component in modern medical diagnostics by providing important information about internal structures of human bodies. Unfortunately, CT often faces the problem of data scarcity, due to radiation exposure, the need for skilled professionals and data privacy concerns. Therefore, generative models, such as Generative Adversarial Networks (GANs), have been widely applied to generating synthetic CT images, largely changing various aspects of medical image generation and analysis. However, directly applying GANs to CT image generation remains challenging. In particular, several representative GAN-based models, including Deep Convolutional GANs (DCGAN) and bigGANs, cannot directly generate large 3D volumes of CT scans. One important reason is that the consistency and dependency between CT scans are not appropriately handled by those GAN-based models. To model 3D CT scans, large volumes of CT images are treated in similar terms to time series. However, GAN models built up on Recurrent Neural Networks (RNNs) are not able to characterize long sequences of data due to training difficulties. In this paper, we propose Transformer-based GAN models to capture long sequences of CT scans. We conduct experiments on the LUNA16 pulmonary CT image dataset to verify the proposed methods. The empirical results demonstrate that the proposed models are able to successfully generate large CT volumes with hundreds of CT slices.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Event-based sensors (EBS) consist of a pixelated focal plane array in which each pixel is an independent asynchronous change detector. The analog asynchronous array is read by a synchronous digital readout and written to disk. As a result, EBS pixels consume minimal power and bandwidth unless the scene changes. Furthermore, the change detectors have a very large dynamic range (~120 dB) and rapid response time (~20 us). A framing camera with comparable speed requires ~3 orders of magnitude more power and ~2 orders of magnitude higher bandwidth. These features make EBS an appealing technology for proliferation detection applications. Remote sensing deployed in the field requires low power, low bandwidth, and low complexity algorithms. EBS inherently allows for low power and low bandwidth, but a drawback of event-based sensors is the lack of mature image analysis algorithms. While analysis of conventional imagers draws from decades of image processing algorithms, EBS data is a fundamentally different format; a series of x, y, asynchronous time, and polarization change (increase/decrease) as opposed to x, y, and intensity at a regularly sampled framerate. To leverage the advantages of EBS over conventional imagers, our team has worked to develop and refine image processing algorithms that use EBS data directly. We will discuss these efforts, including frequency and phase detection. We will also discuss the field applications of these algorithms such as degraded visual environments (e.g., fog) and defeating laser dazzling attempts.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The rapid progress in deep learning, particularly in convolutional neural networks (CNNs), has significantly enhanced the effectiveness and efficiency of hyperspectral image (HSI) classification. While CNN-based approaches excel in enriching local features, they often struggle to capture long-range dependencies in sequential data. To address this limitation, an attention mechanism can be integrated with CNN architectures to capture both global and local rich representations. Transformer architectures and their variations, known for their ability to model long-distance dependencies in sequential data, have gradually found applications in HSI classification tasks. Recently, the Retentive Network (RetNet) has emerged, claiming to offer superior scalability and efficiency compared to traditional transformers. One pivotal distinction between the self-attention operator in the Transformer and the retention mechanism in RetNet lies in the introduction of a decay parameter. This parameter explicitly regulates the attention weights assigned to each token by considering its neighboring tokens, resulting in improved performance. However, no study has been reported to show the effectiveness of RetNet for HSI analysis. In this study, we incorporate the retention mechanism and progressive neuron expansion structure into the task of pixel-wise HSI classification, and thus we name our proposed method as Retentive Progressive Expansion Network (R-PEN). Experimental analyses conducted on real-world hyperspectral image datasets have shown that the R-PEN model surpasses other pertinent deep learning models in classification performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The study in this paper builds on previous research in reinforcement learning to address the challenges of computational complexity and scalability in multi-agent, multi-target satellite sensor tasking systems. Drawing on the groundwork laid by previous research conducted space-based hyperspectral imaging systems, novel approaches are introduced to optimize satellite tasking efficiency. The primary innovation is the implementation of a continuous space expansion method, which enhances system adaptability without necessitating intricate adjustments. Additionally, the study investigates transfer learning within larger state-action spaces, utilizing insights from smaller spaces to accelerate training in more extensive and intricate environments. Through a series of comprehensive experiments conducted in an enhanced physics-based Python simulation environment, the effectiveness and practicality of these strategies are confirmed. The outcomes reveal significant reductions in computational complexity in multi-agent, multi-target satellite tasking, rendering it more viable for real-world implementation. This research contributes to the advancement of AI-driven satellite tasking, enhancing its efficiency in managing extensive satellite constellations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With increasing global temperatures due to anthropogenic climate change, seasonal sea ice in the Arctic has experienced rapid retreat, with increasing areal extent of meltponds that occur on the surface of retreating sea ice. Because meltponds have a much lower albedo than sea ice or snow, more solar radiation is absorbed by the underlying water, further accelerating the melting rate of sea ice. However, the dynamic nature of meltponds, which exhibit complex shapes and boundaries, makes manual analysis of their effects on underlying light and water temperatures tedious and taxing. Several classical image processing approaches have been extensively used for the detection of meltpond regions in the Arctic area. We propose a Convolutional Neural Network (CNN) based multiclass segmentation model termed NABLA-N (∇N) for automated detection and segmentation of meltponds. The architectural framework of NABLA-N consists of an encoding unit and multiple decoding units that decode from several latent spaces. The fusion of multiple feature spaces in the decoding units enables better representation of features due to the combination of low and high-level feature maps. The proposed model is evaluated on high-resolution aerial photographs of Arctic sea ice obtained during the Healy-Oden Trans Arctic Expedition (HOTRAX) in 2005 and NASA’s Operation IceBridge DMS L1B Geolocated and Orthorectified image data in 2016. These images are classified into three classes: meltpond, open water and sea ice. We determined that NABLA-N demonstrates superior performance on segmentation of meltpond data compared to other state-of-the-art networks such as UNet and Recurrent Residual UNet (R2UNet).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Within many organizations, a vast number of communications, memos, reports and documents have been accumulated in internal servers. Efficiently discovering relevant entries can reduce time spent addressing organizational needs such as personnel skills matching or anomaly resolution. However, per organization, information retrieval on these disparate data types can be challenging, as systems must be designed for their domain while accounting for unstructured and inconsistent datasets. Traditional querying via search terms often requires relevancy tuning by subject matter experts which makes it difficult to build retrieval systems. We argue that development of retrieval systems can be simplified and enhanced by embedding data with Large Language Models (LLMs), organizing information in a Knowledge Graph (KG) structure, and further encoding their relational features through a Graph Neural Network (GNN). One of the major challenges of using GNNs for information retrieval is optimizing negative edge selection. Training GNNs requires a balanced ratio between positive and negative edges however the space of negative edges is exponentially larger than positive edges. In this work, we extend the LLM-GNN hybrid architecture by applying ensemble voting on a set of trained LLM-GNNs. Preliminary results have shown modest improvement on our personnel-document matching tasks. This work contributes to a developmental effort that aims to help engineers and scientists find new research opportunities, learn from past mistakes, and quickly address future needs.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Improving a machine’s ability to reason about the unknown has been a prominent commonality across the different emerging areas of modern supervised learning. While there are different approaches that formalize this problem, many focus on generalized target recognition tailored to the known vs unknown problem setting. Overall, these approaches have created a meaningful foundation that promotes algorithm enhancement with respect to factors like detection, robustness, and internal knowledge expression. However, one major shortcoming across numerous prior works is the question of how to make use of unknown classifications for an algorithm deployment setting. Herein, we address this shortcoming by proposing an self-supervised comparison assessment methodology for computer vision tasks. Specifically, we leverage the features of foundational models across different dimensionality spaces to facilitate a comparison analysis of unknown information. Preliminary results are encouraging and demonstrate that this process not only has benefits in computer vision applications, but also is flexible for methodology alterations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Underwater imagery often exhibits significant degradation and poor quality as compared to outdoor imagery. To compensate for this, Single-Image Super-Resolution (SISR) and enhancement algorithms are used to lessen this degradation and produce high-resolution images. In this study, we apply state-of-the-art Simultaneous Enhancement and Super-Resolution (SESR) and SISR models to different sets of downscaled images from the comprehensive RUOD dataset. We then conduct a qualitative and quantitative analysis of the upscaled and enhanced images using standard underwater image quality metrics (IQMs). Subsequently, we evaluate the robustness of the state-of-the-art YOLO-NAS detector against image sets with varying downscaled spatial resolutions. Lastly, we examine the impact that the SISR and SESR models has on YOLO-NAS detector performance. The findings reveal a decline in the detection performance on the downscaled test images and a further decline on the upscaled and enhanced images produced by SISR and SESR models, suggesting a negative relationship between such models and detection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper evaluates the required resolution of a telescope system experimentally to enable a reliable deep learning-based long-range UAV detection. FRCNN, a state-of-the-art deep learning object detector is fine-tuned for UAV detection with a custom dataset. A test dataset has been created of a small UAV in front of a clear and complex background at distances ranging from 500m up to 2500m using a telescope with a focal length of 1325mm and an aperture of 102 mm. At each distance the resolution is measured with a modified version of the US Air Force resolution chart. The results show that a small UAV is detected with a mAP(0.5) of above 90% in front of a complex background up to a distance of 1167m given a minimum resolution of 9:3mm or 8μrad and up to 2222m in front of a clear background given a minimum resolution of 38mm or 17:1μrad.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Effective preprocessing of image data plays a pivotal role in enhancing the discriminative modeling capabilities in downstream machine learning tasks. This study investigates the significance of adequately mapping image data into a new feature space during the preprocessing phase, emphasizing its criticality in facilitating more robust and accurate models. While traditional methods such as signal/image processing transforms have been previously explored for this purpose, this study introduces a novel approach leveraging deep learning techniques. Specifically, convolutional and pooling layers are employed to process the image data, offering a more sophisticated and adaptive method for feature extraction and representation. By employing deep learning architectures, the preprocessing phase becomes more flexible and capable of capturing intricate patterns and structures within the data. Through empirical evaluation, our approach demonstrates significant improvements in discriminative modeling across various traditional machine learning approaches. This highlights the effectiveness and versatility of deep learning-based preprocessing in enhancing the performance of downstream tasks, showcasing its potential to advance the field of image data processing and analysis.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Wildfires are a key aspect of many ecosystems, but climate change has created conditions more conducive for devastating wildfires. Thus, it is imperative that relevant agencies know where small fires occur expeditiously. Remote sensing is a key tool for active fire detection (AFD), and satellite imagery in particular is useful due to covering wide areas. Semantic segmentation architectures like U-Net have been used for AFD and have proven very effective. In this paper, we apply a unique variant of U-Net called ResWnet towards AFD, using a large global dataset. ResWnet achieved a precision of 95% and an F-Score of 94.2%, which is better than a U-Net trained on the same dataset.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A case study analysis is presented demonstrating the procedure of deinterleaving, and its characteristics in terms of parameter sensitivity, which is for PRI-modulation recognition associated with ELINT analysis. This study examines data obtained from a RADAR collection site, consisting of various PRI-modulated characteristics, such as those occurring typically for ELINT-analysis environments. A specific aspect of this analysis is that it considers deinterleaving of RADAR pulses, which represent a precursor data format collected in a tactical maritime environment, to construct RADAR scans. Ultimately, the goal of this analysis is to construct these RADAR scans to eventually construct and identify RADAR signals, which represent a post-processed data format with respect to scans, in order to determine the geolocation of maritime targets. We present initial results of an evolving data-driven methodology based on the types of data being processed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Gliomas are the primary brain tumors that are most commonly observed in adult patients and exhibit varying degrees of aggressiveness and prognosis. The accurate identification and diagnosis of Gliomas in surgical procedures heavily rely on the acquisition of precise segmentation results, which involve delineating the tumor region from magnetic resonance imaging (MRI) scans of the brain. The segmentation process in conventional 3D CNN methods is often reliant on patch processing as a result of the limitations in GPU memory. This paper presents an approach for segmenting brain tumors into distinct subregions, namely the whole tumor, tumor core, and enhancing tumor, utilizing a 3D tiled convolution-based segmentation method. The utilization of the 3DTC method enables the inclusion of larger patch sizes without requiring hardware with high GPU memory. This study presents three significant modifications to the standard 3D U-Net. Firstly, we incorporate 3D tiled convolution as the initial layer in our proposed models. Secondly, we substitute the trilinear upsampling layer with a dense upsampling convolution layer. Lastly, we replace the standard convolution block with recurrent residual blocks in the proposed R2AU-Net. The best framework was utilized to apply an average ensembling technique, aiming to achieve accurate results on the validation set of the BraTS 2020 dataset. The network proposed in this study was utilized for the analysis of the BRATS2020 dataset. The evaluation of our method on the validation dataset yielded Dice scores of 90.76%, 83.39%, and 74.77% for the whole tumor, tumor core, and enhancing tumor region, respectively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
2D multi-person pose estimation is a well-studied problem for understanding humans in an image. This involves keypoint detection, which requires to detect and localize the points of interest (human joints). Multi-person pose estimation remains challenging because of occlusion of body parts, non-rigidity of human body, variable number of persons in an image and various scales. The most common existing method for keypoint detection is heatmap-based regression. However, there are several drawbacks. The precision relies on the resolution of the output heatmap; the computation is costly for post-process or pre-process for high resolution heatmap; the overlapping heatmap signals of spatial closely keypoints could not be distinguished. Therefore, heatmap-free pose estimation was emerged to tackle these problems. KAPAO and YOLO-Pose are the representations. They both utilized YOLO for keypoint detection since YOLO is an extremely fast object detection method with high accuracy. A graph consists of a collection of nodes and a collection of edges that connect the nodes. A human pose could be referred to a graph, where human joints are nodes and corresponding connection will draw the pose. Graph neural network (GNN) is designed for data with graph structure. Inspired by these, we introduce a YOLO-based GNN, a heatmap-free approach for 2D multi-person pose estimation. YOLO-based network is leveraged for keypoint detection. The detected keypoints and connections will be then re-arranged and refined by GNN. We tested our framework on COCO-2017 dataset and preliminary results show superior performance in accuracy and efficiency.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we determine associations between social media use and beliefs in conspiracy theories and misinformation among African American communities in Tuskegee County. We will study a high community with significant social problems. The primary goal of this work is to visualize how information (both false and accurate) flows through social media, traditional media, and social networks to influence decision-making in rural areas. The second goal is to examine how various other factors moderate this influence. We will examine the impacts of education, age, and other demographics, as well as measure Gigerenzer’s concept of “risk literacy” which examines the accuracy of people’s perceived notions of risk. We will develop our model based on data collected from in-person meetings and town halls, questionnaires, and other information collected to measure peoples’ social media use, social networks, and their beliefs about issues such as the efficacy of COVID vaccines, their trust in the health care system, their beliefs about mental health.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Reversible information hiding technology can hide secret or sensitive information in the redundant information of the carrier image and completely restore the original image at the receiving end. Current, the difference histogram algorithm appears to be the most attractive for reversible information hiding. However, this technique cannot well balance the embedding capacity and security. To further improve the embedding capacity and security of the difference histogram algorithm, this paper proposes a large-capacity reversible information hiding algorithm based on multi-difference histograms and Gray code. At first, the original image is divided into multiple same-size blocks. Then the blocked image is scrambled with Gray code to improve the system's security. Thereafter, a difference histogram is established for the blocked image and the zero value on the right side of the peak value is selected as the embedding position. Finally, the secret information is embedded. Experimental results show that the proposed algorithm significantly improves the embedding capacity of the carrier image while ensuring the security of the carrier image and the secret information.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the trustworthiness of multimedia data has been challenged by editing tools, image forgery localization aims to identify regions in images that have been modified. Although the existing techniques provide reasonably good results for image forgery localization, with emerging new editing techniques, such models must be retrained and it is highly dependent on the real tampering localization maps. In this paper, we propose an attention-based fusion network that combines the RGB image and noise residual yielding excellent results. Noise residual is commonly regarded as camera model fingerprint, and forgery localization can be detected as deviations from the expected regular pattern. The model consists of three parts: feature extraction, attentional feature fusion, and feature output. The feature extraction module is used to extract RGB image features and noise residuals separately, and the attentional feature fusion module is used to suppress the high frequency components, supplement and enhance model-related artifacts by combining the aforementioned features. Finally, the last module generates images with one channel as the camera model fingerprint. In order to avoid dependence on tampering localization maps, the model is trained with pairs of image patches coming from the same or different camera sensors by means of Siamese network. Experiment results obtained from several datasets show that the proposed technique successfully identifies modified regions, improves the quality of camera model fingerprints, and achieves significantly better performance when compared to the existing techniques.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Accurately segmenting skin lesions from dermoscopy images is crucial for improving the quantitative analysis of skin cancer. However, segmenting melanoma automatically is difficult due to the significant variation in melanoma and unclear boundaries of the lesion areas. While Convolutional Neural Networks (CNNs) have made impressive progress in this area, most existing solutions need help to effectively capture global dependencies resulting from limited receptive fields. Recently, transformers have emerged as a promising tool for modeling global context by using a powerful global and local attention mechanism. In this paper, we investigated the effectiveness of various deep learning, including CNN-based and transform-based approaches, for the segmentation of skin lesions on dermoscopy images. We also studied and compared the performance of transfer learning algorithms developed based on well-established encoders such as Swin Transformer, Mix-Transformer, Vision Transformer, ResNet, VGG-16, and DenseNet. Our proposed approach involves training a neural network on polar transformations of the original dataset, with the polar origin set to the object’s center point. This simplifies the segmentation and localization tasks and reduces dimensionality, making it easier for the network to converge. The ISIC 2018 datasets containing 2,594 dermoscopy images with their ground truth segmentation masks was used in the evaluation of our approach for skin lesion segmentation tasks. This dataset was randomly split into 70%, 10%, and 20% groups for training, validation, and testing purposes. The experimental results showed that when we used polar transformations as a pre-processing step, the CNN-based and transform-based approaches generally improved the models efficiency across dataset.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper examines the challenges of object tracking algorithms performing on RGB-D data. We analyze and quantify the performance of common state-of-the-art tracking methods performing on the intensity and depth channels. This paper investigates the tracking performance characteristics of intensity and depth channel processing separate and in conjunction within complex RGB-D scenes with moving objects. A new assessment metric is introduced, called template dissimilarity assessment (TDA), to score the performance of individual tracking methods and determine when track is lost and re-initialization is appropriate. Various tracking metrics are directly compared between intensity and depth data sets. The overall performance and the advantages of the intensity and depth tracking approaches are emphasized. Lastly, the overall performance assessment includes the algorithmic computational expense, measured via processor timing tests.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Semantic segmentation has crucial importance in various domains due to its ability to recognize and categorize objects within an image at a pixel level. This task enables a wide range of applications, such as autonomous vehicles, environmental monitoring, and remote sensing (RS). In RS, semantic segmentation plays a crucial role, acting as the basis for applications including land cover classification. Following the success of deep learning (DL) methods in computer vision, our paper addresses the intersection between DL and RS imagery. We focus on improving the efficiency of some baseline and backbone models to ensure their adaptability to the challenges posed by RS imagery. Therefore, we evaluate state-of-the-art models on two datasets and investigate their ability to accurately segment objects in RS imagery. Our research aims to open the way for more accurate and reliable semantic segmentation methods in geospatial analysis.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This study investigates the use of metaheuristic algorithms for adaptation to climate change and decision making, presenting literature from 1997 to 2023. Our investigation of the matter shows that metaheuristic algorithms are central in trying to deal with climate change complexity as it has a high annual growth rate and wide interdisciplinary effort. These algorithms, specifically genetic algorithms and particle swarm optimization have been applied in several areas including energy efficiency and sustainable development. This shows their flexibility to different response thus the use of these approaches may help form resilient adaptation methods. In its turn, the research gives substantial information on strategic development using metaheuristic algorithms, yet, it has some limitations such as probable coverage gaps and geographical focus of efforts. The empirical validation in real-world settings should be the future research that will investigate their application to underrepresented areas therefore, promoting interdisciplinary collaborations in accordance with global adaptation needs. In conclusion, the study reveals the great potential of metaheuristic algorithms in fostering climate change adaptation and calls for further research to enhance inclusive, effective and sustainable methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we present an optimization methodology for reducing the number of sensors in an existing monitoring network. These sensors measure the concentration of pollutant gas in the air, in order to estimate the position and intensity of a pollutant source. Two statistical methods were used and compared. The first method is based on Hierarchical Agglomerative Clustering (HAC), and the second one is Self-Organized Maps (SOM). The aim is to regroup sensors of the same behavior, based on similarity measure; then, we keep only one sensor of each cluster. The methodology was tested on synthetic data, with Bayesian inference and Monte Carlo Markov Chain (MCMC) algorithm to identify the pollutant source position and intensity. Of 88 sensors in the initial network, the number was reduced to 21 by HAC and 27 by SOM. As for the identification, both methods had close estimation of the source position, however the SOM had better results in the estimation of the source intensity in general.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
When building or renovating a warehouse, a brainstorming phase is required to discuss robotic automation. Indeed, in order to achieve optimal performances, enhancements to the goods selection processes are continually sought. This selection uses important information based on moving products. Recently, several new methods have been emerged and the time to try them still limited. To evaluate the performance of these methods, it is necessary to carry out some tests. In this paper, we introduce a small-scale simulator designed to facilitate the testing of innovations outlined in the literature. Like a real warehouse, we have a conveyor belt to simulate the movement of goods and the robotic arm proposed by the Ned2. This research presents, with limited resources, the performance of a novel method in object detection. The simulation operates autonomously and is controlled by an NVIDIA Jetson Nano card, which incorporates novel deep-Learning methods. Furthermore, a depth camera is integrated to determine the 3D position of the goods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Chromosomal translocations involve the exchange of segments between non-homologous chromosomes. The Philadephia chromosome, known as t(9;22) abnormality, is an example of translocation linked with chronic myeloid leukemias. This study leverages the capabilities of a modified Siamese architecture for the automated detection of this translocation. Highlighting its superior image recognition capabilities, this modified Siamese architecture, an innovative alternative to conventional Convolutional Neural Networks (CNNs), processes images by effectively capturing both local and global image details without the inherent biases found in traditional image analysis methods. This work underscores the specific capabilities and advantages of the proposed Siamese architecture, emphasizing its crucial role in overcoming the limitations of traditional diagnostic methods in identifying the t(9;22) translocation, and its potential to significantly enhance genetic diagnostics.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.