Single-pixel imaging has gained prominence for its wide working wavelength and high sensitivity. Deep learning-based single-pixel imaging shows superiority in real-time reconstruction, particularly with limited resources. In this work, we report a novel encoder-decoder method for single-pixel imaging, which aims at enhancing imaging quality from extremely low measurement amounts. First, we encode the high-dimensional target information into one-dimensional measurements using globally optimized modulation patterns, implemented by a fully connected or convolutional layer. Second, we integrate a U-Net neural network with an advanced multi-head self-attention mechanism and a pyramid pooling module to decode the measurements and reconstruct high-fidelity images. Under such a strategy, the skip connections within the U-Net structure enhance the preservation of fine image features, and the incorporation of the multi-head self-attention mechanism and pyramid pooling module effectively captures contextual dependencies among low-dimensional measurements, thereby extracting significant image features and enhancing reconstruction quality. The simulation results conducted on the STL-10 dataset validate the efficiency of the reported technique. With a resolution of 96 × 96 pixels and an ultra-low sampling rate of 1%, we consistently achieved the highest image fidelity compared to traditional single-pixel reconstruction methods for both grayscale and color images.
Near-space surveillance is emerging as a pivotal tool to address the challenges of climate observation, resource exploration, and disaster evaluation. Large-scale multi-modal data enables to improve analysis accuracy, which however faces the challenge of limited downlink bandwidth and storage resources in the near-space platform. In this work, we developed a near-space multi-modal surveillance system, which not only enables multi-modal video acquisition but also realizes efficient data storage control and low-latency, low-bandwidth data transmission. Specifically, a snapshot hyperspectral camera (with 2048×2048 spatial resolution, 5 nm spectral resolution, covering a wide spectral range from 400 to 1000 nm), an infrared camera (with 640×512 pixel resolution), and an RGB camera (with 2448×2048 pixel resolution) were equipped together, to enable synchronous wide-spectrum multi-modal data acquisition with a maximum frame rate of 24 fps. To handle the massive heterogeneous data sets generated by the multiple cameras, a B+Tree index was constructed with the data acquisition time as the primary key, which reduces the time complexity of data retrieval from linear level to logarithmic level. A UDP based image transmission protocol was designed to reduce communication latency by eliminating head-of-line blocking and handshake delays caused by TCP. Remote resource management, on-demand image transmission selection and flexible acquiring control of multi-camera were implemented to further reduce storage space and transmission bandwidth usage. Experiments validated the system’s capability to operate normally under conditions of -50 degrees Celsius temperature and 5 kPa pressure, concurrently affirming the enhanced stability, reliability, and efficiency brought by the aforementioned design.
Spectral images provide rich spatial and spectral information, enabling quantitative analysis of the same material and qualitative analysis of different materials. Due to the outstanding material identification capabilities of spectral technology, it is widely used in high-precision target detection tasks in complex scenes. Currently, adversarial sample attack techniques are advancing rapidly, however, most research on adversarial sample attacks in the field of object detection has been focused on RGB three-channel images. The exploration of adversarial techniques for object detection in multi-channel spectral images is still in its early stages. In this work, we propose a method for adversarial sample generation based on spectral images, which belongs to black-box attack and targeted attack. The reported technique, named Spectral Detection Adversary (SDA), is utilized to cause spectral image object detection networks to misclassify camouflage targets as real targets. We introduce a spectral analysis and comparison method to distinguish between real targets and camouflage targets, additionally, we propose a spectral dimension encoding method for various categories of real and camouflaged targets, thereby causing disruption in the adversary’s spectral image object detection network. In the most effective group, experimental verification revealed a reduction of more than 60% in the recall of camouflaged targets and a decrease of over 30% in the precision of real targets.
In recent years, deep learning has exhibited remarkable performance in image classification. Nevertheless, traditional deep-learning-based techniques heavily depend on the availability of high-quality images for conveying information. This reliance results in the inefficient utilization of hardware and software resources across various stages, including image acquisition, storage, and processing. Additionally, these techniques often necessitate substantial amounts of data to effectively learn the underlying mapping, posing challenges in practical scenarios where acquiring a sufficient volume of paired data proves difficult. In this paper, we introduce a novel approach for image-free few-shot recognition by using a single-pixel detector. Our method comprises two fundamental stages. First, we design a neural network that integrates encoding and decoding modules, which can learn optimized encoding masks based on the statistical priors. Second, we employ these optimized masks to generate compressed 1D measurements. Subsequently, these measurements are fed into the classification network, preceded by the decoding module trained during the initial stage. The parameters of this decoding module serve as the initialization parameters for the subsequent stage of training. Furthermore, we incorporate a meta-training strategy, commonly used in few-shot classification, to mitigate dataset requirements during the second stage of training. Simulation results illustrate the effectiveness of our approach in image-free classification directly from 1D measurements, bypassing the time-consuming image reconstruction process. Our technique achieves a substantial reduction in data volume by two orders of magnitude while relying on only a limited number of paired data samples.
Infrared imaging is widely used in low-light environments, remote sensing, and camouflage recognition, due to its high sensitivity, response, and all-weather adaptability. However, conventional infrared cameras are limited by their high cost per pixel, low resolution, and low contrast, resulting in the loss of valuable scene information. To address this challenge, we report a method for the fusion of high-resolution visible light images and single-pixel infrared images using a single-pixel detector. Our method consists of two modules: the single-pixel imaging (SPI) module and the fusion module. The SPI module uses a spatial light modulation system to obtain a one-dimensional measurement of the infrared image and then uses the reconstruction algorithm to obtain the infrared single-pixel image, thus avoiding the need for high-cost infrared cameras and obtaining the high-quality infrared single-pixel image with low sampling rate. The fusion module uses an infrared single-pixel image and a high-resolution visible image to obtain the hybrid image which combines the interested information from the visible light image and the infrared single-pixel image, thereby achieving low-cost imaging that preserves the detailed information from visible images and maintains the salient features from infrared images. A series of simulations demonstrate that our method benefits from the high resolution and high contrast of visible light imaging and the high sensitivity of infrared imaging. Our technique has the potential to improve complex environment detection, nighttime detection, and covert detection.
Digital holography (DH) is a dependable method for observing micro-nano structures and 3D distribution by combining amplitude and phase information. In recent years, pixel super-resolved (PSR) technique improves the SBP of holography. By introducing measurement diversity, PSR phase retrieval approaches are able to solve low-resolution issues due to limited sensor pixels. However, in the existing wavefront modulation PSR technique, dozens or even hundreds of randomly generated phase masks are usually required, resulting in time-consuming measurement and reconstruction. Reducing the amount of data can save time, but lead to poor accuracy and noise robustness. In this paper, we propose a novel PSR holography method with complementary patterns. Specifically, we use a pair of patterns that are exactly complementary in value, while others are randomly generated 0-1 phase patterns. Using this pair, the integrity of the target information contained is guaranteed in the diffraction intensity data set. In addition, the method can effectively improve resolution with limited data, speeding up the measurement and reconstruction process. A series of simulations demonstrate the effectiveness of complementary patterns, achieving more than a 3 dB enhancement in PSNR index compared with the random phase patterns.
KEYWORDS: Neural networks, Wavefronts, Coherence imaging, Biological imaging, Data modeling, Holography, Convolution, Super resolution, Image restoration, Education and training
Large-scale computational imaging can provide remarkable space-bandwidth product that is beyond the limit of optical systems. In coherent imaging (CI), the joint reconstruction of amplitude and phase further expands the information throughput and sheds light on label-free observation of biological samples at micro- or even nano-levels. The existing large-scale CI techniques usually require scanning/modulation multiple times to guarantee measurement diversity and long exposure time to achieve a high signal-to-noise ratio. Such cumbersome procedures restrict clinical applications for rapid and low-phototoxicity cell imaging. In this work, a complex-domain-enhancing neural network for large-scale CI termed CI-CDNet is proposed for various large-scale CI modalities with satisfactory reconstruction quality and efficiency. CI-CDNet is able to exploit the latent coupling information between amplitude and phase (such as their same features), realizing multidimensional representations of the complex wavefront. The cross-field characterization framework empowers strong generalization and robustness for various coherent modalities, allowing high-quality and efficient imaging under extremely low exposure time and few data volume. We apply CI-CDNet in various large-scale CI modalities including Kramers–Kronig-relations holography, Fourier ptychographic microscopy, and lensless coded ptychography. A series of simulations and experiments validate that CI-CDNet can reduce exposure time and data volume by more than 1 order of magnitude. We further demonstrate that the high-quality reconstruction of CI-CDNet benefits the subsequent high-level semantic analysis.
Broadband multispectral filter array (BMSFA) has emerged as an attractive alternative for spectral imaging due to its compactness and high light throughput. It compresses the multispectral data cube to 2D measurements, and then we reconstruct the data cube from the measurements with its pre-calibrated spectral response. In practice, the BMSFA spectral response is usually calibrated using ultra-narrow filters along wavelength with high cost. In addition, the process introduces noise and inter-spectral crosstalk that would severely degrade reconstruction quality. In this work, we report a novel calibration technique using deep learning called deep calibration. The technique generates different spectral illumination and collects sets of measurements of BMSFA camera and corresponding true spectra. In this way, a more accurate characterization of BMSFA spatial-spectral modulation can be obtained. Furthermore, a reconstruction network following hybrid CNN-Vit architecture was employed to learn the demodulation process from the collected dataset. Then, using this network as a decoder, the scene’s hyperspectral data can be accurately reconstructed from the measurements. Extensive experiments validated that the reported technique performs with high efficiency and accuracy in both calibration and reconstruction of BMSFA.
Poor lighting conditions in the real world may lead to ill-exposure in captured images which suffer from compromised aesthetic quality and information loss for post-processing. Recent exposure correction works address this problem by learning the mapping from images of multiple exposure intensities to well-exposed images. However, it requires a large number of paired training data, which is hard to implement for certain data-inaccessible scenarios. This paper presents a highly robust exposure correction method based on self-supervised learning. Specifically, two sub-networks are designed to deal with under- and over-exposed regions in ill-exposed images respectively. This hybrid architecture enables adaptive ill-exposure correction. Then, a fusion module is employed to fuse the under-exposure corrected image and the over-exposure corrected image to obtain a well-exposed image with vivid color and clear textures. Notably, the training process is guided by histogram-equalized images with the application of histogram equalization prior (HEP), which means that the presented method only requires ill-exposed images as training data. Extensive experiments on real-world image datasets validate the robustness and superiority of this technique.
Deep learning shows great potential for super-resolution microscopy, offering biological structures visualization with unprecedented details and high flexibility. An effective pathway toward this goal is structured illumination microscopy (SIM) augmented by deep learning because of its ability to double the resolution beyond the light diffraction limit in real-time. Although the deep-learning-based SIM technique works effectively, it is generally a black box that is difficult to explain the latent principle. Thus, the generated super-resolution biological structures contain unconvinced information for clinical diagnosis. This limitation impedes its further applications in safety-critical fields like medical imaging. In this paper, we report a reliable deep-learning-based SIM technique with uncertainty maps. These uncertainty maps characterize imperfections in various disturbances, such as measurement noise, model error, incomplete training data, and out-of-distribution testing data. Specifically, we employ a Bayesian convolutional neural network to quantify uncertainty and explore its application in SIM. The backbone of the reported neural network is the combination of U-net and Res-net with three low-resolution images from different structured illumination angles as inputs. The outputs are high-resolution images with double resolution beyond the numerical aperture and the pixel-wise confidence intervals quantification of reconstruction images. A series of simulations and experiments validate that the reported uncertainty quantification framework offers reliable uncertainty maps and high-fidelity super resolution images. Our work may promote practical applications of deep-learning-based super-resolution microscopy.
Fourier single-pixel imaging (FSI) acquisition time is tied to the number of modulations. FSI has a tradeoff between efficiency and accuracy. This work reports a mathematical analytic tool for efficient sparse FSI sampling. It is an efficient and adjustable sampling strategy to capture more information about scenes with reduced modulation times. Specifically, we first conduct the statistical importance ranking of Fourier coefficients of natural images. We design a sparse sampling strategy for FSI with a polynomially decent probability of the ranking. The sparsity of the captured Fourier spectrum can be adjusted by altering the polynomial order. We utilize a compressive sensing (CS) algorithm for sparse FSI reconstruction. From quantitative results, we have obtained the experiential rules of optimal sparsity for FSI under different noise levels and at different sampling ratios.
Single-pixel imaging uses a single-pixel detector to capture all photons emitted from the two-dimensional scene, and then calculates and reconstructs the two-dimensional target scene image from the one-dimensional measurement data through single-pixel reconstruction methods (such as linear superposition, compressed sensing or deep learning) based on the one-dimensional acquisition data and the corresponding illumination coding. Compared with traditional cameras, single-pixel imaging has the advantages of high signal-to-noise ratio and wide spectrum. Due to these advantages, single-pixel imaging has been widely used in multispectral imaging. However, the traditional single-pixel image reconstruction methods have some disadvantages, such as low resolution, huge time consuming and poor reconstruction quality. In this paper, we propose a single-pixel image reconstruction method based on neural network. Compared with the traditional single-pixel image reconstruction method, this method has better reconstruction quality at lower sampling rate. Specifically, in this model, we first use a small optimized-patterns to simulate a single-pixel camera to sample the image to obtain the measured values, and then extract multi-channel high-dimensional semantic features from the sampled values through a high-dimensional semantic feature extraction network. Then, the multi-scale residual network module is used to construct the feature pyramid up-sampling module to up sample the high-dimensional semantic features. In the training process, the network parameters and pattern are jointly optimized to obtain the optimal network model and pattern. With the help of large-scale and pre-training, our reconstructed image has higher resolution, shorter reconstruction time and better reconstruction quality.
The conventional high-level sensing techniques require high-fidelity images to extract visual features, which consume high software complexity or high hardware complexity. We present the single-pixel sensing (SPS) technique that performs high-level sensing directly from a small amount of coupled single-pixel measurements, without the conventional image acquisition and reconstruction process. The technique consists of three steps, including binarized light modulation, single-pixel coupled detection, and end-to-end deep-learning based decoding. The binarized modulation patterns are optimized with the decoding network by a two-step training strategy, leading to the least required measurements and optimal sensing accuracy. The effectiveness of SPS is experimentally demonstrated on the classification task of handwritten MNIST dataset, and 96% classification accuracy at ∼1kHz is achieved. The reported SPS technique is a novel framework for efficient machine intelligence, with low hardware and software complexity. Further, it maintains strong encryption.
Phase imaging observes the phase of light interacted with the target. The conventional phase imaging methods such as interferometry employ two-dimensional sensors for image capture, resulting in limited spectrum range and low signal-to-noise ratio (SNR). Single-pixel imaging (SPI) provides an alternative solution for high-SNR acquisition of target information over a wide range of spectrum. However, the conventional SPI can only reconstruct light intensity without phase. Existing phase imaging methods using a single-pixel detector require phase modulation, leading to low light efficiency, slow modulation speed and poor noise robustness. In this paper, we propose a novel single-pixel phase imaging method without phase modulation. First, the binary intensity modulation is applied which provides simplified optical setup and high light efficiency. Second, inspired by the phase-retrieval theory, we derive a joint optimization algorithm to reconstruct both amplitude and phase information of the target, from the intensity measurements collected by a single-pixel detector. Both simulations and experiments demonstrate that the proposed method has high SNR, high frame rate, wide spectrum range (UV+VIS+NIR) and strong noise robustness. The method can be widely applied in optics, material and life science.
Demosaicing is an essential technique in filter array (FA) based color and multispectral imaging. It aimes to recover missing pixels at different spectrum bands. The existing methods are limited to specific FAs and local regularization. To enhance generalization on different FA structures and improve reconstruction quality, here we present a non-local low-rank regularized demosaicing method, based on the non-local grouped sparsity of natural images. Specifically, the optimization model consists of two parts, including the regularization term of image formation model, and the low-rank term of non-local grouped image patches. The two terms ensure to remove noise and distortion while preserving image details. The model is solved by the weighted nuclear norm minimization and the alternating direction multiplier method framework. Experiments validate that the proposed algorithm has good generalization performance on both different FA patterns and channel numbers. The reconstruction accuracy is improved compared with the existing demosaicing algorithms.
The conventional single-pixel imaging (SPI) is unable to directly obtain the target's depth information due to the lack of depth modulation and corresponding decoding. The existing SPI-based depth imaging systems utilize multiple single-pixel detectors to capture multi-angle images, or introduce depth modulation devices such as optical grating to achieve three-dimensional imaging. The methods require bulky systems and high computational complexity. In this paper, we present a novel and efficient three-dimensional SPI method that does not require any additional hardware compared to the conventional SPI system. Specifically, a multiplexing illumination strategy combining random and sinusoidal pattern is proposed, which is able to simultaneously encode the target's spatial and depth information into a measurement sequence captured by a single-pixel detector. To decode the three-dimensional information from one-dimensional measurements, we built and trained a deep convolutional neural network. The end-to-end framework largely accelerates reconstruction speed, reduces computational complexity and improves reconstruction precision. Both simulations and experiments validate the method's effectiveness and efficiency for depth imaging.
Bundle adjustment (BA) is an important task for feature matching in multiple applications such as image stitching and position mapping. It aims to reconstruct the 8-parameter homography matrix, which is used for perspective transformation among different images. The existing algorithms such as the Levenberg-Marquardt (LM) algorithm and the Gauss{Newton (GN) algorithm require much computation and a large number of iterations. To accelerate reconstruction speed, here we propose a novel BA algorithm based on adaptive moment estimation (Adam). The Adam solver uses the mean and uncentered variance of the gradients in the previous iterations to dynamically adjust the gradient direction of the current iteration, which improves reconstruction quality and increases convergence speed. Besides, it requires only the first derivate calculation, and thus obtains low computational complexity. Both simulations and experiments validate that the proposed method converges faster than the conventional BA methods.
Existing multispectral imagers mostly use 2D array sensors to separately measure 2D data slices in a 3D spatialspectral data cube. They suffer from low photon efficiency, limited spectral range, and high cost. To address these issues, we propose to conduct multispectral imaging using a photodiode, to take full advantage of its high sensitivity, wide spectral range, low cost, and small size. Specifically, utilizing the photodiode’s fast response, a scene’s 3D spatial-spectral information is sinusoidally multiplexed into a dense 1D measurement sequence, and then demultiplexed computationally under the single-pixel imaging scheme. A proof-of-concept setup is built to capture multispectral data of 256 pixels × 256 pixels × 10 wavelength bands ranging from 450 nm to 650 nm. The imaging scheme holds great potentials for various biological applications such as fluorescence microscopy and endoscopy.
Conventional multispectral imaging methods detect photons of a 3D hyperspectral data cube separately either in the spatial or spectral dimension using array detectors, and are thus photon inefficient and spectrum range limited. Besides, they are usually bulky and highly expensive. To address these issues, this paper presents single-pixel multispectral imaging techniques, which are of high sensitivity, wide spectrum range, low cost and light weight. Two mechanisms are proposed, and experimental validation are also reported.
Optical coherence tomography (OCT) is an important interferometric diagnostic technique, which provides cross-sectional views of biological tissues’ subsurface microstructures. However, the imaging quality of high-speed OCT is limited by the large speckle noise. To address this problem, we propose a multiframe algorithmic method to denoise OCT volume. Mathematically, we build an optimization model which forces the temporally registered frames to be low-rank and the gradient in each frame to be sparse, under the constraints from logarithmic image formation and nonuniform noise variance. In addition, a convex optimization algorithm based on the augmented Lagrangian method is derived to solve the above model. The results reveal that our approach outperforms the other methods in terms of both speckle noise suppression and crucial detail preservation.
Capturing four dimensional light field data sequentially using a coded aperture camera is an effective approach but
suffers from low signal noise ratio. Although multiplexing can help raise the acquisition quality, noise is still a big issue
especially for fast acquisition. To address this problem, this paper proposes a noise robust light field reconstruction
method. Firstly, scene dependent noise model is studied and incorporated into the light field reconstruction framework.
Then, we derive an optimization algorithm for the final reconstruction. We build a prototype by hacking an off-the-shelf
camera for data capturing and prove the concept. The effectiveness of this method is validated with experiments on the
real captured data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.