PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 11911, including the Title Page, Copyright information, and Table of Contents.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Texture recognition is study on the texture in a large area, the goal is to classify the texture in the area and determine which category they belong to. Its basic content is to use the texture feature extraction method to get the texture feature, and then use the appropriate classifier to classify. In this paper, Principal Component Analysis (PCA) is used to reduce the feature dimension of the original image to get the texture image features with lower dimension first. Then the features are fused by the coarse-to-fine strategy, and finally the texture is classified by the double probability neural network. Experimental results show that, compared with the common classification methods, the proposed texture image recognition method improves the correct recognition rate.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Nowadays, with the rapid expansion of video information, how to label video quickly and effectively and locate the key content of video quickly is a big challenge in the field of multimedia processing and computer vision. Advanced digital equipment and endless media means produce a lot of multimedia information every day. The popularization of broadcast communication and Internet technology makes these information spread, shared and applied in the world. With the increasing popularity of modern video technology, sports visual culture needs to study various visual images in the field of sports. This paper proposes a moving image classification method based on visual attention analysis, which greatly reduces the amount of data and improves the classification efficiency.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we firstly propose that the region proposal network has some defects, such as effect depending on preset parameters, and poor adaptability to different tasks. Secondly, we integrate the idea of adaptive training sample selection into the region proposal network to define samples adaptively and add a centerness branch to constrain low-quality anchors. In the meanwhile, we try to merge the branch structure before classification and regression tasks to further improve the performance of ATSS_RPN. As demonstrated in our experiment, the proposed ATSS_RPN gets a 7.5% higher recall than the origin RPN. We also verify our scheme through two classic optical aerial image datasets. Experiments show that our ATSS_RPN has a considerate performance on the 4 mainstream two-stage detectors(including state of the arts DecteoRS ) than with the origin RPN network. (7% higher mAP in Airbus ship and 1.5% higher mAP in Visdrone2019 when we use Faster RCNN as a baseline).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the continuous development of science and technology, computer technology has made our way of life, making people's way of life more diversified and intelligent, especially computer image recognition technology, has been widely used in various industries, its importance has become an important symbol of the era of information development. In China, computer image recognition technology is of great significance in the whole strategic planning of science and technology. Its development not only helps our country's position in international science and technology, but also promotes the progress and application of our country in various fields. On the way of exploration, the importance and development of computer image recognition technology is very important to the whole technology. How to make the image recognition technology get the comprehensive promotion and application? This requires our comprehensive and in-depth research, which needs to grasp its principles, characteristics, prospects and applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposed an infrared and visible image fusion algorithm based on three-layer guided filter and a Composition Analysis Convolutional Neural Network (CACNN) to address the problems of background information not being delicate and some details being lost in the images fused by the traditional infrared and visible image fusion algorithm. Firstly, the original image is decomposed into the base layer and the detail layer through the multi-scale decomposition method of three-layer guided filter. Then, the CACNN model and the regional energy method are utilized to guide the fusion of the base and detail layers. Finally, the fused image is obtained by merging the base and detail layers. Experimental verification shows that compared with similar algorithms, the proposed model in this paper has a great improvement in evaluation metrics such as Average Gradient (AG) and Spatial Frequency (SF), and better retained and presented the detailed texture information of the images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The document images often appear in the digital library, social media, e-mail etc. The duplicate copies of the same content bring burden to the management system and waste network traffic and storage resources. This paper proposes a new algorithm for detecting the duplicate document images in large-scale image data sets. The key idea of the proposed algorithm lies in taking advantage of the characteristics of the document image that is structured because of the page layout. In this paper, the text lines are exacted to be taken as elements features of the document image and the Fréchet Distance is introduced to measure the similarity of these features. The experimental results of different types of electronic documents show the advantages of the proposed algorithm in accuracy and stability.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the development of computer technology, the theory of color system has been developed rapidly. Color is described as a specific element value, which realizes the definition of virtual vectorized space. At the same time, researchers at home and abroad gradually realize that color perceptual factors and the application of quantification are more important factors in color design. It provides theoretical guidance for the in-depth study of the color image in image vision and the exploration of the cognitive process of color image. In this paper, based on the Angle of computing image color design strategy gradually from the traditional experience of designers to computing color image strategy to study the method of sorting, to compute the digital earnestly learn the user's color preferences and cognitive process, color image as much as possible to narrow the gap in between designer and user image cognition, Better research and development of image color perceptual needs, increase the success rate of image color related research and development, reduce research and development costs.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Aiming at the problem that it is difficult to accurately detect and locate pedestrians due to high air dust concentration and low visibility at the construction site lifting equipment, a pedestrian detection and positioning method based on monocular vision combined with millimeter wave radar is proposed. This method inputs the RGB image obtained by the monocular camera into the YOLOv4 algorithm to achieve pedestrian target detection. The distance information of the obstacle obtained by the millimeter wave radar and the pixel coordinate information of the pedestrian obstacle target obtained by the monocular camera are used for information fusion, and then it based on the positioning principle of monocular vision realizes the spatial positioning of pedestrian targets. Experimental results show that the accuracy and recall rates of pedestrian detection reach 94.09% and 93.52%, respectively, which are higher than 90.79% and 89.13% of the YOLOv3 algorithm, and the detection speed reaches 65 FPS; the relative error of pedestrian positioning in the lateral distance is 3.52%. The method is accurate and rapid, and can better realize real-time pedestrian detection and positioning in the hoisting equipment site.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the popularity of the Internet, the way of communication between college counselors and college students has changed. Compared with the past face-to-face direct communication, online communication using the Internet has replaced it. In the era of Internet, in order to make up for this lack of non-verbal information, a series of emoticons are produced. In the communication between college counselors and college students, mainly through the QQ platform, which plays an important role in nonverbal cues in the process of communication, and at the same time has an impact on other people's enthusiasm perception. This study explores the use of emoji and the influence of different locations on the perception of enthusiasm, aiming to provide suggestions on how to communicate with college students through the Internet, so as to help build a good relationship. The results show that the use of emoticons has a significant effect on enthusiasm perception. Emoji was significantly affected by which placed before textual information.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This article analyzes the development status and components of 3D animation production, and combines the design and production points of 3D animation. The content includes scenario design, character design, modeling redesign, action design, environment design, 3D scanning production, 3D printing model, 3D Scene production, etc. By studying the causes and treatment measures of problems such as low production efficiency, low technology integration, and lack of cultural value, the purpose is to improve the rationality of 3D animation design and production results, and create conditions for the economic development of the industry.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The traditional remote sensing image matching method based on registration algorithm has problems in the smoothness of the matching results in practical application. Therefore, a remote sensing image matching method based on neural network is proposed. Based on neural network to extract remote sensing image control points, visual processing of remote sensing image feature points, matching remote sensing image control points, a new matching method is proposed. The experimental results show that the new matching method can effectively improve the anti-interference ability, reduce the noise in the matching behavior and improve the smoothness of the matching results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Navigation computer circuit is the core component of SINS. The design precision of navigation computer directly affects the performance of SINS and even the whole control system. This paper uses DSP+ FPGA to design and complete the hardware platform and interface program of the navigation computer circuit, builds the SINS/GPS vehicle system, and carries out the sports car experiment, which effectively verifies the performance of the SINS navigation computer circuit.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
3D reconstruction of meteorological radar data is an important basis for the visualization of meteorological data, which can reflect the basic characteristics of atmospheric motion and the distribution of cloud space. Data collected by meteorological radar is distributed on the conical surface formed by body scanning and has the characteristics of the discrete, uneven and irregular distribution. When importing a 3D data field, data redundancy or loss is easily caused and calculation is complicated. In this paper, the trapezoid body is used as the calculation unit, the adaptive Barnes interpolation method is used to construct the 3D data field of the cone, different types of triangular patches are combined when the peer surface intersects the unit trapezoid body, through the correspondence of the vertices of the triangular facets of the trapezoid body of the adjacent unit solve the problem of loopholes in the connection of the trapezoid body. Experimental results show that this method can shorten the time to find and calculate the contour plane from the original data, and the reconstruction results can better reflect the atmospheric motion characteristics.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The use of emoji in the network environment is very important for the establishment of intimacy. This also plays an important role for college counselors in the network communication of students. The experiment simulated three different dialogue modes of We-chat. 41 sophomore students were selected using a 2 (validity: positive, negative) *3 (dialogue modes: text-only, emoji-only, text+emoji) within-subjects design, and eye-movement technology was used to examine the effect of different response styles of counselors on students' perceptions of college counselors' intimacy. The results showed that students gazed at the interest area more often and for a longer total time in both positive and negative states with the text+emoji dialogue method. This suggests that the text+emoji dialogue method promotes students' intimate perception with their counselor.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To generate images with complete structure and clear content, a gradient-guided image inpainting method is proposed by introducing the gradient branch to guide the image inpainting. At the same time, in order to better fuse the branch features of gradient map and repair results of generator network, a feature equalization module with attention mechanism is introduced, to effectively balance features and inhibit learning unimportant feature information. Finally, in order to avoid using KL divergence and JS divergence to measure the distribution gap between two samples, this paper uses Wasserstein distance to measure the sample gap, and designs the adversarial-discriminative network based on WGANGP. Experiments on Paris StreetView and CelebA datasets show that our method can obtain satisfactory repair results with complete structure and clear content.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The quality of connecting rod of automobile engine directly affects the transmission performance of engine and is an important factor affecting the safety of automobile.Target detection is a key technology that affects the precision of multi-parameter identification of automobile engine connecting rod mass based on machine vision. This paper proposes a sub-pixel - based target detection method for automobile engine connecting rod image. A multi-parameter detection image acquisition system for automobile engine connecting rod quality was built. Based on multi-threshold analysis to eliminate image shadows, a Gaussian homomorphic filter was designed to enhance the contrast between the image target and the background. Subpixel level analysis and Hough transform are used to detect straight line, circle and other objects in engine connecting rod images, which provides a basis for automatic identification of multi-parameter connecting rod quality based on machine vision. An example of target detection in an engine connecting rod image shows the effectiveness of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to solve the problems of ghosting, deformation and skew matching in large parallax image mosaic, an improved SPHP method is proposed. The steps are as follows: 1. In the feature extraction stage, feature points are extracted by ASIFT algorithm; 2. In the feature matching stage, feature points are roughly matched by KNN algorithm, and feature points are purified by RANSAC algorithm; 3. In the image alignment stage, the spatial transformation model and similar transformation model are calculated by SPHP algorithm to correct the shape and effect of stitching; 4. In the image fusion stage, the linear weighted fusion algorithm is used to fuse the overlapping area. Compare with the traditional autostitch algorithm and SPHP algorithm, the experimental results show that the quality of the improves SPHP algorithm is significantly improved, and the normalized mutual information is improved by 1.4% compared with the SPHP algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image super-resolution is an ill-posed problem, which means that one low-resolution image may correspond to multiple different high-resolution images. We proposed a novel strategy of non-blind multi-scale conditional super-resolution (MSCSR), which not only uses low-resolution image but also uses label of scale factor as the model input, and trains the network using image with the corresponding scale factor. To embed the condition into the network is to combine the condition label with the channel attention. Our model can reconstruct images with different degrees of information by inputting the same low-resolution image and different scale factors.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Brain MRI (magnetic resonance imaging) ventricle segmentation is the key basis for brain disease diagnosis and threedimensional reconstruction. Aiming at the characteristics of insignificant gray-scale difference and low resolution between brain MRI brain tissues, a ventricle segmentation method integrated with Otsu and region growth method was proposed. This method first used Otsu to determine the best global segmentation threshold to improve the growth point selection and growth rules of the region growth method, and then used the improved region growth method and prior knowledge to segment the ventricle region, and used mathematical morphology related techniques to make up the image Holes and edges that are not smooth. Experiments show that compared with Otsu threshold segmentation method and traditional region growth method for brain MRI images, the method in this paper has obvious advantages in segmentation effect, segmentation precision and recall rate. The average value of segmentation and intersection is 0.706319 than IOU, which is better than others. The average value of the algorithm is 42.13% and 23.71% higher, and it has good robustness.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the process of structured light three-dimensional measurement, the absolute phase can be obtained from the wrapped phase and the period value K. The period value K directly determines the quality of the point cloud. This paper proposes a novel structured light fringe encoding and decoding scheme, which can quickly and accurately solve the period value K and the wrapped phase: it only needs to project four fringe patterns to recover the absolute phase. This method calculates the high-frequency wrapped phase based on two sinusoidal fringe patterns, then uses the step fringe pattern and the uniform fringe pattern to calculate the period value K of the wrapped phase, and uses the wrapped phase and the period value K to recover the absolute phase. Experimental tests show that the proposed method is accurate for fast 3D reconstruction.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Hyperspectral images (HSI) cover a very large area, how to achieve excellent classification performance with limited time consumptions is still a challenging issue. To reduce running time and improve accuracy, a parallel-branch expectation-maximization (PBEM) attention principle method will be proposed to HSIs classification in this article. In my cognition, this may be the first study to apply the expectation-maximization attention methodology in hyperspectral image classification. Besides, we believe we are the first to combine the disout layer and the expectation-maximization attention methodology in hyperspectral image classification. The experimental results from benchmark dataset prove the superiority of our team proposed methodology in hyperspectral image classification, especially in small sample classification task.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The Search Engines aim to provide the retrieval results that satisfy users, and the users will benefit from the search results that have been classified. This paper proposes a new image classification algorithm for organizing photos based on the rules of photography aesthetic. The main idea comes from the principles of photographic composition. The subjects in the photo are identified with the image sharpness evaluation algorithm. Then, the composition patterns frequently used by photographers are summarized and the images are classified according to their photographic composition. The experiments are designed to validate the the feasibility and effectiveness of the proposed algorithm. The experimental results show that the proposed scheme is consistent with the human vision system and subjective judgement.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a survey and implementation of BERT language models by analyzing and summarizing its superiority or limitations in relevant tasks and the reasons for it. On this basis, a family of BERT-like models is collected to address these limitations - namely, ERNIE 2.0 (Baidu) by adding and refining pre-training tasks and MT-DNN (MICROSOFT) by introducing multi-task learning downstream. By comparing the changes in the native models with controlling variables for specific tasks and the impact of pre-training on the models in real-world applications with the same language model architecture, we summarize and outlook the potential directions and characteristics of future technological iterations in the field of natural language processing in language modeling.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to quickly and effectively detect lung information in different medical images, this paper designs an improved VGG16 image-based lung opacity classification detection method based on deep transfer learning. This paper applies offline data enhancement technology to increase the number of samples, improves VGG, and employs transfer learning to train the lung recognition model. The results show that the improved VGG16 network has an accuracy rate of 85% for the classification and recognition of lung pictures, and can accurately detect lung pathological mutation information.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image Processing Technology and Intelligent Recognition and Detection
In this paper, we first reviewed the development of remote sensing satellites in China, subsequently establishing that artificial intelligence (AI) is the key to achieve global geographic environment change monitoring in the era of big data. Next, to improve existing specialized and isolated AI solutions, we proposed a change detection method based on the geometric registration of multi-modal images and the fusion of multi-modal radiation information. Our objective was to promote the development of artificial intelligence technology and multi-modal remote sensing image fusion technology in the field of intelligent change detection of global geographic environment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A new method based on a Network in Network (NIN) structure is proposed to detect target changes from multi-temporal optical remote sensing images. Firstly, the changed areas are captured by a change detection method based on multifeature fusion, and the changed patches are obtained by morphological processing. Then, a convolutional neural network with an NIN structure is constructed to train the target recognition model using a small number of samples and to distinguish the original images corresponding to the tchanged patches. Finally, a recognition strategy combining preliminary screening and thorough screening is designed, and multiple thresholds are assigned according to the patch size to avoid the possible false detection brought by a single threshold. Based on experiments with multi-temporal airport images, the overall accuracy of aircraft target change detection using the method in this study was 91.89%, with a false alarm rate of 10.71%, indicating that this method can accurately and reliably detect target change.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The vehicle-mounted millimeter-wave radar has been widely used in modern vehicles. The radar needs accurate feedback on the information about the surrounding environment, and the driverless cars also need to have extremely high safety.Therefore, it is of great significance to enhance the ability of the vehicle-mounted millimeter-wave radar to detection of the surrounding targets. Traditional vehicle-mounted millimeter-wave radar usually signals the target echo signal and finally get the target information. In the signal processing process, it is greatly affected by the weather environment, so it is difficult to ensure a sustainable high detection rate. Therefore, a deep learning-based radar spectrum target detection method is proposed to quickly identify the objects present in the echo spectrum. First, the vehiclemounted millimeter radar spectrum mat format data is extracted, make the dataset for training, then adjust and optimize the SSD (Single Shot Multibox Detector) model according to the situation of the data set, and identify the spectral image using the optimized a training model. In the experiment, 11635 pictures were divided into training set and test set at 9:1, and the average mAP of various targets reached 94.35%, thus detecting the radar echo spectrum based on deep learning and achieving good results. The proposed spectral target detection method can be applied to vehicle-mounted millimeterwave radar.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Object detection is a hot talking point in computer vision. Recently, as COVID-19 is spreading globally, the epidemic prevention and control has entered a normalization, wearing masks when entering and leaving public places and taking public transportation has now become normalized. The recognition of face mask is also of increasing concern. Then fast and accurate mask identification is essential. The Faster R-CNN is currently a more advanced object detection algorithm. It has the advantages of fast detection speed and high detection accuracy and is widely used in various fields. However, this method often fails to demonstrate its excellent performance in detecting small objects. This paper is based on the Faster R-CNN object detection algorithm and introduces FPN to solve multi-scale mask recognition and detection. The feature map of each resolution is introduced into the latter resolution feature map for element-wise summation operation. Fusing shallow layers with high resolution and deep layers with rich semantic information to improve the ability to detect small objects. The method is validated with 2000 face mask images as the dataset. Experimentally, the improved method proposed in this paper proves to be effective and better than the original algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Aircraft skin image fault detection plays an important role in the process of aircraft maintenance, but the traditional detection methods have low efficiency. This paper uses the YOLOv4 object detection algorithm for fault detection of aircraft skin images. Aiming at solve the problems of lightweight structure and the detection speed, we proposed the YOLOv4-ML detection algorithm. The CSPDarkNet53 backbone network in YOLOv4 is replaced by the MobileNetv3 network. The SiLU activation function is used in the shallow feature extraction network, and the adaptive training sample selection method is introduced in the training process to improve the quality of network training. The experimental results show that the YOLOV4 algorithm with lightweight design improves the detection speed by 10% while losing only 2.6% of the detection accuracy, which verifies the effectiveness of the improved algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Regression-based text detection methods are currently the research focuses due to their simple network structure and fast inference speed. However, most of them suffer from limited receptive field of convolutional neural network and simplistic feature-fusing in feature pyramid. As a consequence, the previous algorithms still have many shortcomings, such as difficulty in accurately detecting long texts and the inconsistency across different feature scales. To address these two problems, we first incorporate a densely connected atrous convolutional module into the feature extraction network, accordingly the receptive field is enlarged, and in turn the extraction of high-level semantic information is strengthened. Secondly, we weight and re-fuse the features from different levels of the feature pyramid, which can filter conflicting information at various levels to maintain the scale invariance of features. Extensive experiments have been made on ICDAR2015 and MSRA-TD500 datasets, and the experimental results have proved the effectiveness of the method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Vehicle monitoring using cameras is an important task for highway management. At present, most vehicle detection algorithms are applied in daytime. However, compared with the daytime, the road environment at night is darker and the vehicle body is not obvious, the existing algorithm is difficult to meet the vehicle detection requirements in the night scene. Due to the complexity of night scenes, video images bluring and excessive noise, vehicle detection at night poses a great challenge. Aiming at the above problems, this paper proposes a nighttime vehicle detection framework based on improved Fasterr R-CNN. Firstly, the deep residual network is used to extract features autonomously, and the spatial attention mechanism is integrated to make the network pay more attention to the road region rather than the background region. Secondly, balanced feature module is introduced to make full use of the extracted visual features. Finally, Soft- NMS replaces NMS to reduce the number of missed-detection vehicles. Experimental results show that the AP value of the improved Faster R-CNN model is 4.18% higher than that of baseline Faster R-CNN model.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years, there has been a great progress on object-tracking task. Most of them are based on Joint Detection and Embedding (JDE) benchmark, which accomplish detection and Re-ID task in a single module and thus could reduce the time cost and help to gain a higher processing FPS. However, large computation requirement of existing JDE-based method, which usually demand several expensive GPU devices, is still an obstacle to wide application for industry. In this paper, we propose a new lightweight structure named ShuffleXnet, and further build a simple module named Pyramid-ShuffleXnet (PSXnet) for Multiple-Object Tracking (MOT) task. The motivation of this work is to reduce the amount of calculation and make the network easier to be employed for online and real-time applications. Experimental results show that our method could achieve nearly 28% higher FPS than FairMOT with just 6.7% less by Multi-Object Tracking Accuracy (MOTA) score on MOT17 dataset.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Whether the mobile robot can accurately recognize the terrain features will affect the motion control of the mobile robot. Since the outdoor unstructured terrains have not obvious boundary information, it is necessary to segment the terrain boundary. The paper builds the outdoor unstructured terrain dataset, then chooses the real-time semantic segmentation network-EDAN et to train and test the effort of the unstructured terrain segmentation. The experimental results show that the EDA et can achieve a better balance between segmentation accuracy and segmentation speed for unstrnctured terrain recognition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The Moon is the heavenly body closest to Earth. In order to conduct an in-depth study on the Moon, select the landing site, and/or plan for roving exploration, researchers need to understand how long the Moon has existed and how it was formed. An internationally common method for age dating of the Moon in areas without lunar soil samples is to determine the absolute age of the Moon based on the number and sizes of impact craters. For the identification and extraction of impact craters required for age dating, we combined histogram of oriented gradients (HOG) features and support-vector machine (SVM) classifiers to set up a sample pool (including positive and negative samples) for lunar impact craters, thereby achieving automatic identification and extraction of impact craters of different sizes in the landing area of Chang'e-5.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper mainly studies the problem of face recognition, and an extended local binary pattern method is proposed to achieve face recognition. In this paper, we propose a further improvement on the original LBP operator, which is named extended LBP. We add 8 pixels to the original 8 pixels to improve the recognition abilities. The added 8 pixel can be divided into two groups, because the sampling radius is different. After extended local binary pattern, we use the SVM classifier to achieve face recognition. Based on the proposed algorithm, the performance of face recognition is good.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the spread of the epidemic in the world, wearing masks has become the most simple and effective way to block the COVID-19. For the lack of data and model design to fit the epidemic scene, we propose an integrated masked face recognition system with three cascaded convolutional neural networks. Firstly, a SSD model is used to detect masked face to eliminate the interference of irrelevant background. Then, we use an Hourglass network to regress the key points of the occluded face and crop the aligned eye-brow area without mask. Finally, we finetune a pretrained FaceNet to fully adapt to the data of eye-brow regions. Experiments on numbers of laboratory and wild images proved that our method can recognize the subjects with mask effectively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the practical application of wisdom education classroom teaching, students' faces may be blocked due to various factors (such as clothing, environment, lighting), resulting in low accuracy and low robustness of face recognition. To solve this problem, we introduce a new image restoration and recognition method is based on WGAN (Wasserstein Generated Adversarial Networks). When using the deep convolution generates adversarial networks for unsupervised training, we add the conditional category label c to guide the generator to generate sample data. At the same time, a double discriminant mechanism is introduced to enhance the feature extraction ability of the model. The local discriminant can better repair the details of the occlusion area, and the global discriminant is responsible for judging the authenticity and overall visual coherence of the restored image. Part of the convolution layer of the global discriminator is used to construct a VGG-like structure network as the feature extractor, which is composed of the full connection layer and the sigmoid layer. It can accelerate the convergence speed of the network and improve the robustness of the method. In order to improve the training stability and reduce overfitting, L2 regularization is added on the basis of context loss to enhance the continuity of local and whole images, and improve the quality of restoration and recognition accuracy. We used the peak signal-to-noise ratio (PSNR), structural similarity (SSIM) and recognition accuracy as evaluation indexes, and achieved good results on CelebA and CelebA-HQ datasets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Video-based person re-identification (Re-ID) has drawn more attention as video surveillance could offer richer spatial and temporal information to potentially reduce visual ambiguities and occlusion. For visual ambiguities, multi-scale features are beneficial for distinguishing similar pedestrian sequences by different semantic information. For occlusion, Graph Convolutional Network (GCN) could effectively leverage the complementary information with node pairs for Re- ID task. In this paper, we propose a novel Multi-Scale Representation with Graph Learning (MSR-GL) network consisting of three branches: global branch, shallow branch and graph branch. The global branch and shallow branch extract multi-scale features from different layers of CNN backbone. Specially, an extra Bottleneck module is introduced for shallow feature maps, in which the parameters are independent with other branches. For graph branch, the adjacency relationships are dynamically modeled through a temporal-spatial symmetrical transformation between nodes. Then, the node features are updated by adjacency matrix and aggregated to video-level graph features. We conduct extensive experiments on three widely-adopted benchmarks (i.e. MARS, DukeMTMC-VideoReID and iLIDS-VID). Results show that we achieve the superior results compared with several recent state-of-the-art methods with 90.28% rank1 and 85.20% mAP on MARS.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Based on the concept of deep learning, the proposed convolutional neural networks (CNNs) processing of extracted image features has been recently applied to tackle early fire detection during surveillance. However, such methods generally need more computational time and memory and seldom take smoke that always produced before fires into consideration, which results in poor detection speed and accuracy relatively. In this paper, we propose a novel imagebased fire and smoke detection network. Inspired by Yolov5 architecture, considering the untargeted feature extraction capability and limited receptive fields of Yolov5, the SSHC (Single Stage Headless Context) module is added to the backbone layer to enhance the feature extraction of flames and smoke. The RFB (Receptive Field Block) module is added to the fusion layer to increase the receptive field of our network. Not only does our network detect fire and smoke well in different fire scenes, different shooting angles, and different lighting conditions, but also achieves a speed of 83 FPS, meeting the real-time detection requirements in the detection speed. Meanwhile, we have built a high quality, constructed by collecting from real scenes and annotated by strict and reasonable rules dataset for fire and smoke detection to verify the superiority of our network. Our proposed network achieves 97.2% accuracy for fire detection, 92.4% accuracy for smoke detection. Experimental results on benchmark fire-smoke datasets reveal the effectiveness of the proposed framework and validate its suitability for fire and smoke detection in surveillance systems compared to state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In all kinds of weather events, rainfall plays a vital role in human production and life. China has a vast territory and complex topography, which is severely affected by rainfall. Landslides, mudslides, and floods have occurred in many parts of the country, and the economy and agriculture are affected. The damage is severe. Therefore, it is very important to accurately predict rainfall. This article proposes three rainfall prediction models, which are respectively constructed of Random Forests (RF) prediction models, Support Vector Machine (SVM) prediction models, and long and short-term memory. The network (Long Short-Term Memory, LSTM) prediction model is used for precipitation prediction in Yibin City, Sichuan Province, and the prediction results of the three prediction models are compared. The experiment found that the average absolute error (MAE) of the RF, SVM, and LSTM prediction models were 0.085, 0.089, 0.061, the mean square error (MSE) were 0.016, 0.015, and 0.010, and the average absolute percentage error (MAPE) were 66.38, 79.27, 50.62. Among the three prediction models, the prediction error of LSTM is the smallest, indicating that the accuracy of LSTM model in predicting rainfall is higher than that of RF and SVM models.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This article proposes a ship target recognition method based on FAST detector and Faster-RCNN. Firstly, the FAST detector is used to extract the feature points of the ship target. Then, using the method of increasing the sliding window, improving the convolutional layer structure of Faster R-CNN, using suitable anchor points to identify the target; designed a recognition method based on the combination of the real-world model identification frame and the area suggestion to obtain the target information. Finally, the method of non-maximum suppression is used to filter and remove the redundant rectangular identification frame, so as to realize the accurate identification of the ship's real-world target. Through experimental comparison and analysis, this method has application advantages in extracting feature points with greater recognition utility and recognition rate.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the development of hyperspectral technology and the increase of hyperspectral dimension, single model is difficult to apply to the process of feature selection, feature extraction and feature integration for hyperspectral image, causing the undesirable hyperspectral classification effect. In order to improve the classification accuracy, a kind of algorithm of uniting convolutional neural network and multihead attention is proposed. Firstly, PCA algorithm is used for dimensionality reduction of hyperspectral data; Then, excavation feature of multi-scale convolutional neural network is utilized; Finally, residual layer and classification layer are utilized for the integration of convolution results and the classification of hyperspectral image. open-sourcing hyperspectral dataset Piavia, Salinas and Inida are verified, and the algorithm in this paper can improve the hyperspectral classification accuracy efficiently.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a novel baseline for deep person ReID methods by introducing human pose-based attention mechanism. Benefiting from deep convolutional network, there has been great progress of person re-identification (ReID) in recent years, which aims at retrieving the same person identities from images captured by different cameras. Most of existing methods focus on designing complex network structures to achieve higher scores on public datasets, but few works pay attention to baseline design. A strong baseline is crucial in experiments and could make the elaborated proposed methods more convincing. The present study makes use of a pre-trained human pose estimator to extract human key-point information. Then, we propose a novel manner to fuse pose information with global feature from Resnet50, which could lead the network concentrate more on discriminative key-point feature areas. Our work could achieve 94.8% rank-1 accuracy & 87.4% mean average precision (mAP) on Market1501, and outperform all other existing baselines that only use Resnet50 to our best knowledge. What’s more, experiment results also suggest that with the help of pose information, our work could naturally be robust against misalignment and occlusion problems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Object detection has a wide range of applications in daily life and industrial fields. However, the success of object detection depends on a huge amount of manually labeled data. In this paper, based on the YOLO object detection model, two types of pedestrians are identified. After data enhancement and training, the performance of the model is analyzed. This paper also studies the simplification method for the training set, through the down-sampling method, continuously reduces the number of the training set, and finally obtains the simplification strategy when preparing the training set. This paper aims to provide a training set preparation and simplification method for object detection in a specific scene, so as to save computational cost and improve the efficiency of resource use.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The surface quality of steel ball is one of the most critical assessment indicators for evaluating the quality of steel ball. Therefore, a new method is presented in this study which uses the pre-trained YOLOv4 model based on the deep transfer learning to detect the surface defect of steel ball. The method could automatically extract features from surface detection of steel ball and accurately detect the surface defects of steel ball. The experimental results show that IOU and MAP are 0.9325 and 0.9137 respectively, which are better than other methods in the previous studies.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Remote sensing aircraft target detection is an important task in the field of remote sensing images interpretation. The general target detection network often includes a large number of parameters, slow detection speed, and poor performance when directly applied to aircraft detection tasks. In order to solve the issues, a novel remote sensing aircraft target detection method based on lightweight YOLOv4 is proposed in this paper. Firstly, Lightweight YOLOv4 adopts the MobileNetV3 and depthwise separable convolution to greatly reduce the amount of model parameters. Then, to further enhance the feature extraction ability of the network and make the network more lightweight, this paper uses the feature enhancement module (FEM) and residual fusion module (RFM). Extensive experiments on the DOTA aircraft dataset demonstrate that the Lightweight YOLOv4 can significantly improve the detection accuracy and efficiency, as well as owns fewer model parameters.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper introduces the latest research progress of marine target recognition technology which based on three detection methods: radar, infrared, and visible light. It also compares the advantages and disadvantages of different technical means, and summarizes the current common ideas for solving target recognition problems. The results show that the selection of the detection method needs to be considered according to the actual application background, and the specific algorithm to be used needs to be determined by the difference between the target feature and the background feature.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The current COVID-19 pandemic continues with its new variants, whose mutations are unpredictable. Thus, how to predict mutations in viruses has profound meanings for vaccine and drug development as well as prevention measures. Currently the documented mutations in SARS-CoV-2 are not abundant yet, especially for making phylogenetic tree, it would be useful and easy to use the virus data with abundant mutations such as influenza A virus to build predictive model. In this study, a neural network with feedforward backpropagation algorithm is employed to predict the probabilistically possible mutation positions and mutated amino acids in hemagglutinins from Eurasia H1 influenza A virus. The study demonstrates an encouraging result and suggests the possibility to continue working along this research line.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As one of several common blurs, motion blur images are quite common in life. There are many ways to remove motion blur, but each has some problems, such as checkerboard artifacts, poor restoration of image texture details, and high algorithm costs. This paper proposes an improved blind motion deblurring method with Generative Adversarial Network, which achieved good results. The method in this paper uses a combination of up-sampling and convolution to replace the deconvolution of the traditional generative adversarial network model, effectively removing the common checkerboard artifacts in image processing, and the problems of poor texture detail restoration. The results show that the method in the article has achieved the expected purpose, it has achieved obvious deblurring effects from both objective and subjective evaluation indicators.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Virus evolution is important because it generates new mutations, which can be harmful to humans. Because the evolution is a process along the time course, and many mathematical tools describe the phenomenon along the time course, thus it is possible to apply a mathematical tool to a virus evolution. Neuraminidase is one of two surface proteins in influenza A virus playing an important role influenza transmission, thus it is important to model its evolution. Yet, a protein sequence should be converted into numerical values in order to be workable in mathematical tools, and a driving force for evolution should be defined. In this study, first we use the amino-acid pair predictability as a measure of driving force for evolution to convert 3828 neuraminidases sampled from 1956 to 2008 into numerical values; second we use a system of differential equations to describe the mutation neuraminidases; and third we use the analytical solution to fit the evolution of neuraminidases. The results show a promising and encouraging trajectory of evolution of neuraminidases along the time course.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image Classification Algorithms and Deep Learning Applications
Cassava is an important food security crop in Africa because it can withstand harsh environments. However, viral diseases are the main reason for crop failure. Based on photos of cassava provided by local farmers in Africa, it is possible to use deep learning technology to identify some common viral diseases so that they can be treated. This paper introduced an image classification algorithm based on an ensemble learning[4] model, which combined Vision Transformer[2] and EfficientNet [1]. The practice has proved that the model proposed in this paper has a certain improvement in performance compared with traditional image classification methods and can effectively help local farmers.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The target tracking method based on RGB image is affected by many factors such as light and haze weather, so it is difficult to distinguish the tracking target from the background, and it is easy to lead to the drift or even loss of the tracking target. Target tracking based on infrared images is not affected by light, haze and other illumination factors, but the target's color, texture and other characteristic information will be missing. Therefore, in order to obtain the target's color, texture and other characteristic information in a poorly illuminated environment, while achieving accurate and fast tracking of the object, this paper proposes a RGB and infrared image fusion tracking method based on a deep convolutional network. Firstly, the fusion method of RGB and infrared images is studied; secondly, a target tracking network based on the Siamese network is established to extract the image convolution features of the target template and the current target; finally, the response map is calculated by the deep cross-correlation module. At last, the performance test of the target tracking algorithm is carried out on the VOT2019RGBT data set. Experimental results show that the algorithm can effectively solve the problem of target tracking when the target is partially occluded, the tracking scene has no suitable illumination or the light changes strongly, and it is of great significance to improve the accuracy of target tracking under complex backgrounds.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The stereo matching method based on deep learning is developing rapidly. Before the AANet+ adaptive aggregate stereoscopic matching network was put forward, the 3D convolution-based deep learning network occupied the core position, but it had the disadvantage of large number of parameters and high memory consumption in cost volume and cost aggregation. With the advent of AANet+, it solves the defects of 3D convolution to some extent by virtue of deformable convolution and intra-scale aggregation modules and cross-scale aggregation modules. However, there are also shortcomings in AANet+, which uses hourglass stacking modules to start feature extraction from multiscale directions, but uses convolution to lose some of its information when fused. Therefore, this paper uses the channel attention module ECA to improve the feature extraction module of AANet+, and introduces the channel attention module in the traditional hourglass stacking module to extract more rich features. In addition, channel attention modules are introduced to obtain multi-scale features to reduce information loss and increase contextual connections. Experimental results show that the proposed improved network has achieved better results in synthetic data set Scene Flow and realworld scene data set KITTI2015 than AANet+.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In computer vision technology based on deep learning, the backbone network is very important, and its performance can usually affect vision-related tasks such as target detection and target segmentation. This article proposes a Ghostnet improvement strategy combined with Shufflenetv2 named GSnet. This paper uses Shufflenetv2's concatenate and shuffle operations to further improve Ghostnet. And the attention module in Ghostnet has been improved and optimized. This paper uses long-distance non-local features to further improve SA attention and embed it in Ghostnet. Experiments are carried out with the data set of cifar100, and after experimental verification, this method reduces the amount of parameters and calculations on the basis of the original method and improves the accuracy of TOP1 by 1.26%. Code is available at https://github.com/Yipzcc later.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The lack of lighting in the space environment results in low segmentation accuracy and target lost. To solve this problem, a satellite component tracking method based on Few-Shot learning is proposed in this paper. First, we design a convolutional neural network, which inputs the first frame of mask information, and outputs the true label and important weight parameters. The Few-Shot learning incorporates the real labels, important weight parameters and the first frame feature information to generate target model parameters. Subsequent frames combine target model parameters with feature extraction, and finally output target mask after encoding and decoding. Our algorithm is evaluated on a new satellite partial component data set, and the simulation results show that the proposed method improves the segmentation accuracy and reduces the target loss rate compared to SiamMask under low-light environment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Human pose recognition based on bone node data collected by depth camera is a key problem in the field of human-computer interaction. To improve the accuracy of human pose recognition, a new algorithm based on multiple features and random forest model is proposed. Firstly, a 93-dimensional vector is defined, which contains the coordinate feature of the joint and the distance feature, and the distance feature is selected according to the spatial position of the joint. Then, in the process of body pose recognition, the random forest model is combined with Bagging algorithm to ensure the balance of samples, so as to improve the classification performance of the classifier for different samples. Finally, the performance test of the constructed classifier is carried out on the UTKinect-action3D Dataset. The experimental result shows that the algorithm can effectively identify a variety of human posture, and the recognition rate reaches more than 90%. The fusion of multiple features is of great significance to improve the accuracy of human posture recognition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Chaos theory, as an essential component of Modern nonlinear dynamical system theory, precisely presents regular control, provides a calculation method to encrypt images based on Henen mapping. This method produces random sequences by the specific key which utilized basic formula from Henon chaotic mapping. The random sequence is then excluded or calculated with original images to achieve the encrypted image and decoded by cipher keys. The simulation results and theoretical analysis we conducted prove the strength of this algorithm in terms of randomness, confidentiality, sensitivity, and efficiency.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Retinal blood vessels segmentation acts as an important part to the treatment of ocular disease. Lately, automatic segmentation based on deep learning can solves problems of low efficiency and strong subjectivity of manual segmentation, and attracts the attention of researchers. In this paper, a novel segmentation model called Residualpath-Res-Dense Net (RRD-Net) is proposed to achieve vessel segmentation. In RRD-Net, Res-block and Dense-block are used to help speeding up the convergence of the network and learning more intrinsic features. In addition, the introduction of Residual Path can cut down the semantic difference between the connected feature and eliminate the potential impact of semantic difference on segmentation accuracy. We apply benchmark datasets DRIVE and CHASE_DB1 to evaluate effectiveness of the proposed network. Accuracy, sensitivity and F1 score demonstrate the effectiveness of RRD-Net.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The idea of SDN provides a new way to solve the problems faced by mobility management of network layer in traditional networks. However, the existing mobility management schemes in SDN still face problems such as large network overhead and prolonged switching time in the process of user mobility. In this paper, the mobility management mechanism of network layer supporting IPv6 in SDN is studied. Firstly, by analyzing and designing two key technologies of mobility awareness and mobility handover, and combining with the position prediction theory, an IPv6 mobility management mechanism supporting pre-handover in SDN is proposed. Secondly, according to the result of position prediction, this mechanism calculates and issues the switching path runoff table before the movement, which reduces the signaling overhead and switching delay of the mobile node in the process of movement. Finally, according to the performance evaluation and analysis method, the signaling cost and switching delay of OPMIPv6-C scheme which is also implemented based on SDN are compared.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
College English teaching mode is constantly changing. This paper takes college English students' autonomous learning as the breakthrough point, and introduces the concept of cloud computer, deeply explores the existing problems in college English teaching in the past, compares it with the cloud computer-based college English autonomous learning and understands its advantages. At the same time, it analyzes the future road of college English autonomous learning assisted by cloud computing from three different levels: universities, teachers and their students.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to make up for the lack of information caused by a single band image in fire detection, improve the ability of fire detection and early warning, and enhance the completeness and accuracy of the description of fire scenes, this paper proposes an image fusion algorithm based on NSST and feature weighting to detect the visible light in the fire area. Fusion with infrared images. First, the color visible light and infrared images are converted and grayed by HSI to obtain the source image required for fusion. Secondly, NSST transform is performed on the source image to obtain the low-frequency and high-frequency subband coefficients. Subsequently, the low-frequency coefficients are fused using contrast feature weighting and color transfer technology, and the high-frequency coefficients are fused using variance weighted information entropy feature weighting. Then, perform NSST inverse transform and HSI inverse transform on the fused coefficients to obtain the fused image. Finally, through subjective visual perception and objective evaluation indicators, the images fused with different algorithms are evaluated, and the performance difference between the traditional algorithm and the algorithm in this paper is discussed. The experimental results show that the image fused with the algorithm in this paper has higher contrast, richer texture, clearer outline and better visual effect.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Objective (To explore the relationship between resilience, emotional balance and mental health). Methods (to investigate 512 college students, using Self-compiled questionnaire,CD-RISC10,ABS,and Kesller-10 scales). Results (With the prolonged period of the epidemic (after three months), the overall proportion of negative emotions continues to rise; The resilience was upper-middle level, the emotional balance and mental health are at the middle level; Gender has a significant difference in resilience (t=4.68,p<0.001), urban and city have a significant difference in positive affect and emotional balance (t=-2.853,-2.492,p<0.01). Resilience and mental health are negatively related (r=,-0.287, P<0.01). Conclusion (The epidemic has a certain impact on the emotion and life of adolescents. Therefore, the resilience and emotional balance of adolescents can be improved to promote the mental health).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Few-shot classification, aiming at learning a precise classifier for novel classes with a few annotated support samples, is a challenging task. Current few-shot classification algorithms firstly extract the prototypes from the support images and further leverage the prototypes to classify the query images with diverse elaborately designed matching methods. However, current algorithms all ignore the essentiality of prior knowledge, which would seriously hamper the generalized prototype learning and weaken the classification performance. Meanwhile, the query features obtained from the extractor are indeed the confounders, which would result in the disruption between different class and dramatically limit the classification performance. To tackle these challenges, we propose a novel prior knowledge guided few-shot classification network with class disentanglement. Specifically, the prior knowledge guided module is constructed to boost the original prototypes with more generalized knowledge in a selective combination manner. Meanwhile, class disentanglement module is designed to disentangle the confounder (extracted query features) with the disruption between different classes eliminating. By combing the prior knowledge and disentangling the confounder, our designed network efficiently addresses the few-shot classification. Comprehensive experiments on the miniImageNet and tieredImageNet powerfully demonstrate the superiority of our proposed network.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
CT images play a vital role in the diagnosis of liver cancer. However, CT images often have significant image noise, which is unfavourable for doctors' diagnoses. In response to this problem, this paper applies the BM3D denoising algorithm to the denoising CT images of liver cancer. The BM3D denoising algorithm first obtains similar blocks through block matching, stacks these similar blocks into three-dimensional blocks, performs collaborative filtering processing, and finally obtains the final clear image through aggregation. Experimental results show that the BM3D algorithm can effectively remove the noise of CT images of liver cancer.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The model search based on the distillation model framework aims to train the candidate models adequately and guide a correct evaluation of the architecture. This NAS method can easily obtain middle-level monitoring identification indicators, thus significantly improving the effect. However, the model search based on distillation framework also has its shortcomings. First, the supervision indicators differ greatly for various teacher-student models, so how to determine a highly adaptable supervision indicator is a very important issue. Second, different teacher models will introduce biases. Based on the above problems, this paper proposes the following measures. Firstly, this paper adopts a more adaptable supervision index, which can effectively solve the problem that various teacher-student models differ greatly. Secondly, in order to reduce the bias introduced by the teacher model, this paper adopts the largest teacher model as the guidance model in the network training. Finally, this study uses a reinforcement learning algorithm to guide the search in the internal network, and introduce more supervision quantity, which makes the supervision effect between layers more obvious. It can be concluded that the above methods can effectively improve the model performance and consistency.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Medical image segmentation is an essential task in the field of image recognition, and has a high requirement for accuracy. As a popular model, although U-net has accomplished many achievements, such as locating localizing the lesion area and object segmentation, it has shortcomings on segmenting quite small details. In this paper, we improve the U-net algorithm by adding ResBlock and Batch Normalization modules, as well as increasing the network depth, so as to improve its accuracy for tumor site segmentation in medical images of the brain. The K-Means algorithm is also used to segment the brain region and the background region to find out the relative size of the brain occupied by the tumor, which achieves both image segmentation and tumor size measurement.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The DUDnCNN algorithm is a combination of feedforward denoising network and U-Net network with symmetric connections for better feature fusion. Net network with symmetric connections for better feature fusion, while the first proposed improved algorithm changes the network structure and adds asymmetric connections to achieve feature fusion between different layers again. In addition, the improved network with respect to residual learning and different ways of connectivity is also proposed and tested under different noises for training. From the experimental comparison effect, it can be seen that the improved algorithm of DUDnCNN has better applicability in fidelity and noise reduction and also has certain generalization ability, and the algorithm also has the feasibility of further investigation in the field of seismic data fidelity and noise reduction and seismic interpretation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The traditional image classification methods have defects, which can not process massive image data, and can not meet the needs of image classification in speed and accuracy. The performance of deep learning in the field of computer vision is better than the traditional machine learning technology, and it has become the mainstream method of image classification. Based on the deep learning method, this paper summarizes the commonly used algorithm models in the field of image classification, analyzes the error rate, architecture design, application scenarios and other aspects of the models, and then compares the differences between the current network models with outstanding classification effect through experiments, and further verifies the advantages and disadvantages of various models. Finally, the development trend of deep learning in image classification is summarized, and the possible research directions in the future are discussed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the rapid change of society and the continuous development of science and technology, the application of instruments is more and more widely in all kinds of places from daily life to chemical power and aerospace. The pointer instruments in traditional industries are still widely used. But without changing the original equipment structure, the intelligent transformation is still an urgent research direction. The object of this paper is a kind of pointer pressure instrument used in industry. The purpose of this paper is to research the reading system of the pointer instrument based on embedded hardware platform and simple software algorithm. This paper designs an automatic reading system of the value indicated by the dial of the pointer instrument. The system uses embedded hardware devices and adopts the design idea of Linux + Qt + OpenCV. An open source, cross platform and relatively simple image processing algorithm is designed in it. Canny Edge detection, Hough-Circles detection and the Hough-Lines detection are combined in it.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Over the past decades, natural language processing (NLP) has been a hot topic in many fields, e.g., sentiment analysis and news topic classification. As a very powerful language pre-training model, Bidirectional Encoder Representations from Transformers (BERT) has achieved promising results in many language understanding tasks including text classification. However, fine-tune BERT to adapt different text classification task efficiently is a critical problem that needs improvement. In this paper, a general solution is proposed for BERT fine-tuning on single text classification task. Compared with other traditional fine-tune strategies without any pre-training step, the performance of BERT is boosted by pre-training withintask data. Moreover, the proposed solution obtains superior results on six widely-used text classification datasets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Nowadays, the development of computing power has made the users of different kinds of social, electric business platforms transfer from information receivers to publishers. And this situation stimulates the need for sentiment analysis. Generally speaking, there are two main methods for sentiment analysis, namely the sentiment lexicon methods and machine learning methods. The current research focuses on sentiment analysis with deep learning method. In order to explore the applicability of deep learning technology in analyzing consumers’ psychology in the real world as a tool of sentiment analysis, this paper first uses a crawler to obtain the reviews from the consumers on two major e-commerce platforms in China, focusing on the latest models of four mobile phone brands, such as Apple, Huawei, Xiaomi and Samsung. Then, 20,000 reviews are collected and sentiment marked. The preprocessed data is combined with keyword extraction method to set the first and second level classification, and the data under each classification is input into the text sentiment analysis model based on Bi-LSTM and sentiment lexicon. By scoring the reviews, the sentiment orientation of users to different mobile phone brands can be obtained. Finally, this study also gives corresponding suggestions and countermeasures. This set of analysis models can help mobile phone developers dynamically monitor the sentiment changes of consumers and grasp the trend of the mobile phone market environment in time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to ensure that the nondestructive testing climbing robot can track the weld more accurately and in real time, a method of weld edge detection and centerline extraction is designed. The histogram-based method is used to achieve fast median filtering of the image, and the background and target weld are segmented by adaptive thresholding. To obtain the contour information of the weld, the binarized image is subjected to secondary filtering and area filling combined with the opening operation of morphology and the erosion operation. After Canny edge detection and Hough transform linear detection, the straight line equations on both sides of the weld are obtained, and the centerline position of the weld is calculated. Experimental result show that this method greatly reduces the amount of calculation, shortens the image processing time, and can accurately identify the center of the weld.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Spcl algorithm introduces a unified contrast loss function with the help of mixed memory model, combined with adaptive clustering screening criteria, and achieves the best results of cross-domain pedestrian re-identification task. However, the algorithm ignores many basic problems in pedestrian re-identification task, which limits the performance of the algorithm. In this work, the algorithm is improved in all aspects, and we call the improved algorithm spcl++. Experiments show that spcl++ performs much better than spcl in pedestrian re-identification task, and achieves the best results in current cross-domain pedestrian re-identification task.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Implicit correlations exist between multi-source geospatial data, the association relation can’t be displayed intuitively and retrieved effectively, which leads to the difficulty in data utilization and data sharing. Aimed at this situation, association relation topic maps is constructed using topic map tools Ontopia in this article. Besides, to increase the efficiency of building topic map, the automatic generation algorithm of association relation topic map which based on C# is proposed. The result of experiment shows that the association relation topic map can be constructed correctly by using Ontopia tools and automatic generation algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
ABO blood type automatic recognition can realize the automatic detection of blood type, effectively improve the speed and accuracy of blood type analysis, and has a wide range of clinical applications. In this paper, an automatic blood type recognition algorithm based on image processing is designed. Firstly, the image of blood type card is preprocessed, including image enhancement and median filtering. Then, the micro column tubules are segmented by template matching. Then, the threshold analysis is carried out, and the gray image is transformed into a binary image. Finally, the distribution of red blood cell aggregates in micro column tubules is identified, and determine the ABO blood type of the sample. The experimental results show that the algorithm can effectively segment the micro column tubules on the blood type card, realize the effective recognition of red blood cell distribution, and complete the effective interpretation of blood group.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The rehabilitation model for stroke patients is relatively simple and cumbersome. The “rehabilitation +” family wisdom service system is built, interactive innovation research is carried out on the game community part, and serious game interactive design scenarios are introduced to increase the enthusiasm of patients for rehabilitation and improve the efficiency of rehabilitation. This paper analyze the current design status of existing rehabilitation products by consulting domestic and foreign literature, analyze user pain points, introduce tedious and disorderly pain points into the scene demand classification model, and conduct user demand insights from objective, demand, experience, and application scenarios, combined with KJ method summary to meet the rehabilitation needs of stroke patients, build an information architecture diagram, build a concise, efficient, and usable "rehabilitation +" smart home interactive system, and conduct interactive and innovative design for serious games. This paper improve the efficiency of family rehabilitation of stroke patients through fun and achieve high-quality rehabilitation effects, thereby saving time, manpower and social costs. The organic connection of medical rehabilitation, virtual equipment and serious games will be the development trend of smart medical care in the future, providing new ideas for future medical smart rehabilitation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Deep learning is a brand new field in machine learning research which constructs structured models to extract features by simulating the cognitive aspects of the human brain. The whole training process requires only the cooperative work of computers and no human involvement. In this paper, we explore the application and prospect of image recognition in grid equipment identification with respect to the principle of deep learning and the current development status in image recognition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.