This work presents the development of an autonomous mapping and navigation system tailored for a holonomic robot operating in unknown environments, leveraging Light Detection and Ranging (LiDAR) technology. The core of this research lies in integrating a sophisticated Simultaneous Localization and Mapping (SLAM) algorithm with real-time LiDAR data processing and the GPU’s parallel capabilities to process several subspaces concurrently. This work introduces a novel method to accelerate route planning in an omnidirectional mobile robot fusing advances in hardware with sophisticated algorithmic techniques, a new paradigm is established in path planning for omnidirectional mobile robots, marking an important milestone in the search for more agile and capable robotic systems.
This paper presents the development of an autonomous navigation system for Unmanned Aerial Vehicles (UAVs) using visual reference. The proposal employs a Convolutional Neural Network (CNN) to classify traffic signal images, enabling UAVs to navigate evolving dynamic environments. This research involves the configuration of the Robot Operating System (ROS) for UAV communication, the implementing of a specialized CNN for image classification, and the integration of this network into the navigation system. Therefore, a system will be presented for image acquisition and UAV manipulation based on CNN outputs. We present experimental results specially designed to demonstrate the efficiency of the proposal, to validate the analysis and implementation.
This paper presents the development of a LiDAR-based object classification system using machine learning and signal processing. The proposal explores Support Vector Machines (SVM) and neural networks to classify terrain with the help of a LiDAR that scans an area similarly to how a picture is taken. This project involves the processing of data to generate a point cloud that lets us visualize the scans taken by the Light Detection and Ranging (LiDAR). The dataset was built by taking multiple scans of three types of terrain, flat, grassy, and rocky. This paper shows experimental results of machine learning models built around LiDAR-acquired data and small datasets, it also shows point cloud visualizations and a simple signal processing technique.
The colorization of monochromatic images has demonstrated utility in enhancing human comprehension of images and boosting the accuracy of succeeding image-processing tasks. Nonetheless, current fully automated colorization methodologies often exhibit optimal performance based on the input image’s nature and the employed algorithms’ architectural specifics. In response to this challenge, this paper introduces a novel methodology aimed at effectively predicting the most suitable colorization model for a given input image. This comprehensive approach is characterized by exceptional accuracy across diverse datasets.
KEYWORDS: Image processing, Autonomous vehicles, Cameras, Detection and tracking algorithms, Unmanned vehicles, RGB color model, Mobile robots, Algorithm development, Image segmentation, Control systems
This paper presents the implementation of a lane detection and tracking algorithm for the autonomous navigation of an Ackermann-steering mobile robot. The proposed implementation employs an RGB camera mounted on the robot, the image information is processed through the lane detection and tracking algorithm to define the robot’s present and future position within the lane. This information is used to determine the orientation of the wheels required to steer the robot within the lane. The implementation employs a Raspberry Pi as the primary logic controller to process the image received from the RGB camera. The Ackermann-steering mobile robot performs steering and navigation with a proportional-integral-derivative controller that manages the orientation of the steering. Experimental results are presented to validate the implementation considering a physical implementation of the Ackermann-steering mobile robot.
This paper presents an in-depth exploration of a Neural Network designed to recolor grayscale images with minimal input requirements. The paper delves into the intricate process of training the network, which involves carefully selecting a fitness function and creating an effective adversarial network. Throughout the paper, various alternatives are considered and evaluated until a suitable approach is identified for further training. Notably, the implementation adopts a random batch sampling approach to gather images in each batch selection, allowing for diverse and comprehensive training. Moreover, several techniques, including Batch Normalization, Leaky ReLU, and Label Smoothing, are strategically employed to tackle challenges related to generalization and achieve a balanced interplay between the generator and discriminator. The experimental results are thoroughly discussed, showcasing the substantial progress achieved in addressing the problem at hand. Remarkably, the Neural Network attains a Structural Similarity Index (SSIM) of -0.5944 on the test set and -0.5922 on the training set, signifying its proficiency in accurately recoloring grayscale images. This paper contributes valuable insights into the realm of image recoloring using neural networks and demonstrates the effectiveness of the proposed methodology in achieving good results.
KEYWORDS: LIDAR, Sensors, Cameras, Environmental sensing, Robotics, Monte Carlo methods, Mobile robots, 3D metrology, Tunable filters, Stereoscopic cameras
This paper presents the implementation of localization algorithms for indoor autonomous mobile robots in known environments. The proposed implementation employs two sensors, an RGB-D camera and a 2D LiDAR to detect the environment and map an occupancy grid that allows the robot to perform autonomous/remote navigation throughout the environment while localizing itself. The implementation uses the data retrieved from the perception sensors and odometry to estimate the position of the robot through the Monte Carlo Localization algorithm. The proposed implementation employs the Robot Operating System (ROS) framework on an NVIDIA Jetson TX2 and the Turtlebot 2. Experimental results were considered using a physical implementation of the mobile robot in an indoor environment.
KEYWORDS: 3D modeling, Tunable filters, Face image reconstruction, 3D image reconstruction, Cameras, RGB color model, Data modeling, 3D image processing, Light sources and illumination, Image processing
This work focuses on the description of face reconstruction by using several image techniques. The main purpose is to explore many opto-electronic configurations for camera capture in order to obtain an accurate reconstruction. In this work, we used different camera technologies and lenses for face reconstruction. We compared the analyzed techniques and applied objective measures to evaluate the best camera configuration for reconstruction accuracy. Computer simulation results are obtained in order to evaluate the performance of the proposed system in terms of reconstruction performance, and computational efficiency.
This paper presents a comparison between the implementation of different convolutional neural network models varying the usage of pooling layers to address the problem of hiragana character classification. This study is focused on understanding how the selection and usage of different pooling layers affect the accuracy convergence in a model. To assess this situation eight models were tested with different configurations and using minimum pooling, average pooling, and max pooling schemes. Experimental results to validate the analysis and implementation are provided.
This paper presents the implementation of a convolutional neural network employing two different malware datasets. These datasets are converted to images, processed, and resized to 64x64. Through image processing, the convolutional neural network can accurately classify the types of malware families in the datasets. Experimental results to validate the analysis and implementation are provided; they were specifically made to show the proposal’s effectiveness and efficiency.
This paper presents the implementation of a driving assistance algorithm based on semantic segmentation. The proposed implementation uses a convolutional neural network architecture known as U-Net to perform the image segmentation of traffic scenes taken by the self-driving car during the navigation, the segmented image gives to every pixel a specific class. The driving assistance algorithm uses the data retrieved from the semantic segmentation to perform an evaluation of the environment and provide the results to the self-driving car to help it make a decision. The evaluation of the algorithm is based on the frequency of the pixels of each class, and on an equation that calculates the importance weight of a pixel with its own specific position and its respective class. Experimental results are presented to evaluate the feasibility of the proposed implementation.
KEYWORDS: Sensors, Mobile robots, Cameras, Environmental sensing, Computer simulations, Monte Carlo methods, Navigation systems, 3D modeling, Robotics, Mathematical modeling
This paper presents the implementation of a simultaneous localization and mapping (SLAM) algorithm for autonomous mobile robot navigation. The proposed implementation uses an RGB-D camera to detect the environment and map an occupancy grid that allows the mobile robot to perform autonomous navigation through the environment. The implementation employs the Robot Operating System (ROS) and the Adaptive Monte Carlo Localization to estimate the mobile robot’s current position in the environment with the data retrieved from the RGB-D camera and the odometry data. The mobile robot performs autonomous navigation considering if the robot can safely navigate while avoiding obstacles. Experimental results are presented to validate the implementation.
This paper presents the implementation of a mobile service robot with a manipulator and a navigation stack to interact and move through an environment providing a delivery type service. The implementation uses a LiDAR sensor and an RGB-D camera to navigate and detect objects that can be picked up by the manipulator and delivered to a target location. The robot navigation stack includes mapping, localization, obstacle avoidance, and trajectory planning for robust autonomous navigation across an office environment. The manipulator uses the RGB-D camera to recognize specific objects that can be picked up. Experimental results are presented to validate the implementation and robustness.
In this paper, we present a reliable method for three-dimensional object reconstruction based on multi-camera sensor arrays. Camera arrays are an efficient strategy for high-performance imaging and view interpolation. However, challenges can be present such as data size, real-time processing, and the need for automatic calibration. We analyze several methods for three-dimensional object digitalization using camera arrays. This analysis will help us to compare different algorithms and combine several 3D reconstruction methods in a different, efficient and quick way. Experimental results are presented using real and synthetic laboratory objects in terms of objective metrics for 3D reconstruction accuracy. The robustness of these methods for real applications is discussed.
The present paper explores the implementation of the RRT* path planning algorithm aided with a depth sensor in a physical robot for path planning and re-planning in a partially-known or unknown environment, the robot is capable of omnidirectional motion and aims to move from a starting location to a goal location in different environments. The proposed algorithm allows the robot to move through a map while avoiding collision by detecting unknown obstacles and updating the map for further planning and motion if required. The implementation and experimental results are presented for indoor environments with partial or non-knowledge of the environment in order to achieve autonomous navigation for a holonomic drive robot in an unknown environment using a depth camera as an optical sensing device.
This paper presents the implementation of mapping, localization, and navigation algorithms for a mobile service robot in an unknown environment. The implementation uses a 3D LiDAR sensor to detect the environment and map an occupancy grid that allows global localization and navigation through the environment. The robot estimates the current position through the Monte Carlo localization algorithm with LiDAR sensor and odometry data. The navigation stack uses inflation to determine if the service robot can safely navigate through the environment, and avoid obstacles. Experimental results were considered using a simulated robot in an indoor environment without given prior knowledge of obstacles presented in the environment.
A robust algorithm for Japanese handwritten hiragana character classifier is proposed using a machine learning approach for minimal training data to reduce computational power and time consumption. The proposed algorithm utilizes image recognition techniques to process samples from a data set. Six different models involving convolutional neural networks are implemented using image templates that were previously processed, in order to achieve great results with the least possible amount of training data. Prediction results were evaluated separating the dataset in training and validation data at a ratio of 5:95 respectively, achieving 96.95% as the highest accuracy across different models, competing against state-of-the-art classifiers with 80:20 training ratio.
Template matching is an effective method for object recognition because it provides high accuracy in location estimation of targets and robustness to the presence of scene noise. These features are useful for vision-based robot navigation assistance where reliable detection and location of scene objects is essential. In this work, the use of advanced template matched filters applied for robot navigation assistance is presented. Several filters are constructed by the optimization of objective performance criteria. These filters are exhaustively evaluated in synthetic and experimental scenes, in terms of efficiency of target detection, the accuracy of a target location, and processing time.
This paper presents a proposed algorithm with the implementation of the A* algorithm for path planning in a partially known environment. By using a differential mobile robot, the navigation is accomplished with a LiDAR sensor that detects any potential changes in the environment. The proposed algorithm estimates a safety path-planning trajectory from the origin of the robot to a target coordinate given by the user. If the robot encounters an unknown obstacle that does not belong to the known environment it will update the map, and recalculate the trajectory, executing it and proceed with the new path. Experimental results were considered in an indoors cluttered environment given by unknown obstacles, and partially known maps.
KEYWORDS: Image filtering, 3D acquisition, Detection and tracking algorithms, Object recognition, 3D modeling, 3D image processing, 3D applications, Target recognition, Target detection, Cameras
This paper proposes frequency-domain correlation filtering to solve object recognition of three-dimensional (3D) targets. We perform a linear correlation in the frequency domain between an input frame of the video sequence and a designed filter. This operation measures the correspondence between the two signals. In order to produce a high matching score, we design a bank of correlation filters, in which each filter contains unique information of the target in a single view and statistical parameters of the scene. In this paper, we demonstrate the feasibility of correlation filters used to solve 3D object recognition and their robustness to different image conditions such as noise, cluttered background, and geometrical distortions of the target. The evaluation performance presents a high accuracy in terms of quantitative metrics.
KEYWORDS: Detection and tracking algorithms, Particles, 3D image processing, 3D acquisition, Image processing, 3D modeling, Particle filters, Image filtering, Target detection, Signal to noise ratio
This research presents an algorithm for three-dimensional (3-D) pose tracking of a rigid object by processing sequences of monocular images. The pose trajectory of the object is estimated by performing linear correlation between the current scene and a filter bank constructed from different views of a 3-D model of the target, which are created synthetically with computer graphics. The pose tracking is guided by particle filters that dynamically adapt the filter bank by taking into account the kinematics of the target in the scene. Experimental results obtained with the proposed algorithm in processing synthetic and real images are presented and discussed. These results show that the proposed algorithm achieves a higher accuracy of pose tracking in terms of objective metrics, in comparison with that of existing similar algorithms.
A reliable approach for object segmentation based on template-matching filters is proposed. The system employs an adaptive strategy for the generation of space-variant filters which take into account several versions of the target and local statistical properties of the input scene. Moreover, the proposed method considers the geometric modifications of the target while is moving through a video sequence. The detection accuracy of the matched filter brings the location of the target of interest. The estimated location coordinates are used to compute the support area covered by the target using watershed segmentation technique. In each frame, the filter adapts according the geometrical changes of the target in order to estimate its current support region. Experimental tests carried out in a video sequence show that the proposed system yields a very good performance for accuracy detection, and object segmentation efficiency in real-life scenes.
A visual approach in environment recognition for robot navigation is proposed. This work includes a template matching filtering technique to detect obstacles and feasible paths using a single camera to sense a cluttered environment. In this problem statement, a robot can move from the start to the goal by choosing a single path between multiple possible ways. In order to generate an efficient and safe path for mobile robot navigation, the proposal employs a pseudo-bacterial potential field algorithm to derive optimal potential field functions using evolutionary computation. Simulation results are evaluated in synthetic and real scenes in terms of accuracy of environment recognition and efficiency of path planning computation.
A reliable method for three-dimensional digitization of human faces based on the fringe projection technique
is presented. The proposed method employs robust fringe analysis algorithms for robust phase computation.
The quality of the resultant 3D face model is characterized in terms of accuracy of surface computation using
objective metrics. We present experimental results obtained with real and synthetic laboratory objects. The
potential of this method to be used in the field of face recognition is discussed.
Computer vision is an important task in robotics applications. This work proposes an approach for autonomous mobile robot navigation using the integration of the template-matching filters for obstacle detection and the evolutionary artificial potential field method for path planning. The recognition system employs a digital camera to sense the environment of a mobile robot. The captured scene is processed by a bank of space variant filters in order to find the obstacles and a feasible area for the robot navigation. The path planning employs evolutionary artificial potential fields to derive optimal potential field functions using evolutionary computation. Simulation results to validate the analysis and implementation are provided; they were specifically made to show the effectiveness and the efficiency of the proposal.
The problem of 3D pose recognition of a rigid object is difficult to solve because the pose in a 3D space can vary with multiple degrees of freedom. In this work, we propose an accurate method for 3D pose estimation based on template matched filtering. The proposed method utilizes a bank of space-variant filters which take into account different pose states of the target and local statistical properties of the input scene. The state parameters of location coordinates, orientation angles, and scaling parameters of the target are estimated with high accuracy in the input scene. Experimental tests are performed for real and synthetic scenes. The proposed system yields good performance for 3D pose recognition in terms of detection efficiency, location and orientation errors.
An accurate algorithm for three-dimensional (3-D) pose recognition of a rigid object is presented. The algorithm is based on adaptive template matched filtering and local search optimization. When a scene image is captured, a bank of correlation filters is constructed to find the best correspondence between the current view of the target in the scene and a target image synthesized by means of computer graphics. The synthetic image is created using a known 3-D model of the target and an iterative procedure based on local search. Computer simulation results obtained with the proposed algorithm in synthetic and real-life scenes are presented and discussed in terms of accuracy of pose recognition in the presence of noise, cluttered background, and occlusion. Experimental results show that our proposal presents high accuracy for 3-D pose estimation using monocular images.
KEYWORDS: 3D acquisition, Object recognition, 3D modeling, RGB color model, Detection and tracking algorithms, 3D image processing, Light sources and illumination, Image filtering, Computing systems, Image processing
In this paper we solve the problem of pose recognition of a 3D object in non-uniformly illuminated and noisy scenes. The recognition system employs a bank of space-variant correlation filters constructed with an adaptive approach based on local statistical parameters of the input scene. The position and orientation of the target are estimated with the help of the filter bank. For an observed input frame, the algorithm computes the correlation process between the observed image and the bank of filters using a combination of data and task parallelism by taking advantage of a graphics processing unit (GPU) architecture. The pose of the target is estimated by finding the template that better matches the current view of target within the scene. The performance of the proposed system is evaluated in terms of recognition accuracy, location and orientation errors, and computational performance.
KEYWORDS: Object recognition, Light sources and illumination, Light sources, Ray tracing, Signal to noise ratio, Computer simulations, Visualization, Photons, RGB color model, Light
Light interactions with matter is of remarkable complexity. An adequate modeling of global illumination is a vastly studied topic since the beginning of computer graphics, and still is an unsolved problem. The rendering equation for global illumination is based of refraction and reflection of light in interaction with matter within an environment. This physical process possesses a high computational complexity when implemented in a digital computer. The appearance of an object depends on light interactions with the surface of the material, such as emission, scattering, and absorption. Several image-synthesis methods have been used to realistically render the appearance of light incidence on an object. Recent global illumination algorithms employ mathematical models and computational strategies that improve the efficiency of the simulation solution. This work presents a review the state of the art of global illumination algorithms and focuses on the efficiency of the solution in a computational implementation in a graphics processing unit. A reliable system is developed to simulate realistics scenes in the context of real-time object recognition under different lighting conditions. Computer simulations results are presented and discussed in terms of discrimination capability, and robustness to additive noise, when considering several lighting model reflections and multiple light sources.
A real-time system for illumination-invariant object tracking is proposed. The system is able to estimate at
high-rate the position of a moving target in an input scene when is corrupted by the presence of a high cluttering
background and nonuniform illumination. The position of the target is estimated with the help of a filter bank of
space-variant correlation filters. The filters in the bank, adapt their parameters according to the local statistical
parameters of the observed scene in a small region centered at coordinates of a predicted position for the target
in each frame. The prediction is carried out by exploiting information of present and past frames, and by using
a dynamic motion model of the target in a two-dimensional plane. Computer simulation results obtained with
the proposed system are presented and discussed in terms of tracking accuracy, computational complexity, and
tolerance to nonuniform illumination.
A real-time system for multiclass object recognition is proposed. The system is able to identify and correctly
classify several moving targets from an input scene by using a bank of adaptive correlation filters with complex
constraints implemented on a graphics processing unit. The bank of filters is synthesized with the help of
an iterative algorithm based on complex synthetic discriminant functions. At each iteration, the algorithm
optimizes the discrimination capability of each filter in the bank by using all available information about the
known patterns to be recognized and unwanted patterns to be rejected such as false objects or a background.
Computer simulation results obtained with the proposed system in real and synthetic scenes are presented and
discussed in terms of pattern recognition performance and real-time operation speed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.