The present generation of mobile handheld devices comes equipped with a large number of sensors. The key sensors include the Ambient Light Sensor, Proximity Sensor, Gyroscope, Compass and the Accelerometer. Many mobile applications are driven based on the readings obtained from either one or two of these sensors. However the presence of multiple-sensors will enable the determination of more detailed activities that are carried out by the user of a mobile device, thus enabling smarter mobile applications to be developed that responds more appropriately to user behavior and device usage. In the proposed research we use recent advances in machine learning to fuse together the data obtained from all key sensors of a mobile device. We investigate the possible use of single and ensemble classifier based approaches to identify a mobile device’s behavior in the space it is present. Feature selection algorithms are used to remove non-discriminant features that often lead to poor classifier performance. As the sensor readings are noisy and include a significant proportion of missing values and outliers, we use machine learning based approaches to clean the raw data obtained from the sensors, before use. Based on selected practical case studies, we demonstrate the ability to accurately recognize device behavior based on multi-sensor data fusion.
The present generation of Ambient Light Sensors (ALS) of a mobile handheld device suffer from two practical shortcomings. The ALSs are narrow angle, i.e. they respond effectively only within a narrow angle of operation and there is a latency of operation. As a result mobile applications that operate based on the ALS readings could perform sub-optimally especially when operated in environments with non-uniform illumination. The applications will either adopt with unacceptable levels of latency or/and may demonstrate a discrete nature of operation. In this paper we propose a framework to predict the ambient illumination of an environment in which a mobile device is present. The predictions are based on an illumination model that is developed based on a small number of readings taken during an application calibration stage. We use a machine learning based approach in developing the models. Five different regression models were developed, implemented and compared based on Polynomial, Gaussian, Sum of Sine, Fourier and Smoothing Spline functions. Approaches to remove noisy data, missing values and outliers were used prior to the modelling stage to remove their negative effects on modelling. The prediction accuracy for all models were found to be above 0.99 when measured using R-Squared test with the best performance being from Smoothing Spline. In this paper we will discuss mathematical complexity of each model and investigate how to make compromises in finding the best model.
Multiexposure image fusion algorithms are used for enhancing the perceptual quality of an image captured by sensors of limited dynamic range. This is achieved by rendering a single scene based on multiple images captured at different exposure times. Similarly, multifocus image fusion is used when the limited depth of focus on a selected focus setting of a camera results in parts of an image being out of focus. The solution adopted is to fuse together a number of multifocus images to create an image that is focused throughout. A single algorithm that can perform both multifocus and multiexposure image fusion is proposed. This algorithm is a new approach in which a set of unregistered multiexposure/focus images is first registered before being fused to compensate for the possible presence of camera shake. The registration of images is done via identifying matching key-points in constituent images using scale invariant feature transforms. The random sample consensus algorithm is used to identify inliers of SIFT key-points removing outliers that can cause errors in the registration process. Finally, the coherent point drift algorithm is used to register the images, preparing them to be fused in the subsequent fusion stage. For the fusion of images, a new approach based on an improved version of a wavelet-based contourlet transform is used. The experimental results and the detailed analysis presented prove that the proposed algorithm is capable of producing high-dynamic range (HDR) or multifocus images by registering and fusing a set of multiexposure or multifocus images taken in the presence of camera shake. Further, comparison of the performance of the proposed algorithm with a number of state-of-the art algorithms and commercial software packages is provided. In particular, our literature review has revealed that this is one of the first attempts where the compensation of camera shake, a very likely practical problem that can result in HDR image capture using handheld devices, has been addressed as a part of a multifocus and multiexposure image enhancement system.
Multi-exposure image fusion algorithms are used for enhancing the perceptual quality of an image captured by sensors of
limited dynamic range. This is achieved by rendering a single scene based on multiple images captured at different
exposure times. Similarly, multi-focus image fusion is used when the limited depth of focus on a selected focus setting of
a camera results in parts of an image being out of focus. The solution adopted is to fuse together a number of multi-focus
images to create an image that is focused throughout. In this paper we propose a single algorithm that can perform both
multi-focus and multi-exposure image fusion. This algorithm is a novel approach in which a set of unregistered multiexposure/
focus images is first registered before being fused. The registration of images is done via identifying matching
key points in constituent images using Scale Invariant Feature Transforms (SIFT). The RANdom SAmple Consensus
(RANSAC) algorithm is used to identify inliers of SIFT key points removing outliers that can cause errors in the
registration process. Finally we use the Coherent Point Drift algorithm to register the images, preparing them to be fused
in the subsequent fusion stage. For the fusion of images, a novel approach based on an improved version of a Wavelet
Based Contourlet Transform (WBCT) is used. The experimental results as follows prove that the proposed algorithm is
capable of producing HDR, or multi-focus images by registering and fusing a set of multi-exposure or multi-focus
images taken in the presence of camera shake.
Automatic speaker identification in a videoconferencing environment will allow conference attendees to focus their
attention on the conference rather than having to be engaged manually in identifying which channel is active and who
may be the speaker within that channel. In this work we present a real-time, audio-coupled video based approach to
address this problem, but focus more on the video analysis side. The system is driven by the need for detecting a talking
human via the use of computer vision algorithms. The initial stage consists of a face detector which is subsequently
followed by a lip-localization algorithm that segments the lip region. A novel approach for lip movement detection based
on image registration and using the Coherent Point Drift (CPD) algorithm is proposed. Coherent Point Drift (CPD) is a
technique for rigid and non-rigid registration of point sets. We provide experimental results to analyse the performance
of the algorithm when used in monitoring real life videoconferencing data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.