Object detection from infrared-band (thermal) imagery has been a challenging problem for many years. With the advent of deep Convolutional Neural Networks (CNN), the automated detection and classification of objects of interest within the scene has become popularised due to the notable increases in performance over earlier approaches in the field. These advances in CNN approaches are underpinned by the availability of large-scale, annotated image datasets that are typically available for visible-band (RGB) imagery. By contrast, there is a lack of prior work that specifically targets object detection in infrared-band images, owing to limited datasets availability that stems from more the limited availability and access to infrared-band imagery and associated hardware in general. A viable solution to this problem is transfer learning which can enable the use of such CNN techniques within infrared-band (thermal) imagery, by leveraging prior training on visible-band (RGB) image datasets, and then subsequently only requiring a secondary, smaller volume of infrared-band (thermal) imagery for CNN model fine-tuning. This is performed by adopting an existing pre-trained CNN, pre-optimized for generalized object recognition in visible-band (RGB) imagery, and subsequently fine-tuning the resultant model weights towards our specific infrared-band (thermal) imagery domain task. We use of two state-of-art object detectors, Single Shot Detector (SSD) with a VGG-16 CNN backbone pre-trained on the ImageNet dataset, and You-Only-Look-Once (YOLOV3) with a DarkNet-53 CNN backbone pretrained on the MS-COCO dataset to illustrate our visible-band to infrared band transfer learning paradigm. Exemplar results reported over the FLIR Thermal and MultispectralFIR benchmark datasets show that significant improvements in mAP detection performance to f0.804MsFIR, 0.710FLIRg for SSD and f0.520MsFIR, 0.308FLIRg for YOLOV3 via the use of transfer learning from initial visible-band based CNN training.
The rapid growth of image digitization and collections in recent years makes it challenging and burdensome to organize, categorize, and retrieve similar images from voluminous collections. Content-based image retrieval (CBIR) is immensely convenient in this context. A considerable number of local feature detectors and descriptors are present in the literature of CBIR. We propose a model to anticipate the best feature combinations for image retrieval-related applications. Several spatial complementarity criteria of local feature detectors are analyzed and then engaged in a regression framework to find the optimal combination of detectors for a given dataset and are better adapted for each given image; the proposed model is also useful to optimally fix some other parameters, such as the k in k-nearest neighbor retrieval. Three public datasets of various contents and sizes are employed to evaluate the proposal, which is legitimized by improving the quality of retrieval notably facing classical approaches. Finally, the proposed image search engine is applied to the cultural photographic collections of a French museum, where it demonstrates its added value for the exploration and promotion of these contents at different levels from their archiving up to their exhibition in or ex situ.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.