Geometric Incremental Support Vector Machine for Object Detection from Capsule Endoscopy Videos

doi:10.1117/3.1002311.ch6

Book: Computer-Aided Cancer Detection and Diagnosis: Recent Advances

Editor(s): Jinshan Tang; Sos S. Agaian

Published: 2013

https://doi.org/10.1117/3.1002311.ch6

Abstract

Capsule endoscopy (CE) is a method used to visualize the entire small intestine. It is a widely adopted procedure for diagnosing gastrointestinal diseases including obscure bleeding, Crohn's disease, gastric ulcers, and colon cancer. The CE videos used in this research were produced with the Pillcam® by Given Imaging. The imaging component of this system is a vitamin-sized capsule that comprises a color CMOS camera, a battery, a light source, and a wireless transmitter. The device captures two images per second for approximately eight hours and generates approximately 55,000 color images with a size of 256 × 256 pixels during the life of its usage. Reviewing CE videos to make diagnostic decisions is a tedious task and is achieved by watching the video playback and marking suspicious frames and anatomical landmarks. It usually takes more than one hour to annotate a full-length video, and a typical mid-size hospital produces an average of twelve CE videos per day. Given the large amount of training data, computer algorithms are in great demand to reduce the review time by identifying frames that contain signs of lesion, bleeding, and polyps, as well as segment videos into gastrointestinal sections. Many existing learning algorithms require all training data to be present in memory to achieve the best generalization performance. Limited by the computing power and memory size, it is usually difficult to implement such a learning scheme. Incremental learning has great potential to accommodate the inclusion of examples that become available over time or represent a change of perception. The initial data set can be used to create a model; when new data becomes available, it is integrated to update the classifier. In practice, clinical videos are acquired over time. Furthermore, knowledge of the visual appearance of the diseases in CE video changes over time due to the relatively shorter practice time. It would be practical to build a classifier based on initial data and revise the classifier as new examples arrive.