Paper
1 January 2001 Multimodal pattern matching for audio-visual query and retrieval
Author Affiliations +
Proceedings Volume 4315, Storage and Retrieval for Media Databases 2001; (2001) https://doi.org/10.1117/12.410927
Event: Photonics West 2001 - Electronic Imaging, 2001, San Jose, CA, United States
Abstract
A necessary capability for content-based retrieval is to support the paradigm of query by example. In the past, there have been several attempts to use low-level features for video retrieval. None of the approaches however uses the multimedia information content of the video. We present an algorithm for matching multi modal patterns for the purpose of content-based video retrieval. The novel ability of our approach to use the information content in multiple media coupled with a strong emphasis on temporal similarity differentiates it from the state-of-the-art in content-based retrieval. At the core of the pattern matching scheme is a dynamic programming algorithm, which leads to a significant improvement in performance. Coupling the use of audio with video this algorithm can be applied to grouping of shots based on audio-visual similarity. This is much more effective in constructing scenes from shots than using only visual content to do the same.
© (2001) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Milind Ramesh Naphade, Roy R. Wang, and Thomas S. Huang "Multimodal pattern matching for audio-visual query and retrieval", Proc. SPIE 4315, Storage and Retrieval for Media Databases 2001, (1 January 2001); https://doi.org/10.1117/12.410927
Lens.org Logo
CITATIONS
Cited by 24 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Distortion

Computer programming

Feature extraction

Visualization

Databases

Multimedia

Back to Top