Date of Original Version

5-2014

Type

Conference Proceeding

Journal Title

Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

First Page

1375

Last Page

1379

Rights Management

© 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Abstract or Description

Huge amount of videos on the Internet have rare textual information, which makes video retrieval challenging given a text query. Previous work explored semantic concepts for content analysis to assist retrieval. However, the human-defined concepts might fail to cover the data and there is a potential gap between these concepts and the semantics expected from user's query. Also, building a corpus is expensive and time-consuming. To address these issues, we propose a semi-automatic framework to discover the semantic concepts. We limit ourselves in audio modality here. In the paper, we also discuss how to select meaningful vocabulary from the discovered hierarchical sub-categories and provide an approach to detect all the concepts without further annotation. We evaluate the method on NIST 2011 multimedia event detection (MED) dataset.

DOI

10.1109/ICASSP.2014.6853822

Share

COinS
 

Published In

Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 1375-1379.