Typical digital video search is based on queries involving a single shot. We generalize this problem by allowing queries that involve a video clip (say a 5 sec video segment).
We propose two schemes for query by video clip:
(i) retrieval based on key frames follows the traditional representation of shots, computing key frames from a video, and then extracting image features around the key frames. Based on each key frame in the query, a similarity value (using color, texture, and motion) is associated with the key frames in the database video. Consecutive key frames in the database video that are highly similar to the query key frames are then used to generate the set of retrieved video clips.
(ii) In retrieval using sub-sampled frames, we uniformly sub-sample the query clip as well as the database video. Retrieval is based on matching color and texture features of the sub-sampled frames. Initial experiments on two video databases (basketball video with approximately 160 frames and a CNN news video with approximately 200 frames) show promising results. Experiments using segments from one basketball video as query and a different basketball video as the database show that the feature representation and matching schemes are robust. We are currently investigating methods for improving the performance of the system using semantic knowledge of the given domain, object segmentation and tracking, detection of text and faces, and combining the various matching schemes.
0 comments:
Post a Comment