Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/177222
Title: MOTION-BASED MULTIMEDIA INDEXING
Authors: PEH CHIN HWEE
Issue Date: 1999
Citation: PEH CHIN HWEE (1999). MOTION-BASED MULTIMEDIA INDEXING. ScholarBank@NUS Repository.
Abstract: The lack of a framework that accelerates the multiple use of motion in content-based video indexing, together with the absence of a robust algorithm for general three-dimensional motion recovery, has restricted the current use of motion information in representing video content. In this thesis, a framework that is supported by many visual modules, each manifesting how a particular aspect of motion can be directly used, and possibly with different degrees of inter-dependencies, is proposed for effectual indexing. Technically, the main task of video indexing is broken down into several visual tasks that are directly tied to the cinematic tasks performed in the production stage. This induces a more direct coupling between the end-tasks and the requisite qualitative motion-based computations. This approach is in contrast to the traditional way of studying motion-based indexing problem whereby a strictly monolithic hierarchy is constructed that is often not tied to any immediate tasks. In this thesis, it is demonstrated how a number of indexing capabilities can be based upon video production tasks that are reconstructed from and directly specified by motion features. Specifically, three visual modules are studied in detail. They are: the resolution of camera operation, the measurement of time-to-collision and the recognition of moving object. Each of these modules demonstrates how a different aspect of motion can be directly used for the purpose of video indexing. Experiments were conducted to substantiate the working of each visual module in deriving different motion characteristics for each visual task. It is argued that this multi-faceted usage of motion information results in a more robust and comprehensive motion-based indexing system. The first visual module explores how the resolution of general camera motion parameters can be achieved. Despite the inability to recover the exact motion parameters, the qualitative results obtained give a direct indication on the types of camera operation effected. The difficulties involved in obtaining the quantitative results are also discussed in the thesis. Specifically, we discussed the ambiguities arisen in solving a flow field resulting by a simultaneous zoom and Z-translation by the camera. The second visual module demonstrates the possibility of using time-to-collision measure for video indexing. It is shown that this measure is capable of describing the content of the scene. The term "apparent time-to-collision" is introduced in our study to redefine the time-to-collision measure which results from a zoom and a Z-translation. The characteristics of this new measure are also studied in detail. The third visual module illustrates the direct use of object motion to analyze video content. The concept of extended spatio-temporal texture is introduced to characterize motion in a video scene. This texture is derived from low-level motion features. By applying classical texture analysis tools, the characteristics of video content are systematically described.
URI: https://scholarbank.nus.edu.sg/handle/10635/177222
Appears in Collections:Master's Theses (Restricted)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
b22109286.pdf7.18 MBAdobe PDF

RESTRICTED

NoneLog In

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.