BUILDING EFFECTIVE AND SCALABLE VISUAL OBJECT RECOGNITION SYSTEMS | ScholarBank@NUS

Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/47343

DC Field	Value
dc.title	BUILDING EFFECTIVE AND SCALABLE VISUAL OBJECT RECOGNITION SYSTEMS
dc.contributor.author	CHEN QIANG
dc.date.accessioned	2013-10-31T18:00:13Z
dc.date.available	2013-10-31T18:00:13Z
dc.date.issued	2013-06-18
dc.identifier.citation	CHEN QIANG (2013-06-18). BUILDING EFFECTIVE AND SCALABLE VISUAL OBJECT RECOGNITION SYSTEMS. ScholarBank@NUS Repository.
dc.identifier.uri	http://scholarbank.nus.edu.sg/handle/10635/47343
dc.description.abstract	Visual object recognition is of fundamental importance to artificial intelligence. In this thesis, we aim to build the most effective general object recognition system on well-known benchmarks, e.g. PASCAL VOC. Furthermore, we successfully scale this system into a large scale setting with much less complexity compared with other works. This thesis addresses a number of key issues that are needed to build a working system. At the feature representation part, we first introduce the SuperCoding which extends the GMM-based coding to the second order statistic while remaining the favourable linearity. Based on the coded features, we perform the object-centric pooling by means of the proposed Generalized Hierarchical Matching (GHM) with useful side information. At the model learning part, we consider the high level task context from the object detection and classification tasks. We develop a novel mutual and iterative contextualization scheme for both tasks based on the so-called Contextualized Support Vector Machine (Context-SVM) method. Extensive experiments show the effectiveness of these novel methods. Furthermore, we scale this effective system to the large scale setting with thousands of categories and millions of images. By means of efficient Pointwise Fisher Vector coding, per-pixel pooling and the context modelling, our experiments show that the proposed system can perform detection of 1000 object classes in less than one minute on the ImageNet ILSVRC2012 dataset using a single CPU, while achieving comparable performance to state-of-the-art algorithms. To sum up, by utilizing several novel keys, we build an effective visual object recognition system demonstrated on benchmarks and propose a scalable solution for large scale object recognition problem.
dc.language.iso	en
dc.subject	artificial intelligence, computer vision, object recognition, system, scalbility
dc.type	Thesis
dc.contributor.department	ELECTRICAL & COMPUTER ENGINEERING
dc.contributor.supervisor	YAN SHUICHENG
dc.description.degree	Ph.D
dc.description.degreeconferred	DOCTOR OF PHILOSOPHY
dc.identifier.isiut	NOT_IN_WOS
Appears in Collections:	Ph.D Theses (Open)

Show simple item record

Files in This Item:

File	Description	Size	Format	Access Settings	Version
ChenQ.pdf		5.83 MB	Adobe PDF	OPEN	None	View/Download

Google Scholar^TM

Check

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.