Please use this identifier to cite or link to this item:
Title: Attribute-based Image Retrieval: Towards Bridging the Semantic and Intention Gaps
Keywords: image retrieval, attributes, semantic hierarchy, image representation, supervised learning, image semantics
Issue Date: 11-Nov-2013
Citation: ZHANG HANWANG (2013-11-11). Attribute-based Image Retrieval: Towards Bridging the Semantic and Intention Gaps. ScholarBank@NUS Repository.
Abstract: This thesis is concerned with Content-based Image Retrieval (CBIR), a task of searching for images in a large repository based on their visual contents. In particular, we target at seeking semantically similar images, which correspond more to human needs. The current state-of-the-art solutions model image semantics by popular semantic concepts such as objects (e.g., "dog", "person"), events (e.g., "sports", "birthday"), or scene (e.g., "outdoor", "wild"). Such high-level semantic concepts have been shown to be promising for CBIR. However, its progress is hampered by the "semantic gap" between the extracted low-level visual features and the desired high-level semantics. Moreover, even if the images were to be well annotated by proper concepts, another notorious gap still leads to unsatisfactory results. This gap is called the "intention gap" between the envisioned intents of the users and the ambiguous semantics delivered by the query at hand, due to the inability of the query to express the users' intents precisely. In order to bridge these two gaps, we propose a novel Attribute-based Image Retrieval framework. Here, attributes refer to properties that characterize objects such as the visual appearances (e.g., "round" as shape, "metallic" as texture), sub-components (e.g., "has wheel", "has leg"), functionalities (e.g., "can fly", "can swim") and various other discriminative properties (e.g., "properties that dog has but cat do not"). On one hand, attributes act as the intermediate semantics that naturally connects the low-level visual features and high-level concepts, narrowing down the semantic gap. This is because attributes generally depict common visual properties, which can be more easily extracted and modeled as compared to high-level concepts that have higher visual variance. On the other hand, attributes enrich existing concept-based image semantic representation and endow more comprehensive semantic measurement of images. With the help of attributes, users can deliver more expressive and precise semantic description of intents and hence leading to smaller intention gap. In this thesis, we aim to conduct a thorough study on how the attributes may help in CBIR, towards bridging both the semantic gap and intention gap. Experiments are systematically conducted on a large-scale real-world Web image data set, and the results conclusively demonstrate the effectiveness of the proposed attribute-based image retrieval architecture.
Appears in Collections:Ph.D Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
thesis.pdf10.07 MBAdobe PDF



Page view(s)

checked on Nov 3, 2018


checked on Nov 3, 2018

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.