Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/49363
DC Field | Value | |
---|---|---|
dc.title | Attribute-based Image Retrieval: Towards Bridging the Semantic and Intention Gaps | |
dc.contributor.author | ZHANG HANWANG | |
dc.date.accessioned | 2014-02-28T18:00:44Z | |
dc.date.available | 2014-02-28T18:00:44Z | |
dc.date.issued | 2013-11-11 | |
dc.identifier.citation | ZHANG HANWANG (2013-11-11). Attribute-based Image Retrieval: Towards Bridging the Semantic and Intention Gaps. ScholarBank@NUS Repository. | |
dc.identifier.uri | http://scholarbank.nus.edu.sg/handle/10635/49363 | |
dc.description.abstract | This thesis is concerned with Content-based Image Retrieval (CBIR), a task of searching for images in a large repository based on their visual contents. In particular, we target at seeking semantically similar images, which correspond more to human needs. The current state-of-the-art solutions model image semantics by popular semantic concepts such as objects (e.g., "dog", "person"), events (e.g., "sports", "birthday"), or scene (e.g., "outdoor", "wild"). Such high-level semantic concepts have been shown to be promising for CBIR. However, its progress is hampered by the "semantic gap" between the extracted low-level visual features and the desired high-level semantics. Moreover, even if the images were to be well annotated by proper concepts, another notorious gap still leads to unsatisfactory results. This gap is called the "intention gap" between the envisioned intents of the users and the ambiguous semantics delivered by the query at hand, due to the inability of the query to express the users' intents precisely. In order to bridge these two gaps, we propose a novel Attribute-based Image Retrieval framework. Here, attributes refer to properties that characterize objects such as the visual appearances (e.g., "round" as shape, "metallic" as texture), sub-components (e.g., "has wheel", "has leg"), functionalities (e.g., "can fly", "can swim") and various other discriminative properties (e.g., "properties that dog has but cat do not"). On one hand, attributes act as the intermediate semantics that naturally connects the low-level visual features and high-level concepts, narrowing down the semantic gap. This is because attributes generally depict common visual properties, which can be more easily extracted and modeled as compared to high-level concepts that have higher visual variance. On the other hand, attributes enrich existing concept-based image semantic representation and endow more comprehensive semantic measurement of images. With the help of attributes, users can deliver more expressive and precise semantic description of intents and hence leading to smaller intention gap. In this thesis, we aim to conduct a thorough study on how the attributes may help in CBIR, towards bridging both the semantic gap and intention gap. Experiments are systematically conducted on a large-scale real-world Web image data set, and the results conclusively demonstrate the effectiveness of the proposed attribute-based image retrieval architecture. | |
dc.language.iso | en | |
dc.subject | image retrieval, attributes, semantic hierarchy, image representation, supervised learning, image semantics | |
dc.type | Thesis | |
dc.contributor.department | COMPUTER SCIENCE | |
dc.contributor.supervisor | CHUA TAT SENG | |
dc.description.degree | Ph.D | |
dc.description.degreeconferred | DOCTOR OF PHILOSOPHY | |
dc.identifier.isiut | NOT_IN_WOS | |
Appears in Collections: | Ph.D Theses (Open) |
Show simple item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
thesis.pdf | 10.07 MB | Adobe PDF | OPEN | None | View/Download |
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.