Lizi Liao

Email Address
dcsliao@nus.edu.sg


Organizational Units
Organizational Unit
Organizational Unit
COMPUTING
faculty

Publication Search Results

Now showing 1 - 10 of 13
  • Publication
    Knowledge Enhanced Neural Fashion Trend Forecasting
    (Association for Computing Machinery, Inc, 2020-10-26) Yunshan Ma; Yujuan Ding; Xun Yang; Lizi Liao; Wai Keung Wong; Tat-Seng Chua; DEPARTMENT OF COMPUTER SCIENCE
    Fashion trend forecasting is a crucial task for both academia andindustry. Although some efforts have been devoted to tackling this challenging task, they only studied limited fashion elements with highly seasonal or simple patterns, which could hardly reveal thereal fashion trends. Towards insightful fashion trend forecasting,this work focuses on investigating fine-grained fashion element trends for specific user groups. We first contribute a large-scale fashion trend dataset (FIT) collected from Instagram with extracted time series fashion element records and user information. Furthermore, to effectively model the time series data of fashion elements with rather complex patterns, we propose a Knowledge Enhanced Recurrent Network model (KERN) which takes advantage of the capability of deep recurrent neural networks in modeling time series data. Moreover, it leverages internal and external knowledgein fashion domain that affects the time-series patterns of fashion element trends. Such incorporation of domain knowledge further enhances the deep learning model in capturing the patterns of specific fashion elements and predicting the future trends. Extensive experiments demonstrate that the proposed KERN model can effectively capture the complicated patterns of objective fashion elements, therefore making preferable fashion trend forecast. © 2020 ACM.
  • Publication
    Who, Where, and What to Wear? Extracting Fashion Knowledge from Social Media
    (2019-10-21) Yunshan Ma; Xun Yang; Lizi Liao; Yixin Cao; Tat-Seng Chua; DEPARTMENT OF COMPUTER SCIENCE
    Fashion knowledge helps people to dress properly and addresses not only physiological needs of users, but also the demands of social activities and conventions. It usually involves three mutually related aspects of: occasion, person and clothing. However, there are few works focusing on extracting such knowledge, which will greatly benefit many downstream applications, such as fashion recommendation. In this paper, we propose a novel method to automatically harvest fashion knowledge from social media. We unify three tasks of occasion, person and clothing discovery from multiple modalities of images, texts and metadata. For person detection and analysis, we use the off-the-shelf tools due to their flexibility and satisfactory performance. For clothing recognition and occasion prediction, we unify the two tasks by using a contextualized fashion concept learning module, which captures the dependencies and correlations among different fashion concepts. To alleviate the heavy burden of human annotations, we introduce a weak label modeling module which can effectively exploit machine-labeled data, a complementary of clean data. In experiments, we contribute a benchmark dataset and conduct extensive experiments from both quantitative and qualitative perspectives. The results demonstrate the effectiveness of our model in fashion concept prediction, and the usefulness of extracted knowledge with comprehensive analysis. © 2019 Association for Computing Machinery.
  • Publication
    Knowledge-aware Multimodal Dialog Systems
    (Association for Computing Machinery, Inc, 2018-10-26) Lizi Liao; Yunshan Ma; Xiangnan He; Richang Hong; Tat-Seng Chua; DEPARTMENT OF COMPUTER SCIENCE
    By offering a natural way for information seeking, multimodal dialogue systems are attracting increasing attention in several domains such as retail, travel etc. However, most existing dialogue systems are limited to textual modality, which cannot be easily extended to capture the rich semantics in visual modality such as product images. For example, in fashion domain, the visual appearance of clothes and matching styles play a crucial role in understanding the user's intention. Without considering these, the dialogue agent may fail to generate desirable responses for users. In this paper, we present a Knowledge-aware Multimodal Dialogue (KMD) model to address the limitation of text-based dialogue systems. It gives special consideration to the semantics and domain knowledge revealed in visual content, and is featured with three key components. First, we build a taxonomy-based learning module to capture the fine-grained semantics in images (e.g., the category and attributes of a product). Second, we propose an end-to-end neural conversational model to generate responses based on the conversation history, visual semantics, and domain knowledge. Lastly, to avoid inconsistent dialogues, we adopt a deep reinforcement learning method which accounts for future rewards to optimize the neural conversational model. We perform extensive evaluation on a multi-turn task-oriented dialogue dataset in fashion domain. Experiment results show that our method significantly outperforms state-of-the-art methods, demonstrating the efficacy of modeling visual modality and domain knowledge for dialogue systems. © 2018 Association for Computing Machinery.
  • Publication
    Dialogue state tracking with incremental reasoning
    (MIT Press Journals, 2021-01-01) Liao, Lizi; Long, Le Hong; Ma, Yunshan; Lei, Wenqiang; Chua, Tat-Seng; DEPARTMENT OF COMPUTER SCIENCE
    Tracking dialogue states to better interpret user goals and feed downstream policy learning is a bottleneck in dialogue management. Common practice has been to treat it as a problem of classifying dialogue content into a set of pre-defined slot-value pairs, or generating values for different slots given the dialogue history. Both have limitations on considering dependencies that occur on dialogues, and are lacking of reasoning capabilities. This paper proposes to track dialogue states gradually with reasoning over dialogue turns with the help of the back-end data. Empirical results demonstrate that our method outperforms the state-of-the-art methods in terms of joint belief accuracy for MultiWOZ 2.1, a large-scale human–human dialogue dataset across multiple domains. © 2021, MIT Press Journals. All rights reserved.
  • Publication
    Neural Collaborative Filtering
    (International World Wide Web Conferences Steering Committee, 2017-03-04) Xiangnan He; Lizi Liao; Hanwang Zhang; Liqiang Nie; Xia Hu; Tat-Seng Chua; DEPARTMENT OF COMPUTER SCIENCE
    In recent years, deep neural networks have yielded immense success on speech recognition, computer vision and natural language processing. However, the exploration of deep neural networks on recommender systems has received relatively less scrutiny. In this work, we strive to develop techniques based on neural networks to tackle the key problem in recommendation — collaborative filtering — on the basis of implicit feedback. Although some recent work has employed deep learning for recommendation, they primarily used it to model auxiliary information, such as textual descriptions of items and acoustic features of musics. When it comes to model the key factor in collaborative filtering — the interaction between user and item features, they still resorted to matrix factorization and applied an inner product on the latent features of users and items. By replacing the inner product with a neural architecture that can learn an arbitrary function from data, we present a general framework named NCF, short for Neural network-based Collaborative Filtering. NCF is generic and can express and generalize matrix factorization under its framework. To supercharge NCF modelling with non-linearities, we propose to leverage a multi-layer perceptron to learn the user–item interaction function. Extensive experiments on two real-world datasets show significant improvements of our proposed NCF framework over the state-of-the-art methods. Empirical evidence shows that using deeper layers of neural networks offers better recommendation performance. © 2017 International World Wide Web Conference Committee (IW3C2).
  • Publication
    Automatic Fashion Knowledge Extraction from Social Media
    (2019-10-21) Yunshan Ma; Lizi Liao; Tat-Seng Chua; DEPARTMENT OF COMPUTER SCIENCE
    Fashion knowledge plays a pivotal role in helping people in their dressing. In this paper, we present a novel system to automatically harvest fashion knowledge from social media. It unifies three tasks of occasion, person and clothing discovery from multiple modalities of images, texts and metadata. A contextualized fashion concept learning model is applied to leverage the rich contextual information for improving the fashion concept learning performance. At the same time, to counter the label noise within training data, we employ a weak label modeling method to further boost the performance. We build a website to demonstrate the quality of fashion knowledge extracted by our system. © 2019 Association for Computing Machinery.
  • Publication
    Crowd Knowledge Enhanced Multimodal Conversational Assistant in Travel Domain
    (Springer, 2020-01-05) Lizi Liao; Lyndon Kennedy; Lynn Wilcox; Tat-Seng Chua; DEPARTMENT OF COMPUTER SCIENCE
    We present a new solution towards building a crowd knowledge enhanced multimodal conversational system for travel. It aims to assist users in completing various travel-related tasks, such as searching for restaurants or things to do, in a multimodal conversation manner involving both text and images. In order to achieve this goal, we ground this research on the combination of multimodal understanding and recommendation techniques which explores the possibility of a more convenient information seeking paradigm. Specifically, we build the system in a modular manner where each modular construction is enriched with crowd knowledge from social sites. To the best of our knowledge, this is the first work that attempts to build intelligent multimodal conversational systems for travel, and moves an important step towards developing human-like assistants for completion of daily life tasks. Several current challenges are also pointed out as our future directions. © 2020, Springer Nature Switzerland AG.
  • Publication
    Attributed Social Network Embedding
    (IEEE Computer Society, 2017-05-12) Lizi Liao; Xiangnan He; Hanwang Zhang; Tat-Seng Chua; DEPARTMENT OF COMPUTER SCIENCE
    Embedding network data into a low-dimensional vector space has shown promising performance for many real-world applications, such as node classification and entity retrieval. However, most existing methods focused only on leveraging network structure. For social networks, besides the network structure, there also exists rich information about social actors, such as user profiles of friendship networks and textual content of citation networks. These rich attribute information of social actors reveal the homophily effect, exerting huge impacts on the formation of social networks. In this paper, we explore the rich evidence source of attributes in social networks to improve network embedding. We propose a generic Attributed Social Network Embedding framework (ASNE), which learns representations for social actors (i.e., nodes) by preserving both the structural proximity and attribute proximity. While the structural proximity captures the global network structure, the attribute proximity accounts for the homophily effect. To justify our proposal, we conduct extensive experiments on four real-world social networks. Compared to the state-of-the-art network embedding approaches, ASNE can learn more informative representations, achieving substantial gains on the tasks of link prediction and node classification. Specifically, ASNE significantly outperforms node2vec with an 8.2 percent relative improvement on the link prediction task, and a 12.7 percent gain on the node classification task. © 1989-2012 IEEE.
  • Publication
    Interpretable Multimodal Retrieval for Fashion Products.
    (Association for Computing Machinery, Inc, 2018-10-26) Lizi Liao; Xiangnan He; Bo Zhao; Chong-Wah Ngo; Tat-Seng Chua; DEPARTMENT OF COMPUTER SCIENCE
    Deep learning methods have been successfully applied to fashion retrieval. However, the latent meaning of learned feature vectors hinders the explanation of retrieval results and integration of user feedback. Fortunately, there are many online shopping websites organizing fashion items into hierarchical structures based on product taxonomy and domain knowledge. Such structures help to reveal how human perceive the relatedness among fashion products. Nevertheless, incorporating structural knowledge for deep learning remains a challenging problem. This paper presents techniques for organizing and utilizing the fashion hierarchies in deep learning to facilitate the reasoning of search results and user intent. The novelty of our work originates from the development of an EI (Exclusive & Independent) tree that can cooperate with deep models for end-to-end multimodal learning. EI tree organizes the fashion concepts into multiple semantic levels and augments the tree structure with exclusive as well as independent constraints. It describes the different relationships among sibling concepts and guides the end-to-end learning of multi-level fashion semantics. From EI tree, we learn an explicit hierarchical similarity function to characterize the semantic similarities among fashion products. It facilitates the interpretable retrieval scheme that can integrate the concept-level feedback. Experiment results on two large fashion datasets show that the proposed approach can characterize the semantic similarities among fashion items accurately and capture user's search intent precisely, leading to more accurate search results as compared to the state-of-the-art methods. © 2018 Association for Computing Machinery.
  • Publication
    Neural Multimodal Belief Tracker with Adaptive Attention for Dialogue Systems
    (Association for Computing Machinery, Inc, 2019-05-13) Zheng Zhang; Lizi Liao; Minlie Huang; Xiaoyan Zhu; Tat-Seng Chua; DEPARTMENT OF COMPUTER SCIENCE
    Multimodal dialogue systems are attracting increasing attention with a more natural and informative way for human-computer interaction. As one of its core components, the belief tracker estimates the user's goal at each step of the dialogue and provides a direct way to validate the ability of dialogue understanding. However, existing studies on belief trackers are largely limited to textual modality, which cannot be easily extended to capture the rich semantics in multimodal systems such as those with product images. For example, in fashion domain, the visual appearance of clothes play a crucial role in understanding the user's intention. In this case, the existing belief trackers may fail to generate accurate belief states for a multimodal dialogue system. In this paper, we present the first neural multimodal belief tracker (NMBT) to demonstrate how multimodal evidence can facilitate semantic understanding and dialogue state tracking. Given the multimodal inputs, while applying a textual encoder to represent textual utterances, the model gives special consideration to the semantics revealed in visual modality. It learns concept level fashion semantics by delving deep into image sub-regions and integrating concept probabilities via multiple instance learning. Then in each turn, an adaptive attention mechanism learns to automatically emphasize on different evidence sources of both visual and textual modalities for more accurate dialogue state prediction. We perform extensive evaluation on a multi-turn task-oriented dialogue dataset in fashion domain and the results show that our method achieves superior performance as compared to a wide range of baselines. © 2019 IW3C2 (International World Wide Web Conference Committee), published under Creative Commons CC-BY 4.0 License.