MMGCN: Multimodal Graph Convolution Network for Personalized Recommendation of Micro-video

Please use this identifier to cite or link to this item: https://doi.org/10.1145/3343031.3351034

DC Field	Value
dc.title	MMGCN: Multimodal Graph Convolution Network for Personalized Recommendation of Micro-video
dc.contributor.author	Yinwei Wei
dc.contributor.author	Xiang Wang
dc.contributor.author	Liqiang Nie
dc.contributor.author	Xiangnan He
dc.contributor.author	Richang Hong
dc.contributor.author	Tat-Seng Chua
dc.date.accessioned	2020-05-06T04:14:44Z
dc.date.available	2020-05-06T04:14:44Z
dc.date.issued	2019-10-21
dc.identifier.citation	Yinwei Wei, Xiang Wang, Liqiang Nie, Xiangnan He, Richang Hong, Tat-Seng Chua (2019-10-21). MMGCN: Multimodal Graph Convolution Network for Personalized Recommendation of Micro-video. ACM MM 2019 : 1437-1445. ScholarBank@NUS Repository. https://doi.org/10.1145/3343031.3351034
dc.identifier.isbn	9781450368896
dc.identifier.uri	https://scholarbank.nus.edu.sg/handle/10635/167770
dc.description.abstract	Personalized recommendation plays a central role in many online content sharing platforms. To provide quality micro-video recommendation service, it is of crucial importance to consider the interactions between users and items (i.e., micro-videos) as well as the item contents from various modalities (e.g., visual, acoustic, and textual). Existing works on multimedia recommendation largely exploit multi-modal contents to enrich item representations, while less effort is made to leverage information interchange between users and items to enhance user representations and further capture user's fine-grained preferences on different modalities. In this paper, we propose to exploit user-item interactions to guide the representation learning in each modality, and further personalized micro-video recommendation. We design a Multimodal Graph Convolution Network (MMGCN) framework built upon the message-passing idea of graph neural networks, which can yield modal-specific representations of users and micro-videos to better capture user preferences. Specifically, we construct a user-item bipartite graph in each modality, and enrich the representation of each node with the topological structure and features of its neighbors. Through extensive experiments on three publicly available datasets, Tiktok, Kwai, and MovieLens, we demonstrate that our proposed model is able to significantly outperform state-of-the-art multi-modal recommendation methods. ? 2019 Association for Computing Machinery.
dc.subject	Graph Convolution Network
dc.subject	Micro-video Understanding
dc.subject	Multi-modal Recommendation
dc.type	Conference Paper
dc.contributor.department	DEPT OF COMPUTER SCIENCE
dc.description.doi	10.1145/3343031.3351034
dc.description.sourcetitle	ACM MM 2019
dc.description.page	1437-1445
dc.grant.id	R-252-300-002-490
dc.grant.fundingagency	Infocomm Media Development Authority
dc.grant.fundingagency	National Research Foundation
Appears in Collections:	Staff Publications Elements

Show simple item record

Files in This Item:

File	Description	Size	Format	Access Settings	Version
3343031.3351034.pdf		3.29 MB	Adobe PDF	OPEN	None	View/Download

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM