Please use this identifier to cite or link to this item: https://doi.org/10.1109/TMM.2020.3042706
Title: A Hybrid Approach for Detecting Prerequisite Relations in Multi-Modal Food Recipes
Authors: Pan, Liangming
Chen, Jingjing 
Liu, Shaoteng
Ngo, Chong-Wah 
Kan, Min-Yen 
Chua, Tat-Seng 
Keywords: Science & Technology
Technology
Computer Science, Information Systems
Computer Science, Software Engineering
Telecommunications
Computer Science
Feature extraction
Training
Task analysis
Semantics
Pipelines
Deep learning
Predictive models
Food recipes
cooking workflow
prerequisite trees
multi-modal fusion
cause-and-effect reasoning
deep learning
Issue Date: 1-Jan-2021
Publisher: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Citation: Pan, Liangming, Chen, Jingjing, Liu, Shaoteng, Ngo, Chong-Wah, Kan, Min-Yen, Chua, Tat-Seng (2021-01-01). A Hybrid Approach for Detecting Prerequisite Relations in Multi-Modal Food Recipes 23 : 4491-4501. ScholarBank@NUS Repository. https://doi.org/10.1109/TMM.2020.3042706
Abstract: Modeling the structure of culinary recipes is the core of recipe representation learning. Current approaches mostly focus on extracting the workflow graph from recipes based on text descriptions. Process images, which constitute an important part of cooking recipes, has rarely been investigated in recipe structure modeling. We study this recipe structure problem from a multi-modal learning perspective, by proposing a prerequisite tree to represent recipes with cooking images at a step-level granularity. We propose a simple-yet-effective two-stage framework to automatically construct the prerequisite tree for a recipe by (1) utilizing a trained classifier to detect pairwise prerequisite relations that fuses multi-modal features as input; then (2) applying different strategies (greedy method, maximum weight, and beam search) to build the tree structure. Experiments on the MM-ReS dataset demonstrates the advantages of introducing process images for recipe structure modeling. Also, compared with neural methods which require large numbers of training data, we show that our two-stage pipeline can achieve promising results using only 400 labeled prerequisite trees as training data.
URI: https://scholarbank.nus.edu.sg/handle/10635/229622
ISSN: 15209210
19410077
DOI: 10.1109/TMM.2020.3042706
Appears in Collections:Staff Publications
Elements

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
TMM20_Paper.pdf7.94 MBAdobe PDF

OPEN

Post-printView/Download

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.