Please use this identifier to cite or link to this item: https://doi.org/10.1109/TMM.2020.3042706
DC FieldValue
dc.titleA Hybrid Approach for Detecting Prerequisite Relations in Multi-Modal Food Recipes
dc.contributor.authorPan, Liangming
dc.contributor.authorChen, Jingjing
dc.contributor.authorLiu, Shaoteng
dc.contributor.authorNgo, Chong-Wah
dc.contributor.authorKan, Min-Yen
dc.contributor.authorChua, Tat-Seng
dc.date.accessioned2022-08-01T05:56:08Z
dc.date.available2022-08-01T05:56:08Z
dc.date.issued2021-01-01
dc.identifier.citationPan, Liangming, Chen, Jingjing, Liu, Shaoteng, Ngo, Chong-Wah, Kan, Min-Yen, Chua, Tat-Seng (2021-01-01). A Hybrid Approach for Detecting Prerequisite Relations in Multi-Modal Food Recipes 23 : 4491-4501. ScholarBank@NUS Repository. https://doi.org/10.1109/TMM.2020.3042706
dc.identifier.issn15209210
dc.identifier.issn19410077
dc.identifier.urihttps://scholarbank.nus.edu.sg/handle/10635/229622
dc.description.abstractModeling the structure of culinary recipes is the core of recipe representation learning. Current approaches mostly focus on extracting the workflow graph from recipes based on text descriptions. Process images, which constitute an important part of cooking recipes, has rarely been investigated in recipe structure modeling. We study this recipe structure problem from a multi-modal learning perspective, by proposing a prerequisite tree to represent recipes with cooking images at a step-level granularity. We propose a simple-yet-effective two-stage framework to automatically construct the prerequisite tree for a recipe by (1) utilizing a trained classifier to detect pairwise prerequisite relations that fuses multi-modal features as input; then (2) applying different strategies (greedy method, maximum weight, and beam search) to build the tree structure. Experiments on the MM-ReS dataset demonstrates the advantages of introducing process images for recipe structure modeling. Also, compared with neural methods which require large numbers of training data, we show that our two-stage pipeline can achieve promising results using only 400 labeled prerequisite trees as training data.
dc.publisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
dc.sourceElements
dc.subjectScience & Technology
dc.subjectTechnology
dc.subjectComputer Science, Information Systems
dc.subjectComputer Science, Software Engineering
dc.subjectTelecommunications
dc.subjectComputer Science
dc.subjectFeature extraction
dc.subjectTraining
dc.subjectTask analysis
dc.subjectSemantics
dc.subjectPipelines
dc.subjectDeep learning
dc.subjectPredictive models
dc.subjectFood recipes
dc.subjectcooking workflow
dc.subjectprerequisite trees
dc.subjectmulti-modal fusion
dc.subjectcause-and-effect reasoning
dc.subjectdeep learning
dc.typeConference Paper
dc.date.updated2022-07-19T07:43:23Z
dc.contributor.departmentDEPARTMENT OF COMPUTER SCIENCE
dc.description.doi10.1109/TMM.2020.3042706
dc.description.volume23
dc.description.page4491-4501
dc.published.statePublished
Appears in Collections:Staff Publications
Elements

Show simple item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
TMM20_Paper.pdf7.94 MBAdobe PDF

OPEN

Post-printView/Download

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.