Please use this identifier to cite or link to this item: https://doi.org/10.1145/2393347.2393397
DC FieldValue
dc.titleDon't ask me what I'm like, just watch and listen
dc.contributor.authorSrivastava, R.
dc.contributor.authorFeng, J.
dc.contributor.authorRoy, S.
dc.contributor.authorYan, S.
dc.contributor.authorSim, T.
dc.date.accessioned2013-07-23T09:30:56Z
dc.date.available2013-07-23T09:30:56Z
dc.date.issued2012
dc.identifier.citationSrivastava, R.,Feng, J.,Roy, S.,Yan, S.,Sim, T. (2012). Don't ask me what I'm like, just watch and listen. MM 2012 - Proceedings of the 20th ACM International Conference on Multimedia : 329-338. ScholarBank@NUS Repository. <a href="https://doi.org/10.1145/2393347.2393397" target="_blank">https://doi.org/10.1145/2393347.2393397</a>
dc.identifier.isbn9781450310895
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/43321
dc.description.abstractTraditional (based on psychology) approaches for personality assessment of an individual require him/her to fill up a questionnaire. This paper presents a novel way of utilizing multimodal cues to automatically fill up the questionnaire. The contributions of this work are three-fold. (1) Novel psychology-based audio/visual/lexical features are proposed and shown to be effective in predicting answers to a personality questionnaire, Big-Five Inventory-10 (BFI- 10). (2) Extracted features are used to learn linear and kernel versions of a novel regression model, 'SLoT', to automatically predict BFI-10 answers. The model is based on Sparse and Low-rank Transformation (SLoT). (3) Predicted answers are used to compute personality scores using standard BFI-10 scoring scheme. We evaluated our approach on a dataset of 3907 clips (for 50 characters from movies of diverse genres) manually labeled with BFI-10 answers and personality scores as ground-truth. Experiments indicate that the proposed 'SLoT' model effectively automates the answering process by emulating human understanding. We also conclude that predicting personality scores through predicting answers first is better than directly predicting scores based on audio/visual features (as studied in state-of-the art methods). © 2012 ACM.
dc.description.urihttp://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1145/2393347.2393397
dc.sourceScopus
dc.subjectemotion recognition
dc.subjectmovie analysis
dc.subjectmultimodal features
dc.subjectpersonality assessment
dc.typeConference Paper
dc.contributor.departmentCOMPUTER SCIENCE
dc.contributor.departmentELECTRICAL & COMPUTER ENGINEERING
dc.description.doi10.1145/2393347.2393397
dc.description.sourcetitleMM 2012 - Proceedings of the 20th ACM International Conference on Multimedia
dc.description.page329-338
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.