The NUS sung and spoken lyrics corpus: A quantitative comparison of singing and speech

Please use this identifier to cite or link to this item: https://doi.org/10.1109/APSIPA.2013.6694316

DC Field	Value
dc.title	The NUS sung and spoken lyrics corpus: A quantitative comparison of singing and speech
dc.contributor.author	Duan, Z.
dc.contributor.author	Fang, H.
dc.contributor.author	Li, B.
dc.contributor.author	Sim, K.C.
dc.contributor.author	Wang, Y.
dc.date.accessioned	2014-07-04T03:15:43Z
dc.date.available	2014-07-04T03:15:43Z
dc.date.issued	2013
dc.identifier.citation	Duan, Z.,Fang, H.,Li, B.,Sim, K.C.,Wang, Y. (2013). The NUS sung and spoken lyrics corpus: A quantitative comparison of singing and speech. 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2013 : -. ScholarBank@NUS Repository. <a href="https://doi.org/10.1109/APSIPA.2013.6694316" target="_blank">https://doi.org/10.1109/APSIPA.2013.6694316</a>
dc.identifier.isbn	9789869000604
dc.identifier.uri	http://scholarbank.nus.edu.sg/handle/10635/78388
dc.description.abstract	Despite a long-standing effort to characterize various aspects of the singing voice and their relations to speech, the lack of a suitable and publicly available dataset has precluded any systematic study on the quantitative difference between singing and speech at the phone level. We hereby present the NUS Sung and Spoken Lyrics Corpus (NUS-48E corpus) as the first step toward a large, phonetically annotated corpus for singing voice research. The corpus is a 169-min collection of audio recordings of the sung and spoken lyrics of 48 (20 unique) English songs by 12 subjects and a complete set of transcriptions and duration annotations at the phone level for all recordings of sung lyrics, comprising 25,474 phone instances. Using the NUS-48E corpus, we conducted a preliminary, quantitative study on the comparison between singing voice and speech. The study includes duration analyses of the sung and spoken lyrics, with a primary focus on the behavior of consonants, and experiments aiming to gauge how acoustic representations of spoken and sung phonemes differ, as well as how duration and pitch variations may affect the Mel Frequency Cepstral Coefficients (MFCC) features. © 2013 APSIPA.
dc.description.uri	http://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1109/APSIPA.2013.6694316
dc.source	Scopus
dc.type	Conference Paper
dc.contributor.department	COMPUTER SCIENCE
dc.description.doi	10.1109/APSIPA.2013.6694316
dc.description.sourcetitle	2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2013
dc.description.page	-
dc.identifier.isiut	NOT_IN_WOS
Appears in Collections:	Staff Publications

Show simple item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM