Singing voice detection for karaoke application

Please use this identifier to cite or link to this item: https://doi.org/10.1117/12.631645

DC Field	Value
dc.title	Singing voice detection for karaoke application
dc.contributor.author	Shenoy, A.
dc.contributor.author	Wu, Y.
dc.contributor.author	Wang, Y.
dc.date.accessioned	2013-07-04T08:38:31Z
dc.date.available	2013-07-04T08:38:31Z
dc.date.issued	2005
dc.identifier.citation	Shenoy, A., Wu, Y., Wang, Y. (2005). Singing voice detection for karaoke application. Proceedings of SPIE - The International Society for Optical Engineering 5960 (2) : 752-762. ScholarBank@NUS Repository. https://doi.org/10.1117/12.631645
dc.identifier.issn	0277786X
dc.identifier.uri	http://scholarbank.nus.edu.sg/handle/10635/41900
dc.description.abstract	We present a framework to detect the regions of singing voice in musical audio signals. This work is oriented towards the development of a robust transcriber of lyrics for karaoke applications. The technique leverages on a combination of low-level audio features and higher level musical knowledge of rhythm and tonality. Musical knowledge of the key is used to create a song-specific filterbank to attenuate the presence of the pitched musical instruments. This is followed by subband processing of the audio to detect the musical octaves in which the vocals are present. Text processing is employed to approximate the duration of the sung passages using freely available lyrics. This is used to obtain a dynamic threshold for vocal/ non-vocal segmentation. This pairing of audio and text processing helps create a more accurate system. Experimental evaluation on a small database of popular songs shows the validity of the proposed approach. Holistic and per-component evaluation of the system is conducted and various improvements are discussed.
dc.description.uri	http://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1117/12.631645
dc.source	Scopus
dc.subject	Inverse comb filtering
dc.subject	Karaoke
dc.subject	Key
dc.subject	Lyrics
dc.subject	Rhythm
dc.subject	Singing voice
dc.subject	Tonic
dc.subject	Vocal segmentation
dc.type	Conference Paper
dc.contributor.department	COMPUTER SCIENCE
dc.description.doi	10.1117/12.631645
dc.description.sourcetitle	Proceedings of SPIE - The International Society for Optical Engineering
dc.description.volume	5960
dc.description.issue	2
dc.description.page	752-762
dc.description.coden	PSISD
dc.identifier.isiut	000232176401014
Appears in Collections:	Staff Publications

Show simple item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM