Please use this identifier to cite or link to this item: https://doi.org/10.25540/4VMK-AYPV
Title: Localizing Fake Segments in Speech
Creators: SIM MONG CHENG, TERENCE 
BOWEN ZHANG
NUS Contact: Terence, Mong Cheng Sim
Subject: Computer Science
DOI: doi:10.25540/4VMK-AYPV
Description: 

Partial Synthetic Detection (Psynd) dataset is a multi-speaker English corpus of 2294 utterances, approximately 13 hours English speech at 24kHz sampling rate. It is derived from LibriTTS , a read English speech corpus (all real voices) designed for TTS research. The data samples are real utterances injected with voice cloning synthetic speech. The fake parts are generated by state-of-art multi-speaker text-to-speech method and have high similarity with target speakers characterized by Global Style Token (GST) and X-Vector. 

Related Publications: Localizing Fake Segments in Speech
Citation: When using this data, please cite the original publication and also the dataset.
  • Bowen Zhang and Terence Sim, Localizing Fake Segments in Speech, published International Conference on Pattern Recognition, Montreal, Canada, 2022.
  • SIM MONG CHENG, TERENCE, BOWEN ZHANG (2022-06-20). Localizing Fake Segments in Speech. 1.0. ScholarBank@NUS Repository. [Dataset]. https://doi.org/10.25540/4VMK-AYPV
License: CC0 1.0 Universal
http://creativecommons.org/publicdomain/zero/1.0/
Appears in Collections:Other Dataset

Show full item record
Files in This Item:
File Description SizeFormatAccess Settings 
Psynd.zipDataset file1.77 GBZIP

CLOSED

    Request a copy
Psynd_readme.txtReadme file1.65 kBText

OPEN

View/Download

Google ScholarTM

Check

Altmetric


This item is licensed under a Creative Commons License Creative Commons