Evolution of CMU's submission to Voice Conversion Challenge 2018

These are a series of projects that we are working on with an aim to building high quality voice conversion system.



06 November 2017

MAIN PIPELINE USING ONLY PROVIDED DATA


CONVERSION - Copy f0, Predicted ceps


VCC2SF1 Original

VCC2TF1 Original

Converted TF1 with DNN with 32T32T32T32T

Converted TF1 with DNN with 256T256T256T256T

Converted TF1 with DNN with 512T512T512T512T

SYNTHESIS - TOWARDS ADDITIONAL DATA


SF1_arctic_a0001.wav

SF1_arctic_a0002.wav

Converted TF1 with DNN 32T32T32T32T

Converted TF1 with DNN 256T256T256T256T

07 November 2017

DNN, Predicted f0, Predicted ceps


VCC2SF1 Original

VCC2TF1 Original

Converted TF1 with Frame DNN with 64T64T64T64T

Converted TF1 with Frame DNN with 128T128T128T128T

Converted TF1 with Frame DNN with 256T256T5256T256T

Converted TF1 with Frame DNN with 1024T1024T0124T0124T

AugmentedDNN, Predicted f0, Predicted ceps


VCC2SF1 Original

VCC2TF1 Original

Converted TF1 with frame DNN of 128T128T128T128T augmented with 100 arctic sentneces

Converted TF1 with frame DNN of 256T256T256T256T augmented with 100 arctic sentences

TargetSpeakerPretrainedDNN, Predicted f0, Predicted ceps


VCC2SF1 Original

VCC2TF1 Original

Converted TF1 with pretrained frame DNN of 128T128T128T128T

Converted TF1 with pretrained frame DNN of 256T256T256T256T

VED, Predicted f0, Predicted ceps


VCC2SF1 Original

VCC2TF1 Original

Converted TF1 with Frame Variational Encoder Decoder with 32T32T32T32T

Converted TF1 with Frame Variational Encoder Decoder with 512T512T512T512T

08 November 2017

MAIN PIPELINE USING ONLY PROVIDED DATA


CONVERSION - Predict f0, Predicted ceps


VCC2SF1 Original

VCC2TF1 Original

Converted TF1 with DNN with 32T32T32T32T

Converted TF1 with DNN with 64T64T64T64T

Converted TF1 with DNN with 128T128T128T128T

Converted TF1 with DNN with 256T256T256T256T

Converted TF1 with DNN with 512T512T512T512T

Converted TF1 with Keras DNN with 512T512T512T512T

Converted TF1 with DNN with 1024T1024T1024T1024T

Converted TF1 with Keras DNN with 1024T1024T1024T1024T

04 December 2017

MAIN PIPELINE


CONVERSION - Predict f0, Predicted ceps


VCC2SF1 Original

VCC2TF1 Original

Converted TF1 with DNN - Only provided data

Converted TF1 with 1024 ELU DNN and weighted loss - Only provided data

Converted TF1 with 256 SELU DNN - Synthetic Data added