Improving Choral Music Separation through
Expressive Synthesized Data from Sampled Instruments

In this demo page, we present two parts:

First, we present the synthesis demos from SoundFont-Synthesis, our proposed Standard-Vocal and Expressive-Vocal datasets.

Second, we present separation demos from real choral music datasets of three models:
(1) None-dataset pretrained models; (2) SoundFont-Synthesis pretrained models; and (3) Our Expressive-Vocal pretrained models.

Synthesized Choral Music Dataset Demos:


Demo
SoundFont-Synthesis
Standard-Vocal (Ours)
Expressive-Vocal-Vowel (Ours)
Expressive-Vocal-Word (Ours)
#1:
#2:
#3:

Short Separation Demos (from Bach and Barbershop Collection Dataset):


Demo #1

Mixture:
Voice
References
Non-dataset Pretrained
SoundFont-Synthesis Pretrained
Expressive-Vocal Pretrained (ours)
Soprano:
Alto:
Tenor:
Bass:

Demo #2

Mixture:
Voice
References
Non-dataset Pretrained
SoundFont-Synthesis Pretrained
Expressive-Vocal Pretrained (ours)
Soprano:
Alto:
Tenor:
Bass:

Demo #3

Mixture:
Voice
References
Non-dataset Pretrained
SoundFont-Synthesis Pretrained
Expressive-Vocal Pretrained (ours)
Soprano:
Alto:
Tenor:
Bass:

Demo #4

Mixture:
Voice
References
Non-dataset Pretrained
SoundFont-Synthesis Pretrained
Expressive-Vocal Pretrained (ours)
Soprano:
Alto:
Tenor:
Bass:

Long Separation Demos (from Cantoria Dataset and Chorale Singing Dataset):


Demo #1

Mixture:
Voice
References
Non-dataset Pretrained
SoundFont-Synthesis Pretrained
Expressive-Vocal Pretrained (ours)
Soprano:
Alto:
Tenor:
Bass: