Ensemble of best vocal models. Algorithm gives the highest possible quality for vocal and instrumental stems. The latest ensemble consists of BS Roformer, MelBand Roformer and SCNet XL IHF vocal models.
Quality table
Algorithm name | Multisong dataset | Synth dataset | MDX23 Leaderboard | ||
SDR Vocals | SDR Instrumental | SDR Vocals | SDR Instrumental | SDR Vocals | |
Ensemble (2023.09) (UVR-MDX-NET-Voc_FT, Demucs4 Vocals 2023, MDX23C, VitLarge23) |
10.44 | 16.74 | 12.76 | 12.46 | 11.17 |
Ensemble (2024.02) (BS Roformer (v1), MDX23C, VitLarge23) |
10.75 | 17.06 | 12.72 | 12.42 | --- |
Ensemble (2024.03) (BS Roformer (viperx), MDX23C) |
11.06 | 17.37 | 13.00 | 12.70 | --- |
Ensemble (2024.04) (BS Roformer (finetuned), MDX23C) |
11.33 | 17.63 | 13.57 | 13.27 | --- |
Ensemble (2024.08) (BS Roformer (finetuned), MelBand Roformer) |
11.50 | 17.81 | 13.79 | 13.50 | --- |
Ensemble (2024.12) (BS Roformer (finetuned), MelBand Roformer, SCNet XL) |
11.61 | 17.92 | 14.09 | 13.79 | --- |
Ensemble (2025.06) (BS Roformer (x2), MelBand Roformer (ft), SCNet XL IHF) |
11.93 | 18.23 | 14.46 | 14.17 | --- |
Detailed statistics on Multisong dataset:
Model | Vocals fullness | Vocals bleedless | Vocals SDR | Vocals L1Freq | Instrum fullness | Instrum bleedless | Instrum SDR | Instrum L1Freq |
Ensemble (2025.06) | 17.73 | 36.29 | 11.93 | 39.94 | 28.75 | 47.64 | 18.23 | 40.90 |
Ensemble High Vocals Fullness (2025.06) | 20.46 | 32.77 | 11.69 | 39.86 | --- | --- | --- | --- |
Ensemble High Instrumental Fullness (2025.06) | --- | --- | --- | --- | 34.79 | 41.47 | 17.69 | 40.51 |