This ensemble is based on algorithm which took 2nd place at Music Demixing Track of Sound Demixing Challenge 2023. The main changes comparing to contest version is much better vocal models, which is used here. We use following different models for vocals: UVR-MDX-NET-Voc_FT, Demucs4 Vocals 2023, best MDX23C model, VitLarge23 and BS Roformer. For stems 'bass', 'drums' and 'other' we us the following 4 models: demucsht_ft, deumcs_ht, demucs_6s and demucs_mmi. Initial winning model available here: https://github.com/ZFTurbo/MVSEP-MDX23-music-separation-model
Quality table
Algorithm name | Multisong dataset | Synth dataset | |||||
SDR Bass | SDR Drums | SDR Other | SDR Vocals | SDR Instrumental | SDR Vocals | SDR Instrumental | |
SDR average: 11.21 (v. 2023.09.01) (UVR-MDX-NET-Voc_FT, Demucs4 Vocals 2023, MDX23C, VitLarge23) |
12.52 | 11.73 | 7.01 | 10.30 | 16.60 | 12.67 | 12.38 |
SDR average: 11.87 (v. 2024.03.08) (BS Roformer, MDX23C, VitLarge23) |
12.53 | 11.84 | 7.15 | 10.75 | 17.06 | 12.72 | 12.42 |
SDR average: 12.03 (v. 2024.03.28) (BS Roformer viperx edition, MDX23C) |
12.57 | 11.94 | 7.22 | 11.06 | 17.37 | 13.00 | 12.70 |
SDR average: 12.17 (v. 2024.04.04) (BS Roformer fintuned, MDX23C) |
12.59 | 11.99 | 7.33 | 11.33 | 17.63 | 13.57 | 13.27 |
SDR average: 12.34 (v. 2024.05.21) (BS Roformer fintuned, MDX23C) |
13.44 | 11.99 | 7.33 | 11.33 | 17.63 | 13.57 | 13.27 |
SDR average: 12.66 (v. 2024.07.14) (BS Roformer fintuned, MDX23C) |
13.46 | 13.15 | 7.72 | 11.32 | 17.63 | 13.57 | 13.27 |
SDR average: 12.76 (v. 2024.08.15) (BS Roformer fintuned, MelBand Roformer fintuned) |
13.48 | 13.33 | 7.81 | 11.50 | 17.81 | 13.79 | 13.50 |
Algorithm name | MDX23 Leaderboard | |||
SDR Bass | SDR Drums | SDR Other | SDR Vocals | |
SDR average: 11.21 (v. 2023.09.01) (UVR-MDX-NET-Voc_FT, Demucs4 Vocals 2023, MDX23C, VitLarge23) |
9.937 | 9.559 | 7.280 | 11.093 |