This ensemble is based on algorithm which took 2nd place at Music Demixing Track of Sound Demixing Challenge 2023. The main changes comparing to contest version is much better vocal models, which is used here. We use following different models for vocals: UVR-MDX-NET-Voc_FT, Demucs4 Vocals 2023, best MDX23C model, VitLarge23, BS Roformer, Mel Roformer and SCNet XL. For stems 'bass', 'drums' and 'other' we us the following 4 models: demucs4ht_ft, deumcs4_ht, demucs4_6s and demucs3_mmi. Initial winning model available here: https://github.com/ZFTurbo/MVSEP-MDX23-music-separation-model
Quality table
Algorithm name | Multisong dataset | Synth dataset | |||||
SDR Bass | SDR Drums | SDR Other | SDR Vocals | SDR Instrumental | SDR Vocals | SDR Instrumental | |
SDR average: 11.21 (v. 2023.09.01) | 12.52 | 11.73 | 7.01 | 10.30 | 16.60 | 12.67 | 12.38 |
SDR average: 11.87 (v. 2024.03.08) | 12.53 | 11.84 | 7.15 | 10.75 | 17.06 | 12.72 | 12.42 |
SDR average: 12.03 (v. 2024.03.28) | 12.57 | 11.94 | 7.22 | 11.06 | 17.37 | 13.00 | 12.70 |
SDR average: 12.17 (v. 2024.04.04) | 12.59 | 11.99 | 7.33 | 11.33 | 17.63 | 13.57 | 13.27 |
SDR average: 12.34 (v. 2024.05.21) | 13.44 | 11.99 | 7.33 | 11.33 | 17.63 | 13.57 | 13.27 |
SDR average: 12.66 (v. 2024.07.14) | 13.46 | 13.15 | 7.72 | 11.32 | 17.63 | 13.57 | 13.27 |
SDR average: 12.76 (v. 2024.08.15) | 13.48 | 13.33 | 7.81 | 11.50 | 17.81 | 13.79 | 13.50 |
SDR average: 13.01 (v. 2024.12.20) | 14.14 | 13.57 | 8.02 | 11.50 | 17.81 | 13.79 | 13.50 |
SDR average: 13.07 (v. 2024.12.28) | 14.14 | 13.57 | 8.10 | 11.61 | 17.92 | 14.09 | 13.79 |
Algorithm name | MDX23 Leaderboard | |||
SDR Bass | SDR Drums | SDR Other | SDR Vocals | |
SDR average: 11.21 (v. 2023.09.01) (UVR-MDX-NET-Voc_FT, Demucs4 Vocals 2023, MDX23C, VitLarge23) |
9.937 | 9.559 | 7.280 | 11.093 |