• Home
  • News
  • Plans
  • Demo
  • Create Account
  • Login

Ensemble 4 models (vocals, instrum)

Ensemble of UVR-MDX-NET-Voc_FT, Demucs4 Vocals 2023 and two MDX23C models. Algorithm gives the highest possible quality for vocal and instrumental stems.

Quality table

Algorithm name Multisong dataset Synth dataset MDX23 Leaderboard
SDR Vocals SDR Instrumental SDR Vocals SDR Instrumental SDR Vocals
Ensemble of 4 models 10.26 16.52 12.42 12.12 11.063
🗎 Copy link

Ensemble 8 models (vocals, bass, drums, other)

This ensemble is based on algorithm which took 2nd place at Music Demixing Track of Sound Demixing Challenge 2023. The main changes comparing to contest version is much better vocal models, which is used here. We use following 4 models for vocals: UVR-MDX-NET-Voc_FT, Demucs4 Vocals 2023 and two models MDX23C. For stems 'bass', 'drums' and 'other' we us the following 4 models: demucsht_ft, deumcs_ht, demucs_6s and demucs_mmi. Initial winning model available here: https://github.com/ZFTurbo/MVSEP-MDX23-music-separation-model

Quality table

Multisong dataset Synth dataset
SDR Bass SDR Drums SDR Other SDR Vocals SDR Instrumental SDR Vocals SDR Instrumental
12.52 11.73 6.93 10.17 16.48 12.25 11.95

 

MDX23 Leaderboard
SDR Bass SDR Drums SDR Other SDR Vocals
9.912 9.576 7.220 10.944
🗎 Copy link

Demucs4 HT (vocals, drums, bass, other)

Algorithm Demucs4 HT splits track into 4 stems (bass, drums, vocals, other). It's now best for bass/drums/other separation. It was released in year 2022 and has 3 versions: 

  • htdemucs_ft - best quality, but slow
  • htdemucs - lower quality, but fast
  • htdemucs_6d - it has 2 additional stems "piano" and "guitar" (quality for them is still so-so).

Link: https://github.com/facebookresearch/demucs/tree/ht/demucs

Quality table

Algorithm name Multisong dataset Synth dataset MDX23 Leaderboard
SDR Bass SDR Drums SDR Other SDR Vocals SDR Instrumental SDR Vocals SDR Instrumental SDR Vocals
htdemucs_ft 12.05 11.24 5.74 8.33 14.63 10.23 9.94 9.08
htdemucs 11.74 10.90 5.57 8.18 14.49 --- ---  
htdemucs_6d 11.42 10.59 2.63 8.17 14.48 --- ---  
🗎 Copy link

MDX23C (vocals, instrumental)

New set of models MDX23C is based on code released by kuielab for Sound Demixing Challenge 2023. All models are full band, e.g. they don't cut high frequences.

Quality table

Algorithm name Multisong dataset Synth dataset MDX23 Leaderboard
SDR Vocals SDR Instrumental SDR Vocals SDR Instrumental SDR Vocals
12K FFT, Full Band, Large Conv, Hop 1024 9.95 16.26 11.74 11.44 10.785
12K FFT, Full Band, Large Conv 9.71 16.02 --- --- ---
12K FFT, Full Band 9.68 15.99 --- --- ---
12K FFT, Full Band, 6 Poolings 9.49 15.79 --- --- ---
8K FFT, Full Band 10.17 16.48 12.35 12.06 11.043
🗎 Copy link

MDX B (vocals, instrumental)

MDX B models are based on kuielab code from Music Demixing Challenge 2021. Models were retrained by UVR team on big dataset. For long time models were best for vocals/instrumental separation.

Quality table

Algorithm name Multisong dataset Synth dataset MDX23 Leaderboard
SDR Vocals SDR Instrumental SDR Vocals SDR Instrumental SDR Vocals
UVR-MDX-NET-Voc_FT 9.64 15.95 11.40 11.10 10.505
MDX Kimberley Jensen v2 9.60 15.91 --- --- 10.494
MDX Kimberley Jensen v1 9.48 15.79 --- --- ---
UVR-MDX-NET-Inst_HQ_3 9.38 15.68 11.32 11.03 10.254
MDX Kimberley Jensen Inst 9.28 15.59 --- --- ---
UVR-MDX-NET-Inst_HQ_2 9.12 15.42 --- --- ---
MDX UVR 2022.01.01 8.83 15.14 --- --- ---
UVR_MDXNET_Main 8.79 15.10 --- --- ---
MDX UVR 2022.07.25 8.67 14.97 --- --- ---
🗎 Copy link

Demucs4 Vocals 2023 (vocals, instrum)

Demucs4 Vocals 2023 model - it's Demucs4 HT model fine-tuned on big vocal/instrumental dataset. It has better metrics for vocals separation compared to Demucs4 HT (_ft version). It usually gives worse metrics than MDX23C models, but can be useful for ensembles, since the model is very different from MDX23C. 

Quality table

Algorithm name Multisong dataset Synth dataset MDX23 Leaderboard
SDR Vocals SDR Instrumental SDR Vocals SDR Instrumental SDR Vocals
Demucs4 Vocals 2023 9.04 15.35 11.59 11.29 9.61
🗎 Copy link

MDX-B Karaoke (lead/back vocals)

The MDX-B Karaoke model was prepared as part of the Ultimate Vocal Remover project. The model produces high-quality lead vocal extraction from a music track. The model is available in two versions. In the first version, the neural network model is used directly on the entire track. In the second version, the track is first divided into two parts, vocal and instrumental, and then the neural network model is applied only to the vocal part. In the second version, the quality of separation is usually higher and it becomes possible to additionally separate the backing vocals into a separate track. The model was compared with two other models from UVR (they are also available on the website) on a large validation set. The metric used is SDR: the higher the better.

See the results in the table below.

Validation type Algorithm name
UVR (HP-KAROKEE-MSB2-3BAND-3090) UVR (karokee_4band_v2_sn) MDX-B Karaoke (Type 0) MDX-B Karaoke (Type 1)
Validation lead vocals 6.46 6.34 6.81 7.94
Validation other 13.17 13.02 13.53 14.66
Validation back vocals --- --- --- 1.88

 

🗎 Copy link

MVSep Piano (piano, other)

MVSep Piano model is based on MDX23C architecture. It produces high quality separation. Model was compared with other two models (Demucs4HT (6 stems) and GSEP) on two validation sets. First validation includes electric piano as part of piano, while 2nd only contains acoustic piano (grand piano). Used metrics is SDR: the more the better.

See the results in table below. 

Validation type Algorithm name
Demucs4HT (6 stems) GSEP MVSep Piano 2023 (Type 0) MVSep Piano 2023 (Type 1)
Validation full 2.4432 3.5589 4.9187 4.9772
Validation (only grand piano) 4.5591 5.7180 7.2651 7.2948

The model is available in two variants. In the first variant, the Piano model is used directly on the entire track. In the second variant, the track is first divided into two parts, vocal and instrument, and then the Piano model is applied to the instrument part only. In the second case, the separation quality is usually a bit better.

🗎 Copy link

MVSep Guitar (guitar, other)

The MVSep Guitar model is based on the MDX23C architecture. The model produces high-quality separation of music into a guitar part (including acoustic and electronic) and everything else. The model was compared with the Demucs4HT model (6 stems) on a guitar validation set. The metric used is SDR: the higher the better.

See the results in the table below.

Validation type Algorithm name
Demucs4HT (6 stems) MVSep Guitar 2023 (Type 0) MVSep Guitar 2023 (Type 1)
Validation guitar 7.2245 7.7716 7.9251
Validation other 13.1756 13.7227 13.8762

The model is available in two versions. In the first version, the neural network model for the guitar is used directly on the entire track. In the second case, the track is first divided into two parts, vocal and instrumental, and then the neural network model for the guitar is applied only to the instrumental part. In the second case, the separation quality is usually slightly higher.

🗎 Copy link

Demucs3 Model B (vocals, drums, bass, other)

Algorithm Demucs3 splits track into 4 stems (bass, drums, vocals, other). The winner of the Music Demuxing Challenge 2021. 

Link: https://github.com/facebookresearch/demucs/tree/v3

Quality table

Algorithm name Multisong dataset Synth dataset
SDR Bass SDR Drums SDR Other SDR Vocals SDR Instrumental SDR Vocals SDR Instrumental
demucs3 10.69 10.27 5.35 8.13 14.44 9.78 9.48
🗎 Copy link

Demucs3 Model A (vocals, drums, bass, other)

Algorithm Demucs3 splits track into 4 stems (bass, drums, vocals, other). The winner of the Music Demuxing Challenge 2021. Only MUSDB18 training data was used for training of model, so quality is worse than Demucs3 Model B. Demucs3 Model A and Demucs3 Model B has the same architecture, but has different weights.

Quality table

Multisong dataset Synth dataset
SDR Bass SDR Drums SDR Other SDR Vocals SDR Instrumental SDR Vocals SDR Instrumental
9.50 8.97 4.40 7.21 13.52 --- ---
🗎 Copy link

turbo@mvsep.com

Advanced features

Quality Checker

Algorithm comparison

Algorithms

Privacy Policy

0:00
0:00