MVSEP Logo
  • Home
  • News
  • Plans
  • Demo
  • FAQ
  • Create Account
  • Login

September updates

2023-09-18


1) We upgraded our main MDX23C 8K FFT model to split tracks into vocal and instrumental parts. SDR metrics have increased on MultiSong Dataset and on Synth Dataset. Separation results have improved accordingly on both Ensemble 4 and Ensemble 8 models. See the changes in the table below.

Algorithm name Multisong dataset Synth dataset MDX23 Leaderboard
SDR Vocals SDR Instrumental SDR Vocals SDR Instrumental SDR Vocals
8K FFT, Full Band (Old version) 10.01 16.32 12.07 11.77 10.85
8K FFT, Full Band (New version) 10.17 16.48 12.35 12.06 11.04

2) We have added two new models MVSep Piano (demo) and MVSep Guitar (demo). Both models are based on the MDX23C architecture. The models produce high quality separation of music into piano/guitar part and everything else. Each of the models is available in two variants. In the first variant, the neural network model is used directly on the entire track. In the second variant, the track is first split into two parts, vocal and instrumental, and then the neural network model is applied only to the instrumental part. In the second case, the separation quality is usually a bit higher. We also prepared a small internal validation set to compare the models by the quality of separation of piano/guitar from the main track. Our model was compared with two other models (Demucs4HT (6 stems) and GSEP). For the piano, we have two validation sets. The first set includes the electric piano as part of the piano part and the second set includes only the acoustic piano.
The metric used is SDR: the larger the better. See the results in the two tables below.

Validation type Algorithm name
Demucs4HT (6 stems) GSEP MVSep Piano 2023 (Type 0) MVSep Piano 2023 (Type 1)
Validation full 2.4432 3.5589 4.9187 4.9772
Validation (only grand piano) 4.5591 5.7180 7.2651 7.2948

 

Validation type Algorithm name
Demucs4HT (6 stems) MVSep Guitar 2023 (Type 0) MVSep Guitar 2023 (Type 1)
Validation guitar 7.2245 7.7716 7.9251
Validation other 13.1756 13.7227 13.8762

3) We have updated the MDX-B Karaoke model (demo). It now has better quality metrics. The MDX-B Karaoke model was originally prepared as part of the Ultimate Vocal Remover project. The model produces high quality extraction of the lead vocal part from a music track. We have also made it available in two variants. In the first variant, the neural network model is used directly on the whole track. In the second variant, the track is first divided into two parts, vocal and instrumental, and then the neural network model is applied only to the vocal part. In the second case, the separation quality is usually higher and it is possible to extract backing vocals into a separate track. The model was compared on a large validation set with two other Karaoke models from UVR (they are also available on the website). See the results in the table below.

Validation type Algorithm name
UVR (HP-KAROKEE-MSB2-3BAND-3090) UVR (karokee_4band_v2_sn) MDX-B Karaoke (Type 0) MDX-B Karaoke (Type 1)
Validation lead vocals 6.46 6.34 6.81 7.94
Validation other 13.17 13.02 13.53 14.66
Validation back vocals --- --- --- 1.88
🗎 Copy link

MVSEP Logo

turbo@mvsep.com

Advanced features

Quality Checker

Algorithms

Full API Documentation

Company

Privacy Policy

Terms & Conditions

Refund Policy

Extra

Help us translate!

Help us promote!