MVSEP Logo
  • Home
  • News
  • Plans
  • Demo
  • FAQ
  • Create Account
  • Login

MVSep DnR v3 (speech, music, sfx)

MVSep DnR v3 is a cinematic model for splitting tracks into 3 stems: music, sfx and speech. It is trained on a huge multilingual dataset DnR v3. The quality metrics on the test data turned out to be better than those of a similar multilingual model Bandit v2. The model is available in 3 variants: based on SCNet, MelBand Roformer architectures, and an ensemble of these two models. See the table below:

Algorithm name SDR Metric on DnR v3 leaderboard
music (SDR) sfx (SDR) speech (SDR)
SCNet Large  9.94 11.35 12.59
Mel Band Roformer 9.45 11.24 12.27
Ensemble (Mel + SCNet) 10.15 11.67 12.81
Bandit v2 (for reference) 9.06 10.82 12.29
🗎 Copy link

MVSEP Logo

turbo@mvsep.com

Advanced features

Quality Checker

Algorithms

Full API Documentation

Company

Privacy Policy

Terms & Conditions

Refund Policy

Extra

Help us translate!

Help us promote!