MVSEP Logo
  • Home
  • News
  • Plans
  • Demo
  • FAQ
  • Create Account
  • Login

    MVSep DnR v3 (speech, music, sfx)

    MVSep DnR v3 is a cinematic model for splitting tracks into 3 stems: music, sfx and speech. It is trained on a huge multilingual dataset DnR v3. The quality metrics on the test data turned out to be better than those of a similar multilingual model Bandit v2. The model is available in 3 variants: based on SCNet, MelBand Roformer architectures, and an ensemble of these two models. See the table below:

    Algorithm name SDR Metric on DnR v3 leaderboard
    music (SDR) sfx (SDR) speech (SDR)
    SCNet Large  9.94 11.35 12.59
    Mel Band Roformer 9.45 11.24 12.27
    Ensemble (Mel + SCNet) 10.15 11.67 12.81
    Bandit v2 (for reference) 9.06 10.82 12.29
    🗎 Copy link

    MVSEP Logo

    turbo@mvsep.com

    Advanced features

    Quality Checker

    Algorithms

    Full API Documentation

    Company

    Privacy Policy

    Terms & Conditions

    Refund Policy

    Extra

    Help us translate!

    Help us promote!