MVSEP Logo
  • Home
  • News
  • Plans
  • Demo
  • FAQ
  • Create Account
  • Login

SCNet (vocals, instrumental)

Algorithm for separating tracks into vocal and instrumental parts based on the SCNet neural network. The neural network was proposed in the article "SCNet: Sparse Compression Network for Music Source Separation" by a group of scientists from China. The authors made the neural network code open source, and the MVSep team was able to reproduce results similar to those presented in the published article. First, we trained a small version of SCNet, and then after some time, a heavier version of SCNet was prepared. The quality metrics are quite close to the quality of Roformer models (which are the top models at the moment), but still slightly inferior. However, in some cases, the model can work better than Roformers.

Quality metrics

Algorithm name Multisong dataset Synth dataset MDX23 Leaderboard
SDR Vocals SDR Instrumental SDR Vocals SDR Instrumental SDR Vocals
SCNet 10.25 16.56 12.27 11.97 ---
SCNet Large 10.74 17.05 12.89 12.59 ---
SCNet XL 10.96 17.27 13.08 12.78 ---
SCNet XL (high fullness) 10.92 17.23 --- --- ---
SCNet XL (very high fullness) 10.40 16.60 --- --- ---

Detailed statistics on Multisong dataset:

Model Vocals fullness Vocals bleedless  Vocals SDR Vocals L1Freq Instrum fullness Instrum bleedless  Instrum SDR Instrum L1Freq
SCNet 17.34 25.24 10.25 35.47 29.35 32.34 16.56 36.24
SCNet Large 17.70 26.84 10.74 36.86 27.10 41.47 17.05 37.62
SCNet XL 17.96 26.95 10.96 37.35 28.74 39.42 17.27 38.09
SCNet XL (high fullness) 21.67 25.00 10.92 37.70 31.95 34.06 17.23 37.91
SCNet XL (very high fullness) 23.50 25.30 10.40 37.16 34.04 35.15 16.60 36.78
🗎 Copy link

MVSEP Logo

turbo@mvsep.com

Advanced features

Quality Checker

Algorithms

Full API Documentation

Company

Privacy Policy

Terms & Conditions

Refund Policy

Extra

Help us translate!

Help us promote!