1) We added new model BandIt Plus model for separating tracks into speech, music and effects. The model can be useful for television or film clips. The model was prepared by the authors of the article "A Generalized Bandsplit Neural Network for Cinematic Audio Source Separation" in the repository on GitHub. The model was trained on the Divide and Remaster (DnR) dataset. And at the moment it has the best quality metrics among similar models. You can find demos here.
Quality table
Algorithm name | DnR dataset (test) |
||
SDR Speech | SDR Music | SDR Effects | |
BandIt Plus | 15.62 | 9.21 | 9.69 |
2) The code for almost all models has been updated in such a way that the quality of separation has slightly increased and models became faster overall.
3) The Crowd removal model has been updated. It now has better hollywood laughter removal.