1) We have released a new high quality model BS Roformer v2. This is Transformers-based architecture from the ByteDance team. Quality metrics are slightly superior to those of the MDX23C. The model continues to improve, so expect new releases in the near future. The demo can be viewed here.
2) All ensembles have been updated to take into account BS Roformer v2. The old version of the ensembles also remains available. Ensemble SDR metrics have increased:
Vocals SDR: 10.44 -> 10.75
Instrumental SDR: 16.74 -> 17.06
3) We have added the ability to download an archive of files received after separation.
4) A high-quality model Whisper (large-v3 version) from OpenAI has been added, which allows you to obtain a transcription of a song/dialogue text from arbitrary audio.