This is not an Audio File! Aborted Error when uploading the file Drag & Drop to Upload File Release to Upload File 49
Choose Separation Type
Featured Vocals Drums Bass Piano Guitar Wind Strings Super Resolution Percussion Keys MIDI

Ensembles

🔒 Ensemble (vocals, instrum) [Premium only]
Updated 18 days ago

Ensemble of best vocal models. Algorithm gives the highest possible quality for vocal and instrumental stems. The latest ensemble consists of BS Roformer, MelBand Roformer and SCNet XL IHF vocal models.

Vocals Monthly usage: 8 972, Monthly rating: 4.0909 (22 votes)
🔒 Ensemble (vocals, instrum, bass, drums, other) [Premium only]
Updated 261 days ago

This ensemble is based on algorithm which took 2nd place at Music Demixing Track of Sound Demixing Challenge 2023. The main changes comparing to contest version is much better individual stem models.

Vocals Drums Bass Monthly usage: 1 878, Monthly rating: 4.5000 (4 votes)
🔒 Ensemble All-In (vocals, bass, drums, piano, guitar, lead/back vocals, other) [Premium only]
Updated 261 days ago

It's Ensemble (vocals, instrum, bass, drums, other) + more models included like guitars, piano, wind, strings, back/lead vocals and drumsep.

Featured Vocals Drums Bass Piano Guitar Wind Strings Keys Monthly usage: 3 348, Monthly rating: 4.2667 (15 votes)

Multistem

BS Roformer SW (vocals, bass, drums, guitar, piano, other)
Updated 285 days ago

BS Roformer SW model, which generates 6 stems at once with superior quality.

Featured Vocals Drums Bass Piano Guitar Keys Monthly usage: 132 782, Monthly rating: 4.7167 (639 votes)
Demucs4 HT (vocals, drums, bass, other)
Updated 290 days ago

Algorithm Demucs4 HT. It's fast and gives relatively good quality for bass/drums/other stems.

Vocals Drums Bass Monthly usage: 10 117, Monthly rating: 4.8462 (91 votes)

Vocals / Instrumental

BS Roformer (vocals, instrumental)
Updated 77 days ago

BS Roformer model. Excellent quality for vocals/instrumental separation.

Featured Vocals Monthly usage: 101 060, Monthly rating: 4.6478 (318 votes)
MelBand Roformer (vocals, instrumental)
Updated 77 days ago

Algorithm for separating tracks into vocal and instrumental parts based on the MelBand Roformer neural network

Vocals Monthly usage: 87 383, Monthly rating: 4.5700 (100 votes)
MDX23C (vocals, instrumental)
Updated 135 days ago

Set of MDX23C models which is based on code released by kuielab for Sound Demixing Challenge 2023. Very good for vocals/instrumental separation.

Vocals Monthly usage: 7 610, Monthly rating: 4.5667 (30 votes)
SCNet (vocals, instrumental)
Updated 135 days ago

Algorithm for separating tracks into vocal and instrumental parts based on the SCNet neural network

Vocals Monthly usage: 4 608, Monthly rating: 4.2647 (34 votes)
MDX B (vocals, instrumental)
Updated 135 days ago

MDX B models are based on kuielab code from Music Demixing Challenge 2021. Models were retrained by UVR team on big dataset. For long time models were best for vocals/instrumental separation.

Vocals Monthly usage: 1 784, Monthly rating: 4.8000 (15 votes)
Ultimate Vocal Remover VR (vocals, music)
Updated 135 days ago

A set of models from the Ultimate Vocal Remover program, which are based on the old VR architecture. Most of the models are vocal, but there are also special models for karaoke, piano, removing reverberation effects, etc.

Vocals Monthly usage: 9 197, Monthly rating: 4.2500 (16 votes)
Demucs4 Vocals 2023 (vocals, instrum)
Updated 135 days ago

Demucs4 Vocals 2023 model - it's Demucs4 HT model fine-tuned on big vocals dataset.

Vocals Monthly usage: 1 086, Monthly rating: 4.8947 (19 votes)
MVSep Karaoke (lead/back vocals)
Updated 135 days ago

Algorithm for extracting only lead vocals and everything else based on the MelBand Roformer and SCNet models.

Vocals Monthly usage: 63 789, Monthly rating: 4.7406 (293 votes)
MDX-B Karaoke (lead/back vocals)
Updated 135 days ago

The MDX-B Karaoke model was prepared as part of the Ultimate Vocal Remover project. The model produces high-quality lead vocal extraction from a music track.

Vocals Monthly usage: 8 456, Monthly rating: 4.1739 (23 votes)
MVSep Crowd removal (crowd, other)
Updated 94 days ago

An unique model for removing crowd sounds from music recordings (applause, clapping, whistling, noise, laugh etc.).

Monthly usage: 9 831, Monthly rating: 4.1489 (47 votes)
Medley Vox (Multi-singer separation)
Updated 535 days ago

Medley Vox is an algorithm for separating multiple singers within a single music track and evaluation dataset for this task.

Vocals Monthly usage: 6 371, Monthly rating: 2.9524 (21 votes)
MVSep Multichannel BS (vocals, instrumental)
Updated 266 days ago

MVSep Multichannel BS - uses the best vocal model to extract sound from multi-channel audio (5.1, 7.1, etc.).

Vocals Monthly usage: 2 159, Monthly rating: 4.6667 (6 votes)
MVSep Male/Female separation
Updated 436 days ago

A model for separating male and female voices within a single vocal track. The track should contain only voices, no music.

Vocals Monthly usage: 6 608, Monthly rating: 3.0571 (35 votes)
MVSep Choir (choir, other)
Updated 73 days ago

Choir Extraction Model

Monthly usage: 2 094, Monthly rating: 0 (0 votes)
MVSep SATB Choir (soprano, alto, tenor, bass)
Updated 71 days ago

Model to separate vocals and strings to SATB parts (Soprano, Alto, Tenor, and Bass)

Monthly usage: 9 977, Monthly rating: 3.1667 (18 votes)

Drums, Bass and Synth

MVSep Drums (drums, other)
Updated 283 days ago

The MVSep Drums model produces high-quality separation of music into a drums part and everything else.

Drums Monthly usage: 17 842, Monthly rating: 4.6875 (48 votes)
MVSep Bass (bass, other)
Updated 209 days ago

The MVSep Bass model produces high-quality separation of music into a bass part and everything else.

Bass Monthly usage: 11 838, Monthly rating: 4.5172 (29 votes)
MVSep Synth (synth, other)
Updated 113 days ago

Synth extraction model

Monthly usage: 9 627, Monthly rating: 3.9643 (28 votes)
DrumSep (4-6 stems: kick, snare, cymbals, toms, ride, hh, crash)
Updated 342 days ago

The DrumSep model divides the drum track into several types: 'kick', 'snare', 'toms', 'cymbals' (it includes 'hh', 'ride', 'crash').

Drums Monthly usage: 13 779, Monthly rating: 3.9552 (134 votes)

Keys

MVSep Piano (piano, other)
Updated 157 days ago

MVSep Piano model is based on MDX23C, MelRoformer and SCNet Large architectures. It produces high quality separation for piano and other stems.

Piano Keys Monthly usage: 8 020, Monthly rating: 4.6111 (36 votes)
MVSep Digital Piano (digital-piano, other)
Updated 157 days ago

No data found

Piano Keys Monthly usage: 2 202, Monthly rating: 4.0000 (5 votes)
MVSep Keys (keys, other)
Updated 113 days ago

The MVSep Keys is a high quality model for separating music into keys instruments and everything else.

Keys Monthly usage: 3 485, Monthly rating: 3.2000 (5 votes)
MVSep Organ (organ, other)
Updated 178 days ago

The MVSep Organ model produces high-quality separation of music into an organ part and everything else.

Keys Monthly usage: 2 910, Monthly rating: 3.5000 (6 votes)
MVSep Harpsichord (harpsichord, other)
Updated 138 days ago

No data found

Keys Monthly usage: 959, Monthly rating: 4.0000 (1 votes)
MVSep Accordion (accordion, other)
Updated 138 days ago

No data found

Keys Monthly usage: 1 385, Monthly rating: 4.5000 (4 votes)

Guitars

MVSep Guitar (guitar, other)
Updated 283 days ago

The MVSep Guitar model produces high-quality separation of music into a guitar part (including acoustic and electronic) and everything else.

Guitar Strings Monthly usage: 7 663, Monthly rating: 4.4211 (19 votes)
MVSep Acoustic Guitar (acoustic-guitar, other)
Updated 209 days ago

No data found

Guitar Strings Monthly usage: 4 049, Monthly rating: 4.7333 (15 votes)
MVSep Electric Guitar (electric-guitar, other)
Updated 142 days ago

No data found

Guitar Strings Monthly usage: 5 220, Monthly rating: 4.5217 (23 votes)
MVSep Lead/Rhythm Guitar (lead-guitar, rhythm-guitar)
Updated 122 days ago

No data found

Guitar Strings Monthly usage: 10 196, Monthly rating: 3.8276 (29 votes)

Plucked Strings

MVSep Plucked Strings (plucked-strings, other)
Updated 122 days ago

The MVSep Plucked Strings is a high quality model for separating music into plucked strings instruments and everything else.

Strings Monthly usage: 1 813, Monthly rating: 5.0000 (1 votes)
MVSep Harp (harp, other)
Updated 181 days ago

No data found

Strings Monthly usage: 945, Monthly rating: 4.0000 (1 votes)
MVSep Mandolin (mandolin, other)
Updated 175 days ago

No data found

Strings Monthly usage: 324, Monthly rating: 0 (0 votes)
MVSep Banjo (banjo, other)
Updated 142 days ago

No data found

Strings Monthly usage: 557, Monthly rating: 5.0000 (1 votes)
MVSep Sitar (sitar, other)
Updated 138 days ago

No data found

Strings Monthly usage: 417, Monthly rating: 3.0000 (2 votes)
MVSep Ukulele (ukulele, other)
Updated 138 days ago

No data found

Strings Monthly usage: 256, Monthly rating: 0 (0 votes)
MVSep Dobro (dobro, other)
Updated 138 days ago

No data found

Strings Monthly usage: 410, Monthly rating: 0 (0 votes)

Bowed Strings

MVSep Bowed Strings (strings, other)
Updated 142 days ago

The MVSep Bowed Strings is a high quality model for separating music into bowed string instruments and everything else.

Strings Monthly usage: 8 581, Monthly rating: 4.5926 (27 votes)
MVSep Violin (violin, other)
Updated 224 days ago

No data found

Strings Monthly usage: 3 075, Monthly rating: 4.5882 (17 votes)
MVSep Viola (viola, other)
Updated 195 days ago

No data found

Strings Monthly usage: 788, Monthly rating: 1.0000 (1 votes)
MVSep Cello (cello, other)
Updated 195 days ago

No data found

Strings Monthly usage: 1 175, Monthly rating: 4.7143 (7 votes)
MVSep Double Bass (double-bass, other)
Updated 181 days ago

No data found

Strings Monthly usage: 1 033, Monthly rating: 5.0000 (3 votes)

Wind

MVSep Wind (wind, other)
Updated 204 days ago

The MVSep Wind model produces high-quality separation of music into a wind part and everything else.

Wind Monthly usage: 5 955, Monthly rating: 3.8333 (12 votes)
MVSep Brass (brass, other)
Updated 113 days ago

The MVSep Brass is a high quality model for separating music into brass wind instruments and everything else.

Wind Monthly usage: 3 274, Monthly rating: 4.7143 (35 votes)
MVSep Woodwind (woodwind, other)
Updated 113 days ago

The MVSep Woodwind is a high quality model for separating music into woodwind instruments and everything else.

Wind Monthly usage: 1 231, Monthly rating: 4.0000 (5 votes)
MVSep Saxophone (saxophone, other)
Updated 182 days ago

No data found

Wind Monthly usage: 2 584, Monthly rating: 4.9310 (29 votes)
MVSep Flute (flute, other)
Updated 195 days ago

No data found

Wind Monthly usage: 1 833, Monthly rating: 4.2727 (11 votes)
MVSep Trumpet (trumpet, other)
Updated 181 days ago

No data found

Wind Monthly usage: 1 787, Monthly rating: 4.7778 (9 votes)
MVSep Trombone (trombone, other)
Updated 175 days ago

No data found

Wind Monthly usage: 966, Monthly rating: 4.8750 (8 votes)
MVSep Oboe (oboe, other)
Updated 157 days ago

No data found

Wind Monthly usage: 425, Monthly rating: 4.0000 (2 votes)
MVSep Clarinet (clarinet, other)
Updated 157 days ago

No data found

Wind Monthly usage: 762, Monthly rating: 5.0000 (5 votes)
MVSep French Horn (french-horn, other)
Updated 142 days ago

No data found

Wind Monthly usage: 672, Monthly rating: 4.0000 (2 votes)
MVSep Harmonica (harmonica, other)
Updated 142 days ago

No data found

Wind Monthly usage: 602, Monthly rating: 4.6667 (3 votes)
MVSep Tuba (tuba, other)
Updated 138 days ago

No data found

Wind Monthly usage: 535, Monthly rating: 5.0000 (2 votes)
MVSep Bassoon (bassoon, other)
Updated 138 days ago

No data found

Wind Monthly usage: 382, Monthly rating: 0 (0 votes)
MVSep Bagpipes (bagpipes , other)
Updated 28 days ago

The MVSep Bagpipes model provides high-quality music separation, isolating the bagpipes from the rest of the track.

Wind Monthly usage: 741, Monthly rating: 5.0000 (1 votes)

Percussion

MVSep Percussion (percussion, other)
Updated 113 days ago

The MVSep Percussion is a high quality model for separating music into percussion instruments and everything else.

Percussion Monthly usage: 3 965, Monthly rating: 2.6667 (6 votes)
MVSep Tambourine (tambourine, other)
Updated 157 days ago

No data found

Percussion Monthly usage: 715, Monthly rating: 4.0000 (4 votes)
MVSep Marimba (marimba, other)
Updated 142 days ago

No data found

Percussion Monthly usage: 619, Monthly rating: 4.6667 (3 votes)
MVSep Glockenspiel (glockenspiel, other)
Updated 142 days ago

No data found

Percussion Monthly usage: 585, Monthly rating: 4.0000 (2 votes)
MVSep Timpani (timpani, other)
Updated 142 days ago

No data found

Percussion Monthly usage: 501, Monthly rating: 4.0000 (3 votes)
MVSep Triangle (triangle, other)
Updated 138 days ago

No data found

Percussion Monthly usage: 204, Monthly rating: 3.0000 (1 votes)
MVSep Congas (congas , other)
Updated 138 days ago

No data found

Percussion Monthly usage: 709, Monthly rating: 4.8000 (5 votes)
MVSep Bells (bells, other)
Updated 138 days ago

No data found

Percussion Monthly usage: 1 726, Monthly rating: 4.0000 (5 votes)
MVSep Wind Chimes (wind-chimes, other)
Updated 138 days ago

No data found

Percussion Monthly usage: 379, Monthly rating: 5.0000 (3 votes)
MVSep Xylophone (xylophone, other)
Updated 81 days ago

No data found

Percussion Monthly usage: 575, Monthly rating: 5.0000 (3 votes)
MVSep Celesta (celesta, other)
Updated 81 days ago

No data found

Percussion Monthly usage: 413, Monthly rating: 5.0000 (1 votes)

Effects

MVSep Demucs4HT DNR (speech, music, effects)
Updated 174 days ago

No data found

Monthly usage: 2 695, Monthly rating: 3.0000 (6 votes)
BandIt Plus (speech, music, effects)
Updated 744 days ago

BandIt Plus model for separating tracks into speech, music and effects.

Monthly usage: 2 736, Monthly rating: 4.0000 (7 votes)
BandIt v2 (speech, music, effects)
Updated 612 days ago

Bandit v2 is a model for cinematic audio source separation in 3 stems: speech, music, effects/sfx. It was trained on DnR v3 dataset.

Monthly usage: 1 252, Monthly rating: 3.8571 (7 votes)
MVSep DnR v3 (speech, music, effects)
Updated 174 days ago

MVSep DnR v3 is a cinematic model for splitting tracks into 3 stems: music, sfx and speech.

Monthly usage: 13 120, Monthly rating: 3.2941 (17 votes)
MVSep Braam (braam , other)
Updated 28 days ago

The MVSep Braam model provides high-quality music separation, isolating the cinematic "Braam" sound effect from the rest of the track.

Monthly usage: 758, Monthly rating: 0 (0 votes)
MVSep FX (fx, other)
Updated 3 days ago

No data found

Monthly usage: 2 182, Monthly rating: 3.7500 (8 votes)

Upscale and Restoration

Apollo Enhancers (by JusperLee, Lew, baicai1145)
Updated 171 days ago

The algorithm restores the quality of audio. For example MP3 files compressed to 128 kbps or lower and other types.

Super Resolution Monthly usage: 9 440, Monthly rating: 4.5417 (24 votes)
Reverb Removal (noreverb)
Updated 90 days ago

Set of different models to remove reverberation effect from music.

Monthly usage: 15 213, Monthly rating: 4.4444 (36 votes)
DeNoise by aufr33
Updated 595 days ago

No data found

Monthly usage: 15 150, Monthly rating: 4.7571 (70 votes)
AudioSR (Super Resolution)
Updated 366 days ago

Algorithm AudioSR: Versatile Audio Super-resolution at Scale. Algorithm restores high frequencies.

Super Resolution Monthly usage: 3 031, Monthly rating: 4.5625 (16 votes)
FlashSR (Super Resolution)
Updated 366 days ago

FlashSR - audio super resolution algorithm for restoring high frequencies

Super Resolution Monthly usage: 2 723, Monthly rating: 4.0000 (14 votes)

ASR and TTS

Stable Audio Open Gen
Updated 114 days ago

Generating audio based on a given text prompt

Monthly usage: 309, Monthly rating: 3.3333 (6 votes)
Whisper (extract text from audio)
Updated 240 days ago

Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation.

Monthly usage: 845, Monthly rating: 2.6250 (8 votes)
Parakeet (extract text from audio)
Updated 49 days ago

Parakeet by NVIDIA is a state-of-the-art automatic speech recognition (ASR) model designed for accurate and efficient conversion of spoken language into text.

Monthly usage: 379, Monthly rating: 4.0000 (2 votes)
VibeVoice (Voice Cloning)
Updated 22 days ago

VibeVoice - is a model for generating natural conversational dialogues from text with the ability to use a reference voice for cloning purposes.

Monthly usage: 3 975, Monthly rating: 3.0000 (9 votes)
VibeVoice (TTS)
Updated 114 days ago

VibeVoice (TTS) - is a model for generating natural conversational dialogues from text, capable of creating dialogues with up to 4 speakers and durations of up to 90 minutes.

Monthly usage: 70, Monthly rating: 0 (0 votes)
Qwen3-TTS (Custom Voice)
Updated 22 days ago

Qwen3-TTS (Custom Voice) is a powerful speech generation model with predefined speakers

Monthly usage: 35, Monthly rating: 5.0000 (1 votes)
Qwen3-TTS (Voice Design)
Updated 22 days ago

Qwen3-TTS (Voice Design) is a powerful speech generation model capable of creating a voice based on a description.

Monthly usage: 127, Monthly rating: 5.0000 (3 votes)
Qwen3-TTS (Voice Cloning)
Updated 18 days ago

Qwen3-TTS (Voice Cloning) is a powerful speech generation model capable of cloning a voice based on a reference audio file

Monthly usage: 912, Monthly rating: 2.0000 (4 votes)

Experimental and Misc

Bark (Speech Gen)
Updated 42 days ago

Bark - is a generative model for realistic speech, music, and emotions.

Monthly usage: 232, Monthly rating: 4.2500 (4 votes)
MVSep MultiSpeaker (MDX23C)
Updated 628 days ago

MVSep MultiSpeaker (MDX23C) - this model tries to isolate the most loud voice from all other voices.

Monthly usage: 523, Monthly rating: 1.0000 (1 votes)
Aspiration (by Sucial)
Updated 516 days ago

The algorithm adds "whispering" effect to vocals.

Monthly usage: 462, Monthly rating: 0 (0 votes)
Phantom Centre extraction (by wesleyr36)
Updated 516 days ago

No data found

Monthly usage: 2 353, Monthly rating: 0 (0 votes)
Matchering (by sergree)
Updated 190 days ago

Matchering is a novel tool for audio matching and mastering.

Monthly usage: 3 299, Monthly rating: 5.0000 (13 votes)
SOME (Singing-Oriented MIDI Extractor)
Updated 122 days ago

SOME is a MIDI extractor that can convert singing voice to MIDI sequence.

MIDI Monthly usage: 418, Monthly rating: 0 (0 votes)
Transkun (piano -> midi)
Updated 56 days ago

High-quality transcription of piano music into MIDI

MIDI Monthly usage: 1 668, Monthly rating: 4.6000 (5 votes)
Basic Pitch (MIDI Extraction)
Updated 50 days ago

Basic Pitch is a modern neural network that converts melodic audio recordings into notes (MIDI format).

MIDI Monthly usage: 991, Monthly rating: 2.7143 (7 votes)
HeartMuLa (Song Gen)
Updated 1 days ago

HeartMuLa - an open-source AI model for music generation (an alternative to Suno and Udio)

Monthly usage: 45, Monthly rating: 0 (0 votes)

Old Models

Demucs3 Model (vocals, drums, bass, other)
Updated 473 days ago

Algorithm Demucs3 (A and B versions)

Vocals Drums Bass Monthly usage: 187, Monthly rating: 2.5000 (2 votes)
MDX A/B (vocals, drums, bass, other)
Updated 476 days ago

No data found

Vocals Drums Bass Monthly usage: 82, Monthly rating: 0 (0 votes)
Vit Large 23 (vocals, instrum)
Updated 423 days ago

Experimental model VitLarge23 based on Vision Transformers. In terms of metrics, it is slightly inferior to the MDX23C, but may work better in some cases.

Vocals Monthly usage: 80, Monthly rating: 0 (0 votes)
UVRv5 Demucs (vocals, music)
Updated 790 days ago

No data found

Vocals Monthly usage: 153, Monthly rating: 0 (0 votes)
MVSep DNR (music, sfx, speech)
Updated 822 days ago

No data found

Monthly usage: 428, Monthly rating: 5.0000 (1 votes)
MVSep Old Vocal Model (vocals, music)
Updated 194 days ago

No data found

Vocals Monthly usage: 114, Monthly rating: 0 (0 votes)
Demucs2 (vocals, drums, bass, other)
Updated 1210 days ago

No data found

Vocals Drums Bass Monthly usage: 47, Monthly rating: 0 (0 votes)
Danna Sep (vocals, drums, bass, other)
Updated 1210 days ago

No data found

Vocals Drums Bass Monthly usage: 29, Monthly rating: 0 (0 votes)
Byte Dance (vocals, drums, bass, other)
Updated 1210 days ago

No data found

Vocals Drums Bass Monthly usage: 49, Monthly rating: 2.0000 (1 votes)
spleeter
Updated 476 days ago

No data found

Monthly usage: 107, Monthly rating: 0 (0 votes)
UnMix
Updated 476 days ago

No data found

Monthly usage: 146, Monthly rating: 0 (0 votes)
Zero Shot (Query Based) (Low quality)
Updated 750 days ago

No data found

Monthly usage: 104, Monthly rating: 0 (0 votes)
LarsNet (kick, snare, cymbals, toms, hihat)
Updated 476 days ago

The LarsNet model divides the drums stem into 5 types: 'kick', 'snare', 'cymbals', 'toms', 'hihat'.

Drums Monthly usage: 479, Monthly rating: 5.0000 (1 votes)
No data found
MVSEP Logo
  • Home
  • News
  • Plans
  • Demo
  • Create Account
  • Login
  • Theme
    Model Selector
    Language
    • English
    • Русский
    • 中文
    • اَلْعَرَبِيَّةُ
    • Polski
    • Portugues do Brasil
    • Español
    • 日本語
    • Français
    • Oʻzbekcha
    • Türkçe
    • हिन्दी
    • Tiếng Việt
    • Deutsch
    • 한국어
    • Bahasa Indonesia
    • Italiano
    • Svenska
    • suomi
    • български език
    • magyar nyelv
    • עִבְֿרִית
    • ภาษาไทย
    • hrvatski
    • Română

Music & Voice Separation

MVSEP performs separation of audio on voice and music parts
Target audio
Drag & Drop to Upload File
Reference audio
Drag & Drop to Upload File
Drag & Drop to Upload File
OR
Remote Upload
Batch Upload

Note: Registering gives you higher priority in queue and lossless exports.

0%

Free separations for today: 50 / 50.

Unprocessed files in queue: 694. Currently processed with GPU: 15


March 2026 News

1) We added an iOS app and updated our Android app. They are both now live. 

The latest release adds the following features:

  • Auto-update checker
  • You can now send reviews for separations, just like on the website
  • Bug fixes

2) We have introduced a variety of new models for separating individual instruments:

1) MVSep Lead/Rhythm Guitar (Demo) 2) MVSep Plucked Strings (Demo) 3) MVSep Percussion (Demo)
4) MVSep Keys (Demo) 5) MVSep Brass (Demo) 6) MVSep Woodwind (Demo)
7) MVSep Xylophone (Demo) 8) MVSep Celesta (Demo) 9) MVSep Choir (Demo)
10) MVSep Bagpipes (Demo) 11) MVSep Braam (Demo) 12) MVSep FX (Demo)

The current separation scheme can be found below:

Instruments chart

 

3) A new model, MVSep SATB Choir (soprano, alto, tenor, bass), has been added. 

Description: https://mvsep.com/algorithms/104
Demo 1 vocals
Demo 2 vocals
Demo strings

A huge thanks to @Dry Paint Dealer Undr for helping me create this model.
P.S. The model works not only with vocals but also with strings and some other instruments.

4) We added the powerful VibeVoice model to the Experimental section. It is available in 2 variants: Voice Cloning and Text-to-Speech.

Key Features:

  • Two models: small (1.5B parameters) and large (7B parameters)
  • Up to 4 speakers in a single recording
  • Up to 90 minutes of generated audio
  • Language support: Officially supports English (default) and Chinese, but it has been verified to work decently for other languages as well.
  • Voice cloning: The ability to upload a reference audio recording

VibeVoice (Voice Cloning): Info | Demo 1 | Demo 2
VibeVoice (TTS): Info | Demo 1

We also noted that if a sample contains some music along with words, it can make the generated voice sing. 

5) We added a new Crowd removal model based on the BSRoformer architecture. It's available in "MVSep Crowd removal (crowd, other)" under the name "BS Roformer (SDR crowd: 7.21)". The SDR has increased from 6.27 to 7.21.

6) Three new vocal models have been added.

In BS Roformer (vocals, instrumental):

  • unwa BS Roformer HyperACE v2 instrum (SDR instrum: 17.40)
  • unwa BS Roformer HyperACE v2 vocals (SDR vocals: 11.39)

In MelBand Roformer (vocals, instrumental):

  • becruily deux (SDR vocals: 11.35, SDR instrum: 17.66)

7) We added the new Transkun model. Transkun is a modern, open-source model for automatic piano music transcription (Audio-to-MIDI). The official page for the model is here. It is considered one of the best (SOTA — State of the Art) in its class. The model can recognize not only the notes themselves but also their duration, loudness (velocity), and pedal usage.

Demo | Model link

 

8) We added the new Basic Pitch model. Basic Pitch is a modern neural network from Spotify’s Audio Intelligence Lab that converts melodic audio recordings into notes (MIDI format). Unlike outdated converters, this model can "hear" not only individual notes but also chords, along with the finest nuances of a performance. Basic Pitch is an "instrument-agnostic" model. This means it handles different timbres equally well:

  •     Vocals
  •     Strings: Acoustic and electric guitars, violins, and cellos.
  •     Keyboards: Pianos, organs, and synthesizers.
  •     Winds: Flutes, saxophones, trumpets, and others.

Important: The model is designed for melodic instruments. It is not suitable for drums or percussion, as it focuses on pitch rather than rhythmic noise.

Demo | Description | Model link

9) We added the Bark (Speech Gen) algorithm to the Experimental section. Bark is a transformer-based model created by Suno, representing not just a traditional text-to-speech tool, but a fully generative "text-to-audio" system. Its capabilities go far beyond ordinary voicing: besides creating highly realistic speech in multiple languages, Bark can generate music, background noises, and simple sound effects. A unique feature of the model is its ability to reproduce subtle non-verbal communication, such as laughter, sighs, and crying, making the resulting sound maximally alive and natural.

Demo | Description

In our experiments, it sometimes doesn't follow the text or instructions. See the demo as an example.

10) We added Qwen3-TTS, a powerful speech generation model offering support for voice cloning, voice design, ultra-high-quality human-like speech generation, and natural language-based voice control. At MVSep, we use the largest 1.7 billion parameter model. The model is available in 3 variants:

  • Qwen3-TTS (Custom Voice) - A model with predefined speakers | Demo
  • Qwen3-TTS (Voice Design) - A model capable of creating a voice based on a description | Demo
  • Qwen3-TTS (Voice Cloning) - A model capable of cloning a voice based on a reference audio file | Demo

11) We added the new HeartMuLa algorithm to the site. It is an advanced open-source family of multimodal foundation models (Apache 2.0 license) designed for high-quality music synthesis and audio processing. Unlike proprietary cloud services (such as Suno or Udio), HeartMuLa gives developers the ability to run it locally on their own hardware. The quality of the generated songs is quite good.

Official repository | Demos 1 | Demos 2 | Documentation

Current limitations: 
1) The model struggles to follow tags.
2) The model is computationally heavy and uses a lot of VRAM.

❌ Hide article

MVSEP Logo

turbo@mvsep.com

Google Play App Store
Site information

FAQ

Quality Checker

Algorithms

Full API Documentation

Company

Privacy Policy

Terms & Conditions

Refund Policy

Cookie Notice

Extra

Help us translate!

Help us promote!

0:00

0:00
0:00