Set of different models to remove reverberation effect from music/vocals.
| Author | Architecture | Works with | SDR (no independent testing yet) | Link |
| FoxJoy | MDX-B | Full track | ~6.50 | |
| anvuew | MelRoformer | Only vocals | 7.56 | |
| anvuew | BSRoformer | Only vocals | 8.07 | |
| anvuew v2 | MelRoformer | Only vocals | --- | |
| Sucial | MelRoformer | Only vocals | 10.01 | |
| anvuew | BSRoformer | Only vocals (Room) | 13.74 | HF Link |
| anvuew | BSRoformer | Only vocals (Stereo) | 22.50 | HF Link |
Reverberation (Reverb) is the physical process of gradual sound decay in an enclosed space after the sound source has stopped. If a regular echo is distinct, separate copies of a sound (like shouting in the mountains: "Hello... hello... hello"), then reverberation — is a dense, continuous humming cloud of thousands of blended reflections from walls, the floor, the ceiling, and other surfaces (like the sound of a clap in an empty cathedral or a stairwell).
In audio engineering, the reverb effect is used to place a dry (studio-recorded) sound into a virtual space and give it volume and depth.
What does reverberation consist of?
Acoustically, this process can be divided into three stages:
-
Direct Sound: The sound wave that reaches the listener or microphone in a straight line, without any reflections. This is the loudest and clearest signal.
-
Early Reflections: The first echoes that bounce off the nearest surfaces and reach the ears a few milliseconds after the direct sound. They are what give our brain information about the size and shape of the room we are in.
-
Reverb Tail (Late Reflections): A multitude of chaotic, intertwining reflections that bounce off surfaces again and again. They merge into a continuous hum and gradually lose energy (decay).
Main parameters in reverb plugins
When you open a reverb plugin in a DAW (Digital Audio Workstation), you control the physical properties of this virtual room:
-
Size / Room Size: Sets the volume of the virtual space (from a tiny vocal booth to a massive stadium).
-
Decay / Reverb Time / RT60: The time (usually in seconds) it takes for the reverb tail to decay by 60 decibels, meaning it practically disappears.
-
Pre-Delay: A very important parameter that sets the pause (in milliseconds) between the direct sound and the onset of reverberation. Increasing Pre-Delay helps separate the vocal or instrument from the "tail", preserving their clarity while maintaining the sense of a large space.
-
Damping: Simulates sound absorption. In real life, soft surfaces (carpets, people, curtains) quickly absorb high frequencies, so a long reverb tail usually sounds more muffled than the direct signal.
-
Mix / Dry/Wet: The ratio between the original dry signal (Dry) and the processed signal (Wet).
Why is reverb necessary in music mixing?
-
Creating depth (staging): Reverb acts as the Z-axis (depth) in a mix. A loud and dry sound appears close to the listener (right in their face), while a quiet sound with a lot of reverb seems distant.
-
Gluing the mix: If all instruments are recorded in different deadened studios, the mix can sound disjointed. Sending them to a common reverb bus (even in small amounts) places them into a single acoustic space.
-
Artistic effect: Creating an ethereal, ambient, or epic atmosphere (for example, the Shimmer effect, where the reverb tail is also pitched up an octave).
Why is it necessary to remove the reverb effect?
Removing reverberation (or dereverberation) — is the process of cleaning an audio signal from acoustic room reflections to obtain the original dry sound. Although reverb makes a sound beautiful and spacious, in many professional scenarios, this effect turns into unwanted noise or a serious obstacle. Here are the main reasons why there is a need to "dry out" the sound:
-
Music Source Separation: When extracting vocals or individual instruments from a mixed stereo track, reverb tails create a serious problem — they "eat into" the useful signal. Effective dereverberation allows you to get a truly clean acapella or instrument stem that sounds as if it was just recorded in a studio, rather than extracted from a concert hall.
-
Automatic Speech Recognition (ASR) systems: Room echo and hum are the worst enemies of acoustic models. Reflections "smear" short consonant sounds and phonemes. In complex machine learning tasks, such as creating children's speech recognition models, where articulation is often already unclear, the presence of reverberation catastrophically reduces transcription accuracy. Therefore, dereverberation is a critical preprocessing step for audio datasets.
-
Sampling and remixing: If you take a vocal sample or a drum loop from an old recording, it already contains the space of the original mix. If you add this sample to your track and apply your own new reverb on top of it, it will result in acoustic "mud" (the effect of reverb on reverb). To integrate someone else's sound into your mix architecture, it must first be cleaned.
-
Video and film post-production (ADR & Location Sound): Actors' speech is often recorded with shotgun microphones right on the set (for example, in an echoey empty room or a stairwell). For the dialogue to sound tight, intelligible, and studio-quality, the sound engineer needs to suppress the natural reflections of the location.
-
Restoration and forensics: Recordings from surveillance cameras, hidden microphones, or dictaphones often contain so much room hum that the words become unintelligible. Suppressing the reverberation helps restore speech intelligibility.
How does it work technologically? In the past, sound engineers tried to combat the room using Noise Gates and Transient Shapers, which simply cut off the quiet tails of the sounds. This was a crude method and often distorted the useful signal itself. Today, the task of dereverberation is solved using AI and neural networks that are trained to analyze the spectrogram, distinguish direct signal patterns from reflection patterns, and mathematically subtract the latter without damaging the original.
