Remove Vocals
Point a neural network at any song and watch it peel the mix apart into isolated...Point a neural network at any song and watch it peel the mix apart into isolated layers, the singing on one track, the drums on another, the bass on a...
Drag and drop your audio file
or click to browse (MP3, WAV, FLAC, OGG - Max 100MB)
Under the Hood
Why Use Remove Vocals?
Demucs, The AI Model That Redefined Stem Separation
Demucs is an open-source stem separation model developed at Meta's FAIR (Fundamental AI Research) lab, first published by Alexandre Défossez et al. in 2019 and refined through four major versions. The latest Hybrid Transformer Demucs (HTDemucs) achieves a Signal-to-Distortion Ratio (SDR) of 7.33 dB on the MUSDB18 benchmark, the standard evaluation dataset containing 150 professionally mixed songs. It processes the stereo waveform through a deep U-Net architecture trained on tens of thousands of tracks, producing results that are dramatically cleaner than old-school phase-cancellation tricks that only worked on perfectly centered vocals.
Four Individual Stems From a Single Upload
Most free tools give you two tracks, vocals and everything else. This tool separates the mix into four: vocals, drums, bass, and other instruments. That granularity lets a DJ isolate a drum break, a producer sample a bass line, a choir director study a vocal arrangement, and a guitarist extract the backing track, all from the same upload. Download them individually or grab the entire set as a ZIP.
Output in the Format That Matches Your Next Step
Choose MP3 when the stem is headed for a playlist, a quick share, or a karaoke app. Choose WAV when the stem is going into a DAW for mixing, layering, or mastering. Choose FLAC when you want lossless quality without the raw WAV file size. The format applies to all stems in the ZIP when you download the full set, so every track in your session stays consistent.
Cloud GPUs Handle the Heavy Math in Under a Minute
Stem separation is one of the most computationally expensive operations in audio processing. Running Demucs locally requires a powerful GPU and several minutes of patience per song. The server here runs the model on dedicated hardware that chews through a four-minute pop track in roughly thirty seconds. Longer songs or complex orchestral arrangements may push toward sixty seconds, but you never wait as long as you would on a consumer laptop.
What People Do With Separated Stems
Hosting Karaoke Night Without Buying Backing Tracks
The global karaoke market is valued at over $5 billion according to Grand View Research, yet commercial backing track libraries charge per-song or per-month subscriptions and still might not have the obscure deep cut your friend always wants to sing. Upload the original song here, select Instrumental, and you have a karaoke-ready track in sixty seconds. The vocal removal is clean enough for a living room party, a bar setup, or a corporate team-building event.
Building Remix Stems Without Contacting the Label
Bedroom producers and mashup artists need isolated elements to layer into new compositions. The full stem set, vocals, drums, bass, instruments, gives you the raw materials to rearrange, retune, re-tempo, and re-contextualize a track. Drop the stems into Ableton, FL Studio, or Logic and start experimenting immediately.
Studying a Vocal Performance Note by Note
Voice teachers and vocal coaches isolate the vocal stem to analyze pitch accuracy, vibrato patterns, breath support, and phrasing choices without the distraction of instrumentation. Students hear their favorite singer's technique in naked detail, every run, every scoop, every deliberate crack, and can model their practice sessions on what they hear.
Preparing Isolated Drum Loops for a DJ Set or Sample Pack
Dance music producers crave authentic drum grooves from genres they do not produce themselves. Separating the drum stem from a funk record, a bossa nova track, or a Motown classic gives you a rhythmic foundation to chop, loop, and layer under your own synths. DJs use isolated drums for transitions where they want rhythm without harmonic clash.
How It Works
Upload the Song You Want to Pull Apart
Drag a file from your music library, Downloads folder, or phone storage into the upload zone. The tool accepts MP3, WAV, FLAC, OGG, M4A, and AAC files up to one hundred megabytes. Higher-quality source files produce cleaner separations, a 320-kbps MP3 or a lossless WAV gives the neural network more spectral detail to work with than a 128-kbps stream rip.
Choose What You Want to Extract
Select the stem you are after: Instrumental strips the vocals and hands back everything else, perfect for karaoke. Vocals Only isolates the singing and removes all instruments, ideal for remixes and a cappella edits. Drums, Bass, or All Stems break the mix down further. The All Stems option downloads a ZIP containing every separated track in a single archive.
Wait for the AI, Preview the Result, and Download
Processing takes thirty to sixty seconds depending on song length and server load. When it finishes, preview the separated audio directly in the browser to make sure the result meets your expectations. Then download in MP3, WAV, or FLAC, whichever format fits your workflow. The server purges both the original upload and the processed stems within sixty minutes.
Getting the Cleanest Separations
Start With the Highest-Quality Source You Can Find
A 320-kbps MP3 or lossless WAV gives Demucs full spectral detail. Stream rips at 128 kbps compress the frequencies the AI relies on for separation, leading to bleedier results. If you have the CD or a FLAC purchase, use that.
Try All Stems Before Deciding What You Need
You came for the instrumental, but the isolated bass line might be exactly the sample you did not know you wanted. Grab the full ZIP and explore each stem, unexpected creative ideas often come from hearing familiar songs in unfamiliar configurations.
Expect Artifacts on Heavily Reverbed Tracks
Songs where the vocal sits inside a wash of reverb or delay blur the boundary between voice and instrument. The AI will still separate, but reverb tails may end up in both the vocal and instrumental stems. Dry, close-miked recordings separate the cleanest.
Layer the Separated Stems Back Together to Verify Quality
Import all four stems into a DAW and play them simultaneously. If the sum sounds identical to the original, the separation preserved everything. If you hear phase issues or missing frequencies, the source quality may be the bottleneck, not the AI.