macOS Still Has No Volume Mixer, So I Built One

macOS has never had a proper per-app volume mixer.

On Windows, this is basic. You can lower Spotify, mute a browser, keep a meeting app loud, and leave the rest of the system untouched.

On macOS, you usually get two options:

change the entire system volume
hope each app has its own volume control

That always felt strange to me.

So I built Mimir a macOS menu bar app that lets you control the volume of individual apps.

Full source code is on GitHub: github.com/ThalesBMC/Mimir

The interesting part is that this is now possible without kernel extensions, virtual audio drivers, or routing hacks.

Starting with macOS 14.2, Apple introduced the Core Audio Process Tap API. It lets you capture and process audio from individual apps directly.

This article is a technical breakdown of how I used it to build a per-app volume mixer in Swift.

The basic idea is simple: capture audio from a process, mute the original output, process the stream yourself, apply gain, and send it back to the output device.

The flow looks like this:

App audio output
-> Process Tap
-> Aggregate Device
-> IOProc callback
-> Default output device

The key detail is .mutedWhenTapped. When this mode is enabled, the app audio no longer goes directly to the output device. Instead, it is routed through your tap, where you can apply volume changes, mute it, or measure levels in real time.

First, you need to find the process AudioObjectID, because Core Audio does not work directly with PIDs:

static func findProcessObjectID(for pid: pid_t) -> AudioObjectID? {
    var propertyAddress = AudioObjectPropertyAddress(
        mSelector: kAudioHardwarePropertyProcessObjectList,
        mScope: kAudioObjectPropertyScopeGlobal,
        mElement: kAudioObjectPropertyElementMain
    )

    var propertySize: UInt32 = 0

    guard AudioObjectGetPropertyDataSize(
        AudioObjectID(kAudioObjectSystemObject),
        &propertyAddress,
        0,
        nil,
        &propertySize
    ) == noErr else {
        return nil
    }

    let count = Int(propertySize) / MemoryLayout<AudioObjectID>.size
    var objectList = [AudioObjectID](repeating: 0, count: count)

    guard AudioObjectGetPropertyData(
        AudioObjectID(kAudioObjectSystemObject),
        &propertyAddress,
        0,
        nil,
        &propertySize,
        &objectList
    ) == noErr else {
        return nil
    }

    for objectID in objectList {
        var processPID: pid_t = 0

        var pidAddress = AudioObjectPropertyAddress(
            mSelector: kAudioHardwarePropertyProcessPID,
            mScope: kAudioObjectPropertyScopeGlobal,
            mElement: kAudioObjectPropertyElementMain
        )

        var pidSize = UInt32(MemoryLayout<pid_t>.size)

        let err = AudioObjectGetPropertyData(
            objectID,
            &pidAddress,
            0,
            nil,
            &pidSize,
            &processPID
        )

        if err == noErr, processPID == pid {
            return objectID
        }
    }

    return nil
}

Once you have the process object, you can create a tap for that app:

@available(macOS 14.2, *)
func createProcessTap(for processObjectID: AudioObjectID) throws -> AudioObjectID {
    let tapDescription = CATapDescription(
        stereoMixdownOfProcesses: [processObjectID]
    )

    tapDescription.uuid = UUID()
    tapDescription.muteBehavior = .mutedWhenTapped

    var tapID: AudioObjectID = kAudioObjectUnknown
    let err = AudioHardwareCreateProcessTap(tapDescription, &tapID)

    guard err == noErr else {
        throw NSError(domain: NSOSStatusErrorDomain, code: Int(err))
    }

    return tapID
}

At this point, you have a Core Audio object representing the audio stream of a specific app. But you still cannot attach an IOProc directly to the tap. For that, the tap needs to be wrapped in an aggregate device together with the real output device:

func buildAggregateDescription(
    outputUID: String,
    tapUUID: UUID,
    name: String
) -> [String: Any] {
    [
        kAudioAggregateDeviceNameKey: name,
        kAudioAggregateDeviceUIDKey: "SoundManager-\(tapUUID.uuidString)",
        kAudioAggregateDeviceTapListKey: [[
            kAudioSubTapUIDKey: tapUUID.uuidString,
            kAudioSubTapDriftCompensationKey: true
        ]],
        kAudioAggregateDeviceSubDeviceListKey: [[
            kAudioSubDeviceUIDKey: outputUID
        ]],
        kAudioAggregateDeviceMainSubDeviceKey: outputUID,
        kAudioAggregateDeviceIsPrivateKey: true
    ]
}

Then create the aggregate device:

var aggregateDeviceID: AudioObjectID = kAudioObjectUnknown

let err = AudioHardwareCreateAggregateDevice(
    description as CFDictionary,
    &aggregateDeviceID
)

guard err == noErr else {
    throw NSError(domain: NSOSStatusErrorDomain, code: Int(err))
}

One subtle detail: after AudioHardwareCreateAggregateDevice returns, the device may not be ready immediately. In practice, you should wait or poll before starting the IOProc, otherwise audio can fail silently or behave inconsistently.

The actual volume control happens inside the real-time audio callback:

AudioDeviceCreateIOProcIDWithBlock(
    &deviceProcID,
    aggregateDeviceID,
    queue
) { [weak self] _, inInputData, _, outOutputData, _ in
    self?.processAudio(inInputData, to: outOutputData)
}

The callback should stay boring and predictable: no allocations, no locks, no Objective-C calls, no logging, and no work that could block. It should only read samples, apply gain, and write the result:

private func processAudio(
    _ input: UnsafePointer<AudioBufferList>,
    to output: UnsafeMutablePointer<AudioBufferList>
) {
    let inABL = input.pointee
    var outABL = output.pointee

    guard inABL.mNumberBuffers > 0,
          outABL.mNumberBuffers > 0 else {
        return
    }

    let inBuffer = inABL.mBuffers

    let frameCount = Int(inBuffer.mDataByteSize) /
        (MemoryLayout<Float32>.size * 2)

    guard let inData = inBuffer.mData?.assumingMemoryBound(to: Float32.self),
          let outData = outABL.mBuffers.mData?.assumingMemoryBound(to: Float32.self) else {
        return
    }

    for i in 0..<(frameCount * 2) {
        currentVolume += (targetVolume - currentVolume) * rampCoefficient
        outData[i] = inData[i] * currentVolume
    }
}

The ramp is important. Changing gain instantly can create audible clicks, so smoothing the transition over a few milliseconds makes the app feel much better:

let rampTimeSeconds: Float = 0.030

rampCoefficient = 1 - exp(
    -1 / (Float(sampleRate) * rampTimeSeconds)
)

Output device changes are another important part of the implementation. When the user connects headphones or switches audio output, the aggregate devices need to be recreated using the new default output device. Otherwise, audio may stop working silently.

You can listen for changes to kAudioHardwarePropertyDefaultOutputDevice:

func startDeviceChangeListener() {
    var propertyAddress = AudioObjectPropertyAddress(
        mSelector: kAudioHardwarePropertyDefaultOutputDevice,
        mScope: kAudioObjectPropertyScopeGlobal,
        mElement: kAudioObjectPropertyElementMain
    )

    AudioObjectAddPropertyListenerBlock(
        AudioObjectID(kAudioObjectSystemObject),
        &propertyAddress,
        queue
    ) { [weak self] _, _ in
        self?.handleDeviceChange()
    }
}

When the output device changes, the safest approach is to save state, invalidate the current taps, recreate everything against the new output device, and restore the previous volume and mute state:

func handleDeviceChange() {
    for (pid, tap) in activeTaps {
        tapStates[pid] = (
            volume: tap.volume,
            muted: tap.isMuted
        )
    }

    for tap in activeTaps.values {
        tap.invalidate()
    }

    activeTaps.removeAll()

    for pid in tapStates.keys {
        guard let tap = ProcessTapController(pid: pid) else {
            continue
        }

        if let state = tapStates[pid] {
            tap.volume = state.volume
            tap.isMuted = state.muted
        }

        try? tap.activate()
        activeTaps[pid] = tap
    }
}

Modern apps also make process detection tricky. Browsers, Electron apps, and WebKit-based apps often play audio through helper processes. For a good user experience, those helpers should be grouped under the main app instead of showing duplicate or confusing sliders.
In short, the Core Audio Process Tap API finally makes per-app volume control on macOS possible in a much cleaner way than older approaches.

The implementation comes down to:

Find the audio process.
Create a process tap.
Wrap it in an aggregate device.
Attach an IOProc.
Apply gain in real time.
Recreate everything when the output device changes.

The hard parts are not the main concept, but the details around permissions, tap lifecycle, helper processes, device changes, and staying real-time safe inside the audio callback.

That is how I built Mimir!

Source code: github.com/ThalesBMC/Mimir

推荐订阅源

DEV Community