Methodology Paper · v1.0

The Science Behind Transientik Master

How a deterministic engine listens to a track, builds a structured picture of it, and turns that picture into safe, explainable mastering decisions.

Version
1.0 · methodology
Date
May 2026
Reading
~12 min
Status
Public summary · production engine in private beta
By
Transientik Labs
ABSTRACT

Keywords

  • EBU R128 / BS.1770-4
  • weighted percentile aggregation
  • adaptive band targets
  • decision tracing
  • deterministic DSP
  • frame-driven analysis
  • risk-aware loudness
  • explainable automation

Transientik Master treats mastering as a measurement problem before a creative one. A short listening pass captures one analysis frame every 100 ms; the buffer is then reduced into a typed track snapshot using weighted percentile statistics — so loud, tonal, perceptually relevant frames count more than quiet intros, ringouts and crossfades. A planning layer derives risk-aware loudness, high-frequency and stereo policies; a destination recipe turns those into concrete parameter targets, and every move is recorded as a decision with a reason. The result is a release-ready chain that can be inspected, justified and reproduced — bit-identical, run after run.

01 · Pipeline

Listen. Aggregate. Decide. Fine-tune.

Every wizard run flows through four well-defined stages. Each stage has a single job, a typed input and a typed output — there are no hidden side effects between them, and any stage can be inspected independently in tests.

  1. 01
    Listen
    FFT · stereo · transient · LUFS
  2. 02
    Aggregate
    weighted P75 + contrast
  3. 03
    Decide
    planners + recipe + trace
  4. 04
    Fine-tune
    12 perceptual nudges
Fig. 1 — The four pipeline stages. Live analyzers keep running after capture, so the BEFORE / AFTER displays remain accurate while the user fine-tunes.
01
Listen
Four analyzers run in parallel on every audio block: a Hann-windowed FFT spectral analyzer with an eight-band split, a per-block stereo correlator, a dual-envelope transient detector and a fully spec-compliant EBU R128 / BS.1770-4 loudness meter (momentary, short-term, integrated, LRA, oversampled true peak). All four write into a lock-free results struct read by the rest of the engine.
02
Aggregate
During the listening phase, one frame snapshot is captured every 100 ms into a pre-reserved ring buffer. On commit, the buffer is reduced into a typed TrackKnowledge structure using weighted percentile statistics rather than arithmetic means — so a quiet intro can no longer dilute a chorus's mud severity halfway to zero.
03
Decide
Once the listening pass commits, the user picks a destination. A planning layer (Loudness, High-Frequency, Stereo Safety) derives risk-aware policies from the snapshot and that destination, then a per-destination recipe turns those policies into concrete parameter targets. Switching destination on the same captured snapshot is a free re-resolve — no second listening pass. Every rule writes a decision with a reason code into a structured trace.
04
Fine-tune
The user nudges twelve perceptual dimensions on top of the engine's result. The wizard's choice remains visible as a reference baseline at every step — one click restores it, on any control, at any time.
02 · Listening

Four analyzers, one shared snapshot.

Each analyzer has a narrow, testable job. They all write into the same lock-free result struct, so the rest of the engine sees one consistent view of the signal at every block.

Spectral analyzer

FFT · 8-band

Hann-windowed real FFT with an eight-band split calibrated against well-mastered references. Each band carries a target dB level, a tolerance, an asymmetric temporal follower and a soft severity curve. A long-window broadband EMA is used as the reference scale, so cutting one band cannot inflate every other band's ratio.

  • Per-band level, target delta and severity
  • Spectral tilt, flux and flatness
  • Per-band crest factor (peak / average tracker)
  • Per-band loudness range (P95 − P10 over a sliding histogram)
module · 01

Stereo analyzer

Pearson correlator

Block-rate Pearson correlator of the L/R channels. Below a small energy floor everything is forced to zero so silence cannot flash the radar to MONO or WIDE. A second instance runs after the chain to drive the AFTER side of every stereo display.

  • Correlation, width, balance
  • Phase / mono-compatibility risk
  • Pre / post-chain instances for BEFORE / AFTER
module · 02

Transient analyzer

Dual envelope

Two envelope followers on the rectified mono mix — one fast, one slow. The difference between them is a per-sample transient strength signal, time-averaged into dullness confidence. A separate latched hit counter gives transient density per second.

  • Transient strength, dullness confidence
  • Transient density (hits / second)
  • Pure detector — independent from the corrective stage
module · 03

Loudness analyzer

BS.1770-4 / EBU R128

K-weighted, gated, fully spec-compliant. Momentary (400 ms), short-term (3 s), integrated (gated mean over a histogram), LRA (EBU 3342) and oversampled true peak with a multi-stage equiripple FIR. A 250 ms warmup masks the filter-prime transient so the first readings are never garbage.

  • Momentary, short-term, integrated LUFS
  • Loudness range (LRA)
  • Oversampled true peak
  • Pre / post-chain instances
module · 04

All four analyzers are lock-free and allocation-free on the audio thread. They run on every host block, every host, with no bypass shortcut.

03 · Aggregation

Why a percentile beats the mean.

A track has loud sections and quiet ones, tonal moments and noisy fills, intros that say nothing and ringouts that say less. An arithmetic mean treats them all the same. A weighted P75 does not.

0.000.250.500.751.00MUD SEVERITYArithmetic mean · 0.31Weighted P75 · 0.62diluted by quiet sectionstracks the loud / tonal sectionsintro · weight ≈ 0chorus · weight ≈ 1ringout · weight → 0TIME →
Fig. 2 — Frame-by-frame mud severity for a track with a quiet intro, a hot chorus and a long ringout. The arithmetic mean reads as mildly muddy. The weighted P75 reads as severely muddy — which is what the listener hears in the chorus.
  • 01Each captured frame carries a weight that depends on its loudness relative to the integrated value, its tonal density and its position in the take. Quiet intros and the trailing seconds of a ringout collapse to near-zero weight.
  • 02Severity scalars (mud, harsh, transient dullness, stereo issue) are reduced as the weighted P75 of the per-frame value — the loudest, most tonal 25% of the take. The arithmetic mean is kept too, but only as a legacy view.
  • 03Loudness scalars (integrated LUFS, true peak, LRA) come straight from the spec-compliant meter — they are already gated and windowed to standard.
  • 04If the captured material is too short the engine emits an explicit code and halves the corrective deltas, instead of pretending the snapshot is reliable.
04 · Adaptive targets

Eight bands that move with your track.

Hard-coded EQ targets reward tracks that already sound like the reference and punish everything else. Transientik Master compares each band against an expected curve built per take from destination, spectral tilt, flatness density and loudness range — and only fires inside the band's own tolerance window.

-20-15-10-50dBSUBBASSMUDLOW-MIDMIDHARSHAIRSPARKLEEXPECTED CURVE (PER TAKE)MEASURED ENERGY±1.5 DB DEAD ZONE
Fig. 3 — Eight-band layout. The grey strip is the per-band tolerance window; the green dots show measured energy; the dashed line is the adaptive expected curve.
  • Expected dB per band = base curve + destination offset + tilt term + density term + LRA term.
  • Delta = measured − expected. Inside ±1.5 dB the band is logged as balanced and contributes no EQ move.
  • Outside the dead zone, an adaptive delta is applied — capped per destination by the EQ coherence pass.
  • A deliberately dark, deliberately bright or deliberately mid-forward track is no longer scored as broken.
05 · Planning

Three planners between knowledge and parameters.

Between the snapshot and the parameter targets sits a small planning layer. Each planner is a pure function from (TrackKnowledge, Envelope) to a typed POD — no audio thread allocation, no I/O, no hidden state. Their outputs are then fed to the destination recipe.

TrackKnowledgeDestination envelopeLoudnessPlanHighFrequencyPlanStereoSafetyPlanRecipe rulesEQ coherence passParameter targetsINPLANOUT
Fig. 4 — The planners sit between the snapshot and the recipe. An EQ coherence post-pass enforces five rules on the four EQ slots before the targets are written to the parameter tree.

LoudnessPlan

risk-aware target

Replaces the naive 'clamp(target − integrated, −6, +12)' gain with a risk-aware effective target. Dynamic, peak-risk and short-capture material is gently shaved; dense, low-LRA material on aggressive destinations is allowed a touch more headroom — within destination caps.

  • global crest, density score
  • limiter risk score
  • effective target LUFS
  • recommended manual gain (clamped per destination)

HighFrequencyPlan

coordinated HF policy

Coordinates DeMetallic, Presence, Hi-Mid and High EQ via three risk scores and four permission flags, instead of the legacy 'multiply Presence by a fixed factor' hack. Sibilant tracks lose Presence and the air-shelf boost; metallic tracks see DeMetallic prioritised over the high shelf.

  • metallic risk, sibilance risk, air need
  • Presence allowed / scale
  • high-shelf allowed
  • DeMetallic priority flag

StereoSafetyPlan

phase-aware width

Scores low-end width, balance and global stereo risk, then derives a mono-below frequency policy, a stereo-repair amount and a stereo-link policy that depends on destination. Vinyl always gets a mono low-end; soundtracks always stay link-safe.

  • low-end width risk, balance risk, global stereo risk
  • mono-below frequency policy
  • repair amount and bus stereo-link policy
  • widening guard
06 · Explainability

Every move is a decision. Every decision has a reason.

Every recipe rule writes a structured Decision into a DecisionTrace: which parameter changed, from what to what, with what confidence, on the strength of which signals. Skipped moves are recorded too — a 'no-op' is itself a decision and a reason.

OBSERVATIONmud contrast 3.4 dB · mud severity 0.61CONDITIONcontrast > 2.5 dB AND severity > 0.45ACTIONEQ low-mid → −2.1 dBREASONMUD_CONTRAST_PERSISTENT · confidence 0.86
Fig. 5 — A single decision: which signals were observed, which condition fired, which action was taken, which reason was logged. Every entry in a DecisionTrace fits this shape.
  • 01If a parameter moved, the trace says why — and which signal triggered it.
  • 02If a parameter did not move when it could have, the trace says why not.
  • 03Reasons are typed enums, not free-text strings — they survive round-trips, schema versions and tests.
  • 04A deterministic phrasebook turns the trace into readable Simple, Technical and Debug summaries — same input, same output, byte for byte.
07 · Problem Radar

Five axes, two polygons, one honest picture.

The Problem Radar shows what the engine sees on the source track and how hard the chain is working to address it. Each axis reads from 0 (centre, clean) to 1 (outer ring, severe). Two polygons are drawn on the same axes: a faint amber BEFORE outline sampled from the raw input, and a bright mint AFTER polygon driven by the controls currently engaged.

MUDHARSHCLIPPHASEWIDE
BEFORE · raw input
AFTER · plugin response
  • MUDlow-mid build-up
  • HARSHupper-mid energy
  • CLIPtrue-peak / clip risk
  • PHASElow-end / mono safety
  • WIDEstereo width budget
Fig. 6 — Pentagon radar. Amber outline = BEFORE: what the analyser hears on the raw input. Mint filled polygon = AFTER: how hard the chain is working on each axis. With the plugin bypassed or every control flat, the mint polygon sits exactly on top of the amber one — so any inset of mint inside amber is the plugin actively reducing that problem.
08 · Section consistency

Stable loops shouldn't read as inconsistent.

Long loops, drones and steady mixes used to read as inconsistent because small frame-to-frame jitter inflated the cross-window dispersion measure. Three layered fixes turned the indicator into a reliable signal.

0.000.250.500.751.00trigger 0.55STABLE LOOPVERSE / CHORUS5 S WINDOWS →IQR (P75 − P25)P90 − P10
Fig. 7 — IQR (P75 − P25) vs P90 − P10 across rolling windows. The IQR draws a tighter, more honest line on a stable loop, while still flaring on a real verse / chorus split.
  • Drop the first 5-second window as warmup — upstream EMA convergence and ring-buffer fill no longer corrupt the dispersion calculation.
  • Use the inter-quartile range (P75 − P25) as the robust spread instead of P90 − P10 — edge outliers no longer flag a stable loop as inconsistent.
  • Reset the analyzers on the rising edge into Listening, so cross-instance bias from a previously-loaded plugin instance cannot leak into the current take.
  • When inconsistency is genuine, positive EQ boosts and softclip are damped via a smooth curve. Safety cuts are never damped.
09 · Determinism

Local, deterministic, inspectable.

Every stage of the pipeline runs locally on the host CPU. Every output of every stage is reproducible bit-for-bit given the same input. No generative model is invoked at any point.

No network

The engine performs no remote call at any point. The audio it processes never leaves the host machine.

No LLM

Insight prose is rendered from a static phrasebook indexed by reason codes. Same trace, same prose — including which alternate phrasing is selected, picked deterministically from the run's identifier.

Lock-free audio thread

Every analyzer is allocation-free on the audio thread. Capture buffers are pre-reserved. The listening pass never blocks the host's processBlock.

Reproducible commits

Given the same captured frame buffer and the same destination, the resolved parameters and the decision trace are byte-identical across runs. The destination feeds in at decide-time, not at listen-time, so swapping destination on the same capture re-resolves deterministically against the same TrackKnowledge.

Auditable trace

Every move and every skip is logged with a typed reason code. The trace is what the test suite asserts against, and what the user sees in the analysis modal.

Replayable snapshots

TrackKnowledge is serialized into both project state and user presets, so a saved master can re-resolve against a different destination on a future session without a fresh listening pass — the same captured snapshot, deterministically reused.

10 · Boundaries

What we deliberately do not do.

Mastering tools collect a long list of features that sound impressive in marketing but quietly degrade the signal. This is what is intentionally out of scope for the v1 engine.

  • No source separation
    We do not split a stereo master into stems. The whole point of mastering is that it is the last stage on the bus.
  • No tempo or key detection
    Neither informs any mastering decision. They are not measured.
  • No 'AI generation'
    No generative model writes audio, parameters or prose. Every value the user sees came out of a function whose inputs and outputs are typed.
  • No hidden cloud step
    No part of the pipeline calls a remote service for analysis, decisions or rendering.
  • No reference-track matching
    Targets come from destination semantics and from the track itself, not from a third-party file the user uploaded. We do not pretend to copy a hit record onto a different song.
11 · Glossary

Working definitions.

TrackKnowledge
The typed snapshot the recipe reads. Loudness scalars, eight-band deltas, severities (P75), contrast heuristics, capture metadata.
Weighted P75
The 75th percentile of the per-frame value, weighted by frame loudness × frame tonality × position-in-take. Quiet intros, ringouts and crossfades collapse to near-zero weight.
Adaptive band delta
Measured band level minus an expected curve made from base + destination + tilt + density + LRA. ±1.5 dB inside the dead zone counts as balanced.
Envelope
Per-destination tuple of LUFS target, ceiling, character index, softclip cap and slope, aggression, manual-gain cap and total-EQ cap.
Planner
A pure function from (TrackKnowledge, Envelope) to a typed POD. Three planners ship: Loudness, HighFrequency and StereoSafety.
DecisionTrace
The structured log of every recipe move and every recorded skip. Reason codes, parameter ids, before / after values, confidences, source signals.
Reason code
A typed enum value identifying why a decision was taken — survives schema bumps, tests, exports.
EQ coherence pass
A post-process over the four EQ slots that enforces five rules: adjacent-cut damping, low-weight preservation, presence / air stack guard, sibilance guard and a destination-limited total-EQ-move cap.
Closing

Built to be measured, not believed.

If something in this paper looked like a claim worth testing, that is the point. The engine ships with the trace turned on by default, the report visible in the modal and the baseline restorable on every control.

© Transientik Labs — methodology paper, public summary. Internal thresholds, formulas and constants are deliberately not disclosed.