Sonifying Social Momentum: Building a Data-to-Audio Pipeline with Claude AI

What does it sound like when a single event ripples through a population, compounding week after week? That was the question behind our latest experiment at Sombra Audio — and the answer came from an unlikely source: an economics paper about movie ticket sales.

Listen: "Momentum Cascade" — the final sonification piece.

The dataset comes from "Something to Talk About: Social Spillovers in Movie Consumption" by Gilchrist and Sands (2016), published in the Journal of Political Economy. The researchers used unexpected weather variation on movie opening weekends as a natural experiment to isolate how word-of-mouth drives ticket sales over time.

Their key finding: a positive weather shock on opening weekend doesn't just fade away. Each weekend of viewers generates its own echo of conversations, which attract more viewers, who generate more conversations. By week 6, a $1 shock to opening-weekend revenue has generated $2.14 in total revenue. The social multiplier compounds invisibly — unless you make it audible.

The Pipeline: 5 Layers from PDF to Sound

We built a reusable sonification pipeline using Claude AI as the orchestrator across five layers:

1. Data Extraction — Python scripts parse the PDF, pulling momentum coefficients, cumulative multipliers, viewership decay curves, and quality-split metrics into clean CSV.

2. Data-to-Music Mapping — This is the creative core. Rather than arbitrary aesthetic choices, every mapping is grounded in The Sonification Handbook (Hermann, Hunt & Neuhoff, 2011). Four musical layers each sonify a different data dimension:

Layer	Data Dimension	Musical Expression
Pad	Cumulative multiplier (1.0 to 2.14)	Chord density grows from 1 to 5 voices — polyphonic mass accumulates
Bass	Momentum coefficient (1.0 to 0.096)	Deep C2 initial shock rising to C3 as echoes weaken — with sub-octave drone entering at week 3
Echo	Week progression (1 to 6)	Melodic fragments multiply from 1 to 6, spreading through stereo space
Pulse	Temporal acceleration	Whole notes to eighth notes — an accelerating heartbeat building urgency

3. Transport — The pipeline generates a multi-track MIDI file (480 PPQN, 48 bars at 72 BPM in C natural minor) with CC automation at 32nd-note resolution. A real-time MIDI CC streamer via macOS IAC Driver provides live parameter control, and an OSC sender offers an alternative for custom setups.

4. Sound Engine — Ableton Live with Max for Live. A custom JavaScript engine reads transport position and outputs 8 interpolated automation values at 30 Hz, driving filter cutoff, reverb, delay feedback, stereo width, and more.

5. AI Enhancement — An optional layer for AudioCraft/MusicGen to generate ambient textures informed by the data characteristics. Framework ready, not yet deployed.

Why C Natural Minor at 72 BPM?

Nothing in this piece is arbitrary. C natural minor (Aeolian mode) has no leading tone — the harmony never fully resolves, mirroring the research finding that social momentum never fully decays within the observation window. At 72 BPM, each week gets roughly 45 seconds to develop, supporting a slow tension build across the 2:40 total duration.

Velocity is data-driven too. The opening shock hits at velocity 100 (coefficient 1.0, highest statistical significance). By week 6, echoes have faded to velocity 40 (coefficient 0.096) — embedding the paper's confidence intervals directly into the performance dynamics.

The Arc: From Shock to Cascade

The piece unfolds in six sections, each mapping directly to a week of the research data:

Week 1 — The Shock. A single pad voice. Deep bass at C2. One isolated descending motif. Whole-note pulse. Sparse and ominous.

Week 2 — First Echo. Second voice enters the pad. Bass rises to E2, quieter now. Two melodic fragments start spreading. Half-note pulse stirs.

Weeks 3 through 6 — The Cascade. The pad thickens to five voices. Echo fragments multiply and saturate the stereo field. The pulse accelerates to eighth notes — relentless and urgent. A sub-octave drone adds weight beneath. By the final section, the listener is immersed in the accumulated mass of six weeks of compounding social momentum.

You don't need to know anything about econometrics to feel the build. That's the point.

What Claude AI Actually Did

Claude didn't just write the Python scripts. It operated as a creative collaborator across the entire pipeline:

Analyzed the research paper and identified which data dimensions carried the most narrative potential
Designed the mapping strategy grounded in sonification theory, suggesting which auditory parameters would best represent each data dimension
Generated the MIDI pipeline — pure Python with no external dependencies beyond mido for MIDI I/O
Built the Max for Live JavaScript engine with interpolation, change detection, and quadratic fade-in to prevent parameter spikes
Iterated on transport solutions when Ableton's MIDI mapping limitations required switching from baked-in CC to external IAC streaming
Documented every decision with theoretical justification, creating a reusable playbook for future sonifications

The pipeline is designed to be generalizable. Swap the CSV for climate data, stock prices, epidemiological curves, or neural signals — the 5-layer architecture holds.

Challenges and Lessons

Real-time MIDI CC streaming from Python has timing jitter that's acceptable for effects automation but not for pitch-critical parameters. The Max for Live device needed manual assembly — generating .amxd files programmatically doesn't survive Ableton's drag-and-drop loading. And delay feedback mapped to data density required hard caps at 80/127 to prevent self-oscillation.

The biggest lesson: sonification is composition with constraints. The data determines your material, but the mapping decisions — which dimension drives which parameter, what scale, what tempo — are artistic choices that require both theoretical grounding and musical intuition. Claude AI handles the systematic mapping; the human ear makes the final call.

What's Next

The "Momentum Cascade" piece is one data story from one paper. The pipeline is ready for more: the weather U-curve analysis from the same research, quality-split divergence between high and low-rated films, or entirely different datasets. We're also exploring an interactive version where listeners can adjust mapping parameters in real time — remixing the data as a performance instrument.

If you're interested in data-driven sound design or want to explore sonification for your own projects, the pipeline architecture and mapping strategies documented here are a solid starting point. The intersection of data science and audio production is still wide open.

The sonification pipeline was built using Claude AI, Python, Ableton Live, and Max for Live. The source research paper is freely available: Gilchrist & Sands, "Something to Talk About: Social Spillovers in Movie Consumption," Journal of Political Economy, 2016.