Testing AIVA and Suno: Bridging the Gap Between Sound and Emotion

Hands-on evaluation of AIVA and Suno using my own music: I created a David Bowie-inspired ambient composition in AIVA (parametric/symbolic generation) and a Norah Jones reinterpretation of my original song in Suno (text-to-audio synthesis), then used Moises for stem separation and iZotope Ozone for AI mastering to prep a studio demo. Both generation tools excel at ideation but fail at expressive nuance, revealing the semantic gap between surface-level sound and genuine musical meaning. The process exposed how dataset bias kills timbral depth and emotional realism. This analysis bridges technical concepts (MIR, spectral masking, latent representations) with practical workflow decisions: identifying where AI tools genuinely enhance production.

Phil Conil - Berklee College of Music

Read Analysis

Testing AI Mixing Tools: From 1930s Field Recordings to Bon Iver

I stress-tested iZotope Neutron (AI mixing) and Ozone (AI mastering) using my own tracks alongside intentionally weird references: 1930s Alan Lomax field recordings, robot voices, Bon Iver's compressed modern sound, and Bob Dylan's vintage open production. This revealed two distinct product design philosophies - supervisory control (Ozone: AI suggests, you refine) versus automated control (LANDR: AI decides for speed). The results show how reference-matching algorithms can nail one context and completely miss another. When AI tries to apply external sonic profiles without understanding the full mix, it breaks. The core insight: great AI music tools need different control levels for different users, and how algorithms apply their training matters as much as what they learned.

Phil Conil - Berklee College of Music

Read Analysis

The Training Data Problem: Why AI Gets Norah Jones Right but Fails on Nina Simone

I tested Moises AI stem separation on two piano-driven recordings: a 2016 Norah Jones studio session and a 1962 Nina Simone live performance captured in a single room. The contrast exposes a fundamental constraint in deep learning for audio - AI only separates what it's been trained on. Contemporary studio recordings with standard instrumentation? Near-perfect isolation. Vintage live recordings from single-room acoustic spaces? Bass and piano merge into mush. This analysis examines how dataset bias (DSD100), artifact generation, and bottom-up learning shape performance: even sophisticated neural networks can't separate sources they've never experienced.

Phil Conil - Berklee College of Music

Read Analysis

Inside the Filter Bubble: How Streaming Traded Discovery for Comfort

Personalized streaming makes listeners feel understood until it starts to feel predictable. I examine how content-based and collaborative filtering systems, designed to deepen engagement, quietly narrow musical taste over time by rewarding familiarity over discovery. These algorithms create self-reinforcing feedback loops that reduce serendipity and turn active listeners into passive consumers - a shift with real business consequences. Passive "lean-back" listeners don't buy tickets or merch; they don't become lifelong fans. I explore the rise of "algorithm A&Rs" replacing human intuition with data, the business cost of passive consumption, and the cultural risk of optimizing for engagement metrics over genuine connection. Spotify's personalization works almost too well: users feel understood, engagement spikes, but something crucial disappears. In the pursuit of comfort, streaming risks losing what makes music matter - the surprise of discovering something you didn't expect to love.

Phil Conil - Berklee College of Music

Read Analysis

The Hidden Cost of AI in Music: Sameness

AI mixing and mastering tools have democratized music production for bedroom artists, but there's a cost: algorithmic sameness. When thousands of producers use identical tools trained on the same datasets, optimizing for Spotify's algorithm rather than developing distinctive identities, the result is sonic homogenization. Access to tools doesn't equal artistic vision. I examine how true musical revolutions (punk's rebellion, hip-hop's DIY grit) have come from breaking rules rather than following algorithmic patterns, and why spatial audio might inspire the next creative rebellion.

Phil Conil - Berklee College of Music

Read Analysis

Data vs. Magic: Where Algorithms End and Music Begins

I explore the fundamental tension in AI music generation: parametric models (MIDI-based, structured control) versus non-parametric models (text-to-audio, spontaneous texture). Parametric systems like AIVA let you define musical building blocks - chords, tempo, key - and mirror traditional composition but feel rigid. Non-parametric tools like Suno feel more intuitive yet often miss the intended emotional mark. Drawing on Brian Eno's generative philosophy, Pharrell Williams' synesthetic production ("make it more purple"), and Herbie Hancock's observation that music "transcends language," I examine what's fundamentally missing: the wordless, intuitive telepathy musicians share when creating in real time. Hybrid systems combining structured logic with deep audio synthesis may offer a path forward, but the deeper question remains whether AI could ever capture what Miles Davis meant by "don't play what's there, play what's not there" - the magic that emerges when musicians stop thinking and the music itself begins to speak.

Phil Conil - Berklee College of Music

Read Analysis

What AI Can't Steal: Why Human Mess Makes Better Music

AI can analyze millions of chord progressions, but it will never understand why Charlie Parker works in the mountains and Frank Ocean doesn't - both tied to urban experience, but only one belongs in nature when you hear the birds chirping outside. That combination might never appear in any dataset. I explore what's fundamentally missing from AI training data: the accidents, memories, and cultural collisions that create music people actually care about. Drawing on Brian Eno's decades-long exploration of machines as creative partners (from his 1970s work on Oblique Strategies and Bowie's Low to present-day generative systems), and Bowie's concept of the "grey space" between artist and audience where meaning emerges, I examine how imperfection and unpredictability give music its emotional truth. AI tools can accelerate workflow and enhance technical processes throughout the music lifecycle, but they don't address the deeper question of meaning-making. Music that carries the weight of real life, the texture of imperfection, and the strangeness of lived experience requires insight from people who have actually lived something worth transforming into sound.

Phil Conil - Berklee College of Music

Read Analysis

Why AI Music Feels Empty (And How Brain Data Might Fix It)

AI-generated music often sounds technically flawless but emotionally hollow. This analysis explores a promising multimodal research approach: combining EEG brainwave and eye-tracking data with MIR audio analysis to help AI understand what actually moves us when we listen. By measuring physiological responses (brain activity, pupil dilation) alongside musical features (tempo, harmony, timbre), researchers hope to bridge the gap between technical patterns and emotional meaning. I examine the proposed system's potential and its practical limitations - individual variation, expensive data collection, and the artificiality of lab conditions. The approach could make emotion-music connections more transparent, revealing which musical elements trigger physiological responses. But transparency about correlations isn't the same as understanding why. Knowing that certain frequencies trigger reactions won't explain why a Miles Davis Bâ™­ has such power - some mysteries resist quantification.

Phil Conil - Berklee College of Music

Read Analysis

What Waveforms Actually Tell You About Music (And Why It Matters for AI)

Using MIR signal processing, I analyze the sonic signatures of two instruments I use in production - the Indian dilruba and Armenian duduk - examining their waveforms, spectrograms, and amplitude envelopes to reveal how physical gestures create acoustic fingerprints. The dilruba's bow catching strings creates sharp transients and sympathetic resonance, while the duduk's breath pressure produces an unusually strong second harmonic. I then compare amplitude envelopes from two New Orleans piano recordings: Jon Batiste's "St. James Infirmary Blues" (sparse phrases shaped by silence) and Professor Longhair's "Big Chief" (constant pulsing groove). This foundational analysis demonstrates how signal processing makes the invisible visible - translating intuitive musical understanding into quantifiable data essential for anyone working at the intersection of audio technology and music.

Phil Conil - Berklee College of Music

Read Analysis