Maybe AI Music Is Bad Actually

Craig Havighurst
11 min readApr 30, 2023

or The Sadness of Singing Robots

(cross-posted from https://stringtheories.substack.com/)

If you’re not awestruck and a little bit frightened by Artificial Intelligence right now, I’d suggest you’re not paying attention. While we’ve been taking advantage of various AI tools for years in voice command apps and content recommendation engines, new “generative AI” tools like ChatGPT (for text) and Midjourney (for graphic art) have sent us reeling into an uncertain and unsettling future. We should all be burning with questions about how these applications — which can seemingly produce original, quality content in seconds using simple voice prompts — will change how we work, learn and create.

But I want to talk about music, where the new tools appear to be less famous but more varied and prolific than in the other art forms. After years of waiting, it’s now urgent business to stake out a position on the ethics and boundaries of generative AI music, which I’ll abbreviate as GAIM. Much of the commentary I’ve seen so far has been tepid and agnostic, with an “only time will tell” kind of attitude, and that’s not enough. AI applications in music aren’t entirely useless or inherently evil, but we should start from a framework of skepticism. GAIM isn’t something we need, and I doubt it’s something anyone — creator or consumer — should want. Call me a William F. Buckley Jr. conservative on this point, which he famously defined as “someone who stands athwart history, yelling STOP!, at a time when no one is inclined to do so.”

That’s certainly not the vibe one gets from some coverage of AI music tools. It’s almost all cheerleading. The top ranked Google news story on the subject just now says that a service called Amper Music is a “great choice for content creators” while another called Soundful “leverages the power” of AI to generate original soundtrack music for podcasts or games. While this particular story does open with a note suggesting “using AI as a supplemental tool rather than as a replacement for human artists,” it offers not a whiff of warning about the costs of effort-free art-making. Nor does it recognize the stunning mediocrity of the results, a sample of which is here. Other examples of prompt-based music can be found in this paper from Google. I warn you, it’s mind-numbing stuff, but of course it will get more complex and convincing with time.

The AI music apps themselves are making extravagant claims. Here’s the lead copy from one called Boomy: “Make original songs in seconds, even if you’ve never made music before. Submit your songs to streaming platforms and get paid when people listen. Join a global community of artists empowered by Boomy AI.” Think of the hucksterism and delusion in this brief pitch. In reality, users of this platform are not “making songs” but having a sequence of sounds generated for them. Taking the next step, contemplate the hubris and petty greed of submitting electronic ditties generated by a web app to streaming services in hopes of a payout. And in that case, no, you’d not be “joining a community of artists” because prompting generative AI doesn’t make you an artist. This kind of marketing hype should be a reminder that our first ethical obligation in this strange new world is to reaffirm our respect for the labor, the experience, the humility and the humanity inherent in being a musician.

Of course, we’ve had digital tools to generate and manipulate sound for decades. We can map instrument sounds like “clarinet” or “church organ” onto melodies generated by another instrument or a voice. Other tools can generate harmonic backing parts in real time. Computers also can simulate the room acoustics of specific spaces, model various kinds of guitar amplifiers, or sweeten up mixes with simulated mastering. Such tools can be deployed tastefully or tastelessly, but they’re not ethically problematic. Such tools enhance or modify musical content made by people rather than making it for them. Moreover, these techniques became established gradually, and musicians tend to be quite transparent about how they’ve applied them to their recordings. One has to have certain kinds of musical aptitude to apply them competently. Generative AI is different in kind, not degree.

The most obvious concern around GAIM relates to attribution and plagiarism. Some people will pass off AI-made music as their own or as the product of a fictional artist/persona they control. Already, people with apparently nothing better to do have generated “songs” in the style of famous artists using those artists’ works as their learning models — ersatz Nirvana or AC/DC — and boy are they awful. But they’ll get more convincing with time. We’ll soon see news stories revealing that some track with millions of streams on a service by a mysterious indie “artist” was actually generated by AI and the media will spin that as wow isn’t that cool that we couldn’t tell? But of course converting an artist’s work into data and then corporealizing it without permission or passing off AI works one didn’t compose and execute as original is flagrantly wrong. Will that stop people from doing it? No, but musicians and producers ought to present a united front in rejecting and condemning it.

A related threat tracks with the fear of deep fake video or audio in the political realm. It’s become possible to convincingly replace the voice of one person with another’s, and recently a clip made the rounds with a remarkable simulation of Jay-Z’s voice performing a song he had nothing to do with. His longtime recording engineer quickly and appropriately sent up a warning that this wasn’t a good direction. Yet at the same time, we have this young woman standing on a TED stage, encouraging people to sing “as me” or ruminating on her own voice being “in a thousand different bands in multiple languages.” While she does make perfunctory nods to intellectual property and permission, she’s seeing the future through a narcissist’s eyes and flagrantly devaluing the meaning and value of musical performance. She calls voice simulation a “21st century corollary to sampling” which it absolutely is not. It’s either theft or a weird and creepy kind of musical necrophilia.

But these are the easy cases. What about GAIM that’s presented honestly as what it is? What happens when AI inevitably can fabricate complete works that simulate melody, harmony, timbre and structure so convincing that it evokes an emotional reaction in large numbers of people? What happens when I hear AI music that I, at some level, enjoy? How one responds depends in part on being prepared, which means seizing on this weird moment in human and technological history to refresh our ethos of music — to ask ourselves what music is for and what it means at a philosophical level. I welcome that, because it’s a conversation few people ever have with themselves in a world where everything can feel like amorphous “content.”

Definitions of music can be contested and esoteric, so I need to be clear about my own to make my argument about AI music meaningful. I put it this way: music is humans moving fellow humans through the esthetic manipulation of sound. “Moving” can mean emotional, intellectual, physical and spiritual. The “fellow human” may encompass a mass audience, a room of several listeners, or only one’s self, because a lot of music is made alone (I should know). But for me, human agency must be included because music’s value is manifested in both the making and the listening. A symphonic score or a recording on a shelf has potential energy, but were those never played again, they’d have no value, and we could say in a meaningful sense that some music had died.

Music is a medium of intentional communication and connection, with agency at two ends of a line. Music’s value derives from a yin-yang duality between the inspiration of the creators and the perception of the listeners. Multiple parties are involved in a transaction of energy, attention and empathy. With AI music, there’s no creator with whom to empathize and thus no energy to exchange. On that creation side, there are two layers, as we know from copyright law — the composition of a work and the audible manifestation of the work — the track — which I dare not call a sound recording, even though that’s how some will want it to be treated for their royalty streams. In real music, composing involves ingenuity and inspiration while the track or performance implicates a matrix of factors such as arrangement, mix, phrasing, dynamics, etc. all of which are supposed to reflect experience, expertise and labor.

So we’re asked to imagine a world in which certain works of “music” don’t come with any of that. They’re just manifested by a computer via some kind of prompt. What is this good for? Well we know that passive listening is less meaningful than active listening, and the same logic should apply for creation. Passive creation is an oxymoron, but it’s at the heart of generative AI music. A tree falling in a forest with nobody to hear it does in fact make sound, because sound is an objective form of energy. A robot generating harmonious tunes through a speaker detected by another robot’s microphone does make sound, but it doesn’t make music in any important sense. A robot making music for us isn’t making music as an art form but a simulacrum of music made from math. It doesn’t matter if that sound is superficially groovy, tuneful or engaging. AI music is a category error and an existential sadness.

Another argument against AI music has to do with economics — with value, supply and demand. Even before the digital music revolution, we had an astronomical amount of good music in our world and a robust pipeline for new music, producing more than a busy and devoted person could listen to or understand in a lifetime. Then after 2000 came an unprecedented supply shock that led to tens of thousands of new musical works being added to streaming services every week. No disrespect to any given musical person or the desire to create and play music, but sheer ease of access — for all the good it does — has generated a staggering, unproductive supply of mediocre music in the marketplace. At a time when the industry and creators struggle with realizing their value in the market, we scarcely acknowledge the supply side and our historically unprecedented ratio of musical creators to listeners. Why would anyone try to usher into existence a machine that makes more mediocre music, indeed an infinite amount of it? (There’s a very real chance that AI music factories will decide it’s in their best interest to spawn countless tracks, creating a nightmare for the hardware and business value of the system we rely on to access music.)

Some musicians will counter that AI music will merely assist, enable and empower artists, in much the same way that Photoshop transformed the lives of graphics professionals, without destroying art itself. I’ve heard music producers compare AI to auto-tune. But that’s not a fair comparison. It misunderstands something about this moment and it’s a point of view that devalues, morally and financially, the human beings who are in the music creation and appreciation business. The point of GAIM is not to enhance existing sound or creative impulses but to create, to comprehensively model and mimic human creativity, and thousands of engineers are driving relentlessly at this as if the creation of new music is a “problem” to solve. Think how absurd that sounds. Any time I hear a musician willing generative AI music into the mainstream, I’ll see somebody proposing a kind of suicide pact. Musicians may say they want safeguards, but the designers of these systems, from text to art to audio, don’t understand the self-teaching systems they’re building. The only bulwark against flagrant abuse of the public trust is a music community with strong, widely held ethical principles.

So what would defensible, useful applications for AI music look like?

Already you can tap AI for royalty free soundtrack music for podcasts and film, which is to say bland and generic moody rhythmic sound that doesn’t need to draw attention to itself. Okay, no harm or foul here, but there IS a more human centered way to do the same thing. Be very aware of what you’re sacrificing when you fabricate a sonic experience and be aware of the inherent deception at some level toward the audience of your larger work. I credit the music on my radio show/podcast, and I’d expect the same level of attribution for AI soundtracks. If you think it brings value to your production, and you didn’t make it, then don’t pretend you did.

Now let’s imagine another scenario. The noted ambient music creator/producer Brian Eno works with AI to develop a set of rules and a dataset that produces a continual stream of Eno-esque music for public plazas, malls, hotels, or hold music. Here we’d have a kind of syndication of Eno’s musical brain, implying some kind of control and licensing. This remains legitimate as long as it’s bounded and specific for a set of places and purposes. While I think that’s kind of skeezy and cheap, I can’t see anything wicked about it. Yet what’s to stop some random person from training an algorithm with all of Brian Eno’s music and then generating ambient soundscape music without attribution for their own benefit? We need laws that would impose crushing penalties on such invidious behavior and a social contract that regards that kind of extractive and exploitive misuse of somebody else’s musicality as theft.

Finally, an example that’s probably already happening here in Nashville in the famed writers’ rooms along Music Row. A songwriter asks AI to generate melodic ideas in the key of A with a melancholy mood as a “prompt.” The songwriter listens to the ideas and then works with one as the basis of a song. How do they present this song to their publisher or to artists in the A&R process? It’s a co-written song, but there’s no person out there to claim credit for the musical side of the work. So we need a code of ethics requiring transparency about the song’s origins. Co-writing credit isn’t only about ensuring all creative parties get paid. It’s also about creators acknowledging the boundaries of their contribution to the whole. The only safe path forward is a strong practice against hiding the use of GAIM in the creative path. Will cool music be made by humans in concert with AI? Absolutely. I just insist on knowing that’s what it is, as I know that it was Lennon AND McCartney (and George and Ringo), and I want the human partner in the creation to show the humility required of any ethical artist.

AI is already here, tempting everyone in music with a previously unimaginable menu of cheat codes. It’s incumbent on everyone who makes or cares about music to be intentional, reflective and open about how they use these new tools. We ought to be convening AI ethics conferences and congresses across the creative fields. I don’t see us moving fast enough, and too many marketplace players rushing toward something they don’t understand (“it’s just like Photoshop!”) for their own benefit. This is a watershed moment in history, and we’d better stand up for the inherent humanity of musicianship and musical communication, lest we determine by inaction that none of it really mattered at all.

https://stringtheories.substack.com/p/the-sadness-of-singing-robots

--

--

Craig Havighurst

Journalist, radio host and speaker in Nashville. Editorial Director for WMOT/Roots Radio. Host of The String.