Chris Meets Music Interview

Ed Newton-Rex: I Stand In The Library

By Chris Cooke | Published on Monday 7 August 2023

Live From London is a digital festival originally created by the VOCES8 Foundation during the pandemic, and which returns once again this weekend, presenting nine concerts that together feature six centuries of music, filmed in locations around the world.

Among the performances going live on Saturday is the premiere of ‘I Stand In The Library’, a new piece by composer and entrepreneur Ed Newton-Rex which includes words generated by artificial intelligence. Ed has been working in the generative AI space for more than a decade, founding the company Jukedeck and now as VP Audio at Stability AI.

With generative AI becoming such a big talking point in the last year, I wanted to speak to Ed about what the technology is currently capable of, how he used it as part of the creative process for ‘I Stand In The Library’, and how the AI generated words impacting on the music he composed.

CC: Tell us about the process you went through in creating this piece.
ENR: I generated the text first, using OpenAI’s GPT-3. This was the state-of-the-art large language model at the time.

Once I had that, I wrote the music over a couple of months last year, in the manner I always do – sitting at the piano.

CC: Where did the idea come from to use AI to create the lyrics in this way?
ENR: In early 2022 I was looking for a new text to set to music.

I’ve been working in generative AI since 2010, so – when OpenAI’s GPT-3 was released to initial testers – I heard about it and signed up for early access.

In the early days lots of people were sharing poetry it had written them, some of which was surprisingly good – so it wasn’t a great leap to think perhaps I could use it to write a text for my piece.

CC: How much prompting of the AI and tweaking of the text did you do?
ENR: The initial prompt I wrote was “below is a poem about music and solitude”. That was it.

When I hit ‘enter’, it immediately came out with the line “I stand in the library, where a voice soars”, which I loved.

It wrote a few more lines, then paused in the middle of a sentence, as it had reached its token limit – ie its output length. I pressed ‘continue‘ and it wrote some more.

I kept doing this, every so often telling it to rewrite something it had just written – I probably asked for a rewrite around ten times or so. So there was definitely some curation on my part – but it was pretty minimal.

Before long, I had a text that felt complete, so I divided it into stanzas and got onto writing the music.

CC: How would you describe the musical element of the piece?
ENR: The generated text directly influenced the music in a major way.

The mention of a piano in the first stanza led me to write for choir and piano, something I haven’t done much before – I tend to write for unaccompanied choir.

The opening piano motif is meant to represent a half-remembered melody from the protagonist’s childhood, and the style of the piano part remains florid and melodic for most of the piece, often using only one hand.

This contrasts with the manner in which the piano is usually used when paired with choir, which tends towards more of an accompanying function. The choir and the piano in this piece are equal partners.

The choral part itself is in large part homophonic, making lots of use of clustered notes, which is a style I like to write in. But the text influenced this, too.

Reading the poem through, I was reminded of a poem by Christopher Smart, ‘Jubilate Agno’, which Benjamin Britten set to music in ‘Rejoice In The Lamb’.

Smart wrote it in a psychiatric hospital and this shows, with references to cats, mice and other oddities you don’t generally find in the choral canon. The AI-generated output had traces of this, I thought, particularly the final stanza, which begins “I am at war with the world”.

So I loosely modelled the piece on Britten’s work, inserting standalone solo numbers in the middle as he did.

CC: How did your tie up with VOCES8 come about?
ENR: I’ve known Barney Smith and VOCES8, the vocal ensemble he co-founded, for a while. They recorded a piece of mine, ‘This Marriage’, to commemorate the royal wedding in 2011.

When I’d finished ‘I Stand In The Library’, I sent the score to Barney and he said he thought it would be perfect for their Live From London series.

CC: You’ve been working with generative AI for quite some time, talk us through your involvement with this technology past and present.
ENR: At university, I became fascinated with the question of why no one had successfully taught a computer to write music, and what would be possible if someone did.

So in 2010 I decided to work on the problem and founded Jukedeck, the first start-up in the AI music composition space. We did a fair amount to bring AI composition into the public consciousness, including having over a million pieces of music created using our system, putting on a big AI concert in Seoul, and winning a Cannes Innovation Lion.

We were acquired by Bytedance in 2019. I now work at another generative AI start-up, Stability AI, where I run our efforts in audio, which includes music.

CC: Why do you think generative AI – including music-making AI – has become such a big talking point in the last year?
ENR: Simply because the technology has suddenly come on so much. In 2022, almost overnight, people found themselves able to create any image they wanted just by describing it, using tools like Stable Diffusion and Midjourney. The same happened with text creation with ChatGPT this year.

There’s no reason to think the same won’t happen in music – and, sure enough, there have been some major advances already this year, with both Google and Meta demoing impressive music generation capabilities.

Then of course AI voice-swapping, the tech behind the ‘AI Drake’ songs and much more, came along.

While I wouldn’t call that generative AI, as it’s just taking a recorded vocal and converting it to the style of another singer, the very fact that we can now create ‘fake’ songs has made a lot of people realise that we may be closer to full-blown generative songs than was previously thought.

CC: Could generative AI have composed the musical element to this piece? If not, will it be able to in the future?
ENR: Not with current tech. Most state of the art AI music generation systems today create short passages of audio, perhaps fifteen to 60 seconds, and tend to do much better at instrumental music than vocal music.

In terms of programs that create symbolic music – ie that compose scores, as opposed to producing audio – they work quite well for short scores in the style of specific composers, particularly for a single instrument like the piano, but the output tends to be very generic.

There is nothing yet that comes close to creating a structured fifteen minute piece, setting text effectively, creating something musically novel in the process – which is hopefully something I’ve managed with this piece, though that’s for the listener to decide.

But will it be able to in future? Absolutely – or at least it’s highly likely. In other modalities, like image and text, the speed with which generative AI systems have recently improved and achieved things that two years ago seemed impossible is astounding.

We should expect the same to happen in music.

CC: To what extent could generative AI replace human creators in the future?
ENR: It’s hard to say. While I used AI to generate a text here, it’s a one-off for me – I’m not going to use AI for all my future texts. Using AI here is about artistic exploration, rather than replacing poets.

But it’s hard not to look at these systems and conclude that there won’t be some impact on human creators. Exactly what that looks like – the balance between AI being used as a creative tool, as I’ve used it here, and being used as a replacement technology, remains to be seen.

CC: Other than your own projects, what other innovative ways have you seen creators use AI as a tool in their creative processes?
ENR: Back in 2020 I was one of the judges for the first international AI Song Contest.

There was one entry that really stood out – a musician who had traded phrases with an AI, passing the music back and forth between human and machine until the results were unrecognisable from what they’d started with.

I really like this idea that AI can be used as a creative companion to lead us in directions we wouldn’t have taken otherwise.

CC: Tell us a little more about Live From London – what is it and how did you get involved with it?
ENR: Live From London is a digital festival.

It was set up by the VOCES8 Foundation as a response to the pandemic – initially as a way for concerts to continue in a remote-only world, meaning performers generating income and audiences seeing great music. Performances are streamed and online audience admission is ticketed.

Even though the pandemic is over, it goes from strength to strength – there have been ten festivals to date, with over 100 concerts and 250,000 tickets sold. I got involved through this piece – when I sent the score to Barney, he thought it would work well for the festival, and I’m thrilled that this will be the setting for its première.

Live From London begins on 12 Aug with a series of online concerts to pick from on-demand. Ed’s piece is part of the concert called ‘To Sing Of Love’ from the Voces8 Foundation Choir & Orchestra.

LINKS: livefromlondon.org | ed.newtonrex.com