樱花动漫

Skip to main content Skip to search

YU News

YU News

Student鈥檚 Musical DNA Project聽Cracks the Code Behind Music We Can鈥檛 Stop Playing

Benji Morris' project examined everything from chord progressions and background vocals to syllable complexity and repeated lyrical phrases. He even added contextual features, such as whether a song was performed on tour or played recently as a surprise song during a concert.

By Dave DeFusco

For most listeners, music streaming feels simple: press play, hear a song and let an app recommend what comes next. Behind every recommendation, however, is a mountain of data trying to predict what people want to hear. Benji Morris, a student in the Katz School鈥檚 M.S. in Data Analytics and Visualization, believes today鈥檚 music platforms are only scratching the surface.

His project, 鈥淢usical DNA,鈥 aims to build a much deeper understanding of music by analyzing songs through hundreds of characteristics instead of the limited set commonly used by streaming services like Spotify.

鈥淪potify publishes a dataset on audio descriptors like tempo, valence and energy,鈥 said Morris, who recently presented his work at a forum hosted by the Department of Graduate Computer Science and Engineering. 鈥淚 looked at it and thought they were interesting but not specific. What do you mean when you鈥檙e looking at energy? I really wanted to break it down and get a much broader description of audio features.鈥

Spotify currently describes songs using a small collection of traits such as danceability, loudness and tempo. Morris found those measurements useful, but too shallow to explain why some songs become massive hits while others do not.

鈥淭he focus of the project was not necessarily a recommender system,鈥 said Morris. 鈥淚t was a prediction model. Could you predict how well a song is going to stream on Spotify?鈥

To answer that question, Morris built a large data pipeline that treated songs almost like scientific samples. Instead of relying on about 15 or 16 audio measurements, his system analyzed hundreds of features connected to lyrics, harmony, production style, instrumentation, structure and cultural context.

The project examined everything from chord progressions and background vocals to syllable complexity and repeated lyrical phrases. Morris even added contextual features, such as whether a song was performed on tour or played recently as a surprise song during a concert.

鈥淭here were features that I put in there that I never would have thought would meaningfully change the way the model predicted,鈥 said Morris. 鈥淥ne of the last features I added was track number, whether a song was track one, track five or whatever. That made the prediction meaningfully closer.鈥

He also discovered that tour-related information unexpectedly mattered. 鈥淚 added a tracker for whether a song had been played in the last 15 days as a surprise song,鈥 said Morris. 鈥淚 remember it improved the prediction somewhat, and I was surprised.鈥

To test the system, Morris used Taylor Swift as a case study because her catalog offered a rare combination of scale, diversity and streaming data. Her more than 300 songs span country, pop and folk music, while her re-recorded albums created a unique opportunity to study how fans respond to older music released again as something new.

鈥淪ome of these re-releases were outperforming like a new album would,鈥 said Morris. 鈥淚t says that when you鈥檙e an artist that big, with a career that long, fans will stream a song they鈥檝e heard for 12 years as if it鈥檚 brand-new music.鈥

One of the project鈥檚 biggest findings was that lyrics mattered more than Morris expected, especially for Swift鈥檚 audience. 鈥淪he鈥檚 known as being a lyricist before anything else,鈥 he said, 鈥渁nd those were the types of features I commonly would see in the top five.鈥

Morris describes the project as treating songs 鈥渓ike a genome instead of a flat row of numbers.鈥 In simple terms, that means recognizing music as the result of hundreds of creative decisions shaped by time, culture and audience behavior.

鈥淎 hit from the 1960s wouldn鈥檛 necessarily resonate the same way today,鈥 said Morris. 鈥淐ontextually, in the current cultural climate, that becomes such a key point.鈥

Beyond predicting streams, Morris believes systems like Musical DNA could eventually help artists make creative decisions. His model grouped songs into 鈥渁rchetypes,鈥 such as 鈥淭he Opener,鈥 鈥淓motional Core鈥 and 鈥淭he Pop Hit,鈥 based on shared musical traits.

鈥淭hat鈥檚 where I felt like the project becomes much more meaningful,鈥 said Morris. 鈥淥nce you can predict how well a song is going to stream, you can start making creative decisions around that.鈥

In the future, Morris envisions streaming platforms evolving from passive music libraries into active creative advisors capable of helping artists decide which songs should become singles, where collaborations fit best or how albums should be sequenced. He also believes deeper musical analysis could transform how listeners understand their own tastes.

鈥淪potify Wrapped right now tells you your top songs and artists,鈥 said Morris, 鈥渂ut something like this could tell you that you actually prefer guitar as a primary instrument or songs within a certain tempo range. Once you add more descriptive features, you can give people a much deeper understanding of what they really respond to in music.鈥

Share

FacebookTwitterLinkedInWhat's AppEmailPrint

Follow Us