Skip to main content

2 posts tagged with "CLIP"

View All Tags

CLIP Challenge Dataset

· 4 min read
Gerardo Roa
Cadenza Team Member
Trevor Cox
Cadenza Team Member

People with hearing loss can have difficulties to clearly and effortlessly hearing lyrics. In speech technology, having metrics to automatically evaluate intelligibility has driven improvements in speech enhancement through machine learning. We want to do the same for music lyrics. We are busy working on the infrastructure for the Cadenza Lyrics Intelligibility Prediction Challenge (CLIP) that will launch at the start of September.

Lyric intelligibility prediction is an understudied topic with only a couple of studies available. In speech, we're seeing advancements driven by large Language Models, but the equivalent is not available for music. One of the reasons for this is the lack of data where audio is paired with listener scores for intelligibility. This is what our new CLIP1 database will address.

Cadenza Lyric Intelligibility Prediction Challenge (CLIP)

· 2 min read
Gerardo Roa
Cadenza Team Member
Trevor Cox
Cadenza Team Member

Dear colleague, It gives us great pleasure to pre-announce the next Cadenza Challenge for music processing and hearing difference. This autumn we will be running the Cadenza Lyric Intelligibility Prediction Challenge (CLIP). We're hoping this will be accepted as an ICASSP 2026 Grand Challenge.

The Challenge

To develop better music processing through machine learning, we need reliable way to automatically evaluate the audio. For music with lyrics, then we need a metric to evaluate the intelligibility of the sung words. The metric would come from a predictive model that takes as input audio and estimates the lyric intelligibility score that someone would achieve in a listening test.

With the development of large language models and foundation models for speech and music, there is great potential to significantly improve the current state-of-the-art. The music will be genres like pop and rock. Some of this will be as-is, other will be passed through a hearing loss simulator to mimic listeners with hearing loss but not wearing hearing aids.