ICASSP 2026 Cadenza Challenge: Predicting Lyric Intelligibility
The full release will be on the 1st of September.
Last update: 2025-08-18
Image by Sabena Costa from Pixabay
Overview​
Understanding the lyrics in music is key for music enjoyment [1]. People with hearing loss can have difficulties clearly and effortlessly hearing lyrics [2], however. In speech technology, having metrics to automatically evaluate intelligibility has driven improvements in speech enhancement. We want to do the same for music with lyrics.
Challenge entrants will be given thousands of audio segments of accompanied singing, and their task is to predict the word correct rate from perceptual experiments, where normal hearing listeners were asked to write down the lyrics they heard. All audios will be presented as-is and processed by a hearing loss simulator for a mild and moderate hearing loss, so diverse hearing characteristics are incorporated into the challenge.
This challenge will:
- Develop intelligibility metrics that better reflect human perception of sung lyrics in popular Western music.
- Develop new knowledge about the intelligibility differences between spoken and sung language and intonation.
- Catalyse future work into enhancing lyric intelligibility to: (i) improve accessibility of music for listeners with hearing loss, and (ii) improve health and well-being.
What is Lyric Intelligibility?​
Our sensory panel of hearing aids users with hearing loss, refers to Lyric Intelligibility as "how clearly and effortlessly the words in the music can be heard." Some songs are intrinsically less intelligible than others. Factors that can affect intelligibility include [1]:
- Vocal style and articulation.
- Song genre.
- Mixing and production techniques.
- Listener hearing ability.
Our dataset considers all these factors by:
- It includes different singing styles from rap to rock.
- It includes songs from a large range of genres.
- Songs have various mixing levels with some having the vocals more prominent and others where the vocals are masked by the background accompaniment.
- All samples are presented as-is and processed by a hearing loss simulator, to model different listener ability.
No Loss | Mild | Moderate |
---|---|---|
Learning from Speech Intelligibility prediction​
Speech Intelligibility prediction is an established area of research and many different algorithms have been developed. In contrast, metrics for lyrics are very rare (e.g. singing adapted STOI [3]). Consequently, there are many techniques from speech that could be adapted to music to create novel research. For instance, foundation models have made blind (non-intrusive) speech intelligibility metrics much more accurate, can they be adapted for music? Current speech metrics are unreliable for music, because spoken and sung language and intonation are different. Also, sung speech is typically embedded in a music accompaniment which has different characteristics than the independent noise background that spoken speech metrics try to account for. These will pose interesting problems for challenge entrants to overcome.
What will be provided?​
- A novel dataset of song excerpts paired with lyric intelligibility scores from listening tests.
- Software and baseline system.
- A leaderboard via Eval.AI to allow entrants to track progress.
Some song excerpts will be passed through a hearing loss simulator. But entrants can accepted the audio as-is, and no knowledge of hearing loss modelling is needed to take part in the challenge.
ICASSP​
The top teams will be invited to submit papers to present during ICASSP 2026, 4-8th May 2026, Barcelona, Spain.
Expressing interest​
To register and express your interest, please fill out the registration form. And also sign up to our Google group for alerts and discussions about the challenges.
References​
- Fine, P. A. and Ginsborg, J., 2014. Making myself understood: perceived factors affecting the intelligibility of sung text. Frontiers in psychology, 5, 809.
- Greasley, A., Crook, H. and Fulford, R., 2020. Music listening and hearing aids: perspectives from audiologists and their patients. International Journal of Audiology, 59(9), pp.694-706.
- Sharma, B. and Wang, Y., 2019. Automatic evaluation of song intelligibility using singing adapted STOI and vocal-specific features. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28, pp.319-331.