We have two major updates from the Cadenza project.
First, we completed the sensory panel study in which our participants with hearing loss developed audio quality scales for use in our listening experiments.
We presented our study findings at the International Conference of Music Perception and Cognition (ICMPC) in Tokyo, Japan in August and at the Basic Auditory Science (BAS) conference in London in September.
Second, the entrants in the First Cadenza Challenge have now submitted systems they devised to improve the audio quality of various music samples.
These samples have been tailored for the hearing profiles of our listener panel participants and are currently being prepared for our online software.
We will be releasing the main online listening experiment over the coming days!
On 20th September, we ran a webinar for our listener panel. In the first part, we gave a short talk summarising the aims and results of the sensory panel study.
You can watch the recording of this event in the next video.
Our thanks to the challenge entrants for their submissions!
We are now reaching the end of our sensory evaluation study, where a panel of twelve listeners who use hearing aids
have worked across online music listening tasks and three focus groups, to reach a consensus on the important
perceptual attributes of music audio quality.
Starting from 373 unique terms used by participants to describe music audio quality, a discussion process
was completed as outlined in the image below:
At this stage in the work, we wanted to share the current state of these perceptual attributes and short definitions:
Overall Audio Quality
Perceived audio quality results from judgments of the sound of the music, in relation to a person’s expectations of how the music should ideally sound to them.
Clarity refers to how well you can hear and distinguish between the different instruments and elements within the music.
Harshness refers to an uncomfortable overemphasis of certain parts of the sound. It is most often heard in the treble resulting in a piercing, screechy or sharp sounds.
Distortion can be caused by artefacts that shouldn’t be present e.g., noise, hiss, pops or crackles. It can also be caused by the pitches sounding wrong. Music with No distortion sounds like an authentic version of what was performed.
Spaciousness refers to how much you feel the music is ‘coloured’ by the performance space, and how much you can hear the reverberations and sense of space.
Treble strength refers to the perceived strength or prominence of sound qualities that are characterised by higher frequencies in the treble range, or similarly, sounds, instruments or voices with higher pitches.
Middle strength refers to the perceived strength or prominence of sound qualities that are characterised by middle frequencies found between bass and treble ranges, or similarly, sounds, instruments, or voices that pitches perceived as being between lower and higher pitches.
Bass strength refers to the perceived strength or prominence of sound qualities that are characterised by lower frequencies in the bass range, or similarly, sounds, instruments or voices with lower pitches.
Frequency balance refers to the perceived balance between treble (or higher pitch) and bass (or lower pitch) sounds.
Our next steps in the perceptual research will involve further data collection and testing of these attributes,
to understand which are the strongest predictors of overall audio quality. This is an important process as enhanced
music signals submitted by challenge entrants will be scored on these attributes by a listening panel,
and so the necessity of these attributes requires some initial testing.
As always, we would like to express our sincere gratitude and thanks to our sensory panel group for their
incredible commitment, motivation, and contribution to this research!
Here at Cadenza our sensory evaluation work to define audio quality for hearing impaired listeners is underway. We want to understand better how hearing-impaired listeners perceive audio quality in music and develop quality metrics that will subsequently be used by our listener panel to rate the systems submitted by challenge entrants. Through careful listening tasks and group discussion, the sensory panel will arrive at a consensus about important sound quality attributes and how these should be measured. You can find out more about this process on the Sensory Evaluation page.
The first task was an individual elicitation task in which we asked twelve listeners with hearing impairment to provide single-word perceptual terms to describe various music excerpts provided to them. This resulted in hundreds of unique attributes used to describe sound quality, of which 89 were used more than four times (see Fig. 1) and 87 were used two or three times (see Fig. 2). This provided the starting point for the first Focus group which sought to identify the most important attributes and to identify ways of grouping the attributes into perceptual dimensions that could be captured in the quality metric.
The outcome of Focus group one was a spatial mapping of the ways in which the panel grouped attributes together which was used as a starting point for Focus group two (see Fig. 3). The panel then discussed further the meaning and grouping of different terms to reach consensus about important dimensions. In February, we will carry out a third Focus group to arrive at final dimensions, how these can be defined, and how they can be rated or scored in the challenges.
We would like to thank our Sensory panel group for their ongoing motivation and commitment with this challenging task!
Welcome to the new Cadenza webpage. We will be using this page to post the latest news about our forthcoming machine learning challenges and workshops, as well as posts discussing the tools and techniques that we are using in our baseline systems.