Task 1 Headphones
1 Training/developmentβ
The main training/development database is the MUSDB18-HQ. MUSDB18-HQ has 86 training songs and 14 validation songs.
You can supplement the training and validation data from the following sources:
- Bach10
- FMA-small
- MedleydB version 1 and version 2
We leave it to you to decide how to use these as part of the training and validation sets. Note, some songs from MedleydB are already part of the training set in MUSDB18-HQ. For more information on augmenting and supplementing the training data, please see the rules.
2 Evaluationβ
- We will use the MUSDB18-HQ's evaluation set which is made up of 50 songs.
- You must process all of these for the complete songs.
- All the music will be used for HAAQI evaluation.
- We will then select a random 10-second sample from some of the pieces of music for listening panel evaluation.
3. Data file formats and naming conventionsβ
3.1 Enhanced signalsβ
There are nine output signals generated by the baseline enhancement algorithm:
- Eight enhanced output signal corresponding to the left and right channels of each stem (i.e., as submitted by the challenge entrants)
<Listener ID>/<Song Name>/<Listener ID>_<Song Name>_<Channel>_<Stem>.wav
- One enhanced output signal corresponding to the final remix
<Listener ID>/<Song Name>/<Listener ID>_<Song Name>_remix.wav
Where:
Listener ID
β ID of the listener panel member, e.g., L001 to L100 for initialpseudo-listeners
, etc.Song Name
- Track name from MUSDB18, e.g, One Minute Smile.Channel
- left or right channelStem
- Vocal, Bass, Drums or Others
For example, for development listener ID L5011
and development song name One Minute Smile_left
,
the enhanced output is:
L5011
ββββOne Minute Smile
ββββL5011_Actions - One Minute Smile_left_bass.wav
ββββL5011_Actions - One Minute Smile_right_bass.wav
ββββL5011_Actions - One Minute Smile_left_drums.wav
ββββL5011_Actions - One Minute Smile_right_drums.wav
ββββL5011_Actions - One Minute Smile_left_other.wav
ββββL5011_Actions - One Minute Smile_right_other.wav
ββββL5011_Actions - One Minute Smile_left_vocals.wav
ββββL5011_Actions - One Minute Smile_right_vocals.wav
ββββL5011_Actions - One Minute Smile_remix.wav
3.2 Music metadataβ
Music data is store in a single JSON per file dataset with the following format.
[
{
"Track Name":"A Classic Education - NightOwl",
"Genre":"Singer/Songwriter",
"Source":"MedleyDB",
"License":"CC BY-NC-SA",
"Split":"train"
}
]