Task 1 Headphones
Data and baseline code can be downloaded from the download page following this timeline.
The main training/development database is the MUSDB18-HQ. MUSDB18-HQ has 86 training songs and 14 validation songs.
You can supplement the training and validation data from the following sources:
- MedleydB version 1 and version 2
We leave it to you to decide how to use these as part of the training and validation sets. Note, some songs from MedleydB are already part of the training set in MUSDB18-HQ. For more information on augmenting and supplementing the training data, please see the rules.
- We will use the MUSDB18-HQ's evaluation set which is made up of 50 songs.
- You must process all of these for the complete songs.
- All the music will be used for HAAQI evaluation.
- We will then select a random 10-second sample from some of the pieces of music for listening panel evaluation.
3. Data file formats and naming conventions
3.1 Enhanced signals
There are nine output signals generated by the baseline enhancement algorithm:
- Eight enhanced output signal corresponding to the left and right channels of each stem (i.e., as submitted by the challenge entrants)
<Listener ID>/<Song Name>/<Listener ID>_<Song Name>_<Channel>_<Stem>.wav
- One enhanced output signal corresponding to the final remix
<Listener ID>/<Song Name>/<Listener ID>_<Song Name>_remix.wav
Listener ID– ID of the listener panel member, e.g., L001 to L100 for initial
Song Name- Track name from MUSDB18, e.g, One Minute Smile.
Channel- left or right channel
Stem- Vocal, Bass, Drums or Others
For example, for development listener ID
L5011 and development song name
One Minute Smile_left,
the enhanced output is:
└───One Minute Smile
├───L5011_Actions - One Minute Smile_left_bass.wav
├───L5011_Actions - One Minute Smile_right_bass.wav
├───L5011_Actions - One Minute Smile_left_drums.wav
├───L5011_Actions - One Minute Smile_right_drums.wav
├───L5011_Actions - One Minute Smile_left_other.wav
├───L5011_Actions - One Minute Smile_right_other.wav
├───L5011_Actions - One Minute Smile_left_vocals.wav
├───L5011_Actions - One Minute Smile_right_vocals.wav
└───L5011_Actions - One Minute Smile_remix.wav
3.2 Music metadata
Music data is store in a single JSON per file dataset with the following format.
"Track Name":"A Classic Education - NightOwl",