{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "tC4jbs5yqzYw"
},
"source": [
"# Baseline CAD2 Task2\n",
"\n",
"## Rebalance Classical Ensemble\n",
"\n",
"```{image} ../_static/figures/classical_music.jpeg\n",
":alt: Danish String Quartet Sebastian Manz Klarinette Heidelberger Frühling 2013\n",
":class: bg-primary mb-1\n",
":width: 700px\n",
":align: center\n",
"```\n",
"Image by Port(u*o)s oder Phil Ortenau from Wikimedia Commons"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ffET3AfZFKPt"
},
"source": [
"In a pilot study, we found listeners with hearing loss liked the ability to rebalance the different instruments in an ensemble.\n",
"\n",
"In this second round of Cadenza Challenges, we are presenting a challenge where entrants need to process and rebalance the levesl of the instruments of an ensemble of 2 to 5 instruments.\n",
"\n",
"More details about the challenge can be found on the [Cadenza website](https://cadenzachallenge.org/docs/cadenza2/intro). \n",
"\n",
"This tutorial walks you through the process of running the Rebalance Classical Ensemble baseline using the shell interface."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "pajpylpbFud6"
},
"source": [
"## Create the environment\n",
"\n",
"We first need to install the Clarity package. For this, we use the tag version of the challenge, **v0.6.1**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Setting the Location of the Project\n",
"\n",
"For convenience, we are setting an environment variable with the location of the root working directory of the project. This variable will be used in various places throughout the tutorial. Please change this value to reflect where you have installed this notebook on your system."
]
},
{
"cell_type": "code",
"metadata": {},
"source": [
"import os\n",
"os.environ[\"NBOOKROOT\"] = os.getcwd()\n",
"os.environ[\"NBOOKROOT\"] = f\"{os.environ['NBOOKROOT']}/..\"\n",
"%cd {os.environ['NBOOKROOT']}"
],
"outputs": [],
"execution_count": null
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "zL0yLZUvFr9P",
"outputId": "7434b909-ac71-4f3f-c163-2d5870317e9c"
},
"source": [
"from IPython.display import clear_output\n",
"\n",
"import os\n",
"import sys\n",
"\n",
"print(\"Cloning git repo...\")\n",
"!git clone --depth 1 --branch v0.6.1 https://github.com/claritychallenge/clarity.git\n",
"\n",
"clear_output()"
],
"execution_count": 1,
"outputs": []
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Repository installed\n"
]
}
],
"source": [
"print(\"Installing the pyClarity...\\n\")\n",
"%cd clarity\n",
"%pip install -e .\n",
"\n",
"sys.path.append(f'{os.getenv(\"NBOOKROOT\")}/clarity')\n",
"\n",
"clear_output()\n",
"print(\"Repository installed\")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"%cd {os.environ['NBOOKROOT']}/clarity/recipes/cad2/task1\n",
"!pip install -r requirements.txt\n",
"clear_output()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8pXJVSt-F-NN"
},
"source": [
"## Get the demo data\n",
"\n",
"The next step is to download a demo data package that will help demonstrate the process. This package has the same structure as the official data package, so it will help you understand how the files are organized.\n",
"\n",
"Before continuing, it is recommended that you familiarize yourself with the data structure and content, which you can find on the [website](https://cadenzachallenge.org/docs/cadenza2/Lyric%20Intelligibility/lyric_data).\n",
"\n",
"Now, let's download the data..."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "TpG-bGs8Fzgl",
"outputId": "438666a1-ac21-4532-a340-baefee85202d"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Data installed\n"
]
}
],
"source": [
"%cd {os.environ['NBOOKROOT']}\n",
"!gdown 1UqiqwYJuyC1o-C14DpVL4QYncsOGCvHF\n",
"!tar -xf cad2_demo_data.tar.xz\n",
"\n",
"clear_output()\n",
"print(\"Data installed\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "a0gQKwx8GnfD"
},
"source": [
"## Changing working Directory\n",
"\n",
"Next, we change working directory to the location of the shell scripts we wish to run."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"ExecuteTime": {
"end_time": "2024-09-03T14:51:40.107328Z",
"start_time": "2024-09-03T14:51:40.098930Z"
},
"colab": {
"base_uri": "https://localhost:8080/",
"height": 53
},
"id": "fopV37z6GSoO",
"outputId": "791191b3-6176-427c-bc62-63cd45196b57"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/home/gerardoroadabike/Extended/Projects/cadenza_tutorials/clarity/recipes/cad2/task2/baseline\n"
]
}
],
"source": [
"%cd {os.environ['NBOOKROOT']}/clarity/recipes/cad2/task2/baseline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's save the path to the dataset in `root_data`"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"ExecuteTime": {
"end_time": "2024-09-03T14:51:44.722025Z",
"start_time": "2024-09-03T14:51:44.024071Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 8\n",
"drwxr-xr-x 3 gerardoroadabike gerardoroadabike 4096 Aug 22 15:35 audio\n",
"drwxr-xr-x 2 gerardoroadabike gerardoroadabike 4096 Sep 3 16:52 metadata\n"
]
}
],
"source": [
"root_data = f\"{os.environ['NBOOKROOT']}/cadenza_data_demo/cad2/task2\"\n",
"!ls -l {root_data}"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jWFEskB3HgGM"
},
"source": [
"## Running the Baseline\n",
"\n",
"The `enhancement` baseline employs a ConvTasNet model to separate the lyrics from the background accompaniment. This model was trained for causal and non-causal cases. The pre-trained models are stored on Huggingface. The causality is defined in the `config.yaml`.\n",
"\n",
"### The config parameters\n",
"\n",
"The parameters of the baseline are define in the `config.yaml` file.\n",
"\n",
"---\n",
"\n",
"First, it configures the paths to metadata, audio files and, location for the output files.\n",
"\n",
"```yaml\n",
"path:\n",
" root: ??? # Set to the root of the dataset\n",
" metadata_dir: ${path.root}/metadata\n",
" music_dir: ${path.root}/audio\n",
" gains_file: ${path.metadata_dir}/gains.json\n",
" listeners_file: ${path.metadata_dir}/listeners.valid.json\n",
" enhancer_params_file: ${path.metadata_dir}/compressor_params.valid.json\n",
" music_file: ${path.metadata_dir}/music.valid.json\n",
" scenes_file: ${path.metadata_dir}/scenes.valid.json\n",
" scene_listeners_file: ${path.metadata_dir}/scene_listeners.valid.json\n",
" exp_folder: ./exp # folder to store enhanced signals and final results\n",
"```\n",
"\n",
"* `path.root`: must be set to the location of the dataset.\n",
"* `exp_folder`: by default, the name of the folder it's using the causality parameter. But this can be cahnge according your requirements\n",
"\n",
"---\n",
"\n",
"The next parameters are the different sample rates\n",
"\n",
"```yaml\n",
"input_sample_rate: 44100 # sample rate of the input mixture\n",
"remix_sample_rate: 32000 # sample rate for the output remixed signal\n",
"HAAQI_sample_rate: 24000 # sample rate for computing HAAQI score\n",
"```\n",
"\n",
"The HAAQI sample rate is uses by HAAQI in the evaluation.\n",
"\n",
"---\n",
"The next parameters are related to the separation and how it will operate\n",
"\n",
"```yaml\n",
"separator:\n",
" force_redownload: True\n",
" add_residual: 0.1\n",
" causality: noncausal\n",
" device: ~\n",
" separation:\n",
" number_sources: 2\n",
" segment: 6.0\n",
" overlap: 0.1\n",
" sample_rate: ${input_sample_rate}\n",
"```\n",
"\n",
"* `separator.force_redownload`: whether to force redownload the model or not.\n",
"* `separator.add_residual`: percentage (value between 0 and 1) of the rest of instruments to add back to estimated instrument.\n",
"* `separator.causality`: this is where we set the causality.\n",
"* `separator.separation`: these parameters are used for separate large signals using fade and overlap.\n",
"\n",
"--- \n",
"The `enhancer` parameters are common parameters used by the multiband dynamic range compressor.\n",
"\n",
"```yaml\n",
"enhancer:\n",
" crossover_frequencies: [ 353.55, 707.11, 1414.21, 2828.43, 5656.85 ] # [250, 500, 1000, 2000, 4000] * sqrt(2)\n",
" attack: [ 11, 11, 14, 13, 11, 11 ]\n",
" release: [ 80, 80, 80, 80, 100, 100 ]\n",
" threshold: [ -30, -30, -30, -30, -30, -30 ]\n",
"```\n",
"\n",
"You are free to change the parameters if you believe it may improve the signal for the listener panel. However, the evaluation will use these parameters, and your changes may result in lower objective HAAQI scores, as this metric is based on the correlation between the enhanced and reference signals.\n",
"\n",
"---\n",
"\n",
"The last parameters set some of the evaluation configurations\n",
"\n",
"```yaml\n",
"evaluate:\n",
" set_random_seed: True\n",
" small_test: False\n",
" batch_size: 1 # Number of batches\n",
" batch: 0 # Batch number to evaluate\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Running enhance.py\n",
"\n",
"The process goes as:\n",
"1. Loading the different metadata files into dictionaries.\n",
"2. Load the causal or non-causal separation models into a dictionary using the method `load_separation_model()`.\n",
"3. Create an instance of a `MultibandCompressor`.\n",
"4. Load the scenes and listeners per scenes"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"Then, the script processes one scene-listener pair at a time."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"The process of a scene goes as:\n",
"\n",
"1. Get the compressor parameters for the listener\n",
" \n",
"```Python\n",
"# Get the listener's compressor params\n",
"mbc_params_listener: dict[str, dict] = {\"left\": {}, \"right\": {}}\n",
"\n",
"for ear in [\"left\", \"right\"]:\n",
" mbc_params_listener[ear][\"release\"] = config.enhancer.release\n",
" mbc_params_listener[ear][\"attack\"] = config.enhancer.attack\n",
" mbc_params_listener[ear][\"threshold\"] = config.enhancer.threshold\n",
"mbc_params_listener[\"left\"][\"ratio\"] = enhancer_params[listener_id][\"cr_l\"]\n",
"mbc_params_listener[\"right\"][\"ratio\"] = enhancer_params[listener_id][\"cr_r\"]\n",
"mbc_params_listener[\"left\"][\"makeup_gain\"] = enhancer_params[listener_id][\n",
" \"gain_l\"\n",
"]\n",
"mbc_params_listener[\"right\"][\"makeup_gain\"] = enhancer_params[listener_id][\n",
" \"gain_r\"\n",
"]\n",
"```\n",
"\n",
"---\n",
"2. Get the instruments composing the mixture. \n",
"\n",
"```Python\n",
"source_list = {\n",
" f\"source_{idx}\": s[\"instrument\"].split(\"_\")[0]\n",
" for idx, s in enumerate(songs[song_name].values(), 1)\n",
" if \"Mixture\" not in s[\"instrument\"]\n",
"}\n",
"```\n",
"\n",
"---\n",
"3. Load the signal to process and select the requested segment\n",
"```Python\n",
"mixture_signal, mix_sample_rate = read_flac_signal(\n",
" filename=Path(config.path.music_dir) / songs[song_name][\"mixture\"][\"track\"]\n",
")\n",
"assert mix_sample_rate == config.input_sample_rate\n",
"\n",
"start = songs[song_name][\"mixture\"][\"start\"]\n",
"end = start + songs[song_name][\"mixture\"][\"duration\"]\n",
"mixture_signal = mixture_signal[\n",
" int(start * mix_sample_rate) : int(end * mix_sample_rate),\n",
" :,\n",
"]\n",
"```\n",
"\n",
"---\n",
"4. Estimate the stems of the mixture.\n",
" * model: dictionary with the separation models.\n",
" * signal: the original mixture with channels first.\n",
" * signal_sample_rate: sample rate of the original mixture.\n",
" * device: cpu or cuda\n",
" * sources_list: dictionary with the instruments in the mixture.\n",
" * listener: listener to process\n",
" * add_residual: percentage of the `rest of instruments` to add back to the estimated instrument.\n",
" \n",
"```Python\n",
"stems: dict[str, ndarray] = decompose_signal(\n",
" model=separation_models,\n",
" signal=mixture_signal.T, \n",
" signal_sample_rate=config.input_sample_rate,\n",
" device=device,\n",
" sources_list=source_list,\n",
" listener=listener,\n",
" add_residual=config.separator.add_residual,\n",
")\n",
"```\n",
"\n",
"---\n",
"5. Apply the gains.\n",
"The baseline cannot separate 2 lines of the same instruments, i.e., it cannot separate a `violin 1` and `violin 2` in the same mixture. Therefore, when 2 lines of the same instruments are present in the same mixture, the gains for each line of those instrument becaomes the average between them.\n",
" \n",
"Example:\n",
"> original_gains = {'violin 1': 3, 'violin 2': 10, 'viola': 6} \n",
"> violin_avg = (3 + 10) / 2 = 6.5 \n",
"> new_gains = {'violin 1': 6.5, 'violin 2': 6.5, 'viola': 6} \n",
"\n",
"```Python\n",
"# Apply gains to sources\n",
"gain_scene = check_repeated_source(gains[scene[\"gain\"]], source_list)\n",
"stems = apply_gains(stems, config.input_sample_rate, gain_scene)\n",
"```\n",
"\n",
"---\n",
"6. Remix back to stereo the estimated sources with the requested levels.\n",
"\n",
"```Python\n",
"# Downmix to stereo\n",
"enhanced_signal = remix_stems(stems)\n",
"```\n",
"\n",
"---\n",
"7. Adjust the level of the new mixture to -40 dB and apply the compressor\n",
"\n",
"```Python\n",
"# adjust levels to get roughly -40 dB before compressor\n",
"enhanced_signal = adjust_level(enhanced_signal, gains[scene[\"gain\"]])\n",
"\n",
"# Apply compressor\n",
"enhanced_signal = process_remix_for_listener(\n",
" signal=enhanced_signal,\n",
" enhancer=enhancer,\n",
" enhancer_params=mbc_params_listener,\n",
" listener=listener,\n",
")\n",
"```\n",
"\n",
"---\n",
"8. Save the enhanced signal in FLAC. These are saved in the directory `enhanced_signals` within the experiment path `path.exp_folder` defined in the `config.yaml`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can call the `enhance.py` script now. \n",
"When calling this script, mind that you are loading the correct files.\n",
"\n",
"In shell, we can call the enhancer using the demo data as:\n",
"```bash\n",
"python enhance.py \\\n",
" path.root={root_data} \\\n",
" 'path.listeners_file=${path.metadata_dir}/listeners.demo.json' \\\n",
" 'path.enhancer_params_file=${path.metadata_dir}/compressor_params.demo.json' \\\n",
" 'path.scenes_file=${path.metadata_dir}/scenes.demo.json' \\\n",
" 'path.scene_listeners_file=${path.metadata_dir}/scene_listeners.demo.json' \\\n",
" 'path.music_file=${path.metadata_dir}/music.demo.json'\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"ExecuteTime": {
"end_time": "2024-09-03T14:58:38.876351Z",
"start_time": "2024-09-03T14:57:47.266135Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[2024-09-05 10:18:36,071][__main__][INFO] - Loading model cadenzachallenge/ConvTasNet_Bassoon_NonCausal\n",
"config.json: 100%|█████████████████████████████| 204/204 [00:00<00:00, 1.74MB/s]\n",
"model.safetensors: 100%|███████████████████| 26.4M/26.4M [00:00<00:00, 30.9MB/s]\n",
"[2024-09-05 10:18:38,013][__main__][INFO] - Loading model cadenzachallenge/ConvTasNet_Cello_NonCausal\n",
"config.json: 100%|█████████████████████████████| 204/204 [00:00<00:00, 1.19MB/s]\n",
"model.safetensors: 100%|████████████████████| 26.4M/26.4M [00:00<00:00, 104MB/s]\n",
"[2024-09-05 10:18:39,108][__main__][INFO] - Loading model cadenzachallenge/ConvTasNet_Clarinet_NonCausal\n",
"config.json: 100%|█████████████████████████████| 204/204 [00:00<00:00, 2.07MB/s]\n",
"model.safetensors: 100%|████████████████████| 26.4M/26.4M [00:00<00:00, 106MB/s]\n",
"[2024-09-05 10:18:40,283][__main__][INFO] - Loading model cadenzachallenge/ConvTasNet_Flute_NonCausal\n",
"config.json: 100%|█████████████████████████████| 204/204 [00:00<00:00, 2.17MB/s]\n",
"model.safetensors: 100%|████████████████████| 26.4M/26.4M [00:00<00:00, 110MB/s]\n",
"[2024-09-05 10:18:41,247][__main__][INFO] - Loading model cadenzachallenge/ConvTasNet_Oboe_NonCausal\n",
"config.json: 100%|█████████████████████████████| 204/204 [00:00<00:00, 2.00MB/s]\n",
"model.safetensors: 100%|████████████████████| 26.4M/26.4M [00:00<00:00, 104MB/s]\n",
"[2024-09-05 10:18:42,349][__main__][INFO] - Loading model cadenzachallenge/ConvTasNet_Sax_NonCausal\n",
"config.json: 100%|█████████████████████████████| 204/204 [00:00<00:00, 2.09MB/s]\n",
"model.safetensors: 100%|████████████████████| 26.4M/26.4M [00:00<00:00, 108MB/s]\n",
"[2024-09-05 10:18:43,440][__main__][INFO] - Loading model cadenzachallenge/ConvTasNet_Viola_NonCausal\n",
"config.json: 100%|█████████████████████████████| 204/204 [00:00<00:00, 2.12MB/s]\n",
"model.safetensors: 100%|████████████████████| 26.4M/26.4M [00:00<00:00, 111MB/s]\n",
"[2024-09-05 10:18:44,421][__main__][INFO] - Loading model cadenzachallenge/ConvTasNet_Violin_NonCausal\n",
"config.json: 100%|█████████████████████████████| 204/204 [00:00<00:00, 1.94MB/s]\n",
"model.safetensors: 100%|████████████████████| 26.4M/26.4M [00:00<00:00, 110MB/s]\n",
"[2024-09-05 10:18:46,617][__main__][INFO] - [001/002] Processing S50027: song op1_1_002 for listener L5008\n",
"[2024-09-05 10:20:31,040][clarity.utils.flac_encoder][WARNING] - Writing enhanced_signals/valid/S50027_L5008_remix.flac: 31 samples clipped\n",
"[2024-09-05 10:20:31,071][__main__][INFO] - [002/002] Processing S50084: song sq7123582_2_006 for listener L5079\n",
"/home/gerardoroadabike/anaconda3/envs/tutorials/lib/python3.11/site-packages/numpy/core/fromnumeric.py:3504: RuntimeWarning: Mean of empty slice.\n",
" return _methods._mean(a, axis=axis, dtype=dtype,\n",
"/home/gerardoroadabike/anaconda3/envs/tutorials/lib/python3.11/site-packages/numpy/core/_methods.py:129: RuntimeWarning: invalid value encountered in scalar divide\n",
" ret = ret.dtype.type(ret / rcount)\n",
"[2024-09-05 10:22:11,175][__main__][INFO] - Done!\n"
]
}
],
"source": [
"!python enhance.py path.root={root_data} path.listeners_file={root_data}/metadata/listeners.demo.json path.enhancer_params_file={root_data}/metadata/compressor_params.demo.json path.scenes_file={root_data}/metadata/scenes.demo.json path.scene_listeners_file={root_data}/metadata/scene_listeners.demo.json path.music_file={root_data}/metadata/music.demo.json"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"ExecuteTime": {
"end_time": "2024-09-03T14:58:39.580882Z",
"start_time": "2024-09-03T14:58:38.880629Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"S50027_L5008_remix.flac S50084_L5079_remix.flac\n"
]
}
],
"source": [
"!ls {os.environ['NBOOKROOT']}/clarity/recipes/cad2/task2/baseline/exp/enhanced_signals/valid"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"Let's listen to these signals."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"ExecuteTime": {
"end_time": "2024-09-03T15:08:06.071877Z",
"start_time": "2024-09-03T15:08:05.990435Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"S50084_L5079_remix.flac\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"S50027_L5008_remix.flac\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from pathlib import Path\n",
"from clarity.utils.flac_encoder import read_flac_signal\n",
"from clarity.utils.signal_processing import resample\n",
"import IPython.display as ipd\n",
"\n",
"audio_path = Path(os.environ['NBOOKROOT']) / \"clarity/recipes/cad2/task2/baseline/exp/enhanced_signals/valid\" \n",
"audio_files = [f for f in audio_path.glob('*') if f.suffix == '.flac']\n",
"\n",
"for file_to_play in audio_files:\n",
" signal, sample_rate = read_flac_signal(file_to_play)\n",
" signal = resample(signal, sample_rate, 16000)\n",
" print(file_to_play.name)\n",
" ipd.display(ipd.Audio(signal.T, rate=16000))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"### Running evaluate.py\n",
"Now that we have enhanced audios we can use the `evaluate.py` script to generate HAAQI scores for the signals. The evaluation should run with the same parameters as the enhancement"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"ExecuteTime": {
"end_time": "2024-09-03T14:59:09.644066Z",
"start_time": "2024-09-03T14:58:39.819318Z"
},
"id": "BFvYiF15LJEu"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[2024-09-05 10:59:15,986][__main__][INFO] - Evaluating from enhanced_signals directory\n",
"[2024-09-05 10:59:15,995][__main__][INFO] - [001/002] Evaluating S50027 for listener L5008\n",
"[2024-09-05 10:59:53,569][__main__][INFO] - [002/002] Evaluating S50084 for listener L5079\n",
"[2024-09-05 11:00:29,695][__main__][INFO] - Done!\n"
]
}
],
"source": [
"! python evaluate.py path.root={root_data} path.listeners_file={root_data}/metadata/listeners.demo.json path.enhancer_params_file={root_data}/metadata/compressor_params.demo.json path.scenes_file={root_data}/metadata/scenes.demo.json path.scene_listeners_file={root_data}/metadata/scene_listeners.demo.json path.music_file={root_data}/metadata/music.demo.json"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "K9yB6XdlOCRp"
},
"source": [
"The evaluation scores are save in the `path.exp_folder`/scores.csv"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"ExecuteTime": {
"end_time": "2024-09-03T14:59:09.923967Z",
"start_time": "2024-09-03T14:59:09.646968Z"
},
"id": "uEgjJQd6N655"
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
scene
\n",
"
song
\n",
"
listener
\n",
"
left_haaqi
\n",
"
right_haaqi
\n",
"
avg_haaqi
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
S50027
\n",
"
op1_1_002
\n",
"
L5008
\n",
"
0.696125
\n",
"
0.720061
\n",
"
0.708093
\n",
"
\n",
"
\n",
"
1
\n",
"
S50084
\n",
"
sq7123582_2_006
\n",
"
L5079
\n",
"
0.596429
\n",
"
0.578859
\n",
"
0.587644
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" scene song listener left_haaqi right_haaqi avg_haaqi\n",
"0 S50027 op1_1_002 L5008 0.696125 0.720061 0.708093\n",
"1 S50084 sq7123582_2_006 L5079 0.596429 0.578859 0.587644"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"pd.read_csv(f\"{os.environ['NBOOKROOT']}/clarity/recipes/cad2/task2/baseline/exp/scores.csv\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The HAAQI scores are compute for the left and right ear and then save the averaged."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.9"
}
},
"nbformat": 4,
"nbformat_minor": 4
}