|
|
3 years ago | |
|---|---|---|
| baseline | 3 years ago | |
| data | 3 years ago | |
| proposal | 3 years ago | |
| .gitignore | 3 years ago | |
| LICENSE | 3 years ago | |
| README.md | 3 years ago | |
| set_env.sh | 3 years ago | |
| unset_env.sh | 3 years ago | |
README.md
Prompt-based methods for Dialog State Tracking
Repository for my master thesis at the University of Stuttgart (IMS).
Refer to this thesis proposal document for detailed explanation about thesis experiments.
Dataset
MultiWOZ 2.1 dataset is used for training and evaluation of the baseline/prompt-based methods. MultiWOZ is a fully-labeled dataset with a collection of human-human written conversations spanning over multiple domains and topics. Only single-domain dialogues are used in this setup for training and testing. Each dialogue contains multiple turns and may also contain a sub-domain booking. Five domains - Hotel, Train, Restaurant, Attraction, Taxi are used in the experiments and excluded the other two domains as they only appear in the training set. Under few-shot settings, only a portion of the training data is utilized to measure the performance of the DST task in a low-resource scenario. Dialogues are randomly picked for each domain. The below table contains some statistics of the dataset and data splits for the few-shot experiments.
| Data Split | # Dialogues | # Total Turns |
|---|---|---|
| 50-dpd | 250 | 1114 |
| 100-dpd | 500 | 2292 |
| 125-dpd | 625 | 2831 |
| 250-dpd | 1125 | 5187 |
| valid | 190 | 900 |
| test | 193 | 894 |
In the above table, term "dpd" refers to "dialogues per domain". For example, 50-dpd means 50 dialogues per each domain.
All the training and testing data can be found under /data/baseline/ folder.
Environment Setup
Baseline (SOLOIST) Environment Setup
Python 3.6 is required for training the baseline model. conda is used for creating environments.
Create conda environment
Create an environment with specific python version (Python 3.6).
conda create -n <env-name> python=3.6
Activate the conda environment
Activate the conda environment for installing the requirements.
conda activate <env-name>
Deactivating the conda evironment
Deactivate the conda environment by running the following command: (After running all the experiments)
conda deactivate
Download and extract SOLOIST pre-trained model
Download and unzip the pretrained model, this is used for finetuning the baseline and prompt-based methods. For more details about the pre-trained SOLOIST model, refer to the GitHub repo.
Download the zip file, replace the /path/to/folder from the below command to a folder of your choice.
wget https://bapengstorage.blob.core.windows.net/soloist/gtg_pretrained.tar.gz \
-P /path/to/folder/
Extract the downloaded pretrained model zip file.
tar -xvf /path/to/folder/gtg_pretrained.tar.gz
Clone the repository
Clone the repository for source code
git clone https://git.pavanmandava.com/pavan/master-thesis.git
Pull the changes from remote (if local is behind the remote)
git pull
Change directory
cd master-thesis
Set Environment variables
Next step is to set environment variables that contains path to pre-trained model, saved models and output dirs.
Edit the set_env.sh file and set the paths for: (nano or vim can be used)
PRE_TRAINED_SOLOIST - Path to the extracted pre-trained SOLOIST model
SAVED_MODELS_BASELINE - Path for saving the trained models at checkpoints
OUTPUTS_DIR_BASELINE - Path for storing the outputs of belief state predictions.
nano set_env.sh
Save the edited file and source it
source set_env.sh
Run the below line to unset the environment variables
sh unset_env.sh