Repository for my master thesis at the University of Stuttgart (IMS)

python gpt-2 dst transformers huggingface language-models prompt-learning thesis

Go to file

Pavan Mandava 8fef49d43c Added Environment setup section to README.md notes		3 years ago
baseline	Improved training/decoding bash scripts, added baseline requirements.txt file	3 years ago
data	Added special_tokens.txt file	3 years ago
proposal	Updated thesis title(DST -> Dialog State Tracking), added thesis start date & submission date to proposal	3 years ago
.gitignore	Added thesis proposal latex files (and latex .gitignore)	3 years ago
LICENSE	Initial commit	3 years ago
README.md	Added Environment setup section to README.md notes	3 years ago
set_env.sh	Added baseline training, decoding & evaluation scripts	3 years ago
unset_env.sh	Added baseline training, decoding & evaluation scripts	3 years ago

README.md

Prompt-based methods for Dialog State Tracking

Repository for my master thesis at the University of Stuttgart (IMS).

Refer to this thesis proposal document for detailed explanation about thesis experiments.

Dataset

MultiWOZ 2.1 dataset is used for training and evaluation of the baseline/prompt-based methods. MultiWOZ is a fully-labeled dataset with a collection of human-human written conversations spanning over multiple domains and topics. Only single-domain dialogues are used in this setup for training and testing. Each dialogue contains multiple turns and may also contain a sub-domain booking. Five domains - Hotel, Train, Restaurant, Attraction, Taxi are used in the experiments and excluded the other two domains as they only appear in the training set. Under few-shot settings, only a portion of the training data is utilized to measure the performance of the DST task in a low-resource scenario. Dialogues are randomly picked for each domain. The below table contains some statistics of the dataset and data splits for the few-shot experiments.

Data Split	# Dialogues	# Total Turns
50-dpd	250	1114
100-dpd	500	2292
125-dpd	625	2831
250-dpd	1125	5187
valid	190	900
test	193	894

In the above table, term "dpd" refers to "dialogues per domain". For example, 50-dpd means 50 dialogues per each domain.

All the training and testing data can be found under /data/baseline/ folder.

Environment Setup

Baseline (SOLOIST) Environment Setup

Python 3.6 is required for training the baseline model. conda is used for creating environments.

Create conda environment

Create an environment with specific python version (Python 3.6).

conda create -n <env-name> python=3.6

Activate the conda environment

Activate the conda environment for installing the requirements.

conda activate <env-name>

Deactivating the conda evironment

Deactivate the conda environment by running the following command: (After running all the experiments)

conda deactivate

Download and extract SOLOIST pre-trained model

Download and unzip the pretrained model, this is used for finetuning the baseline and prompt-based methods. For more details about the pre-trained SOLOIST model, refer to the GitHub repo.

Download the zip file, replace the /path/to/folder from the below command to a folder of your choice.

wget https://bapengstorage.blob.core.windows.net/soloist/gtg_pretrained.tar.gz \
	-P /path/to/folder/

Extract the downloaded pretrained model zip file.

tar -xvf /path/to/folder/gtg_pretrained.tar.gz

Clone the repository

Clone the repository for source code

git clone https://git.pavanmandava.com/pavan/master-thesis.git

Pull the changes from remote (if local is behind the remote)

git pull

Change directory

cd master-thesis

Set Environment variables

Next step is to set environment variables that contains path to pre-trained model, saved models and output dirs.

Edit the set_env.sh file and set the paths for: (nano or vim can be used) PRE_TRAINED_SOLOIST - Path to the extracted pre-trained SOLOIST model SAVED_MODELS_BASELINE - Path for saving the trained models at checkpoints OUTPUTS_DIR_BASELINE - Path for storing the outputs of belief state predictions.

nano set_env.sh

Save the edited file and source it

source set_env.sh

Run the below line to unset the environment variables

sh unset_env.sh