Activate the conda environment for installing the requirements.
To activate the conda environment, run:
```shell
```shell
conda activate <env-name>
conda activate <env-name>
```
```
#### Deactivating the conda evironment
#### Deactivating the conda evironment
Deactivate the conda environment by running the following command:
To deactivate the conda environment, run: (Only after running all the experiments)
(After running all the experiments)
```shell
```shell
conda deactivate
conda deactivate
```
```
@ -74,8 +77,11 @@ cd master-thesis
Next step is to set environment variables that contains path to pre-trained model, saved models and output dirs.
Next step is to set environment variables that contains path to pre-trained model, saved models and output dirs.
Edit the [set_env.sh](set_env.sh) file and set the paths for: (`nano` or `vim` can be used)
Edit the [set_env.sh](set_env.sh) file and set the paths for: (`nano` or `vim` can be used)
`PRE_TRAINED_SOLOIST` - Path to the extracted pre-trained SOLOIST model
`PRE_TRAINED_SOLOIST` - Path to the extracted pre-trained SOLOIST model
`SAVED_MODELS_BASELINE` - Path for saving the trained models at checkpoints
`SAVED_MODELS_BASELINE` - Path for saving the trained models at checkpoints
`OUTPUTS_DIR_BASELINE` - Path for storing the outputs of belief state predictions.
`OUTPUTS_DIR_BASELINE` - Path for storing the outputs of belief state predictions.
```shell
```shell
@ -89,3 +95,48 @@ Run the below line to unset the environment variables
```shell
```shell
sh unset_env.sh
sh unset_env.sh
```
```
## Baseline Experiments
SOLOIST ([Peng et al., 2021](https://arxiv.org/abs/2005.05298)), the baseline model for this thesis, is a task-oriented dialog system that uses transfer learning and machine teaching to build task bots at scale. SOLOIST uses the pre-train, fine-tune paradigm for building end-to-end dialog systems using a transformer-based auto-regressive language model GPT-2. In the pre-training stage, SOLOIST is initialized with 12-layer GPT-2 (117M parameters) and further trained on two task-oriented dialog corpora for solving *belief state prediction* task. In the fine-tuning stage, the pre-trained SOLOIST is fine-tuned on MultiWOZ 2.1 dataset to perform belief prediction task.
### Install the requirements
After following the environment setup steps in the previous [section](#environment-setup), install the required python modules for baseline model training.
Change directory to `baseline` and install the requirements. Make sure the correct baseline conda environment is activated before installing the requirements.
```shell
cd baseline
pip install requirements.txt
```
### Train the baseline model
Train a separate model for each data split. Edit the [train_baseline.sh](baseline/train_baseline.sh) file to modify the hyperparameters while training (learning rate, epochs). Use `CUDA_VISIBLE_DEVICES` to specify a CUDA device (GPU) for training the model.
```shell
sh train_baseline.sh -d <data-split-name>
```
Pass the data split name to `-d` flag. Possible values are: `50-dpd`, `100-dpd`, `125-dpd`, `250-dpd`
Example training command: `sh train_baseline.sh -d 50-dpd`
### Belief State Prediction
Choose a checkpoint of the saved baseline model to generate belief state predictions.
Set the `MODEL_CHECKPOINT` environment variable with the path to the chosen model checkpoint. It should only contain the path from the "experiment-{datetime}" folder.
The generated predictions are saved under `OUTPUTS_DIR_BASELINE` folder. Some of the generated belief state predictions are uploaded to this repository and can found under [outputs](outputs) folder.
### Baseline Evaluation
The standard Joint Goal Accuracy (JGA) is used to evaluate the belief predictions. This metric compares all the predicted belief states to the ground-truth states for each turn. The prediction is considered correct only if all the predicted belief states match with the ground-truth states. Both slots and values must match for the prediction to be correct.
Edit the [evaluate.py](baseline/evaluate.py) to set the predictions output file before running the evaluation