Updated README

3 years ago · 287ec9164d
parent b1e4136734
commit 287ec9164d
2 changed files with 218 additions and 128 deletions
--- a/README.md
+++ b/README.md
@ -1,3 +1,4 @@
 # Prompt-based methods for Dialog State Tracking    
 Repository for my master thesis at the University of Stuttgart (IMS).    
@ -20,20 +21,20 @@ MultiWOZ 2.1 [dataset](https://github.com/budzianowski/multiwoz/blob/master/data
 In the above table, term "*dpd*" refers to "*dialogues per domain*". For example, *50-dpd* means *50 dialogues per each domain*.   
-All the training and testing data can be found under [/data/baseline/](data/baseline) folder.
+All the training and testing data can be found under [/data/](data/) folder.  
 ## Environment Setup  
-Python 3.6 is required for training the baseline model. `conda` is used for creating environments.
+Python 3.6 is required for training the baseline mode. Python 3.10 is required for training the prompt-based model. `conda` is used for creating the environments.  
 ### Create conda environment (for baseline model)  
-Create an environment for baseline training with a specific python version (Python 3.6).
+Create an environment for baseline training with a specific python version (Python 3.6 is **required**).  
 ```shell  
 conda create -n <baseline-env-name> python=3.6  
 ```  
 ### Create conda environment (for prompt learning)  
-Create an environment for prompt-based methods
+Create an environment for prompt-based methods (Python 3.10 is **required**)  
 ```shell  
-# TODO
+conda create -n <prompt-env-name> python=3.10  
 ```  
 #### Activate the conda environment  
@ -42,18 +43,17 @@ To activate the conda environment, run:
 conda activate <env-name>
 ```  
-#### Deactivating the conda evironment
+#### Deactivating the conda environment  
 To deactivate the conda environment, run: (Only after running all the experiments)  
 ```shell  
 conda deactivate
 ```  
 #### Download and extract SOLOIST pre-trained model  
-Download and unzip the pretrained model, this is used for finetuning the baseline and prompt-based methods. For more details about the pre-trained SOLOIST model, refer to the GitHub [repo](https://github.com/pengbaolin/soloist).
+Download and unzip the pretrained model, this is used for fine-tuning the baseline and prompt-based methods. For more details about the pre-trained SOLOIST model, refer to the GitHub [repo](https://github.com/pengbaolin/soloist).  
 Download the zip file, replace the `/path/to/folder` from the below command to a folder of your choice.  
 ```shell  
-wget https://bapengstorage.blob.core.windows.net/soloist/gtg_pretrained.tar.gz \
+wget https://bapengstorage.blob.core.windows.net/soloist/gtg_pretrained.tar.gz \ -P /path/to/folder/
 	-P /path/to/folder/
 ```  
 Extract the downloaded pretrained model zip file.  
@ -78,21 +78,27 @@ cd master-thesis
 #### Set Environment variables  
 Next step is to set environment variables that contains path to pre-trained model, saved models and output dirs.  
-Edit the [set_env.sh](set_env.sh) file and set the paths for: (`nano` or `vim` can be used)
+Edit the [set_env.sh](set_env.sh) file and set the paths (as required) for the following:
 `PRE_TRAINED_SOLOIST` - Path to the extracted pre-trained SOLOIST model  
-`SAVED_MODELS_BASELINE` - Path for saving the trained models at checkpoints
+`SAVED_MODELS_BASELINE` - Path for saving the trained baseline models (fine-tuning) at checkpoints  
 `OUTPUTS_DIR_BASELINE` - Path for storing the baseline model outputs (belief state predictions) 
 `SAVED_MODELS_PROMPT` - Path for saving the trained prompt-based models (after each epoch)
-`OUTPUTS_DIR_BASELINE` - Path for storing the outputs of belief state predictions.
+`OUTPUTS_DIR_PROMPT` - Path for storing the prompt model outputs (generations)
 ```shell  
 nano set_env.sh
 ```  
 Save the edited file and `source` it  
 ```shell  
 source set_env.sh
 ```
 Run the below line to unset the environment variables  
 ```shell  
 sh unset_env.sh
@ -132,7 +138,7 @@ Generate belief states by running decode script
 ```shell  
 sh decode_baseline.sh
 ```  
-The generated predictions are saved under `OUTPUTS_DIR_BASELINE` folder. Some generated belief state predictions are uploaded to this repository and can be found under [outputs](outputs) folder.
+The generated predictions are saved under `OUTPUTS_DIR_BASELINE` folder. Some of the generated belief state predictions are uploaded to this repository and can be found under [outputs](outputs) folder.  
 ### Baseline Evaluation  
@ -142,7 +148,7 @@ Edit the [evaluate.py](baseline/evaluate.py) to set the predictions output file
 ```shell  
 python evaluate.py
 ```  
-#### Results from baseline evaluation
+### Results from baseline experiments  
 |data-split| JGA |  
 |--|:--:|  
 | 5-dpd | 9.06 |  
@ -152,3 +158,87 @@ python evaluate.py
 | 125-dpd | 35.79 |  
 | 250-dpd | 40.38 |
 ## Prompt Learning Experiments
 ### Data
 `create_dataset.py`
 // TODO 
 ### Install the requirements  
 After following the environment setup steps in the previous [section](#environment-setup), install the required python modules for prompt model training.  
 Change directory to `prompt-learning` and install the requirements. Make sure the correct prompt-learning `conda` environment is activated before installing the requirements.  
 ```shell  
 cd prompt-learning
 pip install requirements.txt
 ```  
 ### Train the prompt model  
 Train a separate model for each data split. Edit the [train_prompting.sh](prompt-learning/train_prompting.sh) file to modify the default hyperparameters for training (learning rate, epochs).
 ```shell  
 sh train_prompting.sh -d <data-split-name>
 ```  
 Pass the data split name to `-d` flag. 
 Possible values are: `5-dpd`, `10-dpd`, `50-dpd`, `100-dpd`, `125-dpd`, `250-dpd`  
 Example training command: `sh train_baseline.sh -d 50-dpd`  
 **Some `train_prompting.sh` flags**:
 `--num_epochs 10` - Number of epochs
 `--learning_rate 5e-5` - Initial learning rate for Optimizer
 `--with_inverse_prompt` - Use Inverse Prompt while training
 `--inverse_prompt_weight 0.1` - Weight of the inverse prompt for loss function
 **Note:** The defaults in `train_prompting.sh` are the best performing values.
 ### Belief State Generations (Prompt Generation)
 Now, the belief states can be generated by prompting. Choose a prompt fine-tuned model from the saved epochs and run the below script to generate belief states. 
 Generate belief states by running the below script:
 ```shell  
 sh test_prompting.sh -m <tuned-prompt-model-path>
 ```
 The argument `-m` takes the relative path of saved model from `SAVED_MODELS_PROMPT` env variable. It takes the following structure `-m <data-split-name>/<experiment-folder>/<epoch-folder>`
 Example: `sh test_prompting.sh -m 50-dpd/experiment-20221003T172424/epoch-09`
 The generated belief states (outputs) are saved under `OUTPUTS_DIR_PROMPT` folder. Some of the best outputs are uploaded to this repository and can be found under [outputs](outputs) folder. 
 ### Prompting Evaluation  
 The standard Joint Goal Accuracy (JGA) is used to evaluate the belief predictions. 
 Edit the [evaluate.py](prompt-learning/evaluate.py) to set the predictions output file before running the evaluation 
 ```shell  
 python evaluate.py
 ```  
 ### Results from prompt-based belief state generations
 |data-split| JGA* |  
 |--|:--:|  
 | 5-dpd | //TODO |  
 | 10-dpd | //TODO |  
 | 50-dpd | //TODO |  
 | 100-dpd | //TODO |  
 | 125-dpd | //TODO |  
 | 250-dpd | //TODO |
 // TODO :: Add prompt-based outputs and results in the above table
 ## Multi-prompt Learning Experiments
 ### Prompt Ensemble
 **Training**
 Train a separate model for each data split. Edit the [train_prompting.sh](prompt-learning/train_prompting.sh) file and add `--with_prompt_ensemble` for training with multiple prompt functions.
 // TODO :: Add more README for training and generating. 
 // WIP :: Prompt ensemble training
 ### Prompt Augmentation
 Prompt Augmentation, sometimes called *demonstration learning*, provides a few additional  *answered prompts* that can demonstrate to the PLM, how the actual prompt slot can be answered.  Sample selection of answered prompts are manually hand-picked. Experiments are performed on different sets of *answered prompts*.
 Edit the [test_prompting.sh](prompt-learning/test_prompting.sh) file and add `--with_answered_prompts` flag for generating slots with answered prompts.
 Generate belief states by running the below script:
 ```shell  
 sh test_prompting.sh -m <tuned-prompt-model-path>
 ```
 // TODO :: Add results 
--- a/prompt-learning/test_prompting.sh
+++ b/prompt-learning/test_prompting.sh
@ -2,7 +2,7 @@
 usage="$(basename "$0") [-m <fine-tuned-model-path>]
      Argument -m takes the relative path of fine-tuned model from ${SAVED_MODELS_PROMPT}.
-      Example: -m 250-dpd/experiment-20221030T172424/epoch-08"
+      Example: -m 250-dpd/experiment-20221003T172424/epoch-08"
 while getopts :m: flag
 do
@ -39,7 +39,7 @@ if [ ! -f "${TEST_DATA_FILE}" ]; then
 fi
 FINE_TUNED_MODEL_PATH=${SAVED_MODELS_PROMPT}/${model_path}
-if [ ! -d ${FINE_TUNED_MODEL_PATH} ]; then
+if [ ! -d "${FINE_TUNED_MODEL_PATH}" ]; then
  echo "Invalid fine-tuned model path - ${model_path}"
 fi