Updated README (added Value Extraction)

main
Pavan Mandava 3 years ago
parent b5ba56cb4c
commit 2d8e82ba51

@ -0,0 +1,3 @@
# Analysis of results and outputs
// TODO

@ -180,15 +180,6 @@ python evaluate.py
## Prompt Learning Experiments
### Data
The data for training the prompt learning model is available under [data/prompt-learning](data/prompt-learning) directory.
`create_dataset.py` ([link](utils/create_dataset.py)) has the scripts for converting/creating the data for training the prompt-based model.
> **Note:**
> Running `create_dataset.py` can take some time as it needs to download, install and run Stanford CoreNLP `stanza` package. This scripts downloads coreNLP files of size `~1GB` and requires significant amount of RAM and processor capabilities to run efficiently.
>
> All the data required for training the prompt-based model is already available under the [data](data) directory of this repo.
### Install the requirements
After following the environment setup steps in the previous [section](#environment-setup), install the required python modules for prompt model training.
@ -198,6 +189,26 @@ cd prompt-learning
pip install -r requirements.txt
```
### Data
The data for training the prompt learning model is available under [data/prompt-learning](data/prompt-learning) directory.
`create_dataset.py` ([link](utils/create_dataset.py)) has the scripts for converting/creating the data for training the prompt-based model.
### Value Extraction
Value candidates are extracted from the user dialog history and are utilized in the testing/inference phase. These extracted values are given to the value-based prompt for generating slots at inference time. Stanford CoreNLP (`stanza` package) is used to first extract POS tags and named entities. A set of rules are used to extract values from POS tags and named entities:
- Adjectives (`JJ`) and Adverbs (`RB`) are considered as possible values
- Example: *expensive*, *moderate*
- Consider previous negator `not`
- Example: *not important* (= dont care)
- Named entities (place names, time, date/day, numbers)
- Example: *08:30*, *friday*
- Custom set of Regex NER rules for recognizing named entities
- Stop words and repeated candidate values are filtered out
> **Note:**
> Running `create_dataset.py` can take some time as it needs to download, install and run Stanford CoreNLP `stanza` package. This script also downloads coreNLP files of size about `~1GB` and requires significant amount of RAM and processor capabilities to run this efficiently.
>
> All the data required for training the prompt-based model is already available under the [data](data) directory of this repo. For reproducing the results, it's not required to run this script.
### Train the prompt model
Train a separate model for each data split. Edit the [train_prompting.sh](prompt-learning/train_prompting.sh) file to modify the default hyperparameters for training (learning rate, epochs).
```shell
@ -254,7 +265,7 @@ python evaluate.py -o path/to/outputs/file
| 125-dpd | 46.49 | 91.86 |
| 250-dpd | 47.06 | 92.08 |
> **Note:** All the generated output files for the above reported results are available in the repository. Check [outputs/prompt-learning](outputs/prompt-learning) directory to see the output JSON files for each data-split.
> **Note:** All the generated output files for the above reported results are available in this repository. Check [outputs/prompt-learning](outputs/prompt-learning) directory to see the output JSON files for each data-split.
## Multi-prompt Learning Experiments
@ -313,3 +324,7 @@ sh test_prompting.sh -m <tuned-prompt-model-path>
> **Note:** All the generated output files for the above reported results are available in this repository. Check [outputs/multi-prompt](outputs/multi-prompt) directory to see the output JSON files for each data-split.
## Analysis
Analyses of the results and belief state generations (outputs) can be found [here](ANALYSIS.md).

File diff suppressed because one or more lines are too long
Loading…
Cancel
Save