The data for training the prompt learning model is available under [data/prompt-learning](data/prompt-learning) directory.
`create_dataset.py` ([link](utils/create_dataset.py)) has the scripts for converting/creating the data for training the prompt-based model.
@ -205,7 +205,7 @@ Value candidates are extracted from the user dialog history and are utilized in
- Stop words and repeated candidate values are filtered out
> **Note:**
> Running `create_dataset.py` can take some time as it needs to download, install and run Stanford CoreNLP `stanza` package. This script also downloads coreNLP files of size about `~1GB` and requires significant amount of RAM and processor capabilities to run this efficiently.
> Running `create_dataset.py` can take some time as it needs to download, install and run Stanford CoreNLP `stanza` package. This script also downloads coreNLP files of size about `~1GB` and requires significant amount of RAM and processor capabilities to run it efficiently.
>
> All the data required for training the prompt-based model is already available under the [data](data) directory of this repo. For reproducing the results, it's not required to run this script.