diff --git a/README.md b/README.md index ca217bf..cc1b470 100644 --- a/README.md +++ b/README.md @@ -182,7 +182,7 @@ python evaluate.py | 125-dpd | 35.79 | | 250-dpd | **40.38** | -![Baseline results](images/baseline_results.png) +Baseline results ## Prompt Learning Experiments @@ -266,7 +266,8 @@ python evaluate.py -o path/to/outputs/file
w = 0.1 w = 0.3 w = 0.5 w = 0.7
Dataset JGA JGA* JGA JGA* JGA JGA* JGA JGA*
5-dpd 30.66 71.04 31.67 73.19 30.77 72.85 29.98 70.93
10-dpd 42.65 86.43 41.18 83.48 40.05 80.77 40.38 85.18
50-dpd 47.06 91.63 46.49 91.18 47.04 91.18 46.27 90.05
100-dpd 47.74 92.31 48.42 92.42 48.19 92.65 48.3 92.65
125-dpd 46.49 91.86 46.15 91.18 46.83 91.74 46.15 90.95
250-dpd 47.06 92.08 47.62 92.65 47.4 92.31 47.17 92.09
-![Prompt-based methods results](images/prompt_results.png) +Prompt-based methods results + ## Multi-prompt Learning Experiments @@ -315,7 +316,8 @@ sh test_prompting.sh -m | 250-dpd | 48.30 | 93.44 | -![Prompt Ensembling results](images/ensemble_results.png) +Prompt Ensembling results + ### Prompt Augmentation Prompt Augmentation, also called *demonstration learning*, provides a few additional *answered prompts* that can demonstrate to the PLM, how the actual prompt slot can be answered. Sample selection of answered prompts are hand-crafted and hand-picked manually. Experiments are performed on different sets of *answered prompts*. @@ -330,12 +332,12 @@ sh test_prompting.sh -m
Sample 1 Sample 2
Data JGA JGA* JGA JGA*
5-dpd 26.02 58.6 27.6 59.39
10-dpd 33.26 70.14 34.95 77.94
50-dpd 38.8 71.38 39.77 74.55
100-dpd 35.97 70.89 38.46 74.89
125-dpd 36.09 73.08 36.18 76.47
250-dpd 35.63 72.9 38.91 76.7
-![Prompt Augmentation results](images/demonstration_results.png) +Prompt Augmentation results ### Comparison of all the results -![Comparison of results](images/comparison_results.png) +Comparison of results ## Analysis