master-thesis

1.4 KiB

Raw Blame History

Table of Contents

Analysis of results and outputs

Baseline (SOLOIST)
Prompt-based Methods

Value-based prompt
destination vs departure
Duplicate values
Multi-prompt methods

Value Extraction

Analysis of results and outputs

Baseline (SOLOIST)

The baseline SOLOIST is fine-tuned on different data splits to evaluate the performance of belief state predictions task under low-resource settings. As the results show that the baseline SOLOIST model did perform well when fine-tuned on relatively large data samples, however, it performed poorly under low-resource training data (esp. 25 & 50 dialogs).

The belief state prediction task of SOLOIST utilizes top-k and top-p sampling to generate the belief state slots and values. Since the baseline SOLOIST uses open-ended generation, it's susceptible to generating random slot-value pairs that are not relevant to the dialog history. Below is an example of how the baseline model generated a slot-value pair that's not relevant to user goals and it completely missed two correct slot-value pairs.

History	True belief states	Generated belief states
user: we need to find a guesthouse of moderate price. system: do you have any special area you would like to stay? or possibly a star request for the guesthouse? user: i would like it to have a 3 star rating.	type = guesthouse pricerange = moderate stars = 3	parking = yes stars = 3

1.4 KiB Raw Blame History

Analysis of results and outputs

Baseline (SOLOIST)

Prompt-based Methods

Value-based prompt

destination vs departure

Duplicate values

Multi-prompt methods

Value Extraction

1.4 KiB

Raw Blame History