add findijngs for last hypothesis and reference to three evaluation questions

This commit is contained in:
hannes.kuchelmeister
2020-03-31 12:22:02 +02:00
parent a073d0e868
commit d08998986c

View File

@@ -266,5 +266,8 @@ Random groups have less overall satisfaction with $tc = 85\%$ as seen in \autore
The data shows that having a larger configuration store causes the amount of satisfied group members to be greater than compared to recommendation's using a smaller store. With dissatisfaction the same is seen, just that here it is lower with a higher amount of stored configurations. However in some runs there have been instances of least misery that have seen a slightly lower number. This can be seen in \autoref{fig:Evaluation:HeteroSatisfactionIncrease} when comparing $74$ and $148$ as number of stored configurations. Why this happens is not entirely clear but a cause of that might be that least misery just takes into account the worst performing group member of the group. Therefore it is possible that there is a second slightly worse rated solution (by least misery) that actually has a slight advantage over the configuration chosen by least misery. Having a second best configuration can cause it to land in the second partition of the data therefore resulting in an on average less unhappiness. \hyporef{hyp:Evaluation:StoreSizeBetterResults} therefore is mostly supported by the data but it does not fully hold up when looking at least misery.
\hyporef{hyp:Evaluation:AggregationStrategies} states least misery performs worse than multiplication. For a change in satisfaction this can be seen across the board however for dissatisfaction change this is not true everywhere. \autoref{fig:Evaluation:HeteroSatisfactionIncrease} shows that least misery performs better than best average in terms of dissatisfaction reduction. However in other cases it performs visibly worse. Also of note is multiplication performs best across the board. This supports the findings by \citeauthor{Masthoff2015} \cite[p. 755f]{Masthoff2015} and also shows that the satisfaction model does show some similar results to online evaluations.
\hyporef{hyp:Evaluation:AggregationStrategies} states least misery performs worse than multiplication. For a change in satisfaction this can be seen across the board however for dissatisfaction change this is not true everywhere. \autoref{fig:Evaluation:HeteroSatisfactionIncrease} shows that least misery performs better than best average in terms of dissatisfaction reduction. However in other cases it performs visibly worse. Also of note is multiplication performs best across the board. This supports the findings by \citeauthor{Masthoff2015} \cite[p. 755f]{Masthoff2015} and also shows that the satisfaction model does show some similar results to online evaluations.
To go back to \autoref{sec:Evaluation:Questions} this section has shown that for random and heterogeneous groups the recommender performs better than a dictator. The average satisfaction depends on the chosen parameters but for the chosen value range average satisfaction with the recommender decision lies above two and can reach close to three satisfied group members for a high number of stored configurations and for some group types. The amount of stored finished configurations plays an important role in performance but with a fraction of the configuration the recommender still yields good results.