replace figure by better ones and remove from appendix

This commit is contained in:
hannes.kuchelmeister
2020-04-06 17:09:26 +02:00
parent 0b83b3e9fd
commit 0e1dc4bc0e
11 changed files with 45 additions and 69 deletions

View File

@@ -262,42 +262,55 @@ During a group decision it is better to make one less person dissatisfied oppose
\subsection{Data Analysis}
\begin{figure}[p]
\centering
\includegraphics[width=1\textwidth]{./figures/60_evaluation/heterogeneous_combined__amount-1000__tc-70}
\caption{The satisfaction and dissatisfaction using the group recommender for heterogeneous groups with $tc = 70$.}
\label{fig:Evaluation:HeteroSatisfaction}
\end{figure}
\begin{figure}[p]
\centering
\includegraphics[width=1\textwidth]{./figures/60_evaluation/random_combined__amount-1000__tc-85}
\caption{The satisfaction and dissatisfaction using the group recommender for random groups with $tc = 85$.}
\label{fig:Evaluation:RandomSatisfaction}
\end{figure}
\begin{figure}[p]
\centering
\includegraphics[width=1\textwidth]{./figures/60_evaluation/homogeneous_combined__amount-1000__tc-94}
\caption{The satisfaction and dissatisfaction using the group recommender for homogeneous groups with $tc = 94$.}
\label{fig:Evaluation:HomoSatisfaction}
\end{figure}
This subsection holds fixed parameters of $tc$. In it the satisfaction change and the total amount of satisfied people with the recommenders decision dependent on the amount of stored configurations. For clarity reasons not all graphs of the data are included. The missing graphs can be found in the appendix and have references to them.
\autoref{fig:Evaluation:HeteroSatisfactionIncrease} shows the relationship between the change in satisfaction and dissatisfaction and the stored number of configurations. There are three graphs each. One for multiplication, one for least misery and one for best average. The graphs for satisfaction look similar to a logarithmic curve. The increase in change of satisfaction decelerates with a higher number of stored configurations. The change in satisfaction is always above zero and a satisfaction increase of more than three quarters of the maximum can already be seen with around 25 stored configurations. Moreover, the curve for multiplication is greater than all other curves for all parameters. Least misery reaches the lowest amount of change across all values. The minimum number of satisfaction change is $0$ for least misery, and $0.1$ for best average and multiplications. The highest number is around $0.3$ for least misery, $0.4$ for best average and $0.5$ for multiplication
\autoref{fig:Evaluation:HeteroSatisfaction} shows the relationship between the change in satisfaction and dissatisfaction and the stored number of configurations. There are three graphs each. One for multiplication, one for least misery and one for best average. The graphs for satisfaction look similar to a logarithmic curve. The increase in change of satisfaction decelerates with a higher number of stored configurations. The change in satisfaction is always above zero and a satisfaction increase of more than three quarters of the maximum can already be seen with around 25 stored configurations. Moreover, the curve for multiplication is greater than all other curves for all parameters. Least misery reaches the lowest amount of change across all values. The minimum number of satisfaction change is $0$ for least misery, and $0.1$ for best average and multiplications. The highest number is around $0.3$ for least misery, $0.4$ for best average and $0.5$ for multiplication
When looking at dissatisfaction change the graphs are all in the negative number range. Multiplication reaches the lowest number and best average the highest. The gap between all three functions is less than that of satisfaction increase. And overall the curves are flatter meaning the change with 25 stored configurations already reaches close to five sixth of the minimum value. The highest number of satisfaction change is $-0.4$ for all strategies meanwhile the lowest number is around $-0.57$ for least misery, $-0.53$ for best average and $-0.63$ for multiplication.
The figures for homogenous (\autoref{fig:Appendix:HomoSatisfactionIncrease}) and random groups (\autoref{fig:Appendix:RandomSatisfactionIncrease}) are in the appendix. The figures have a similar shape but their values and slope vary. The satisfaction change for homogenous groups is mostly negative, starting at $-2$, and only reaches a positive level for more than $100$ stored configurations with a value of $0.04$. Multiplication and best average have higher values than least misery here too. Moreover the dissatisfaction change is positive across the bored with a value range of $[0,1]$.
Random groups as seen in \autoref{fig:Appendix:RandomSatisfactionIncrease} mostly have a positive change in satisfaction. Values range here from $-0.55$ to $0.27$ for least misery, from $-0.27$ and $-0.28$ to $0.74$ for best average and multiplication. The change is higher than the change for heterogeneous groups. dissatisfaction also changes similarly to heterogeneous groups. Here the values for random groups reach a lower level. They range from $0$ to $-0.59$ for least misery. Multiplication and best average both have as minimum value around $-0.21$ and behave similarly. The range goes down to $-0.84$ for best average and $-0.86$ for multiplication.
The figures for homogenous (\autoref{fig:Evaluation:HomoSatisfaction}) and random groups (\autoref{fig:Evaluation:RandomSatisfaction}) have a similar shape but their values and slope vary. The satisfaction change for homogenous groups is mostly negative, starting at $-2$, and only reaches a positive level for more than $100$ stored configurations with a value of $0.04$. Multiplication and best average have higher values than least misery here too. Moreover the dissatisfaction change is positive across the bored with a value range of $[0,1]$.
Random groups as seen in \autoref{fig:Evaluation:RandomSatisfaction} mostly have a positive change in satisfaction. Values range here from $-0.55$ to $0.27$ for least misery, from $-0.27$ and $-0.28$ to $0.74$ for best average and multiplication. The change is higher than the change for heterogeneous groups. dissatisfaction also changes similarly to heterogeneous groups. Here the values for random groups reach a lower level. They range from $0$ to $-0.59$ for least misery. Multiplication and best average both have as minimum value around $-0.21$ and behave similarly. The range goes down to $-0.84$ for best average and $-0.86$ for multiplication.
\begin{figure}
\centering
\includegraphics[width=1\textwidth]{./figures/60_evaluation/heterogeneous_happy_unhappy_increase_amount-1000__tc-70}
\caption{The satisfaction and dissatisfaction change using the group recommender for heterogeneous groups with $tc = 70$.}
\label{fig:Evaluation:HeteroSatisfactionIncrease}
\end{figure}
\autoref{fig:Evaluation:HeteroSatisfaction} also shows the total number of group members satisfied and dissatisfied with the recommender's decision. Satisfaction with the recommender's decision starts at $2.4$ and quickly reaches $2.65$ for least misery and $2.8$ for best average and multiplication. The highest value for multiplication is at $2.89$. Dissatisfaction also quickly plateaus. Here values for different recommenders are closer together. They start at $0.74$ (least misery) to $0.78$ (best average) and go as low as $0.62$ for least misery, $0.66$ for best average and $0.56$ for multiplication.
\autoref{fig:Evaluation:HeteroSatisfactionTotal} shows the total number of group members satisfied and dissatisfied with the recommender's decision. The horizontal black continuous line shows the value for satisfaction and dissatisfaction with the dictators decision. The graphs show the same curve as \autoref{fig:Evaluation:HeteroSatisfactionIncrease} but in absolute numbers. Satisfaction with the recommender's decision starts at $2.4$ and quickly reaches $2.65$ for least misery and $2.8$ for best average and multiplication. The highest value for multiplication is at $2.89$. Dissatisfaction also quickly plateaus. Here values for different recommenders are closer together. They start at $0.74$ (least misery) to $0.78$ (best average) and go as low as $0.62$ for least misery, $0.66$ for best average and $0.56$ for multiplication.
As shown in \autoref{fig:Evaluation:HomoSatisfaction} when looking at the total numbers the value range for homogenous groups is much larger but the overall shape stays the same. Here satisfaction numbers go from $0.55$ to $2.95$. Least misery performs visibly worse than multiplication and best average reaching only $2.7$. Dissatisfaction values range from $1.21$ to $0.01$ and the values are not really visibly distinguishable besides that in the range $[25,50]$ least misery seems to have the highest number of dissatisfied group members.
As shown in \autoref{fig:Appendix:HomoSatisfactionTotal} the value range for homogenous groups is much larger but the overall shape stays the same. Here satisfaction numbers go from $0.55$ to $2.95$. Least misery performs visibly worse than multiplication and best average reaching only $2.7$. Dissatisfaction values range from $1.21$ to $0.01$ and the values are not really visibly distinguishable besides that in the range $[25,50]$ least misery seems to have the highest number of dissatisfied group members.
Random groups have less overall satisfaction with $tc = 85\%$ as seen in \autoref{fig:Evaluation:RandomSatisfaction} when looking at the total numbers. Satisfaction numbers start from $1.33$ (least misery), $1.61$ (best average) and $1.6$ (multiplication) and go up to $2.15$ for least misery and $2.62$ for best average and multiplication. The dissatisfaction numbers start at $1.5$ for least misery and $1.27$ for best average and multiplication and level of at $0.9$ (least misery), $0.65$ (best average) and $0.63$ (multiplication). Visibly there is a big difference between least misery and the other two aggregation functions.
Random groups have less overall satisfaction with $tc = 85\%$ as seen in \autoref{fig:Appendix:RandomSatisfactionTotal}. Satisfaction numbers start from $1.33$ (least misery), $1.61$ (best average) and $1.6$ (multiplication) and go up to $2.15$ for least misery and $2.62$ for best average and multiplication. The dissatisfaction numbers start at $1.5$ for least misery and $1.27$ for best average and multiplication and level of at $0.9$ (least misery), $0.65$ (best average) and $0.63$ (multiplication). Visibly there is a big difference between least misery and the other two aggregation functions.
\begin{figure}
\centering
\includegraphics[width=1\textwidth]{./figures/60_evaluation/heterogeneous_happy_unhappy_total_amount-1000__tc-70}
\caption{The average satisfaction and dissatisfaction with the recommender's decision for heterogeneous groups based on $tc = 70$.}
\label{fig:Evaluation:HeteroSatisfactionTotal}
\end{figure}
\subsection{Discussion}
After description of the data now the focus shifts to the hypotheses left that have not been evaluated.
\autoref{hyp:Evaluation:HomogenousMoreSatisfied} states that homogenous groups have more satisfied member's with regards to the dictator's and the group recommender's decision. \autoref{fig:Evaluation:tcCount} shows that this holds true for dictator's decision as for every instance satisfaction in homogeneous groups is higher than that of other groups. However \autoref{fig:Evaluation:HeteroSatisfactionTotal}, \autoref{fig:Appendix:HomoSatisfactionTotal} and \autoref{fig:Appendix:RandomSatisfactionTotal} show that for satisfaction with the recommender's decision this does not hold when looking at $tc$ values where the recommender performs best for each segment. In those places the homogenous group only reaches the highest amount of satisfaction when the recommender has access to all stored configurations. With a decreasing number of stored configurations both random groups and heterogeneous groups perform better. It is important to note, when the same $tc$ values are used homogenous groups have a higher amount of satisfied people across the board.
\autoref{hyp:Evaluation:HomogenousMoreSatisfied} states that homogenous groups have more satisfied member's with regards to the dictator's and the group recommender's decision. \autoref{fig:Evaluation:tcCount} shows that this holds true for dictator's decision as for every instance satisfaction in homogeneous groups is higher than that of other groups. However \autoref{fig:Evaluation:HeteroSatisfaction}, \autoref{fig:Evaluation:HomoSatisfaction} and \autoref{fig:Evaluation:RandomSatisfaction} show that for satisfaction with the recommender's decision this does not hold when looking at $tc$ values where the recommender performs best for each segment. In those places the homogenous group only reaches the highest amount of satisfaction when the recommender has access to all stored configurations. With a decreasing number of stored configurations both random groups and heterogeneous groups perform better. It is important to note, when the same $tc$ values are used homogenous groups have a higher amount of satisfied people across the board.
\autoref{hyp:Evaluation:HeterogenousBiggerSatisfactionIncrease} states that the increase in satisfaction should be bigger for more heterogeneous groups. However \autoref{fig:Evaluation:HeteroSatisfactionIncrease}, \autoref{fig:Appendix:HomoSatisfactionIncrease} and \autoref{fig:Appendix:RandomSatisfactionIncrease} show this to be not true. The recommendations for heterogeneous groups indeed cause a larger change in satisfaction compared to homogeneous groups but random groups cause a positive change of higher magnitude. Also the decrease in dissatisfaction is higher among random groups.
\autoref{hyp:Evaluation:HeterogenousBiggerSatisfactionIncrease} states that the increase in satisfaction should be bigger for more heterogeneous groups. However \autoref{fig:Evaluation:HeteroSatisfaction}, \autoref{fig:Evaluation:HomoSatisfaction} and \autoref{fig:Evaluation:RandomSatisfaction} show this to be not true. The recommendations for heterogeneous groups indeed cause a larger change in satisfaction compared to homogeneous groups but random groups cause a positive change of higher magnitude. Also the decrease in dissatisfaction is higher among random groups.
The data shows that having a larger configuration database causes the amount of satisfied group members to be greater than recommendation's using a smaller database. With dissatisfaction the same is seen in inverse. A larger configuration database causes the number of dissatisfied group members to drop compared to a small database. However in some runs there have been instances of least misery that have seen a slight drop. This can be seen in \autoref{fig:Evaluation:HeteroSatisfactionIncrease} when comparing $74$ and $148$ as number of stored configurations. Why this happens is not entirely clear but a cause of that might be that least misery just takes into account the worst performing group member of the group. Therefore it is possible that there is a second slightly worse solution, when comparing least misery scores, which actually has a slight advantage in terms of dissatisfaction. Having this second best configuration can cause it to be in the second database partition therefore resulting in less dissatisfaction on average. \autoref{hyp:Evaluation:StoreSizeBetterResults} therefore is supported by the data but it does not fully hold up when looking at least misery.
The data shows that having a larger configuration database causes the amount of satisfied group members to be greater than recommendation's using a smaller database. With dissatisfaction the same is seen in inverse. A larger configuration database causes the number of dissatisfied group members to drop compared to a small database. However in some runs there have been instances of least misery that have seen a slight drop. This can be seen in \autoref{fig:Evaluation:HeteroSatisfaction} when comparing $74$ and $148$ as number of stored configurations. Why this happens is not entirely clear but a cause of that might be that least misery just takes into account the worst performing group member of the group. Therefore it is possible that there is a second slightly worse solution, when comparing least misery scores, which actually has a slight advantage in terms of dissatisfaction. Having this second best configuration can cause it to be in the second database partition therefore resulting in less dissatisfaction on average. \autoref{hyp:Evaluation:StoreSizeBetterResults} therefore is supported by the data but it does not fully hold up when looking at least misery.
\autoref{hyp:Evaluation:AggregationStrategies} states least misery performs worse than multiplication. For a change in satisfaction this can be seen across the board however for dissatisfaction change this is not true everywhere. \autoref{fig:Evaluation:HeteroSatisfactionIncrease} shows that least misery performs better than best average in terms of dissatisfaction reduction. However in other cases it performs visibly worse. Also of note is multiplication performs best across the board. This supports the findings by \citeauthor{Masthoff2015} \cite[p. 755f]{Masthoff2015} and also shows that the satisfaction model does show some similar results to online evaluations.
\autoref{hyp:Evaluation:AggregationStrategies} states least misery performs worse than multiplication. For a change in satisfaction this can be seen across the board however for dissatisfaction change this is not true everywhere. \autoref{fig:Evaluation:HeteroSatisfaction} shows that least misery performs better than best average in terms of dissatisfaction reduction. However in other cases it performs visibly worse. Also of note is multiplication performs best across the board. This supports the findings by \citeauthor{Masthoff2015} \cite[p. 755f]{Masthoff2015} and also shows that the satisfaction model does show some similar results to online evaluations.
To go back to \autoref{sec:Evaluation:Questions} this section has shown that for random and heterogeneous groups the recommender performs better than a dictator. The average satisfaction depends on the chosen parameters but for the chosen value range average satisfaction with the recommender decision lies above two and can reach close to three satisfied group members for a high number of stored configurations and for some group types. The amount of stored finished configurations plays an important role in performance but with a fraction of stored configurations the recommender still yields good results.

View File

@@ -18,31 +18,3 @@
%% | / Example content |
%% ---------------------
\begin{figure}
\centering
\includegraphics[width=1\textwidth]{./figures/appendix/homogenous_happy_unhappy_increase_amount-1000__tc-94}
\caption{The satisfaction and dissatisfaction change using the group recommender for homogenous groups with $tc = 94$.}
\label{fig:Appendix:HomoSatisfactionIncrease}
\end{figure}
\begin{figure}
\centering
\includegraphics[width=1\textwidth]{./figures/appendix/homogenous_happy_unhappy_total_amount-1000__tc-94}
\caption{The average satisfaction and dissatisfaction with the recommender's decision for homogenous groups based on $tc = 94$.}
\label{fig:Appendix:HomoSatisfactionTotal}
\end{figure}
\begin{figure}
\centering
\includegraphics[width=1\textwidth]{./figures/appendix/radnom_happy_unhappy_increase_amount-1000__tc-85}
\caption{The satisfaction and dissatisfaction change using the group recommender for random groups with $tc = 85$.}
\label{fig:Appendix:RandomSatisfactionIncrease}
\end{figure}
\begin{figure}
\centering
\includegraphics[width=1\textwidth]{./figures/appendix/radnom_happy_unhappy_total_amount-1000__tc-85}
\caption{The average satisfaction and dissatisfaction with the recommender's decision for random groups based on $tc = 85$.}
\label{fig:Appendix:RandomSatisfactionTotal}
\end{figure}