add reference for cross validation

2024-09-04 01:11:00 +02:00 · 2020-05-09 14:29:39 +02:00
parent f93a94d3a3
commit c41ee2b7f1
2 changed files with 11 additions and 1 deletions
--- a/30_Thesis/sections/60_evaluation.tex
+++ b/30_Thesis/sections/60_evaluation.tex
@@ -180,7 +180,7 @@ The natural group type for the use case is a heterogeneous group but to widen th

 \subsection{The Effect of Stored Finished Configurations}

-Another important component of the evaluation is the influence of stored finished configurations. When evaluating a subset of stored finished configurations it is important to avoid outliers. This is the reason why a process inspired by \emph{cross validation} \todo{referenz hinzufügen} is used. The configuration database is randomly ordered and sliced into sub-databases of the needed size. As an example, if the evaluated stored data size is 20, a configuration database containing 100 configurations is split into five sub-databases of size 20. Now the evaluation is carried out for each of the sub-databases and finally the average is determined. This avoids the random picking of a subset which either performs much better than most other possible combinations of databases or which performs much worse. This way the data is more aligned to the expected value.
+Another important component of the evaluation is the influence of stored finished configurations. When evaluating a subset of stored finished configurations it is important to avoid outliers. This is the reason why a process inspired by \emph{cross validation} \cite{kohaviStudyCrossValidationBootstrap1995} is used. The configuration database is randomly ordered and sliced into sub-databases of the needed size. As an example, if the evaluated stored data size is 20, a configuration database containing 100 configurations is split into five sub-databases of size 20. Now the evaluation is carried out for each of the sub-databases and finally the average is determined. This avoids the random picking of a subset which either performs much better than most other possible combinations of databases or which performs much worse. This way the data is more aligned to the expected value.

 \section{Hypotheses}
 \label{sec:Evaluation:Hypotheses}