From 8651ada7ba1e48bdee3ec97af4e1b1182ea78684 Mon Sep 17 00:00:00 2001 From: "hannes.kuchelmeister" Date: Wed, 18 Mar 2020 17:46:00 +0100 Subject: [PATCH] remove word 'we' from thesis --- 30_Thesis/sections/10_foundations.tex | 14 +++++++------- 30_Thesis/sections/40_concept.tex | 10 +++++----- 30_Thesis/sections/60_evaluation.tex | 6 +++--- 3 files changed, 15 insertions(+), 15 deletions(-) diff --git a/30_Thesis/sections/10_foundations.tex b/30_Thesis/sections/10_foundations.tex index e060da2..5ff91bf 100644 --- a/30_Thesis/sections/10_foundations.tex +++ b/30_Thesis/sections/10_foundations.tex @@ -10,7 +10,7 @@ Formally a configuration problem can be specified as a \emph{constraint satisfac \begin{equation} \label{eq:Foundations:ProductConfiguration:ConstraintSatisfactionProblem} CSP(V,\mathfrak{D},C), \end{equation} -where we have a set of \emph{variables} $V$ (which in this thesis will also be referred to as \emph{features}) with +where $V$ is a set of \emph{variables} (which in this thesis will also be referred to as \emph{features}) with \begin{equation} \label{eq:Foundations:ProductConfiguration:Variables} V = \{v_1, \dotsc, v_m\}, \end{equation} @@ -25,14 +25,14 @@ and \emph{constraints} $C$ that limit the solution space with \subsection{Configuration State} -We will define a \emph{configuration} $S$ as a tuple of variables (\autoref{eq:Foundations:ProductConfiguration:Variables}) and their corresponding domain value with +A \emph{configuration} $S$ will be defined as a tuple of variables (\autoref{eq:Foundations:ProductConfiguration:Variables}) and their corresponding domain value with \begin{equation} \label{eq:Foundations:ProductConfiguration:ConfigurationState} S = \{ (v_i,\ d) \ |\ v_i \in V \ \land \ d \in \mathfrak{D}(i),\ i=1,\dotsc,m \}. \end{equation} Essentially it is a set of variables and assigned values. \subsection{Finished Configuration} -To define a \emph{finished configuration} we first need to define what a valid configuration is. Therefore we define $is\_valid$ as +To define what a \emph{finished configuration} is, it is required to first define what it means for a configuration to be valid. Therefore $is\_valid$ is defined as \begin{equation} \label{eq:Foundations:ProductConfiguration:IsValid} is\_valid : S \to \{true, false\}; x \mapsto \begin{cases} @@ -44,13 +44,13 @@ with $solution\_space$ being the solution space of the corresponding constraint \begin{equation} \label{eq:Foundations:ProductConfiguration:FinishedConfiguration} S_F \subset S,\ where \ \forall v_i \in V (\exists (v_i, d) \in S_F : d \in \mathfrak{D}(i)) \land is\_valid(S_F). \end{equation} -In practice a finished configuration of a product (or solution) is something that is ready to be produced. For example if we are configuring a car, this means that the car could be produced in the specified way that is given by the finished configuration. +In practice a finished configuration of a product (or solution) is something that is ready to be produced. For example if a care is being configured, this means that the car could be produced in the specified way that is given by the finished configuration. \section{Group-Based Product Configuration} \label{sec:Foundations:GroupBasedProductConfiguration} -Instead of a single person configuring a product, a group of people is configuring one product which can be useful in multi-stakeholder decisions. This setting needs mechanisms for describing the preferences of multiple people. Therefore we will add to our definitions, a set of users $U$ with +Instead of a single person configuring a product, a group of people is configuring one product which can be useful in multi-stakeholder decisions. This setting needs mechanisms for describing the preferences of multiple people. Therefore to the definitions there will be added a set of users $U$ with \begin{equation}\label{eq:Foundations:ProductConfiguration:Users} U = \{1, \dotsc, n\}, \end{equation} @@ -91,7 +91,7 @@ In collaborative filtering a users rating for unknown items is predicted by find Collaborative Filtering can not only be done using users, it can also be item-based. Hereby the similarity between items is used for a recommendation and not similar users \cite{ricciRecommenderSystemsHandbook2015}. In the context of configuration the similarity to other historic configurations can be used which makes it an item based approach. -\autoref{tab:Foundations:RecommenderSystem:MoviePreferences} shows an example rating matrix. A simple user-based way to calculate a rating would be to use a k-nearest neighbour (kNN) algorithm and then take the average of those ratings. Using this method with $k := 2$ and euclidean distance our closest neighbours are \textit{Lucy} and \textit{Diane} therefore giving us a predicted rating of $4$. If we use an item based approach instead, we will try to find similar items based on the users rating. An example of similar items here would be \textit{Forest Gump} and \textit{Wall-E} as John and Lucy each have given them the sane rating and Eric's rating is off by one. Using again kNN with $k := 2$ we find that \textit{Forest Gump} and \textit{Wall-E} are the most similar to \textit{Titanic} thereby having a predicted rating of $4.5$. +\autoref{tab:Foundations:RecommenderSystem:MoviePreferences} shows an example rating matrix. A simple user-based way to calculate a rating would be to use a k-nearest neighbour (kNN) algorithm and then take the average of those ratings. Using this method with $k := 2$ and euclidean distance our closest neighbours are \textit{Lucy} and \textit{Diane} therefore giving us a predicted rating of $4$. If an item-based approach is used instead, it will be tried to find similar items based on the user's rating. An example of similar items here would be \textit{Forest Gump} and \textit{Wall-E} as John and Lucy each have given them the sane rating and Eric's rating is off by one. Using again kNN with $k := 2$ it is found that \textit{Forest Gump} and \textit{Wall-E} are the most similar to \textit{Titanic} thereby having a predicted rating of $4.5$. However this simple similarity and prediction function does not take into account different distances. For example Lucy's ratings are more similar compared to Eric's than Diane's but Diane's and Lucy's rating is valued the same amount. \todo[inline]{ @@ -109,7 +109,7 @@ Our movie example (see \autoref{tab:Foundations:RecommenderSystem:MoviePreferenc \subsection{Content-Based Filtering} For the content-based filtering approach, items and users are assigned to categories. Based on consumption and rating of items a user will have implicit ratings for categories. Predictions are now made based on a categories of the new item \cite[~ pp. 10, 11]{felfernigDecisionTasksBasic2018}. -Using our example from \autoref{tab:Foundations:RecommenderSystem:MoviePreferences} and using an additional category matrix (see \autoref{tab:Foundations:RecommenderSystem:ContentBasedFilteringCategories}) we can derive a rating matrix per category (using the average rating of the user of each movie contained in this category). The result can be seen in \autoref{tab:Foundations:RecommenderSystem:ContentBasedFilteringProfiles}. To predict Eric's rating of Titanic we now can use the categories of \textit{Titanic} and average out Eric's implicit rating per category. Titanic is only in the category romance and as Eric's rating of \textit{Forest Gump} is $5$ the prediction is a rating of $5$. Categories don't have to be the genre, they could be any kind of data about a movie. +Using the example from \autoref{tab:Foundations:RecommenderSystem:MoviePreferences} and using an additional category matrix (see \autoref{tab:Foundations:RecommenderSystem:ContentBasedFilteringCategories}) it a rating matrix per category can be derived (using the average rating of the user of each movie contained in this category). The result can be seen in \autoref{tab:Foundations:RecommenderSystem:ContentBasedFilteringProfiles}. To predict Eric's rating of Titanic, the categories of \textit{Titanic} and averages of Eric's implicit rating per category are used. Titanic is only in the category romance and as Eric's rating of \textit{Forest Gump} is $5$ the prediction is a rating of $5$. Categories don't have to be the genre, they could be any kind of data about a movie. \begin{table} \centering diff --git a/30_Thesis/sections/40_concept.tex b/30_Thesis/sections/40_concept.tex index a50c8f2..5ce0601 100644 --- a/30_Thesis/sections/40_concept.tex +++ b/30_Thesis/sections/40_concept.tex @@ -51,7 +51,7 @@ The used characteristics and attributes are shown in \autoref{fig:Concept:Forest \begin{figure} \begin{mdframed}[frametitle={Example for Forest Use Case}] - In this example we have a small group of users. The use case is a piece of forest and variables are for example harvesting activity, which trees to grow and accessibility for people. + In this example there are a small group of users. The use case is a piece of forest and variables are for example harvesting activity, which trees to grow and accessibility for people. \begin{align} \begin{split} V = \{ & \textit{Heimisch}, \textit{Klimaresilient}, \textit{Verwertbar}, \textit{Ernteaufwand}, \\ @@ -136,9 +136,9 @@ This thesis will use multiple scoring functions. Among those are ones for least \subsubsection{Preference Scoring} -All of the aggregation functions mentioned in \autoref{subsec:Concept:SolutionGeneration:ScoringFunction} use one preference per user per product. Therefore to use them in as is a score for the whole configuration per user has to be calculated. We propose to use the difference from the selected feature compared to the average rating of all characteristics. This approach includes all preferences of a user meaning a preference is also seen relative to others. +All of the aggregation functions mentioned in \autoref{subsec:Concept:SolutionGeneration:ScoringFunction} use one preference per user per product. Therefore to use them in as is a score for the whole configuration per user has to be calculated. I propose to use the difference from the selected feature compared to the average rating of all characteristics. This approach includes all preferences of a user meaning a preference is also seen relative to others. -As an example we could have feature +As an example a feature could be \begin{equation} F = \text{ClimateResilientTrees}, \end{equation} with characteristics @@ -149,7 +149,7 @@ preferences \begin{equation} P_1 = \{(\text{low}, 0), (\text{medium},0.6), (\text{high},0.9) \} \end{equation} -and the configuration we want to rate +and the configuration that is supposed to be rated \begin{equation} S_F = \{(\text{ClimateResilientTrees}, \text{high})\}. \end{equation} @@ -160,7 +160,7 @@ A second user with preferences \end{equation} on the other hand results in a feature score of $0.9-0.3=0.6$. For this user characteristic \emph{high} is of higher importance. -As we would like to keep our scores as percentages and not in the interval $[-1,1]$ a normalisation is applied by adding one and dividing by two. Therefore our respective scores are $0.7$ for user one and $0.95$ for user two. A configuration usually consists of more than one feature therefore we take the average rating over all features to get the score one user gives to a configuration. Based on that score the in \autoref{subsec:Concept:SolutionGeneration:ScoringFunction} mentioned aggregation functions can be used. +As scores should be kept as percentages and not in the interval $[-1,1]$ a normalisation is applied by adding one and dividing by two. Therefore the respective scores are $0.7$ for user one and $0.95$ for user two. A configuration usually consists of more than one feature therefore an average rating over all features is taken to get the score one user gives to a configuration. Based on that score the in \autoref{subsec:Concept:SolutionGeneration:ScoringFunction} mentioned aggregation functions can be used. \subsubsection{Cofiguration Change Penalty} diff --git a/30_Thesis/sections/60_evaluation.tex b/30_Thesis/sections/60_evaluation.tex index 4a82602..a9b3c59 100644 --- a/30_Thesis/sections/60_evaluation.tex +++ b/30_Thesis/sections/60_evaluation.tex @@ -3,7 +3,7 @@ In this chapter the prototype is evaluated in terms of its functionality and its properties. -We will generate all possible valid configurations for one use case i.e. generate all possible valid configurations for the forest use case. +All possible valid configurations will be generate for one use case i.e. all possible valid configurations for the forest use case. Generate groups with preferences (explicit preferences) and configuration state (which would be for example the currently existing forest). @@ -26,7 +26,7 @@ When comparing a group to individual scores, a member of the group is randomly c % see: https://medium.com/@george.drakos62/how-to-select-the-right-evaluation-metric-for-machine-learning-models-part-1-regrression-metrics-3606e25beae0 or https://en.wikipedia.org/wiki/Error_metric \subsection{Satisfaction} -As a metric on overall satisfaction within the group we propose a threshold metric that defines a user as satisfied if his score is above a threshold of 60\% and as unsatisfied with a score of less than 40\%. Now we can measure group satisfaction by amount of members being satisfied, neutral and unsatisfied. +As a metric on overall satisfaction within the group a threshold metric is proposed that defines a user as satisfied if his score is above a threshold of 65\% and as unsatisfied with a score of less than 35\%. Now group satisfaction can be measured by the amount of members being satisfied, neutral and unsatisfied. \subsection{Group Score} The group score metric is to simply take the score the recommender has given to a group. This score can be compared with other configurations' score. @@ -47,7 +47,7 @@ The group score metric is to simply take the score the recommender has given to \section{Generating Data} \label{sec:Evaluation:GeneratingGroups} -For the forest use case, the idea is that there are multiple types of user profiles. Each group profile is represented by a neutral, negative or positive attitude to an attribute value. Now during data generation the attitude is converted to a preference using a normal distribution. \autoref{fig:Evaluation:DataGeneration} shows how we convert the user profile to preferences. +For the forest use case, the idea is that there are multiple types of user profiles. Each group profile is represented by a neutral, negative or positive attitude to an attribute value. Now during data generation the attitude is converted to a preference using a normal distribution. \autoref{fig:Evaluation:DataGeneration} shows how the user profile can be converted to preferences. \pgfplotsset{height=5cm,width=\textwidth,compat=1.8} \pgfmathdeclarefunction{gauss}{2}{%