Files
bachelor_thesis/30_Thesis/sections/10_foundations.tex
2020-04-09 11:22:54 +02:00

261 lines
17 KiB
TeX

\chapter{Foundations}
\label{ch:Foundations}
\section{Product Configuration}
\label{sec:Foundations:ProductConfiguration}
Product configuration is a process consisting of a series of decision tasks whereby a product is constructed of components which interact with each other. During a configuration process no new components are created. Their interplay and specification is defined beforehand \cite[~ pp. 42, 43]{sabinProductConfigurationFrameworksa1998}. This section defines product configuration analogous to \citeauthor{felfernigOpenConfiguration2014} \cite{felfernigOpenConfiguration2014}.
Formally a configuration problem can be specified as a \emph{constraint satisfaction problem (CSP)} \cite{tsangFoundationsConstraintSatisfaction1993} as
\begin{equation} \label{eq:Foundations:ProductConfiguration:ConstraintSatisfactionProblem}
CSP(V,\mathfrak{D},C),
\end{equation}
where $V$ is a set of \emph{variables} (which in this thesis will also be referred to as \emph{features}) with
\begin{equation} \label{eq:Foundations:ProductConfiguration:Variables}
V = \{v_1, \dotsc, v_m\},
\end{equation}
a \emph{domain mapping} $\mathfrak{D}$ that maps variable to their corresponding domain values $D$ (which in this thesis will be also referred to as \emph{characteristics}) i.e. the values that can be assigned to each variable
\begin{equation}\label{eq:Foundations:ProductConfiguration:DomainMapping}
\mathfrak{D} : V \to D; x \mapsto \mathfrak{D}(x) \qquad where \ D = \{d_1, \dotsc, d_o\},
\end{equation}
and \emph{constraints} $C$ that limit the solution space with
\begin{equation}\label{eq:Foundations:ProductConfiguration:Constraints}
C = \{c_1, \cdots, c_k\}.
\end{equation}
\subsection{Configuration State}
A \emph{configuration} $S$ will be defined as a tuple of variables (\autoref{eq:Foundations:ProductConfiguration:Variables}) and their corresponding domain value with
\begin{equation} \label{eq:Foundations:ProductConfiguration:ConfigurationState}
S = \{ (v_i,\ d) \ |\ v_i \in V \ \land \ d \in \mathfrak{D}(i),\ i=1,\dotsc,m \}.
\end{equation}
Essentially it is a set of variables and assigned values.
\subsection{Finished Configuration}
To define what a \emph{finished configuration} is, it is required to first define what it means for a configuration to be valid. Therefore $is\_valid$ is defined as
\begin{equation} \label{eq:Foundations:ProductConfiguration:IsValid}
is\_valid : S \to \{true, false\}; x \mapsto
\begin{cases}
true, & S \in solution\_space \\
false, & \text{otherwise}
\end{cases},
\end{equation}
with $solution\_space$ being the solution space of the corresponding constraint satisfaction problem. A \emph{finished configuration} $S_F$ is a configuration that contains all variables and is a valid configuration:
\begin{equation} \label{eq:Foundations:ProductConfiguration:FinishedConfiguration}
S_F \subset S,\ where \ \forall v_i \in V (\exists (v_i, d) \in S_F : d \in \mathfrak{D}(i)) \land is\_valid(S_F).
\end{equation}
In practice a finished configuration of a product (or solution) is something that is ready to be produced. For example if a care is being configured, this means that the car could be produced in the specified way that is given by the finished configuration.
\section{Group-Based Product Configuration}
\label{sec:Foundations:GroupBasedProductConfiguration}
Instead of a single person configuring a product, a group of people is configuring one product which can be useful in multi-stakeholder decisions. This setting needs mechanisms for describing the preferences of multiple people. Therefore to the definitions there will be added a set of users $U$ with
\begin{equation}\label{eq:Foundations:ProductConfiguration:Users}
U = \{1, \dotsc, n\},
\end{equation}
a users \emph{utility function} that maps a domain value to a utility value and is only known to the user
\begin{equation}
\begin{split}
u_i(d_j), \qquad \text{where}\ & d_j \in \mathfrak{D}(j),\\
& 1 <= j <= m, \\
& 1 <= i <= n .
\end{split}
\end{equation}
\citeauthor{raabKollaborativeProduktkonfigurationEchtzeit2019}'s \cite{raabKollaborativeProduktkonfigurationEchtzeit2019} approach, which this thesis extends, is to treat the group configuration the same as one shared configuration and to sync the selection of attributes across clients.
\section{Recommender System}
\label{sec:Foundations:RecommenderSystem}
A recommender system is a system that gives individualized recommendations to users to guide them through a large space of objects \cite[~ p. 331]{burkeHybridRecommenderSystems2002}.
There are several approaches to recommender systems presented in \cite{felfernigGroupRecommenderSystems2018}, these are: collaborative filtering, \todo[inline]{critiquing-based recommendation,} constraint-based recommendation, content-based filtering and hybrid recommendation.
\begin{table}
\centering
\begin{tabular}{ l | c | c | c | c | c }
& The Matrix & Titanic & Die Hard & Forest Gump & Wall-E \\ \hline
John & 5 & 1 & & 2 & 2 \\
Lucy & 1 & 5 & 2 & 5 & 5 \\
Eric & 2 & ? & 3 & 5 & 4 \\
Diane & 4 & 3 & 5 & 3 & \\
\end{tabular}
\caption{An example showing users ratings for movies by \citeauthor{ningComprehensiveSurveyNeighborhoodBased2015} \cite{ningComprehensiveSurveyNeighborhoodBased2015}.}
\label{tab:Foundations:RecommenderSystem:MoviePreferences}
\end{table}
\subsection{Collaborative Filtering}
In collaborative filtering a users rating for unknown items is predicted by finding similar users who have rated it. Their rating is used as prediction
\cite[~ pp. 7, 8]{felfernigDecisionTasksBasic2018}.
Collaborative Filtering can not only be done using users, it can also be item-based. Hereby the similarity between items is used for a recommendation and not similar users \cite{ricciRecommenderSystemsHandbook2015}. In the context of configuration the similarity to other historic configurations can be used which makes it an item based approach.
\autoref{tab:Foundations:RecommenderSystem:MoviePreferences} shows an example rating matrix. A simple user-based way to calculate a rating would be to use a k-nearest neighbour (kNN) algorithm and then take the average of those ratings. Using this method with $k := 2$ and euclidean distance our closest neighbours are \textit{Lucy} and \textit{Diane} therefore giving us a predicted rating of $4$. If an item-based approach is used instead, it will be tried to find similar items based on the user's rating. An example of similar items here would be \textit{Forest Gump} and \textit{Wall-E} as John and Lucy each have given them the sane rating and Eric's rating is off by one. Using again kNN with $k := 2$ it is found that \textit{Forest Gump} and \textit{Wall-E} are the most similar to \textit{Titanic} thereby having a predicted rating of $4.5$.
However this simple similarity and prediction function does not take into account different distances. For example Lucy's ratings are more similar compared to Eric's than Diane's but Diane's and Lucy's rating is valued the same amount.
\todo[inline]{
Critiquing-Based Recommendation %\subsection{Critiquing-Based Recommendation}
Items are recommended and a user can then critique on attributes of the recommendation. Based on that a similar item which does not have those critiques can be recommended. User preferences are implicitly collected this way \cite{knijnenburgEachHisOwn2011}.
With a critique based approach Eric sees a suggestion of watching \textit{The Island} and its attributes. He then can say that he finds this movie has too much action. The critique based recommender will now present a movie that has similar attributes as \textit{The Island} but with less action. For example \textit{Titanic} could be the next suggestion.
}
\subsection{Constraint-Based Recommendation}
Hereby filter rules are defined which filter out items that don't fulfil specified rules. A user models their requirements with these rules and thereby gets a list of recommended items. This approach requires deep knowledge about a product because it needs a detailed description of features \cite[~ p. 12]{felfernigDecisionTasksBasic2018}.
Our movie example (see \autoref{tab:Foundations:RecommenderSystem:MoviePreferences}) requires additional information for example about plot structure, pacing, length and other attributes of the movie. Now the user could enter filtering criteria, e.g. that the movie should be no longer than 120 minutes, be categorized as action or thriller and have a fast pacing. The system will only recommend movies that fit into these categories.
\subsection{Content-Based Filtering}
For the content-based filtering approach, items and users are assigned to categories. Based on consumption and rating of items a user will have implicit ratings for categories. Predictions are now made based on a categories of the new item \cite[~ pp. 10, 11]{felfernigDecisionTasksBasic2018}.
Using the example from \autoref{tab:Foundations:RecommenderSystem:MoviePreferences} and using an additional category matrix (see \autoref{tab:Foundations:RecommenderSystem:ContentBasedFilteringCategories}) it a rating matrix per category can be derived (using the average rating of the user of each movie contained in this category). The result can be seen in \autoref{tab:Foundations:RecommenderSystem:ContentBasedFilteringProfiles}. To predict Eric's rating of Titanic, the categories of \textit{Titanic} and averages of Eric's implicit rating per category are used. Titanic is only in the category romance and as Eric's rating of \textit{Forest Gump} is $5$ the prediction is a rating of $5$. Categories don't have to be the genre, they could be any kind of data about a movie.
\begin{table}
\centering
\begin{tabular}{ l | c | c | c | c | c }
& The Matrix & Titanic & Die Hard & Forest Gump & Wall-E \\ \hline
Action & x & & x & & \\
Sci-Fi & x & & & & \\
Thriller & & & x & & \\
Romance & & x & & x & \\
Family & & & & x & x \\
\end{tabular}
\caption{Showing example categories for movies in \autoref{tab:Foundations:RecommenderSystem:MoviePreferences}.}
\label{tab:Foundations:RecommenderSystem:ContentBasedFilteringCategories}
\end{table}
\begin{table}
\centering
\begin{tabular}{ l | c | c | c | c | c }
& Action & Sci-Fi & Thriller & Romance & Family \\ \hline
John & 5 & 5 & & 1.5 & 2 \\
Lucy & 1.5 & 1 & 2 & 5 & 5 \\
Eric & 2.5 & 2 & 3 & 5 & 4.5 \\
Diane & 4.5 & 4 & 5 & 3 & 3 \\
\end{tabular}
\caption{User profiles generated from categories and rating from \autoref{tab:Foundations:RecommenderSystem:MoviePreferences} and \autoref{tab:Foundations:RecommenderSystem:ContentBasedFilteringCategories}.}
\label{tab:Foundations:RecommenderSystem:ContentBasedFilteringProfiles}
\end{table}
\subsubsection{Advantages over Collaborative Filtering}
\begin{itemize}
\item No cold start problem for items
\item No grey sheep problem as not dependent on similar groups having existed before.
\item Domain knowledge is existent
\item No issues with data sparsity as item description is given by product structure
\item No reliance on preferences that would result in a comparison space that is too large
\item No dependence of historic group preference accuracy
\end{itemize}
\todo[inline]{
Advantages over Critique-Based Recommendation %\subsubsection{Advantages over Critique-Based Recommendation
}
\subsubsection{Advantages over Constrained-Based Recommendation}
\begin{itemize}
\item Configuration state does not cause absence of recommendations
\item Expendable to also support constraints
\item No need to handle inconsistencies explicitly
\end{itemize}
\begin{table}
\begin{center}
\begin{tabularx}{\columnwidth}{X|X|X}
& advantages & disadvantages \\
\hline
Collaborative Filtering
& \begin{itemize}
\item Serendipity of results
\item Automatic learning of market segments
\item No domain knowledge required
\end{itemize}
& \begin{itemize}
\item Cold start problem for users and items
\item Grey sheep problem
\item Quality based on rating quality
\item Data sparsity
\item Privacy not guaranteed
\end{itemize} \\
\hline
Content-Based Filtering
& \begin{itemize}
\item No community required
\item User independent
\item Transparent
\item No item cold start
\item Simplicity
\item Robust
\item Stable to constant influx of new users
\end{itemize}
& \begin{itemize}
\item Overspecialisation
\item No serendipity
\item User cold start problem
\item Requires domain knowledge
\end{itemize} \\
\hline
\todo[inline]{Critique-Based Recommendation } & \todo[inline]{ todo: positives } & \todo[inline]{ todo: negatives } \\
Constraint-Based Recommendation
& \begin{itemize}
\item Transparent
\item Good for non discrete values
\end{itemize}
& \begin{itemize}
\item Inconsistent constraints
\item No results
\end{itemize} \\
\end{tabularx}
\caption{A description of the advantages and disadvantages of common recommendation techniques \cite{richthammerSituationAwarenessRecommender2018, shokeenStudyFeaturesSocial2019,hahslerRecommenderlabFrameworkDeveloping2015, aminiDiscoveringImpactKnowledge2011, suSurveyCollaborativeFiltering2009}}
\label{tab:Foundations:RecommenderComparison}
\end{center}
\end{table}
\subsection{Hybrid Recommendation}
A hybrid recommender combines different recommendation approaches to use the strengths of each individual one and to reduce effects of weaknesses \cite{burkeHybridRecommenderSystems2002}.
\section{Group Recommender System}
\label{sec:Foundations:GroupRecommenderSystem}
A group recommender system is a recommender system aimed at making recommendations for a group instead of a single user. To make recommendations group members preferences have to be aggregated. This can be done by either aggregating single user recommendations or by merging preferences of each user into a group preference model. Based on the resulting preference model recommendation strategies as described in \autoref{sec:Foundations:RecommenderSystem} can be used to generate recommendations \cite{jamesonRecommendationGroups2007}.
For a group recommender system additional definitions are needed. The attitude of a user is represented by their preferences $P$ which is directly related to the utility a user has from a domain value being present in the configuration. Let
\begin{gather} \label{tab:Foundations:GroupRecommenderSystem:Preferences}
P = \{ P_1, \dotsc, P_n\},\ \text{where} \\
P_i = \{(d,\ u_i(d)) \ | \ \forall d \in \mathfrak{D}(i),\ i=1,\dotsc,m \} \notag
\end{gather}
\todo[inline]{example of a group recommender}
\todo[inline]{go more into detail about preference aggregation}
\section{Base Recommender System}
\label{sec:Foundations:BaseSystem}
\citeauthor{raabKollaborativeProduktkonfigurationEchtzeit2019}'s \cite{raabKollaborativeProduktkonfigurationEchtzeit2019} extends CAS Merlin Configurator in his thesis to allow simultaneous configuration. The extended architecture is shown in \autoref{fig:DesignImplementation:CollaborativeConfiguratorMerlin}.
He only makes changes to M.Customer which is renamed to M.Collab-Customer and introduces a new component M.Collab.
\begin{figure}
\centering
\includegraphics[width=0.6\textwidth]{./figures/50_design_and_implementation/MerlinCollaborativeConfigurator.pdf}
\caption{Architecture of Collaborative Configurator Merlin \cite[Fig. 4.3]{raabKollaborativeProduktkonfigurationEchtzeit2019}}
\label{fig:DesignImplementation:CollaborativeConfiguratorMerlin}
\end{figure}
\begin{description}
\item[M.Core] provides the base of the configurator. It checks the configuration against all rules in the database, provides possible alternatives if a change invalidates other parts of a configuration. The system relies on a CSP solver for valida tion and suggestion of alternatives.
\item[M.Model] is the editor to create products and rules. These rules can then be uploaded to M.Core.
\item[M.Collab] is a node.js server application that communicates with M.Core via REST-API and with M.Collab-Customer via WebSocket. It sits in between M.Collab-Customer and M.Core and handles all processing regarding collaborative configuration.
\item[M.Collab-Customer] a modified version of M.Customer that does all communication via WebSocket and does communicate with M.Collab instead of M.Core. M.Customer is the customer facing component. It allows a customer to configure a product or solution.
\end{description}
\FloatBarrier