bachelor_thesis/25_Outline/sections/20_foundations.tex

\chapter{Foundations}
\label{ch:Foundations}

\section{Product Configuration}
\label{sec:Foundations:ProductConfiguration}

Product configuration is a process consisting of a series of decision tasks whereby a product is constructed of components which interact with each other. During a configuration process no new components are created. Their interplay and specification is defined beforehand \cite[~ pp. 42, 43]{sabinProductConfigurationFrameworksa1998}. This section defines product configuration analogous to \citeauthor{felfernigOpenConfiguration2014} \cite{felfernigOpenConfiguration2014}.

Formally a configuration problem can be specified as a \emph{constraint satisfaction problem (CSP)} \cite{tsangFoundationsConstraintSatisfaction1993} as
\begin{equation} \label{eq:Foundations:ProductConfiguration:ConstraintSatisfactionProblem}
    CSP(V,\mathfrak{D},C),
\end{equation}
where we have a set of \emph{variables} $V$ (which in this thesis will also be referred to as \emph{features}) with
\begin{equation} \label{eq:Foundations:ProductConfiguration:Variables}
    V = \{v_1, \dotsc, v_m\},
\end{equation}
a \emph{domain mapping} $\mathfrak{D}$ that maps variable to their corresponding domain values $D$ (which in this thesis will be also referred to as \emph{characteristics}) i.e. the values that can be assigned to each variable
\begin{equation}\label{eq:Foundations:ProductConfiguration:DomainMapping}
    \mathfrak{D} : V \to D; x \mapsto \mathfrak{D}(x) \qquad where \ D = \{d_1, \dotsc, d_o\},
\end{equation}
and \emph{constraints} $C$ that limit the solution space with
\begin{equation}\label{eq:Foundations:ProductConfiguration:Constraints}
    C = \{c_1, \cdots, c_k\}.
\end{equation}

\subsection{Configuration State}

We will define a \emph{configuration} $S$ as a tuple of variables (\autoref{eq:Foundations:ProductConfiguration:Variables}) and their corresponding domain value with
\begin{equation} \label{eq:Foundations:ProductConfiguration:ConfigurationState}
    S = \{ (v_i,\ d) \ |\ v_i \in V \ \land \ d \in \mathfrak{D}(i),\ i=1,\dotsc,m \}.
\end{equation}
Essentially it is a set of variables and assigned values.

\subsection{Finished Configuration}
To define a \emph{finished configuration} we first need to define what a valid configuration is. Therefore we define $is\_valid$ as
\begin{equation} \label{eq:Foundations:ProductConfiguration:IsValid}
    is\_valid : S \to \{true, false\}; x \mapsto
    \begin{cases}
        true, & S \in solution\_space \\
        false, & \text{otherwise}
    \end{cases},
\end{equation}
with $solution\_space$ being the solution space of the corresponding constraint satisfaction problem. A \emph{finished configuration} $S_F$ is a configuration that contains all variables and is a valid configuration:
\begin{equation} \label{eq:Foundations:ProductConfiguration:FinishedConfiguration}
    S_F \subset S,\ where \ \forall v_i \in V (\exists (v_i, d) \in S_F : d \in \mathfrak{D}(i)) \land is\_valid(S_F).
\end{equation}
In practice a finished configuration of a product (or solution) is something that is ready to be produced. For example if we are configuring a car, this means that the car could be produced in the specified way that is given by the finished configuration.


\section{Group-Based Product Configuration}
\label{sec:Foundations:GroupBasedProductConfiguration}

Instead of a single person configuring a product, a group of people is configuring one product which can be useful in multi-stakeholder decisions. This setting needs mechanisms for describing the preferences of multiple people. Therefore we will add to our definitions, a set of users $U$ with
\begin{equation}\label{eq:Foundations:ProductConfiguration:Users}
    U = \{1, \dotsc, n\},
\end{equation}
a users \emph{utility function} that maps a domain value to a utility value and is only known to the user
\begin{equation}
    \begin{split}
        u_i(d_j), \qquad \text{where}\ & d_j \in  \mathfrak{D}(j),\\
        & 1 <= j <= m, \\
        & 1 <= i <= n .
    \end{split}
\end{equation}
\citeauthor{raabKollaborativeProduktkonfigurationEchtzeit2019}'s approach, which this thesis extends, is to treat the group configuration the same as one shared configuration and to sync the selection of attributes across clients \cite{raabKollaborativeProduktkonfigurationEchtzeit2019}.

\section{Recommender System}
\label{sec:Foundations:RecommenderSystem}

A recommender system is a system that gives individualized recommendations to users to guide them through a large space of objects \cite[~ p. 331]{burkeHybridRecommenderSystems2002}.

There are several approaches to recommender systems presented in \cite{felfernigGroupRecommenderSystems2018}, these are: collaborative filtering, constraint-based recommendation, content-based filtering and hybrid recommendation.

\begin{table}
    \centering
    \begin{tabular}{ l | c | c | c | c | c }
        & The Matrix & Titanic & Die Hard & Forest Gump & Wall-E \\ \hline
         John  & 5 & 1 &   & 2 & 2 \\
         Lucy  & 1 & 5 & 2 & 5 & 5 \\
         Eric  & 2 & ? & 3 & 5 & 4 \\
         Diane & 4 & 3 & 5 & 3 &   \\
    \end{tabular}
    \caption{An example showing users ratings for movies by \citeauthor{ningComprehensiveSurveyNeighborhoodBased2015} \cite{ningComprehensiveSurveyNeighborhoodBased2015}.}

    \label{tab:Foundations:RecommenderSystem:MoviePreferences}
\end{table}

\subsection{Collaborative Filtering}
In collaborative filtering a users rating for unknown items is predicted by finding similar users who have rated it. Their rating is used as prediction
\cite[~ pp. 7, 8]{felfernigDecisionTasksBasic2018}.

Collaborative Filtering can not only be done using users, it can also be item-based. Hereby the similarity between items is used for a recommendation and not similar users \cite{ricciRecommenderSystemsHandbook2015}. In the context of configuration the similarity to other historic configurations can be used which makes it an item based approach.

\autoref{tab:Foundations:RecommenderSystem:MoviePreferences} shows an example rating matrix. A simple user-based way to calculate a rating would be to use a k-nearest neighbour (kNN) algorithm and then take the average of those ratings. Using this method with $k := 2$ and euclidean distance our closest neighbours are \textit{Lucy} and \textit{Diane} therefore giving us a predicted rating of $4$.  If we use an item based approach instead, we will try to find similar items based on the users rating. An example of similar items here would be \textit{Forest Gump} and \textit{Wall-E} as John and Lucy each have given them the sane rating and Eric's rating is off by one. Using again kNN with $k := 2$ we find that \textit{Forest Gump} and \textit{Wall-E} are the most similar to \textit{Titanic} thereby having a predicted rating of $4.5$.
However this simple similarity and prediction function does not take into account different distances. For example Lucy's ratings are more similar compared to Eric's than Diane's but Diane's and Lucy's rating is valued the same amount.

\subsection{Constraint-Based Recommendation}
Hereby filter rules are defined which filter out items that don't fulfil specified rules. A user models their requirements with these rules and thereby gets a list of recommended items. This approach requires deep knowledge about a product because it needs a detailed description of features  \cite[~ p. 12]{felfernigDecisionTasksBasic2018}.

Our movie example (see \autoref{tab:Foundations:RecommenderSystem:MoviePreferences}) requires additional information for example about plot structure, pacing, length and other attributes of the movie. Now the user could enter filtering criteria, e.g. that the movie should be no longer than 120 minutes, be categorized as action or thriller and have a fast pacing. The system will only recommend movies that fit into these categories.

\subsection{Content-Based Filtering}
For the content-based filtering approach, items and users are assigned to categories. Based on consumption and rating of items a user will have implicit ratings for categories. Predictions are now made based on a categories of the new item \cite[~ pp. 10, 11]{felfernigDecisionTasksBasic2018}.

Using our example from \autoref{tab:Foundations:RecommenderSystem:MoviePreferences} and using an additional category matrix (see \autoref{tab:Foundations:RecommenderSystem:ContentBasedFilteringCategories}) we can derive a rating matrix per category (using the average rating of the user of each movie contained in this category). The result can be seen in \autoref{tab:Foundations:RecommenderSystem:ContentBasedFilteringProfiles}. To predict Eric's rating of Titanic we now can use the categories of \textit{Titanic} and average out Eric's implicit rating per category. Titanic is only in the category romance and as Eric's rating of \textit{Forest Gump} is $5$ the prediction is a rating of $5$. Categories don't have to be the genre, they could be any kind of data about a movie.

\begin{table}
    \centering
    \begin{tabular}{ l | c | c | c | c | c }
        & The Matrix & Titanic & Die Hard & Forest Gump & Wall-E \\ \hline
         Action  & x &  & x  &  &  \\
         Sci-Fi  & x &  &  &  &  \\
         Thriller  &  & & x &  &  \\
         Romance & & x & & x & \\
         Family & & & & x & x \\
    \end{tabular}
    \caption{Showing example categories for movies in \autoref{tab:Foundations:RecommenderSystem:MoviePreferences}.}

    \label{tab:Foundations:RecommenderSystem:ContentBasedFilteringCategories}
\end{table}

\begin{table}
    \centering
    \begin{tabular}{ l | c | c | c | c | c }
        & Action & Sci-Fi & Thriller & Romance & Family \\ \hline
        John  & 5 & 5 & & 1.5 & 2 \\
        Lucy  & 1.5 & 1 & 2 & 5 & 5 \\
        Eric  & 2.5 & 2 & 3 & 5 & 4.5 \\
        Diane & 4.5 & 4 & 5 & 3 & 3  \\
    \end{tabular}
    \caption{User profiles generated from categories and rating from \autoref{tab:Foundations:RecommenderSystem:MoviePreferences} and \autoref{tab:Foundations:RecommenderSystem:ContentBasedFilteringCategories}.}

    \label{tab:Foundations:RecommenderSystem:ContentBasedFilteringProfiles}
\end{table}

\subsubsection{Advantages over Collaborative Filtering}
\begin{itemize}
    \item No cold start problem for items
    \item No grey sheep problem as not dependent on similar groups having existed before.
    \item Domain knowledge is existent
    \item No issues with data sparsity as item description is given by product structure
    \item No reliance on preferences that would result in a comparison space that is too large
    \item No dependence of historic group preference accuracy
\end{itemize}

\subsubsection{Advantages over Constrained-Based Recommendation}

\begin{itemize}
    \item Configuration state does not cause absence of recommendations
    \item Expendable to also support constraints
    \item No need to handle inconsistencies explicitly
\end{itemize}


\begin{table}
    \begin{center}
        \begin{tabularx}{\columnwidth}{X|X|X}
            & advantages & disadvantages \\
            \hline
            Collaborative Filtering
            &   \begin{itemize}
                    \item Serendipity of results
                    \item Automatic learning of market segments
                    \item No domain knowledge required
                \end{itemize}
            &   \begin{itemize}
                    \item Cold start problem for users and items
                    \item Grey sheep problem
                    \item Quality based on rating quality
                    \item Data sparsity
                    \item Privacy not guaranteed
                \end{itemize} \\
            \hline
            Content-Based Filtering
            &   \begin{itemize}
                    \item No community required
                    \item User independent
                    \item Transparent
                    \item No item cold start
                    \item Simplicity
                    \item Robust
                    \item Stable to constant influx of new users
                \end{itemize}
            &   \begin{itemize}
                    \item Overspecialisation
                    \item No serendipity
                    \item User cold start problem
                    \item Requires domain knowledge
                \end{itemize} \\
            \hline
            Constraint-Based Recommendation
            &   \begin{itemize}
                    \item Transparent
                    \item Good for non discrete values
                \end{itemize}
            &   \begin{itemize}
                    \item Inconsistent constraints
                    \item No results
                \end{itemize} \\
        \end{tabularx}
        \caption{A description of the advantages and disadvantages of common recommendation techniques \cite{richthammerSituationAwarenessRecommender2018, shokeenStudyFeaturesSocial2019,hahslerRecommenderlabFrameworkDeveloping2015, aminiDiscoveringImpactKnowledge2011, suSurveyCollaborativeFiltering2009}}
        \label{tab:Foundations:RecommenderComparison}
    \end{center}
\end{table}

\subsection{Hybrid Recommendation}
A hybrid recommender combines different recommendation approaches to use the strengths of each individual one and to reduce effects of weaknesses \cite{burkeHybridRecommenderSystems2002}.

\section{Group Recommender System}
\label{sec:Foundations:GroupRecommenderSystem}

A group recommender system is a recommender system aimed at making recommendations for a group instead of a single user. To make recommendations group members preferences have to be aggregated. This can be done by either aggregating single user recommendations or by merging preferences of each user into a group preference model. Based on the resulting preference model recommendation strategies as described in \autoref{sec:Foundations:RecommenderSystem} can be used to generate recommendations \cite{jamesonRecommendationGroups2007}.

For a group recommender system additional definitions are needed. The attitude of a user is represented by their preferences $P$ which is directly related to the utility a user has from a domain value being present in the configuration. Let
\begin{gather} \label{tab:Foundations:GroupRecommenderSystem:Preferences}
    P = \{ P_1, \dotsc, P_n\},\ \text{where} \\
    P_i = \{(d,\ u_i(d)) \ | \ \forall d \in \mathfrak{D}(i),\ i=1,\dotsc,m \} \notag
\end{gather}


\FloatBarrier