Getting to Know Your Neighbors (KYN). Explaining ItemSimilarity in Nearest Neighbors Collaborative FilteringRecommendations

Joanna Misztal-Radecka and Bipin Indurkhya

misztalradecka@agh.edu.pl bipin.indurkhya@uj.edu.pl

The popular neighborhood-based Collaborative Filtering recommendation techniques are mostly characterized as black-box systems in which resulting outputs are not easy to interpret. In this work, our goal is to provide human-interpretable explanations of item-based collaborative filtering recommendations and the underlying data distribution. We propose the Know Your Neighbors (KYN) algorithm – a model-agnostic approach to explaining similarity-based CF recommendations on both local and global levels.

Motivation

A traditional way to explain the collaborative filtering recommendations [3] (we recommend X because you liked Y) does not provide an intuition about what makes the items similar. Hence, the main research question addressed by this work is: how to provide human-understandable explanations for neighborhood-based CF recommendations?

Traditional explanation: — Figure 1. An example of an explanation for item-based CF recommendation: a traditional form (left) and KYN approach (right).

Proposed approach

The Know Your Neighbors approach provides:

Model-agnostic explanations of the neighborhood-based collaborative recommendations,
Local and global interpretability based on the descriptive item features.

Main stages of the KYN algorithm: 1. Black-box CF recommender, 2. Item embeddings, 3. Embeddings clustering, 4. Cluster prediction based on item descriptive features, 5. Cluster prediction explanation. — Figure 2. Main stages of the KYN algorithm.

The KYN method (Figure 2) consists of the following stages:

A Collaborative Filtering algorithm is trained to represent the latent behavioral patterns from the user-item interaction matrix.
Item embeddings are extracted from the recommender latent representations.
Unsupervised clustering is applied to construct item similarity clusters.
The item-cluster assignment is used as labels and input to the proxy multi-class classifier with the descriptive item features as the input attributes.
An explaining module provides the item similarity explanation based on the classifier predictions.

Experimental Setup

Datasets: MovieLens (movies) and Deskdrop (articles),
Collaborative Filtering representations: Item2Vec [1] and Non-Negative Matrix Factorization (NMF)
Unsupervised clustering methods: k-Means, agglomerative clustering, HDBSCAN [2]
Cluster prediction: logistic regression (LR), Gradient-Boosting Trees (GBT)
Model explanation: SHAP [4]
Visualization 2D: UMAP [5]

Evaluation tasks

Clustering quality evaluation

Metrics: Silhouette (close to 1 – perfect clustering), Davies-Bouldin Index (D-B Index) – (close to 0 for the best clustering)

Poster 244

Getting to Know Your Neighbors (KYN). Explaining ItemSimilarity in Nearest Neighbors Collaborative FilteringRecommendations

Motivation

Proposed approach

Experimental Setup

Evaluation tasks

Clustering quality evaluation

Cluster prediction evaluation

Qualitative analysis of neighborhood explanation

Results summary

Future Work

References