Date of Award

Spring 5-2017

Embargo Period

7-27-2017

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Mathematical Sciences

Advisor(s)

Ramamoorthis Ravi

Abstract

Collaborative filtering approaches have produced some of the most accurate and personalized recommender systems to date by mining for similarities in large-scale datasets. However, despite their stellar performance in accuracy based metrics, researchers have demonstrated a propensity by such algorithms to exaggerate the biases inherent in the data such as popularity or the affinity of users to certain kinds of content. Meanwhile, recommender systems have only grown in importance and have become an integral part of the internet ecosystem, with many users interacting with many recommender systems daily on e-commerce sites, social networks and apps. Therefore, the biases in recommender systems have come to critically impact a company’s bottom line, user satisfaction levels and public image, making it an imperative to develop recommendation diversification methods to explicitly counteract them. In this thesis we make three key contributions to the growing field of sales diversity, which aims to reduce popularity biases inherent in many collaborative filtering based recommender systems. First, we consider the problem of making item-item recommendations, with the goal of redundantly linking from popular items to less popular items in order to bring them more exposure on the web. Next, we consider to the setting of user-item recommendations, and develop a metric we call “discrepancy” to measure the distance between the recommendation distribution desired by a business and the distribution obtained by the recommender system, and develop algorithms to reduce discrepancy while maintaining high recommendation quality. Lastly, we turn our attention to item catalogs and user bases where items and users are clustered into disjoint or overlapping subgroups, and develop metrics to quantify the recommendation diversity experienced both by the users and the items. Our approaches to all three of these problems are unified under a framework of subgraph selection, the use of network flow problems for modeling, and a focus on providing either exact polynomial algorithms or efficient approximation algorithms with concrete performance guarantees. This stands in contrast with existing approaches, most of which are reranking based heuristics for which no performance guarantees can be given. In each of these settings, we augment our theoretical findings with an empirical evaluation on real life datasets from online retailers or standard recommender system datasets provided by Netflix and the MovieLens group, and show that our methods provide superior sales diversity value when compared with competing approaches.

Share

COinS