Recommending relevant and personalized content to users is crucial for media services providers, such as news, video or music streaming platforms. Indeed, effective recommender systems improve the users’ experience and engagement on the platform, by helping them navigate through massive amounts of content, enjoy their favorite videos or songs, and discover new ones that they might like. As a consequence, significant efforts were initiated to transpose promising research on these aspects to industrial-level applications.

In particular, many global mobile apps and websites, notably from the music streaming industry, currently leverage swipeable carousels to display recommended content on their homepages. These carousels, also referred to as sliders or shelves, consist in ranked lists of items or cards (albums, artists, playlists…). A few cards are initially displayed to the users, who can click on them or swipe on the screen to see some of the additional cards from the carousel.

Carousels on Deezer

Selecting and ranking the most relevant cards to display is a challenging task, as the catalog size is usually significantly larger than the number of available slots in a carousel, and as users have different preferences. While being close to slate recommendation and to learning to rank settings, carousel personalization also requires dealing with user feedback to adaptively improve the recommended content via online learning strategies, and integrating that some cards from the carousel might not be seen by users due to the swipeable structure.

In this paper, we model carousel personalization as a multi-armed bandit with multiple plays learning problem. Within our proposed framework, we account for important characteristics of real-world swipeable carousels, notably by considering that media services providers have access to contextual information on user preferences, that they might not know which cards from a carousel are actually seen by users, and that feedback data from carousels might not be available in real time.

Focusing on music streaming applications, we show the effectiveness of our approach by addressing a large-scale carousel-based playlist recommendation task on the global mobile app Deezer.

Cumulative Regrets of Bandits Policies for Playlist Recommendation

Along with this paper, we publicly release large-scale datasets of user preferences for curated playlists on Deezer, and an open-source environment to recreate comparable learning problems. The code is available on GitHub and the datasets are available on Zenodo.

Datasets

This paper has been published in the proceedings of the 14th ACM Conference on Recommender Systems (RecSys 2020), and has been shortlisted among the “Best Short Paper Candidates”.