Build your own music recommender by modeling internet radio streams
In the Internet music scene, where recommendation technology is key for navigating huge collections, large market players enjoy a considerable advantage. Accessing a wider pool of user feedback leads to an increasingly more accurate analysis of user tastes, effectively creating a "rich get richer" effect. This work aims at significantly lowering the entry barrier for creating music recommenders, through a paradigm coupling a public data source and a new collaborative filtering (CF) model. We claim that Internet radio stations form a readily available resource of abundant fresh human signals on music through their playlists, which are essentially cohesive sets of related tracks. In a way, our models rely on the knowledge of a diverse group of experts in lieu of the commonly used wisdom of crowds. Over several weeks, we aggregated publicly available playlists of thousands of Internet radio stations, resulting in a dataset encompassing millions of plays, and hundreds of thousands of tracks and artists. This provides the large scale ground data necessary to mitigate the cold start problem of new items at both mature and emerging services. Furthermore, we developed a new probabilistic CF model, tailored to the Internet radio resource. The success of the model was empirically validated on the collected dataset. Moreover, we tested the model at a cross-source transfer learning manner 膒 the same model trained on the Internet radio data was used to predict behavior of Yahoo! Music users. This demonstrates the ability to tap the Internet radio signals in other music recommendation setups. Based on encouraging empirical results, our hope is that the proposed paradigm will make quality music recommendation accessible to all interested parties in the community.