Recommender Systems: Key Machine Learning Papers

Aug 24, 2025

A curated collection of influential and recent machine learning papers focused on recommender systems. This page aims to provide researchers and practitioners with easy access to foundational and cutting-edge work in the field.

Foundational Papers

TitleAuthorsYearLinkFocusKey Contribution
Collaborative Filtering for Implicit Feedback DatasetsYifan Hu, Yehuda Koren, Chris Volinsky2008PDFImplicit Data ModelingProposes a matrix factorization model tailored specifically for implicit feedback datasets, which are much more common in practice than explicit ratings.
Blockbusters and Wallflowers: Speeding up Diverse and Accurate Recommendations with Random WalksChristoffel Fabian ; Paudel, Bibek ; Newell, Chris ; Bernstein, Abraham2015PDFAccuracy vs DiversityIntroduces random walks on item similarity graphs to generate recommendations that are both accurate and diverse, particularly for long-tail items.
Metadata Embeddings for User and Item Cold-start RecommendationsMaciej Kula2015PDFCold-start ProblemBridges content-based information with collaborative filtering to handle the cold-start problem by learning metadata embeddings that can predict latent user/item vectors.
Embarrassingly Shallow Autoencoders for Sparse DataHarald Steck2019PDFEfficient collaborative filteringIntroduces a shallow (single-layer) autoencoder architecture for collaborative filtering, specifically designed to handle sparse implicit feedback data.
Recency Aware Collaborative Filtering for Next Basket RecommendationGuglielmo Faggioli, Mirko Polato, Fabio Aiolli2020PDFRecency & FrequencyImproves Collaborative filtering by accounting for recency and frequency.

Deep Learning for Recommendations

TitleAuthorsYearLinkFocusKey Contribution
Learning Deep Structured Semantic Models for Web Search using Clickthrough DataPo-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, Larry Heck2013PDFSemantic matchingIntroduces Deep Semantic Similarity Model (DSSM) that uses two tower architecture to map both queries and documents into a common low-dimensional semantic space.
Wide & Deep Learning for Recommender SystemsHeng-Tze Cheng2016PDFHybrid architectureCombines wide linear models and deep neural networks for memorization and generalization in recommendations.
Deep & Cross Network for Ad Click PredictionsRuoxi Wang, Gang Fu, Bin Fu, Mingliang Wang2017PDFFeature interactionsProposes the Deep & Cross Network (DCN), which adds a cross layer to explicitly model feature interactions up to high orders without manual feature engineering.
DeepFM: An End-to-End Wide & Deep Learning Framework for CTR PredictionHuifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, Xiuqiang He, and Zhenhua Dong2018PDFFeature interactionsIntroduces DeepFM combining Factorization Machine (FM) component to learn low-order feature interactions, and a deep neural network to learn high-order interactions.
AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural NetworksWeiping Song, Zhijian Duan, Yewen Xu, Chence Shi, Ming Zhang, Zhiping Xiao, Jian Tang2019PDFInteraction learningIntroduces AutoInt, which uses multi-head self-attention (like in Transformers) to automatically learn feature interactions without manual design.
LightGCN: Simplifying and Powering Graph Convolution Network for RecommendationXiangnan He, Kuan Deng, Xiang Wang, Yan Li, Yongdong Zhang, Meng Wang2020PDFGraph-based CFProposes LightGCN, a simplified version of graph convolutional networks (GCNs) using only neighborhood aggregation specifically for collaborative filtering.
DCN V2: Improved Deep & Cross Network and Practical Lessons for Web-scale Learning to Rank SystemsRuoxi Wang, Rakesh Shivanna, Derek Z. Cheng, Sagar Jain, Dong Lin, Lichan Hong, Ed H. Chi2020PDFFeature interactionsImproves DCN by introducing a more efficient "low-rank" cross layer and adds multi-task learning.
Self-Attentive Sequential RecommendationWang-Cheng Kang, Julian McAuley2018PDFSequential RecommendationsIntroduces a self-attention-based model (SASRec) for sequential recommendation, inspired by the Transformer architecture to captures short- and long-term dependencies in user behavior sequences.
BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from TransformerFei Sun, Jun Liu, Jian Wu2019PDFBidirectional contextApplies the BERT-style bidirectional Transformer to model user behavior sequences in a non-autoregressive manner.
Context-Aware Sequential Model for Multi-Behaviour RecommendationShereen Elsayed, Ahmed Rashed, Lars Schmidt-Thieme2023PDFContextual sequential modelsProposes Context-Aware Sequential Model (CASM) with context-aware multi-head self-attention.

Practical Tips and Tricks

TitleAuthorsYearLinkFocusKey Contribution
Deep Residual Learning for Image RecognitionKaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun2015PDFFixing vanishing gradientsIntroduced residual blocks, which use skip connections to help train very deep networks effectively.
Densely Connected Convolutional NetworksGao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger2016PDFParameter efficiencyProposed DenseNet to achieve significant improvements in parameter efficiency and accuracy, while mitigating the vanishing gradient problem and requiring fewer parameters.
Gaussian Error Linear Units (GELUS)Dan Hendrycks, Kevin Gimpel2018PDFEfficient activation functionProposed a high-performing activation function for better gradient flow in neural networks.
Unified Embedding: Battle-Tested Feature Representations for Web-Scale ML SystemsBenjamin Coleman, Wang-Cheng Kang, Matthew Fahrbach2023PDFEfficient EmbeddingProposed Unified Embedding with three major benefits: simplified feature configuration, strong adaptation to dynamic data distributions, and compatibility with modern hardware.
On Embeddings for Numerical Features in Tabular Deep LearningYury Gorishniy, Ivan Rubachev, Artem Babenko2023PDFEmbedding Numerical FeaturesExplores Piecewise Linear Encoding and trainable periodic encoding for numerical features.
SMMR: Sampling-Based MMR Reranking for Faster, More Diverse, and Balanced Recommendations and RetrievalKiryl Liakhnovich, Oleg Lashinin, Andrei Babkin2025PDFPerformant re-rankingPropose Sampled Maximal Marginal Relevance (SMMR), a sampling-based extension of MMR that introduces randomness into item selection to improve relevance-diversity trade-offs.