论文周报 | 推荐系统领域最新研究进展，含SIGIR, KDD, ACL等顶会论文

本文精选了上周（0717-0723）最新发布的15篇推荐系统相关论文，主要研究方向包括基于大语言模型的推荐系统、推荐中的公平性问题、对话推荐、多语言多区域购物推荐数据集、推荐中的架构搜索、多行为推荐等。

1. Enhancing Job Recommendation through LLM-based Generative Adversarial Networks

2. Language-Enhanced Session-Based Recommendation with Decoupled Contrastive Learning

3. TREA: Tree-Structure Reasoning Schema for Conversational Recommendation, ACL2023

4. Evaluating and Enhancing Robustness of Deep Recommendation Systems Against Hardware Errors

5. Impatient Bandits: Optimizing Recommendations for the Long-Term Without Delay, KDD2023

6. Amazon-M2: A Multilingual Multi-locale Shopping Session Dataset for Recommendation and Text Generation, KDDCup2023

7. Measuring Item Global Residual Value for Fair Recommendation, SIGIR2023

8. Leveraging Recommender Systems to Reduce Content Gaps on Peer Production Platforms

9. Streaming CTR Prediction: Rethinking Recommendation Task for Real-World Streaming Data

10. Imposing Consistency Properties on Blackbox Systems with Applications to SVD-Based Recommender Systems

11. Efficient and Joint Hyperparameter and Architecture Search for Collaborative Filtering, KDD2023

12. Sharpness-Aware Graph Collaborative Filtering, SIGIR2023

13. Learning from Hierarchical Structure of Knowledge Graph for Recommendation, TOIS2023

14. An Admissible Shift-Consistent Method for Recommender Systems

15. Coarse-to-Fine Knowledge-Enhanced Multi-Interest Learning Framework for Multi-Behavior Recommendation, TOIS2023

1. Enhancing Job Recommendation through LLM-based Generative Adversarial Networks

Yingpeng Du, Di Luo, Rui Yan, Hongzhi Liu, Yang Song, Hengshu Zhu, Jie Zhang

https://arxiv.org/abs/2307.10747

Recommending suitable jobs to users is a critical task in online recruitment platforms, as it can enhance users' satisfaction and the platforms' profitability. While existing job recommendation methods encounter challenges such as the low quality of users' resumes, which hampers their accuracy and practical effectiveness. With the rapid development of large language models (LLMs), utilizing the rich external knowledge encapsulated within them, as well as their powerful capabilities of text processing and reasoning, is a promising way to complete users' resumes for more accurate recommendations. However, directly leveraging LLMs to enhance recommendation results is not a one-size-fits-all solution, as LLMs may suffer from fabricated generation and few-shot problems, which degrade the quality of resume completion. In this paper, we propose a novel LLM-based approach for job recommendation. To alleviate the limitation of fabricated generation for LLMs, we extract accurate and valuable information beyond users' self-description, which helps the LLMs better profile users for resume completion. Specifically, we not only extract users' explicit properties (e.g., skills, interests) from their self-description but also infer users' implicit characteristics from their behaviors for more accurate and meaningful resume completion. Nevertheless, some users still suffer from few-shot problems, which arise due to scarce interaction records, leading to limited guidance for the models in generating high-quality resumes. To address this issue, we propose aligning unpaired low-quality with high-quality generated resumes by Generative Adversarial Networks (GANs), which can refine the resume representations for better recommendation results. Extensive experiments on three large real-world recruitment datasets demonstrate the effectiveness of our proposed method.

2. Language-Enhanced Session-Based Recommendation with Decoupled Contrastive Learning

Zhipeng Zhang, Piao Tong, Yingwei Ma, Qiao Liu, Xujiang Liu, Xu Luo

https://arxiv.org/abs/2307.10650

Session-based recommendation techniques aim to capture dynamic user behavior by analyzing past interactions. However, existing methods heavily rely on historical item ID sequences to extract user preferences, leading to challenges such as popular bias and cold-start problems. In this paper, we propose a hybrid multimodal approach for session-based recommendation to address these challenges. Our approach combines different modalities, including textual content and item IDs, leveraging the complementary nature of these modalities using CatBoost. To learn universal item representations, we design a language representation-based item retrieval architecture that extracts features from the textual content utilizing pre-trained language models. Furthermore, we introduce a novel Decoupled Contrastive Learning method to enhance the effectiveness of the language representation. This technique decouples the sequence representation and item representation space, facilitating bidirectional alignment through dual-queue contrastive learning. Simultaneously, the momentum queue provides a large number of negative samples, effectively enhancing the effectiveness of contrastive learning. Our approach yielded competitive results, securing a 5th place ranking in KDD CUP 2023 Task 1. We have released the source code and pre-trained models associated with this work.

3. TREA: Tree-Structure Reasoning Schema for Conversational Recommendation, ACL2023

Wendi Li, Wei Wei, Xiaoye Qu, Xian-Ling Mao, Ye Yuan, Wenfeng Xie, Dangyang Chen

https://arxiv.org/abs/2307.10543

Conversational recommender systems (CRS) aim to timely trace the dynamic interests of users through dialogues and generate relevant responses for item recommendations. Recently, various external knowledge bases (especially knowledge graphs) are incorporated into CRS to enhance the understanding of conversation contexts. However, recent reasoning-based models heavily rely on simplified structures such as linear structures or fixed-hierarchical structures for causality reasoning, hence they cannot fully figure out sophisticated relationships among utterances with external knowledge. To address this, we propose a novel Tree structure Reasoning schEmA named TREA. TREA constructs a multi-hierarchical scalable tree as the reasoning structure to clarify the causal relationships between mentioned entities, and fully utilizes historical conversations to generate more reasonable and suitable responses for recommended results. Extensive experiments on two public CRS datasets have demonstrated the effectiveness of our approach.

4. Evaluating and Enhancing Robustness of Deep Recommendation Systems Against Hardware Errors

Dongning Ma, Xun Jiao, Fred Lin, Mengshi Zhang, Alban Desmaison, Thomas Sellinger, Daniel Moore, Sriram Sankar

https://arxiv.org/abs/2307.10244

Deep recommendation systems (DRS) heavily depend on specialized HPC hardware and accelerators to optimize energy, efficiency, and recommendation quality. Despite the growing number of hardware errors observed in large-scale fleet systems where DRS are deployed, the robustness of DRS has been largely overlooked. This paper presents the first systematic study of DRS robustness against hardware errors. We develop Terrorch, a user-friendly, efficient and flexible error injection framework on top of the widely-used PyTorch. We evaluate a wide range of models and datasets and observe that the DRS robustness against hardware errors is influenced by various factors from model parameters to input characteristics. We also explore 3 error mitigation methods including algorithm based fault tolerance (ABFT), activation clipping and selective bit protection (SBP). We find that applying activation clipping can recover up to 30% of the degraded AUC-ROC score, making it a promising mitigation method.

5. Impatient Bandits: Optimizing Recommendations for the Long-Term Without Delay, KDD2023

Thomas M. McDonald, Lucas Maystre, Mounia Lalmas, Daniel Russo, Kamil Ciosek

https://arxiv.org/abs/2307.09943

Recommender systems are a ubiquitous feature of online platforms. Increasingly, they are explicitly tasked with increasing users' long-term satisfaction. In this context, we study a content exploration task, which we formalize as a multi-armed bandit problem with delayed rewards. We observe that there is an apparent trade-off in choosing the learning signal: Waiting for the full reward to become available might take several weeks, hurting the rate at which learning happens, whereas measuring short-term proxy rewards reflects the actual long-term goal only imperfectly. We address this challenge in two steps. First, we develop a predictive model of delayed rewards that incorporates all information obtained to date. Full observations as well as partial (short or medium-term) outcomes are combined through a Bayesian filter to obtain a probabilistic belief. Second, we devise a bandit algorithm that takes advantage of this new predictive model. The algorithm quickly learns to identify content aligned with long-term success by carefully balancing exploration and exploitation. We apply our approach to a podcast recommendation problem, where we seek to identify shows that users engage with repeatedly over two months. We empirically validate that our approach results in substantially better performance compared to approaches that either optimize for short-term proxies, or wait for the long-term outcome to be fully realized.

6. Amazon-M2: A Multilingual Multi-locale Shopping Session Dataset for Recommendation and Text Generation, KDDCup2023

Wei Jin, Haitao Mao, Zheng Li, Haoming Jiang, Chen Luo, Hongzhi Wen, Haoyu Han, Hanqing Lu, Zhengyang Wang, Ruirui Li, Zhen Li, Monica Xiao Cheng, Rahul Goutam, Haiyang Zhang, Karthik Subbian, Suhang Wang, Yizhou Sun, Jiliang Tang, Bing Yin, Xianfeng Tang

https://arxiv.org/abs/2307.09688

Modeling customer shopping intentions is a crucial task for e-commerce, as it directly impacts user experience and engagement. Thus, accurately understanding customer preferences is essential for providing personalized recommendations. Session-based recommendation, which utilizes customer session data to predict their next interaction, has become increasingly popular. However, existing session datasets have limitations in terms of item attributes, user diversity, and dataset scale. As a result, they cannot comprehensively capture the spectrum of user behaviors and preferences. To bridge this gap, we present the Amazon Multilingual Multi-locale Shopping Session Dataset, namely Amazon-M2. It is the first multilingual dataset consisting of millions of user sessions from six different locales, where the major languages of products are English, German, Japanese, French, Italian, and Spanish. Remarkably, the dataset can help us enhance personalization and understanding of user preferences, which can benefit various existing tasks as well as enable new tasks. To test the potential of the dataset, we introduce three tasks in this work: (1) next-product recommendation, (2) next-product recommendation with domain shifts, and (3) next-product title generation. With the above tasks, we benchmark a range of algorithms on our proposed dataset, drawing new insights for further research and practice. In addition, based on the proposed dataset and tasks, we hosted a competition in the KDD CUP 2023 and have attracted thousands of users and submissions. The winning solutions and the associated workshop can be accessed at our website https://kddcup23.github.io/

7. Measuring Item Global Residual Value for Fair Recommendation, SIGIR2023

Jiayin Wang, Weizhi Ma, Chumeng Jiang, Min Zhang, Yuan Zhang, Biao Li, Peng Jiang

https://arxiv.org/abs/2307.08259

In the era of information explosion, numerous items emerge every day, especially in feed scenarios. Due to the limited system display slots and user browsing attention, various recommendation systems are designed not only to satisfy users' personalized information needs but also to allocate items' exposure. However, recent recommendation studies mainly focus on modeling user preferences to present satisfying results and maximize user interactions, while paying little attention to developing item-side fair exposure mechanisms for rational information delivery. This may lead to serious resource allocation problems on the item side, such as the Snowball Effect. Furthermore, unfair exposure mechanisms may hurt recommendation performance. In this paper, we call for a shift of attention from modeling user preferences to developing fair exposure mechanisms for items. We first conduct empirical analyses of feed scenarios to explore exposure problems between items with distinct uploaded times. This points out that unfair exposure caused by the time factor may be the major cause of the Snowball Effect. Then, we propose to explicitly model item-level customized timeliness distribution, Global Residual Value (GRV), for fair resource allocation. This GRV module is introduced into recommendations with the designed Timeliness-aware Fair Recommendation Framework (TaFR). Extensive experiments on two datasets demonstrate that TaFR achieves consistent improvements with various backbone recommendation models. By modeling item-side customized Global Residual Value, we achieve a fairer distribution of resources and, at the same time, improve recommendation performance.

8. Leveraging Recommender Systems to Reduce Content Gaps on Peer Production Platforms

Mo Houtti, Isaac Johnson, Loren Terveen

https://arxiv.org/abs/2307.08669

Peer production platforms like Wikipedia commonly suffer from content gaps. Prior research suggests recommender systems can help solve this problem, by guiding editors towards underrepresented topics. However, it remains unclear whether this approach would result in less relevant recommendations, leading to reduced overall engagement with recommended items. To answer this question, we first conducted offline analyses (Study 1) on SuggestBot, a task-routing recommender system for Wikipedia, then did a three-month controlled experiment (Study 2). Our results show that presenting users with articles from underrepresented topics increased the proportion of work done on those articles without significantly reducing overall recommendation uptake. We discuss the implications of our results, including how ignoring the article discovery process can artificially narrow recommendations. We draw parallels between this phenomenon and the common issue of "filter bubbles" to show how any platform that employs recommender systems is susceptible to it.

9. Streaming CTR Prediction: Rethinking Recommendation Task for Real-World Streaming Data

Qi-Wei Wang, Hongyu Lu, Yu Chen, Da-Wei Zhou, De-Chuan Zhan, Ming Chen, Han-Jia Ye

https://arxiv.org/abs/2307.07509

The Click-Through Rate (CTR) prediction task is critical in industrial recommender systems, where models are usually deployed on dynamic streaming data in practical applications. Such streaming data in real-world recommender systems face many challenges, such as distribution shift, temporal non-stationarity, and systematic biases, which bring difficulties to the training and utilizing of recommendation models. However, most existing studies approach the CTR prediction as a classification task on static datasets, assuming that the train and test sets are independent and identically distributed (a.k.a, i.i.d. assumption). To bridge this gap, we formulate the CTR prediction problem in streaming scenarios as a Streaming CTR Prediction task. Accordingly, we propose dedicated benchmark settings and metrics to evaluate and analyze the performance of the models in streaming data. To better understand the differences compared to traditional CTR prediction tasks, we delve into the factors that may affect the model performance, such as parameter scale, normalization, regularization, etc. The results reveal the existence of the ''streaming learning dilemma'', whereby the same factor may have different effects on model performance in the static and streaming scenarios. Based on the findings, we propose two simple but inspiring methods (i.e., tuning key parameters and exemplar replay) that significantly improve the effectiveness of the CTR models in the new streaming scenario. We hope our work will inspire further research on streaming CTR prediction and help improve the robustness and adaptability of recommender systems.

10. Imposing Consistency Properties on Blackbox Systems with Applications to SVD-Based Recommender Systems

Tung Nguyen, Jeffrey Uhlmann

https://arxiv.org/abs/2307.08760

In this paper we discuss pre- and post-processing methods to induce desired consistency and/or invariance properties in blackbox systems, e.g., AI-based. We demonstrate our approach in the context of blackbox SVD-based matrix-completion methods commonly used in recommender system (RS) applications. We provide empirical results showing that enforcement of unit-consistency and shift-consistency, which have provable RS-relevant properties relating to robustness and fairness, also lead to improved performance according to generic RMSE and MAE performance metrics, irrespective of the initial chosen hyperparameter.

11. Efficient and Joint Hyperparameter and Architecture Search for Collaborative Filtering, KDD2023

Yan Wen, Chen Gao, Lingling Yi, Liwei Qiu, Yaqing Wang, Yong Li

https://arxiv.org/abs/2307.11004

Automated Machine Learning (AutoML) techniques have recently been introduced to design Collaborative Filtering (CF) models in a data-specific manner. However, existing works either search architectures or hyperparameters while ignoring the fact they are intrinsically related and should be considered together. This motivates us to consider a joint hyperparameter and architecture search method to design CF models. However, this is not easy because of the large search space and high evaluation cost. To solve these challenges, we reduce the space by screening out usefulness yperparameter choices through a comprehensive understanding of individual hyperparameters. Next, we propose a two-stage search algorithm to find proper configurations from the reduced space. In the first stage, we leverage knowledge from subsampled datasets to reduce evaluation costs; in the second stage, we efficiently fine-tune top candidate models on the whole dataset. Extensive experiments on real-world datasets show better performance can be achieved compared with both hand-designed and previous searched models. Besides, ablation and case studies demonstrate the effectiveness of our search framework.

12. Sharpness-Aware Graph Collaborative Filtering, SIGIR2023

Huiyuan Chen, Chin-Chia Michael Yeh, Yujie Fan, Yan Zheng, Junpeng Wang, Vivian Lai, Mahashweta Das, Hao Yang

https://arxiv.org/abs/2307.08910

Graph Neural Networks (GNNs) have achieved impressive performance in collaborative filtering. However, GNNs tend to yield inferior performance when the distributions of training and test data are not aligned well. Also, training GNNs requires optimizing non-convex neural networks with an abundance of local and global minima, which may differ widely in their performance at test time. Thus, it is essential to choose the minima carefully. Here we propose an effective training schema, called {gSAM}, under the principle that the textit{flatter} minima has a better generalization ability than the textit{sharper} ones. To achieve this goal, gSAM regularizes the flatness of the weight loss landscape by forming a bi-level optimization: the outer problem conducts the standard model training while the inner problem helps the model jump out of the sharp minima. Experimental results show the superiority of our gSAM.

13. Learning from Hierarchical Structure of Knowledge Graph for Recommendation, TOIS2023

Yingrong Qin , Chen Gao , Shuangqing Wei , Yue Wang , Depeng Jin , Jian Yuan , Lin Zhang , Dong Li , Jianye Hao , Yong Li

https://dl.acm.org/doi/10.1145/3595632

Knowledge graphs (KGs) can help enhance recommendation, especially for the data-sparsity scenario with limited user-item interaction data. Due to the strong power of representation learning of graph neural networks (GNNs), recent works of KG-based recommendation deploy GNN models to learn from both knowledge graph and user-item bipartite interaction graph. However, these works have not well considered the hierarchical structure of knowledge graph, leading to sub-optimal results. Despite the benefit of hierarchical structure, leveraging it is challenging since the structure is always partly-observed. In this work, we first propose to reveal unknown hierarchical structures with a supervised signal detection method and then exploit the hierarchical structure with disentangling representation learning. We conduct experiments on two large-scale datasets, of which the results well verify the superiority and rationality of the proposed method. Further experiments of ablation study with respect to key model designs have demonstrated the effectiveness and rationality of our proposed model. The code is available at https://github.com/tsinghua-fib-lab/HIKE.

14. An Admissible Shift-Consistent Method for Recommender Systems

Tung Nguyen, Jeffrey Uhlmann

https://arxiv.org/abs/2307.08857

In this paper, we propose a new constraint, called shift-consistency, for solving matrix/tensor completion problems in the context of recommender systems. Our method provably guarantees several key mathematical properties: (1) satisfies a recently established admissibility criterion for recommender systems; (2) satisfies a definition of fairness that eliminates a specific class of potential opportunities for users to maliciously influence system recommendations; and (3) offers robustness by exploiting provable uniqueness of missing-value imputation. We provide a rigorous mathematical description of the method, including its generalization from matrix to tensor form to permit representation and exploitation of complex structural relationships among sets of user and product attributes. We argue that our analysis suggests a structured means for defining latent-space projections that can permit provable performance properties to be established for machine learning methods.

15. Coarse-to-Fine Knowledge-Enhanced Multi-Interest Learning Framework for Multi-Behavior Recommendation, TOIS2023

Chang Meng , Ziqi Zhao , Wei Guo , Yingxue Zhang , Haolun Wu , Chen Gao , Dong Li , Xiu Li , Ruiming Tang

https://dl.acm.org/doi/10.1145/3606369

Multi-types of behaviors (e.g., clicking, carting, purchasing, etc.) widely exist in most real-world recommendation scenarios, which are beneficial to learn users’ multi-faceted preferences. As dependencies are explicitly exhibited by the multiple types of behaviors, effectively modeling complex behavior dependencies is crucial for multi-behavior prediction. The state-of-the-art multi-behavior models learn behavior dependencies indistinguishably with all historical interactions as input. However, different behaviors may reflect different aspects of user preference, which means that some irrelevant interactions may play as noises to the target behavior to be predicted. To address the aforementioned limitations, we introduce multi-interest learning to the multi-behavior recommendation. More specifically, we propose a novel Coarse-to-fine Knowledge-enhanced Multi-interest Learning (CKML) framework to learn shared and behavior-specific interests for different behaviors. CKML introduces two advanced modules, namely Coarse-grained Interest Extracting (CIE) and Fine-grained Behavioral Correlation (FBC), which work jointly to capture fine-grained behavioral dependencies. CIE uses knowledge-aware information to extract initial representations of each interest. FBC incorporates a dynamic routing scheme to further assign each behavior among interests. Empirical results on three real-world datasets verify the effectiveness and efficiency of our model in exploiting multi-behavior data.

推荐系统 https acl models 论文

0 人点赞