Large-Scale Interactive RecommendationwithTree-Structured Policy Gradient
Abstract Reinforcement learning (RL) has recently been introduced to interactive recommender systems (IRS) because of its nature of learning from dynamic interactions and planning for longrun performance.
RL可以被用于IRS因为它动态的特性以及为长期行为的打算。
As IRS is always with thousands of items to recommend (i.e., thousands of actions), most existing RLbased methods, however, fail to handle such a large discrete action space problem and thus become inefficient. The existing work that tries to deal with the large discrete action space problem by utilizing the deep deterministic policy gradient framework suffers from the inconsistency between the continuous action representation (the output of the actor network) and the real discrete action.
需要推荐的东西比较多,为了能够把RL用于推荐系统我们常常采用DDPG格式,但是DDPG格式会出现真是action和outpput出来的action之间的差异(一般采用cos similarity或是欧氏距离最近)
To avoid such inconsistency and achieve high efficiency and recommendation effectiveness, in this paper,
我们解决两者之间的不连贯性以及提高了它的效率。
we propose a Tree-structured Policy Gradient Recommendation (TPGR) framework, where a balanced hierarchical clustering tree is built over the items and picking an item is formulated as seeking a path from the root to a certain leaf of the tree.
就是我们采用了层次化的聚集树,所白了一层一层从上往下走,最后的叶子结点为action,每一层形成一个policy gradient选择下一层直到最后一个。
Extensive experiments on carefully-designed environments based on two real-world datasets demonstrate that our model provides superior recommendation performance and significant efficiency improvement over state-of-the-art methods.
实验证明我们很厉害。
我们先来看一下模型图
好了好了又想学习推荐系统科研的小可爱们,但又不知道该怎样写代码的可以可我的github主页或是由中国人民大学出品的RecBole
https://github.com/xingkongxiaxia/Sequential_Recommendation_System 基于ptyorch的当今主流推荐算法
https://github.com/xingkongxiaxia/tensorflow_recommend_system 我还有基于tensorflow的代码
https://github.com/RUCAIBox/RecBole RecBole(各种类型的,超过60种推荐算法)
欢迎大家点小星星