
2021-01-20 10:33:29 浏览数 (1)

Large-Scale Interactive RecommendationwithTree-Structured Policy Gradient

Abstract Reinforcement learning (RL) has recently been introduced to interactive recommender systems (IRS) because of its nature of learning from dynamic interactions and planning for longrun performance.


As IRS is always with thousands of items to recommend (i.e., thousands of actions), most existing RLbased methods, however, fail to handle such a large discrete action space problem and thus become inefficient. The existing work that tries to deal with the large discrete action space problem by utilizing the deep deterministic policy gradient framework suffers from the inconsistency between the continuous action representation (the output of the actor network) and the real discrete action.

需要推荐的东西比较多,为了能够把RL用于推荐系统我们常常采用DDPG格式,但是DDPG格式会出现真是action和outpput出来的action之间的差异(一般采用cos similarity或是欧氏距离最近)

To avoid such inconsistency and achieve high efficiency and recommendation effectiveness, in this paper,


we propose a Tree-structured Policy Gradient Recommendation (TPGR) framework, where a balanced hierarchical clustering tree is built over the items and picking an item is formulated as seeking a path from the root to a certain leaf of the tree.

就是我们采用了层次化的聚集树,所白了一层一层从上往下走,最后的叶子结点为action,每一层形成一个policy gradient选择下一层直到最后一个。

Extensive experiments on carefully-designed environments based on two real-world datasets demonstrate that our model provides superior recommendation performance and significant efficiency improvement over state-of-the-art methods.



总之想法还比较独特,通过一层一层的分解减少了action space还是比较6的总之想法还比较独特,通过一层一层的分解减少了action space还是比较6的
policy gradient个数policy gradient个数


https://github.com/xingkongxiaxia/Sequential_Recommendation_System 基于ptyorch的当今主流推荐算法

https://github.com/xingkongxiaxia/tensorflow_recommend_system 我还有基于tensorflow的代码

https://github.com/RUCAIBox/RecBole RecBole(各种类型的,超过60种推荐算法)


0 人点赞