对比学习被广泛的应用于序列推荐,以解决交互数据稀疏的问题,但现有的基于对比学习的方法无法确保通过对给定锚用户序列的一些随机增强(或序列采样)获得的正(或负)序列在语义上保持相似(或不同)。当正负序列分别为假阳性和假阴性时,可能会导致推荐性能下降。本文通过提出解释引导增强(EGA)和解释引导对比学习序列推荐(EC4SRec)模型框架来解决上述问题。EGA 的关键思想是利用解释方法来确定商品在用户序列中的重要性,并相应地推导出正负序列。然后,EC4SRec 在 EGA 操作生成的正负序列上结合自监督和监督对比学习,以改进序列表征学习以获得更准确的推荐结果。
2. 基础
2.1 问题定义
令U,V分别为用户和商品集合。用户序列表示为
S_u=[v_1^u,...,v_{|S_u|}^u]
。序列推荐的目标是通过已知的交互序列预测用户下一时刻可能交互的商品
v_{*}^{u}=arg max _{v in mathcal{V}} Pleft(v_{left|s_{u}right| 1}^{u}=v mid s_{u}right)
begin{array}{c}
Pleft(s_{u_{k}}right)=frac{operatorname{util}left(s_{u_{k}}right)}{sum_{s_{u_{j}} in S_{u}^{c}} u operatorname{til}left(s_{u_{j}}right)} \
operatorname{util}left(s_{u_{k}}right)=frac{left|s_{u} cap s_{u_{k}}right|}{left|s_{u} cup s_{u_{k}}right|} sum_{v in s_{u} cap s_{u_{k}}} operatorname{score}(v)
end{array}
mathcal{L}_{E C 4 S operatorname{Rec}(S S L)}=sum_{u in U_{B}} mathcal{L}_{r e c}left(s_{u}right) lambda_{c l }left(mathcal{L}_{c l }left(s_{u}right) lambda_{c l-} mathcal{L}_{c l-}left(s_{u}right)right)
mathcal{L}_{r e c}left(s_{u}right)=-log frac{exp left(operatorname{sim}left(h_{u}, h_{v_{*}^{u}}right)right)}{exp left(operatorname{sim}left(h_{u}, h_{v_{*}^{u}}right)right) sum_{v^{-} in V^{-}} exp left(operatorname{sim}left(h_{u}, h_{v^{-}}right)right)}
A^ ={a_{ecrop },a_{emask },a_{erord }}
和
A^-={a_{ecrop-},a_{emask-}}
。为了计算
L_{cl }(s_u)
,从A 中采样得到
a_i
和
a_j
(
a_i neq a_j
)然后应用于
s_u
,得到两个正视图
s_u^{a_i}
和
s_u^{a_j}
。重复这一操作应用于其他所有用户和序列生成正视图,用户u的正视图集合表示为
S_u^ ={s_u^{a_i}, s_{u}^{a_j}}
。为了让
s_u^{a_i}
和
s_u^{a_j}
的表征彼此相近,和其他用户的表征疏远,损失函数定义为下式,
mathcal{L}_{c l }left(s_{u}right)=-log frac{exp left(operatorname{sim}left(h_{u}^{a_{i}}, h_{u}^{a_{j}}right)right)}{exp left(operatorname{sim}left(h_{u}^{a_{i}}, h_{u}^{a_{j}}right)right) sum_{u_{u^{prime}}^{a} in S^{ }-S_{u}^{ }} exp left(operatorname{sim}left(h_{u}^{a_{i}}, h_{u^{prime}}^{a}right)right)}
同理可以得到
S_u^-
,对于用户的负视图,希望其与其他用户的负视图相近,和所有用户的正视图疏远,
S^-
表示对所有用户重复上述操作后的负视图集。损失函数如下,h为序列表征
mathcal{L}_{c l-}left(s_{u}right)=-frac{1}{left|S^{-}right|-1} sum_{s_{u^{prime}}^{a} in S^{-}-left{s_{u}^{a-}right}} log frac{exp left(operatorname{sim}left(h_{u}^{a-}, h_{u^{prime}}^{a}right)right)}{sum_{s in S^{ } cupleft{s_{u^{prime}}^{a}right}} exp left(operatorname{sim}left(h_{u}^{a-}, hright)right)}
3.4.2 解释引导的监督对比学习
该模型扩展了 DuoRec 以使用解释引导增强。损失函数为下式,
h_u^{ertrl }
是通过ertrl 增强的序列的表征。
mathcal{L}_{E C 4 S operatorname{Rec}(S L)}=sum_{u in U_{B}} mathcal{L}_{r e c}left(s_{u}right) lambda mathcal{L}_{s l }left(s_{u}right)
begin{array}{l}
mathcal{L}_{s l }left(s_{u}right)= \
quad-left(log frac{exp left(operatorname{sim}left(h_{u}, h_{u}^{e r t r l }right) / tauright)}{exp left(operatorname{sim}left(h_{u}, h_{u}^{e r t r l }right) / tauright) sum_{s^{-} in S_{u}^{-}} exp left(operatorname{sim}left(h_{u}, h^{-}right) / tauright)} right. \
left.quad log frac{exp left(operatorname{sim}left(h_{u}^{e r t r l }, h_{u}right) / tauright)}{exp left(operatorname{sim}left(h_{u}^{e r t r l }, h_{u}right) / tauright) sum_{s^{-} in S_{u}^{-}} exp left(operatorname{sim}left(h_{u}^{e r t r l }, h^{-}right) / tauright)}right)
end{array}
3.4.3 结合
将上面两部分结合可得到最终的损失函数
begin{array}{l}
mathcal{L}_{text {EC4SRec }}= \
quad sum_{u in U_{B}} mathcal{L}_{r e c}left(s_{u}right) lambda_{c l } mathcal{L}_{c l }left(s_{u}right) lambda_{c l-} mathcal{L}_{c l-}left(s_{u}right) lambda_{s l } mathcal{L}_{s l }left(s_{u}right)
end{array}