矩阵分解ALS及ALS-WR计算方法

Published: by Creative Commons Licence

工业级推荐系统中经常用到ALS(alternating least squares)来进行矩阵分解。本文对 论文Collaborative Filtering for Implicit Feedback Datasets中提到的ALS计算方法作补充数学说明。

矩阵分解

假设有$m$个User和$n$个item,所以评分矩阵为$R$

ALS希望找到两个比较低维度的矩阵($X$和$Y$)来逼近这个评分矩阵R。

矩阵分解公式: $\mathrm{R}{m \times n} \approx X{m \times k} \cdot Y_{n \times k}^{T}$ ($k$为隐因子数量)

其中:

$R$用户-物品评分矩阵

$X_{m \times k}=\left[\begin{array}{ccc}{x_{11}} & {\cdots} & {x_{1 k}} \ {\vdots} & {\ddots} & {\vdots} \ {x_{m 1}} & {\cdots} & {x_{m k}}\end{array}\right]=\left[\begin{array}{c}{x_{1}} \ {\vdots} \ {x_{m}}\end{array}\right]$ 用户-隐含特征矩阵

$Y_{n \times k}=\left[\begin{array}{ccc}{y_{11}} & {\cdots} & {y_{1 k}} \ {\vdots} & {\ddots} & {\vdots} \ {y_{n 1}} & {\cdots} & {y_{n k}}\end{array}\right]=\left[\begin{array}{c}{y_{1}} \ {\vdots} \ {y_{n}}\end{array}\right]$ 物品-隐含特征矩阵

ALS

代价函数(Loss): $\operatorname{Loss}(X, Y)=\sum_{i, u}\left(r_{u i}-\boldsymbol{x}{u} y{i}^{T}\right)^{2}+\lambda\left(\sum_{u}\left|\boldsymbol{x}{u}\right|^{2}+\sum{i}\left|\boldsymbol{y}_{i}\right|^{2}\right)$ ($λ$: 正规化系数)

计算步骤

  1. 初始化$X$, $Y$

  2. Fix $Y$, For u=1,2,……m, 计算 $x_u$

  3. Fix $X$, For i=1,2……n, 计算 $y_i$

  4. 重复2, 3. 更新RMSE

对$x_u$进行偏微分, 计算$x_u$

\[\begin{array}{l}{\frac{\partial \operatorname{loss}(X, Y)}{\partial \boldsymbol{x}_{u}}=0} \\ {\Rightarrow \frac{\partial \sum_{i, u}\left(r_{u i}-\boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}\right)^{2}+\lambda\left(\sum_{u}\left\|\boldsymbol{x}_{u}\right\|^{2}+\sum_{i}\left\|\boldsymbol{y}_{i}\right\|^{2}\right)}{\partial \boldsymbol{x}_{u}}=0} \\ {\Rightarrow-2 \sum_{i}\left(r_{u i}-\boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}\right) \boldsymbol{y}_{i}+\lambda\left(2 \boldsymbol{x}_{u}\right)=0} \\ {\Rightarrow-\sum_{i} r_{u i} \boldsymbol{y}_{i}+\sum_{i} x_{u} \boldsymbol{y}_{i}^{T} \boldsymbol{y}_{i}+\lambda \boldsymbol{x}_{u}=0} \\ {\Rightarrow\left(\sum_{i} \boldsymbol{y}_{i}^{T} \boldsymbol{y}_{i}+\lambda I\right) \boldsymbol{x}_{u}=\sum_{i} r_{u i} \boldsymbol{y}_{i}} \\ {\Rightarrow x_{u} = \left(\sum_{i} y_{i}^{T} \boldsymbol{y}_{i}+\lambda I\right)^{-1} \sum_{i} r_{u i} \boldsymbol{y}_{i}}\end{array}\]

ALS-WR

偏好(Preference): $p_{u i}=\left{\begin{array}{ll}{1} & {r_{u i}>0} \ {0} & {r_{u i}=0}\end{array}\right.$

置信度(Confidence): $c_{u i}=1+\alpha r_{u i}$

代价函数(Loss): $\operatorname{Loss}{A L S-W R}(X, Y)=\sum{i, u} c_{u i}\left(p_{u i}-\boldsymbol{x}{u} y{i}^{T}\right)^{2}+\lambda\left(\sum_{u}\left|x_{u}\right|^{2}+\sum_{i}\left|y_{i}\right|^{2}\right)$

计算步骤和ALS基本一样,这里只对$x_u$的推导做说明。

对$x_u$进行偏微分, 计算$x_u$

\[\begin{array}{l}{\frac{\partial \operatorname{Loss}_{A L S-W R}(X, Y)}{\partial \boldsymbol{x}_{u}}=0} \\ {\Rightarrow \quad \sum_{i, u} c_{u i}\left(p_{u i}-\boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}\right)^{2}+\lambda\left(\sum_{u}\left\|\boldsymbol{x}_{u}\right\|^{2}+\sum_{i}\left\|\boldsymbol{y}_{i}\right\|^{2}\right)} \\ {\Rightarrow \frac{\partial}{\partial \boldsymbol{x}_{u}}\left\{\sum_{i, u} c_{u i}\left(p_{u i}^{2}-2 p_{u i} \boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}+\left(\boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}\right)^{T}\left(\boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}\right)\right)\right\}+2 \lambda \boldsymbol{x}_{u}=0} \\ {\Rightarrow \frac{\partial}{\partial \boldsymbol{x}_{u}}\left\{\sum_{i, u}\left(c_{u i} p_{u i}^{2}-2 c_{u i} p_{u i} \boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}+c_{u i}\left(\boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}\right)^{T}\left(\boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}\right)\right)\right\}+2 \lambda \boldsymbol{x}_{u}=0} \\ {\Rightarrow \frac{\partial}{\partial \boldsymbol{x}_{u}}\left\{\sum_{i, u}\left(c_{u i} p_{u i}^{2}-2 c_{u i} p_{u i} \boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}+c_{u i}\left(\boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}\right)^{T}\left(\boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}\right)\right)\right\}+2 \lambda \boldsymbol{x}_{u}=0}\end{array}\] \[\begin{array}{l}{\Rightarrow \frac{\partial}{\partial x_{u}}\left\{\sum_{t, u} c_{u i} p_{u i}^{2}-2 \sum_{t, u} c_{u i} p_{u i} x_{u} y_{i}^{T}+\sum_{t, u} c_{u i}\left(x_{u} y_{i}^{T}\right)^{T}\left(x_{u} y_{i}^{Y}\right)\right\}+2 \lambda x_{u}} \\ {\Rightarrow \quad=0} \\ {\left.\Rightarrow-2 \sum_{i} c_{u i} p_{u i} y_{i}^{T}+2 \sum_{i} c_{u i}\left(x_{u} y_{i}^{T}\right) y_{i}\right\}+2 \lambda x_{u}=0} \\ {\Rightarrow-Y^{T} C^{u} p_{u}+x_{u} Y^{T} C^{u} Y+\lambda x_{u}=0} \\ {\Rightarrow\left(Y^{T} C^{u} Y+\lambda I\right) X=Y^{T} C^{u} p_{u}} \\ {\Rightarrow x_{u}=\left(Y^{T} C^{u} Y+\lambda I\right)^{-1} Y^{T} C^{u} p_{u}}\end{array}\]