矩阵分解ALS及ALS-WR计算方法
工业级推荐系统中经常用到ALS(alternating least squares)
来进行矩阵分解。本文对 论文Collaborative Filtering for Implicit Feedback Datasets中提到的ALS
计算方法作补充数学说明。
矩阵分解
假设有$m$个User
和$n$个item
,所以评分矩阵为$R$
ALS
希望找到两个比较低维度的矩阵($X$和$Y$)来逼近这个评分矩阵R。
矩阵分解公式: $\mathrm{R}{m \times n} \approx X{m \times k} \cdot Y_{n \times k}^{T}$ ($k$为隐因子数量)
其中:
$R$用户-物品评分矩阵
$X_{m \times k}=\left[\begin{array}{ccc}{x_{11}} & {\cdots} & {x_{1 k}} \ {\vdots} & {\ddots} & {\vdots} \ {x_{m 1}} & {\cdots} & {x_{m k}}\end{array}\right]=\left[\begin{array}{c}{x_{1}} \ {\vdots} \ {x_{m}}\end{array}\right]$ 用户-隐含特征矩阵
$Y_{n \times k}=\left[\begin{array}{ccc}{y_{11}} & {\cdots} & {y_{1 k}} \ {\vdots} & {\ddots} & {\vdots} \ {y_{n 1}} & {\cdots} & {y_{n k}}\end{array}\right]=\left[\begin{array}{c}{y_{1}} \ {\vdots} \ {y_{n}}\end{array}\right]$ 物品-隐含特征矩阵
ALS
代价函数(Loss): $\operatorname{Loss}(X, Y)=\sum_{i, u}\left(r_{u i}-\boldsymbol{x}{u} y{i}^{T}\right)^{2}+\lambda\left(\sum_{u}\left|\boldsymbol{x}{u}\right|^{2}+\sum{i}\left|\boldsymbol{y}_{i}\right|^{2}\right)$ ($λ$: 正规化系数)
计算步骤:
-
初始化$X$, $Y$
-
Fix $Y$, For u=1,2,……m, 计算 $x_u$
-
Fix $X$, For i=1,2……n, 计算 $y_i$
-
重复2, 3. 更新
RMSE
对$x_u$进行偏微分, 计算$x_u$
\[\begin{array}{l}{\frac{\partial \operatorname{loss}(X, Y)}{\partial \boldsymbol{x}_{u}}=0} \\ {\Rightarrow \frac{\partial \sum_{i, u}\left(r_{u i}-\boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}\right)^{2}+\lambda\left(\sum_{u}\left\|\boldsymbol{x}_{u}\right\|^{2}+\sum_{i}\left\|\boldsymbol{y}_{i}\right\|^{2}\right)}{\partial \boldsymbol{x}_{u}}=0} \\ {\Rightarrow-2 \sum_{i}\left(r_{u i}-\boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}\right) \boldsymbol{y}_{i}+\lambda\left(2 \boldsymbol{x}_{u}\right)=0} \\ {\Rightarrow-\sum_{i} r_{u i} \boldsymbol{y}_{i}+\sum_{i} x_{u} \boldsymbol{y}_{i}^{T} \boldsymbol{y}_{i}+\lambda \boldsymbol{x}_{u}=0} \\ {\Rightarrow\left(\sum_{i} \boldsymbol{y}_{i}^{T} \boldsymbol{y}_{i}+\lambda I\right) \boldsymbol{x}_{u}=\sum_{i} r_{u i} \boldsymbol{y}_{i}} \\ {\Rightarrow x_{u} = \left(\sum_{i} y_{i}^{T} \boldsymbol{y}_{i}+\lambda I\right)^{-1} \sum_{i} r_{u i} \boldsymbol{y}_{i}}\end{array}\]ALS-WR
偏好(Preference): $p_{u i}=\left{\begin{array}{ll}{1} & {r_{u i}>0} \ {0} & {r_{u i}=0}\end{array}\right.$
置信度(Confidence): $c_{u i}=1+\alpha r_{u i}$
代价函数(Loss): $\operatorname{Loss}{A L S-W R}(X, Y)=\sum{i, u} c_{u i}\left(p_{u i}-\boldsymbol{x}{u} y{i}^{T}\right)^{2}+\lambda\left(\sum_{u}\left|x_{u}\right|^{2}+\sum_{i}\left|y_{i}\right|^{2}\right)$
计算步骤和ALS
基本一样,这里只对$x_u$的推导做说明。
对$x_u$进行偏微分, 计算$x_u$
\[\begin{array}{l}{\frac{\partial \operatorname{Loss}_{A L S-W R}(X, Y)}{\partial \boldsymbol{x}_{u}}=0} \\ {\Rightarrow \quad \sum_{i, u} c_{u i}\left(p_{u i}-\boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}\right)^{2}+\lambda\left(\sum_{u}\left\|\boldsymbol{x}_{u}\right\|^{2}+\sum_{i}\left\|\boldsymbol{y}_{i}\right\|^{2}\right)} \\ {\Rightarrow \frac{\partial}{\partial \boldsymbol{x}_{u}}\left\{\sum_{i, u} c_{u i}\left(p_{u i}^{2}-2 p_{u i} \boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}+\left(\boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}\right)^{T}\left(\boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}\right)\right)\right\}+2 \lambda \boldsymbol{x}_{u}=0} \\ {\Rightarrow \frac{\partial}{\partial \boldsymbol{x}_{u}}\left\{\sum_{i, u}\left(c_{u i} p_{u i}^{2}-2 c_{u i} p_{u i} \boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}+c_{u i}\left(\boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}\right)^{T}\left(\boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}\right)\right)\right\}+2 \lambda \boldsymbol{x}_{u}=0} \\ {\Rightarrow \frac{\partial}{\partial \boldsymbol{x}_{u}}\left\{\sum_{i, u}\left(c_{u i} p_{u i}^{2}-2 c_{u i} p_{u i} \boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}+c_{u i}\left(\boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}\right)^{T}\left(\boldsymbol{x}_{u} \boldsymbol{y}_{i}^{T}\right)\right)\right\}+2 \lambda \boldsymbol{x}_{u}=0}\end{array}\] \[\begin{array}{l}{\Rightarrow \frac{\partial}{\partial x_{u}}\left\{\sum_{t, u} c_{u i} p_{u i}^{2}-2 \sum_{t, u} c_{u i} p_{u i} x_{u} y_{i}^{T}+\sum_{t, u} c_{u i}\left(x_{u} y_{i}^{T}\right)^{T}\left(x_{u} y_{i}^{Y}\right)\right\}+2 \lambda x_{u}} \\ {\Rightarrow \quad=0} \\ {\left.\Rightarrow-2 \sum_{i} c_{u i} p_{u i} y_{i}^{T}+2 \sum_{i} c_{u i}\left(x_{u} y_{i}^{T}\right) y_{i}\right\}+2 \lambda x_{u}=0} \\ {\Rightarrow-Y^{T} C^{u} p_{u}+x_{u} Y^{T} C^{u} Y+\lambda x_{u}=0} \\ {\Rightarrow\left(Y^{T} C^{u} Y+\lambda I\right) X=Y^{T} C^{u} p_{u}} \\ {\Rightarrow x_{u}=\left(Y^{T} C^{u} Y+\lambda I\right)^{-1} Y^{T} C^{u} p_{u}}\end{array}\]