### Recommender System

methods:
• Collaborative filtering: find a group of users presenting similar behavior and use group's behavior to predict. NMF, nearest neighbors, etc.
• Content-Based
• Hybrid

#### Matrix Factorization

Modeling
• https://datajobs.com/data-science-repo/recommender-systems-%5bnetflix%5d.pdf
• user $u$ rated item $i$ with a rating $r_{ui}$, and we assume $r_{u,i}=q^T_ip_u$. $q_i p_u$ are typically short vectors representing items/users in latent space.
• therefore the cost function $L = \sum_{u,i} \{(r_{u,i} - q^T_ip_u)^2 + \lambda(||q_i||^2 + ||p_u||^2)\}$ for all known (u,i) pairs.
SGD Optimizing
• take the derivative of $L$ wrt $q_i$: $\frac{dL}{dq_i} = -p_u(r_{u,i} - q^T_ip_u) + \lambda q_i$, so the updating rule for $q_i$ is $q_i = q_i +\gamma(p_u(r_{u,i} - q^T_ip_u)-\lambda q_i)$.
• We have the error term $e_{u,i} = r_{u,i} - q^T_ip_u$
• and the updating rules are $q_i = q_i +\gamma(p_ue_{u,i}-\lambda q_i)$, and $p_u = p_u +\gamma(q_ie_{u,i}-\lambda p_u)$
Alternative Least Square

#### evaluation

• rmse (root mean squared error)

#### Implementation

• Since matrix is extremely sparse, when structing the data, only ratings (as well as its user/item) should be stored in memory.

#### Test runs

Some results I got on Movielens 1M data (used Java. all took only a few seconds to finish):
• plain NMF: initial value for P/Q: 1.0. gamma = 0.005, lambda = 0.00001 #iters: 100 K=20 rmse: 0.8399908248055877
• plain NMF: initial value for P/Q: 1.0. gamma = 0.005, lambda = 0.00001 #iters: 100 K=40 rmse: 0.865461767601043
• plain NMF: initial value for P/Q: 0.1. gamma = 0.005, lambda = 0.00001 #iters: 100 K=20 rmse: 0.831252749229948
• plain NMF: initial value for P/Q: 0.1. gamma = 0.0005, lambda = 0.000001 #iters: 100 K=20 rmse: 0.8208844660311445
• plain NMF: initial value for P/Q: 0.1. gamma = 0.0005, lambda = 0.000001 #iters: 200 K=20 rmse: 0.8182223706268318
• plain NMF: initial value for P/Q: 0.1. gamma = 0.0005, lambda = 0.000001 #iters: 1000 K=20 rmse: 0.8168452801151522
• plain NMF: initial value for P/Q: 0.1. gamma = 0.00005, lambda = 0.0000001 #iters: 1000 K=20 rmse: 0.8184550998312972
• plain NMF: initial value for P/Q: 0.1. gamma = 0.00005, lambda = 0.0000001 #iters: 10000 K=20 rmse: 0.8144844568711578