# Periodic Mobility Model

## Disclamer

**This post is giant wall of text and formula. I am not sure it is 100% correct so please read with your caution.**

“Friendship and Mobility: User movement in location-based social networks” is an interesting paper which discusses about user movement. I think its intuition is quite clear and trivial but it is (maybe) the first paper which separates the location of users into two clusters named *Home* and *Work*. The data analysis of this paper is very nice. After all, the authors proposed a model name *Periodic Mobility Model* aka *PMM* to predict the *Home* and *Work* location of users. The whole process is follow

- Firstly, given the time of day, it will define which is the current cluster (i.e.
*Home*or*Work*) of user. - Secondly, based on cluster, the location of user is drawn.

Before going to the formula, let me introduce the symbol system

- \(c_i = k\) means the check-in \(i\) is in cluster \(k\). In the original paper, \(k = 1 ; 2\) because there are only two clusters.
- \(l_i\) is the location (latitude and logitude) of check-in \(i\).
- \(t_i\) is the time of check-in \(i\).
- \(x_i\) represents the check-in \(i\). Formally, it is a pair of location and time of check-in \(x_i = (l_i, t_i) \).

## The first attempt

The joint distribution of each check-in will be calculated by

Taking the *log* becomes

Follow the EM-algorithm, we must calculate

where

Updating \( \mu_k \) and \(\Sigma_k\) are similar to Gaussian mixture model so we have

We take the differentiation of Q over other parameters

Setting them equal to 0 cannot give us a closed form solution because \(N_k(t_i)\) contains all other parameters inside. Moreover, there are some conditions that we need to follow. First of all, \(P_{c_k}\) is positive and the sum over *k* must be equal to 1. In other word, it is belong to the simplex. Secondly, \(\tau_k\) is time so it must in range of 0 and 24. Finally, \(\sigma_k\) should be positive also.

Optimizing Q with these constraints is the pain in the neck. Moreover, I could not find the closed form solution. It is quite weird because the authors claimed that they could derive the closed form solution.

## Another look of model

where

This viewpoint does not follow the original intuition of model. In this modification, user will choose his/her cluster first (*Home* or *Work*) and from this cluster, he/she will choose time and location of checkin.

To calculate for the first step of EM

We have one constraint that so apply Lagrange multiplier we have

To optimize \(P_{c_k}\) we take the derivative of Q and set it to 0. Moreover, taking advantage of the contraint to infer the value of \(\lambda\) and plug it back to find the final value of \(P_{c_k}\)

We do the same to optimize \(\tau_k\)

Note that using this formula \(\tau_k\) is still in range 0 and 24.

Below is the update rule for \(\sigma_k^2\)

Updating \( \mu_k \) follows the rule

Last but not least, \(\Sigma_k\)