PageRank

From TechWiki

To summarize, PageRank can be either computed iteratively or algebraically employing matrices. Alternatively, the power method can be employed.

Contents

Iterative

In the former case, at t = 0, an initial probability distribution is assumed, usually

PR(p_i; 0) = \frac{1}{N}.

At each time step, the computation, as detailed above, yields

PR(p_i;t+1) = \frac{1-d}{N} + d \sum_{p_j \in M(p_i)} \frac{PR (p_j; t)}{L(p_j)},

or in matrix notation

\mathbf{R}(t+1) = d \mathcal{M}\mathbf{R}(t) + \frac{1-d}{N} \mathbf{1},       (*)

where \mathbf{R}_i(t)=PR(p_i; t) and \mathbf{1} is the column vector of length N containing only ones.

The matrix \mathcal{M} is defined as

\mathcal{M}_{ij} = \begin{cases} 1 /L(p_j) , & \mbox{if }j\mbox{ links to }i\ \\ 0, & \mbox{otherwise} \end{cases}

i.e.,

\mathcal{M} :=  (K^{-1} A)^t,

where A denotes the adjacency matrix of the graph and K is the diagonal matrix with the outdegrees in the diagonal.

The computation ends when for some small ε

|\mathbf{R}_i(t+1) - \mathbf{R}_i(t)| < \epsilon,

i.e., when convergence is assumed.

Algebraic

In the latter case, for t \to \infty (i.e., in the steady state), the above equation (*) reads

\mathbf{R} = d \mathcal{M}\mathbf{R} + \frac{1-d}{N} \mathbf{1}.       (**)

The solution is given by

\mathbf{R} =  (\mathbf{I}-d \mathcal{M})^{-1}  \frac{1-d}{N}  \mathbf{1},

with the identity matrix \mathbf{I}.

The solution exists and is unique for 0 < d < 1. This can be seen noting that \mathcal{M} is by construction a stochastic matrix and hence has an eigenvalue equal to one because of the Perron Frobenius theorem.

Power Method

If the matrix \mathcal{M} is a transition probability, i.e., column-stochastic with no columns consisting of just zeros and \mathbf{R} is a probability distribution (i.e., |\mathbf{R}|=1), Eq. (**) is equivalent to

\mathbf{R} = \left( d \mathcal{M} + \frac{1-d}{N} \mathbf{I} \right)\mathbf{R} =: \widehat \mathcal{M} \mathbf{R}.       (***)

Hence PageRank \mathbf{R} is the principal eigenvector of \widehat\mathcal{M}. A fast and easy way to compute this is using the power method: starting with an arbitrary vector x(0), the operator \widehat\mathcal{M} is applied in succession, i.e.,

x(t+1) = \widehat\mathcal{M} x(t),

until

| x(t + 1) − x(t) | < ε.

Note that in Eq. (***) the matrix on the right-hand side in the parenthesis can be interpreted as

\frac{1-d}{N} \mathbf{I} = (1-d)\mathbf{P} \mathbf{1}^t,

where \mathbf{P} is an initial probability distribution. In the current case

\mathbf{P} := \frac{1}{N} \mathbf{1}.

Finally, if \mathcal{M} has columns with only zero values, they should be replaced with the initial probability vector \mathbf{P}. In other words

\mathcal{M}^\prime := \mathcal{M} + \mathcal{D},

where the matrix \mathcal{D} is defined as

\mathcal{D} := \mathbf{D} \mathbf{P}^t,

with

\mathbf{D}_i = \begin{cases} 1, & \mbox{if }L(p_i)=0\ \\ 0, & \mbox{otherwise} \end{cases}

In this case, the above two computations using \mathcal{M} only give the same PageRank if their results are normalized:

\mathbf{R}_{\textrm{power}} = \frac{\mathbf{R}_{\textrm{iterative}}}{|\mathbf{R}_{\textrm{iterative}}|} =  \frac{\mathbf{R}_{\textrm{algebraic}}}{|\mathbf{R}_{\textrm{algebraic}}|}.

Efficiency

Depending on the framework used to perform the computation, the exact implementation of the methods, and the required accuracy of the result, the computation time of the three methods can vary greatly. Usually if the computation has to be performed many times (i.e., for growing networks) or the network size is large, the algebraic computation is slower and memory hungry due to the inversion of the matrix, and the power method is the most efficient.