# PageRank

### From TechWiki

To summarize, PageRank can be either computed iteratively or algebraically employing matrices. Alternatively, the power method can be employed.

## Contents |

#### Iterative

In the former case, at *t* = 0, an initial probability distribution is assumed, usually

- .

At each time step, the computation, as detailed above, yields

- ,

or in matrix notation

- , (*)

where
and is the column vector of length *N* containing only ones.

The matrix is defined as

i.e.,

- ,

where
*A* denotes the adjacency matrix of the graph and *K* is the diagonal matrix with the outdegrees in the diagonal.

The computation ends when for some small ε

- ,

i.e., when convergence is assumed.

#### Algebraic

In the latter case, for (i.e., in the steady state), the above equation (*) reads

- . (**)

The solution is given by

- ,

with the identity matrix .

The solution exists and is unique for 0 < *d* < 1. This can be seen noting that is by construction a stochastic matrix and hence has an eigenvalue equal to one because of the Perron Frobenius theorem.

#### Power Method

If the matrix is a transition probability, i.e., column-stochastic with no columns consisting of just zeros and is a probability distribution (i.e., ), Eq. (**) is equivalent to

- . (***)

Hence PageRank is the principal eigenvector of . A fast and easy
way to compute this is using the power method: starting with an arbitrary vector *x*(0), the operator is applied in succession, i.e.,

- ,

until

- |
*x*(*t*+ 1) −*x*(*t*) | < ε.

Note that in Eq. (***) the matrix on the right-hand side in the parenthesis can be interpreted as

- ,

where is an initial probability distribution. In the current case

- .

Finally, if has columns with only zero values, they should be replaced with the initial probability vector . In other words

- ,

where the matrix is defined as

- ,

with

In this case, the above two computations using only give the same PageRank if their results are normalized:

- .

#### Efficiency

Depending on the framework used to perform the computation, the exact implementation of the methods, and the required accuracy of the result, the computation time of the three methods can vary greatly. Usually if the computation has to be performed many times (i.e., for growing networks) or the network size is large, the algebraic computation is slower and memory hungry due to the inversion of the matrix, and the power method is the most efficient.