Implementing PageRank using MapReduce
Posted
by
Nick D.
on Stack Overflow
See other posts from Stack Overflow
or by Nick D.
Published on 2011-02-17T13:03:56Z
Indexed on
2011/02/17
23:25 UTC
Read the original article
Hit count: 512
Hello,
I'm trying to get my head around an issue with the theory of implementing the PageRank with MapReduce.
I have the following simple scenario with three nodes: A B C.
The adjacency matrix is here:
A { B, C }
B { A }
The PageRank for B for example is equal to:
(1-d)/N + d ( PR(A) / C(A) )
N = number of incoming links to B
PR(A) = PageRank of incoming link A
C(A) = number of outgoing links from page A
I am fine with all the schematics and how the mapper and reducer would work but I cannot get my head around how at the time of calculation by the reducer, C(A) would be known. How will the reducer, when calculating the PageRank of B by aggregating the incoming links to B will know the number of outgoing links from each page. Does this require a lookup in some external data source?
© Stack Overflow or respective owner