Finding most Important Node(s) in a Directed Graph
- by Srikar Appal
I have a large (˜ 20 million nodes) directed Graph with in-edges & out-edges. I want to figure out which parts of of the graph deserve the most attention. Often most of the graph is boring, or at least it is already well understood. The way I am defining "attention" is by the concept of "connectedness" i.e. How can i find the most connected node(s) in the graph?
In what follows, One can assume that nodes by themselves have no score, the edges have no weight & they are either connected or not.
This website suggest some pretty complicated procedures like n-dimensional space, Eigen Vectors, graph centrality concepts, pageRank etc. Is this problem that complex?
Can I not do a simple Breadth-First Traversal of the entire graph where at each node I figure out a way to find the number of in-edges. The node with most in-edges is the most important node in the graph. Am I missing something here?