<p>本次主要学习图数据库中常用到的一些算法,以及如何在<code>Neo4j</code>中调用,所以这一篇偏实战,每个算法的原理就简单的提一下。</p>
1. 图数据库中常用的算法
-
PathFinding & Search
一般用来发现Nodes之间的最短路径,常用算法有如下几种
- Dijkstra - 边不能为负值
- Folyd - 边可以为负值,有向图、无向图
- Bellman-Ford
- SPFA
-
Centrality
一般用来计算这个图中节点的中心性,用来发现比较重要的那些Nodes。这些中心性可以有很多种,比如
- Degree Centrality - 度中心性
- Weighted Degree Centrality - 加权度中心性
- Betweenness Centrality - 介数中心性
- Closeness Centrality - 紧度中心性
-
Community Detection
用于发现这个图中局部联系比较紧密的Nodes,类似我们学过的聚类算法。
- Strongly Connected Components
- Weakly Connected Components (Union Find)
- Label Propagation
- Lovain Modularity
- Triangle Count and Average Clustering Coefficient
2. 路径搜索算法
-
Shortest Path
1
2
3
4
5
6
7MATCH (start:Loc{name:"A"}), (end:Loc{name:"F"})
CALL algo.shortestPath.stream(start, end, "cost")
YIELD nodeId, cost
MATCH (other:Loc)
WHERE id(other) = nodeId
RETURN other.name AS name, cost -
Single Source Shortest Path
1
2
3
4
5
6MATCH (n:Loc {name:"A"})
CALL algo.shortestPath.deltaStepping.stream(n, "cost", 3.0
YIELD nodeId, distance
MATCH (destination) WHERE id(destination) = nodeId
RETURN destination.name AS destination, distance -
All Pairs Shortest Path
1
2
3
4
5
6
7
8
9
10
11CALL algo.allShortestPaths.stream("cost",{nodeQuery:"Loc",defaultValue:1.0})
YIELD sourceNodeId, targetNodeId, distance
WITH sourceNodeId, targetNodeId, distance
WHERE algo.isFinite(distance) = true
MATCH (source:Loc) WHERE id(source) = sourceNodeId
MATCH (target:Loc) WHERE id(target) = targetNodeId
WITH source, target, distance WHERE source <> target
RETURN source.name AS source, target.name AS target, distance
ORDER BY distance DESC
LIMIT 10 -
Minimum Weight Spanning Tree
1
2
3
4
5MATCH (n:Place {id:"D"})
CALL algo.spanningTree.minimum("Place", "LINK", "cost", id(n),
{write:true, writeProperty:"MINST"})
YIELD loadMillis, computeMillis, writeMillis, effectiveNodeCount
RETURN loadMillis, computeMillis, writeMillis, effectiveNodeCount; -
CASE
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15MERGE (a:Loc {name:"A"})
MERGE (b:Loc {name:"B"})
MERGE (c:Loc {name:"C"})
MERGE (d:Loc {name:"D"})
MERGE (e:Loc {name:"E"})
MERGE (f:Loc {name:"F"})
MERGE (a)-[:ROAD {cost:50}]->(b)
MERGE (a)-[:ROAD {cost:50}]->(c)
MERGE (a)-[:ROAD {cost:100}]->(d)
MERGE (b)-[:ROAD {cost:40}]->(d)
MERGE (c)-[:ROAD {cost:40}]->(d)
MERGE (c)-[:ROAD {cost:80}]->(e)
MERGE (d)-[:ROAD {cost:30}]->(e)
MERGE (d)-[:ROAD {cost:80}]->(f)
MERGE (e)-[:ROAD {cost:40}]->(f);
3. 中心性算法
-
PageRank
1
2
3
4
5
6CALL algo.pageRank.stream("Page", "LINKS",
{iterations:20})
YIELD nodeId, score
MATCH (node) WHERE id(node) = nodeId
RETURN node.name AS page,score
ORDER BY score DESC -
Degree Centrality
-
Betweenness Centrality
1
2
3
4
5CALL algo.betweenness.stream("User", "MANAGES", {direction:"out"})
YIELD nodeId, centrality
MATCH (user:User) WHERE id(user) = nodeId
RETURN user.id AS user,centrality
ORDER BY centrality DESC; -
Closeness Centrality
1
2
3
4
5
6CALL algo.closeness.stream("Node", "LINK")
YIELD nodeId, centrality
MATCH (n:Node) WHERE id(n) = nodeId
RETURN n.id AS node, centrality
ORDER BY centrality DESC
LIMIT 20; -
CASE
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22MERGE (home:Page {name:"Home"})
MERGE (about:Page {name:"About"})
MERGE (product:Page {name:"Product"})
MERGE (links:Page {name:"Links"})
MERGE (a:Page {name:"Site A"})
MERGE (b:Page {name:"Site B"})
MERGE (c:Page {name:"Site C"})
MERGE (d:Page {name:"Site D"})
MERGE (home)-[:LINKS]->(about)
MERGE (about)-[:LINKS]->(home)
MERGE (product)-[:LINKS]->(home)
MERGE (home)-[:LINKS]->(product)
MERGE (links)-[:LINKS]->(home)
MERGE (home)-[:LINKS]->(links)
MERGE (links)-[:LINKS]->(a)
MERGE (a)-[:LINKS]->(home)
MERGE (links)-[:LINKS]->(b)
MERGE (b)-[:LINKS]->(home)
MERGE (links)-[:LINKS]->(c)
MERGE (c)-[:LINKS]->(home)
MERGE (links)-[:LINKS]->(d)
MERGE (d)-[:LINKS]->(home)
4. 社区发现算法
-
Strongly Connected Components
1
2
3
4CALL algo.scc.stream("User","FOLLOWS")
YIELD nodeId, partition
MATCH (u:User) WHERE id(u) = nodeId
RETURN u.id AS name, partition -
Weakly Connected Components (Union Find)
1
2
3
4CALL algo.unionFind.stream("User", "FRIEND", {})
YIELD nodeId,setId
MATCH (u:User) WHERE id(u) = nodeId
RETURN u.id AS user, setId -
Label Propagation
1
2CALL algo.labelPropagation.stream("User", "FOLLOWS",
{direction: "OUTGOING", iterations: 10}) -
Lovain Modularity
1
2
3
4
5CALL algo.louvain.stream("User", "FRIEND", {})
YIELD nodeId, community
MATCH (user:User) WHERE id(user) = nodeId
RETURN user.id AS user, community
ORDER BY community; -
Triangle Count and Average Clustering Coefficient
1
2
3
4
5
6CALL algo.triangle.stream("Person","KNOWS")
YIELD nodeA,nodeB,nodeC
MATCH (a:Person) WHERE id(a) = nodeA
MATCH (b:Person) WHERE id(b) = nodeB
MATCH (c:Person) WHERE id(c) = nodeC
RETURN a.id AS nodeA, b.id AS nodeB, c.id AS node
5. References
- Neo4j in deep
- 官方文档:Comprehensive-Guide-to-Graph-Algorithms-in-Neo4j-ebook
原文地址:https://chenson.cc/2018/08/18/%E5%9B%BE%E6%95%B0%E6%8D%AE%E5%BA%93-Neo4j-%E5%B8%B8%E7%94%A8%E7%AE%97%E6%B3%95/
</div>