community_detection
This query module enables using the Louvain method (opens in a new tab)[1] for community detection, using the Grappolo (opens in a new tab) parallel implementation.
The Louvain algorithm belongs to the modularity maximization family of community detection algorithms. Each node is initially assigned to its own community, and the algorithm uses a greedy heuristic to search for the community partition with the highest modularity score by merging previously obtained communities.
The algorithm is suitable for large-scale graphs as it runs in O(nlogn) time on a graph with n nodes. Further performance gains are obtained by parallelization using a distance-1 graph coloring heuristic, and a graph coarsening algorithm that aims to preserve communities.
[^1] Fast unfolding of communities in large networks (opens in a new tab), Blondel et al.
Trait | Value |
---|---|
Module type | algorithm |
Implementation | C++ |
Graph direction | undirected |
Relationship weights | weighted / unweighted |
Parallelism | parallel |
Procedures
You can execute this algorithm on graph projections, subgraphs or portions of the graph.
get(weight, coloring, min_graph_shrink, community_alg_threshold, coloring_alg_threshold)
Computes graph communities using the Louvain method.
Input:
weight: string (default=null)
➡ Specifies the default edge weight. If not set, the algorithm uses theweight
edge attribute when present and otherwise treats the graph as unweighted.coloring: boolean (default=False)
➡ If set, use the graph coloring heuristic for effective parallelization.min_graph_shrink: integer (default=100000)
➡ The graph coarsening optimization stops upon shrinking the graph to this many nodes.community_alg_threshold: double (default=0.000001)
➡ Controls how long the algorithm iterates. When the gain in modularity goes below the threshold, iteration is over. Valid values are between 0 and 1 (exclusive).coloring_alg_threshold: double (default=0.01)
➡ If coloring is enabled, controls how long the algorithm iterates. When the gain in modularity goes below this threshold, a final iteration is performed using thecommunity_alg_threshold
value. Valid values are between 0 and 1 (exclusive); this parameter's value should be higher thancommunity_alg_threshold
.
Output:
node: Vertex
➡ Graph node.community_id: integer
➡ Community ID. Defaults to -1 if the node does not belong to any community.
Usage:
CALL community_detection.get()
YIELD node, community_id;
get_subgraph(subgraph_nodes, subgraph_relationships, weight, coloring, min_graph_shrink, community_alg_threshold, coloring_alg_threshold)
Computes graph communities over a subgraph using the Louvain method.
Input:
subgraph_nodes: List[Node]
➡ List of nodes in the subgraph.subgraph_relationships: List[Relationship]
➡ List of relationships in the subgraph.weight: str (default=null)
➡ Specifies the default relationship weight. If not set, the algorithm uses theweight
relationship attribute when present and otherwise treats the graph as unweighted.coloring: bool (default=False)
➡ If set, use the graph coloring heuristic for effective parallelization.min_graph_shrink: int (default=100000)
➡ The graph coarsening optimization stops upon shrinking the graph to this many nodes.community_alg_threshold: double (default=0.000001)
➡ Controls how long the algorithm iterates. When the gain in modularity goes below the threshold, iteration is over. Valid values are between 0 and 1 (exclusive).coloring_alg_threshold: double (default=0.01)
➡ If coloring is enabled, controls how long the algorithm iterates. When the gain in modularity goes below this threshold, a final iteration is performed using thecommunity_alg_threshold
value. Valid values are between 0 and 1 (exclusive); this parameter's value should be higher thancommunity_alg_threshold
.
Output:
node: Vertex
➡ Graph node.community_id: int
➡ Community ID. Defaults to -1 if the node does not belong to any community.
Usage:
MATCH (a)-[e]-(b)
WITH COLLECT(a) AS nodes, COLLECT (e) AS relationships
CALL community_detection.get_subgraph(nodes, relationships)
YIELD node, community_id;
Example
Input graph
Load commands
MERGE (a: Node {id: 0}) MERGE (b: Node {id: 1}) CREATE (a)-[r: Relation]->(b);
MERGE (a: Node {id: 0}) MERGE (b: Node {id: 2}) CREATE (a)-[r: Relation]->(b);
MERGE (a: Node {id: 1}) MERGE (b: Node {id: 2}) CREATE (a)-[r: Relation]->(b);
MERGE (a: Node {id: 2}) MERGE (b: Node {id: 3}) CREATE (a)-[r: Relation]->(b);
MERGE (a: Node {id: 3}) MERGE (b: Node {id: 4}) CREATE (a)-[r: Relation]->(b);
MERGE (a: Node {id: 3}) MERGE (b: Node {id: 5}) CREATE (a)-[r: Relation]->(b);
MERGE (a: Node {id: 4}) MERGE (b: Node {id: 5}) CREATE (a)-[r: Relation]->(b);
Running command
CALL community_detection.get()
YIELD node, community_id
RETURN node.id AS node_id, community_id
ORDER BY node_id;
Results
┌─────────────────────────┬─────────────────────────┐
│ node_id │ community_id │
├─────────────────────────┼─────────────────────────┤
│ 0 │ 1 │
│ 1 │ 1 │
│ 2 │ 1 │
│ 3 │ 2 │
│ 4 │ 2 │
│ 5 │ 2 │
└─────────────────────────┴─────────────────────────┘