
Understanding Memgraph Through Dependency Graphs
Memgraph is a large C++ system with multiple dependency layers, and each layer reveals a different part of how the system behaves under change.
In this article, we look at three complementary graph views: a CMake target graph, a Ninja build graph, and a Conan package graph. Together, they provide a practical map of architecture, build execution, and third-party dependency risk.
At a high level, each graph is a different lens:
- CMake: What does our target-level architecture and coupling look like?
- Ninja: Which build-step paths dominate incremental rebuild cost?
- Conan: Where is third-party package influence most concentrated?
Why This Is Useful
Large C++ systems are difficult to reason about from source files alone. Graph analysis makes hidden structure visible: how connected the system is, which nodes have outsized influence, where chokepoints appear, and where cycles may indicate tighter coupling than expected.
How the Graphs Are Generated (High Level)
The workflow is straightforward: export each graph as DOT (Graphviz DOT format), convert DOT to Cypher, import into Memgraph, then run graph algorithms.
Note: All diagrams in this post are SVG files, so you can download them and explore them locally.
1) CMake Graph (Target-Level Architecture)
CMake captures target-level relationships between libraries, executables, and supporting build targets. During the configure/generate phase, CMake can export a Graphviz view as a DOT graph. That DOT output is converted to Cypher and imported into Memgraph for analysis.
Legend (CMake node colors):
- ● Executable
- ● Static library
- ● Shared library
- ● Module library
- ● Interface library
- ● Object library
- ● External library
- ● Custom target
- → Interface dependency edge
- → Private dependency edge
- → General dependency edge
- Main
memgraphnode is shown at larger size.
2) Ninja Graph (Build Execution Mechanics)
Ninja captures concrete build-step dependencies such as compile, link, and generated-file steps. After configuration, Ninja has a concrete plan in build.ninja, and ninja -t graph emits this dependency structure as DOT. As with the CMake graph, we convert DOT to Cypher and import it into Memgraph.
Legend (Ninja node colors):
- ● Ninja target
- ● Ninja rule
- ● Ninja build step
- ● Ninja phony rule
- ● Ninja artifact
- → Input dependency edge
- → Build dependency edge
- → Other/unspecified edge
- Main
memgraphnode is shown at larger size.
3) Conan Graph (Third-Party Packages)
Conan captures package-level relationships during dependency resolution. We generate a DOT representation with conan graph info --format dot, then convert it to Cypher and load it the same way for consistent analysis in Memgraph.
Legend (Conan node colors):
- ● Memgraph root package
- ● Direct dependencies
- ● Transitive dependencies
- → Package dependency edge
From DOT to Analysis
After generating the DOT files, nodes and edges are normalized into a property-graph model and emitted as Cypher statements. The graph is imported into Memgraph, where we run algorithms such as connected components, bridges, PageRank, betweenness, and cycle detection.
Key Findings
Snapshot Metrics
| Graph | Nodes | Edges | Bridges | Articulation points | WCC | SCC | Is DAG | Avg degree |
|---|---|---|---|---|---|---|---|---|
| CMake | 1140 | 2960 | 77 | 44 | 16 | 1124 | False | 2.60 |
| Ninja | 2298 | 4537 | 259 | 40 | 1 | 2298 | True | 1.97 |
| Conan | 62 | 139 | 22 | 8 | 1 | 62 | True | 2.24 |
Table caption: WCC = weakly connected components, SCC = strongly connected components, and DAG = directed acyclic graph.
Quick interpretation:
- CMake is not a DAG while Ninja is, which suggests architecture-level cycles can exist even when build execution flow remains acyclic.
- CMake has 1124 SCCs across 1140 nodes, so most SCCs are trivial single-node components and only a smaller subset forms true multi-node cycles.
- Bridge and articulation counts indicate structural chokepoints: they are useful when intentional around stable boundaries, but risky when concentrated in frequently changing areas.
- Very low WCC counts in Ninja/Conan suggest those layers are highly connected and have limited true modular isolation.
What this means in practice:
The system is mostly one large connected structure (especially in CMake and Ninja), so changes can have wider-than-expected ripple effects. CMake is the clearest architecture lens, and its non-DAG signal indicates cycle hotspots that deserve extra care during refactors. Ninja is the strongest build-performance lens, where central nodes are mostly linker/build-mechanics nodes and therefore good optimization targets. Conan is compact and package-focused, making it useful for third-party risk/governance (for example AWS/curl/OpenSSL), especially during dependency upgrades.
Top-Central Nodes (Examples)
To keep comparisons meaningful, the rankings below come from homogeneous projections: same node type and edge type per graph (CMake targets with Dependency, Ninja targets with BuildDependency, Conan packages with Requires). This avoids mixed-type artifacts where infrastructure/helper nodes can dominate rankings, and makes the results easier to interpret at the target/package level. For CMake, we therefore show both all projected targets and a second view excluding shipped query modules. For Ninja, this projection shifts focus away from low-level linker/dyndep internals toward target-level build-order chokepoints.
PageRank (Top 5)
| Rank | Conan | CMake (all projected targets) | CMake (excluding query modules) | Ninja |
|---|---|---|---|---|
| 1 | aws-c-common/0.12.5 | mg-utils (mg::utils) | mg-utils (mg::utils) | cmake_object_order_depends_target_mg-utils |
| 2 | zlib/1.3.1 | community_detection_online | memgraph | cmake_object_order_depends_target_mg-communication |
| 3 | openssl/3.0.18 | convert | gflags | cmake_object_order_depends_target_mg-storage-v2 |
| 4 | m4/1.4.19 | pagerank_online | nlohmann_json (nlohmann_json::nlohmann_json) | cmake_object_order_depends_target_mg-settings |
| 5 | cmake/4.2.0 | vector_search | fmt::fmt | cmake_object_order_depends_target_mg-parameters |
Betweenness Centrality (Top 5)
| Rank | Conan | CMake | Ninja |
|---|---|---|---|
| 1 | aws-sdk-cpp/1.11.692 | mg-utils (mg::utils) | cmake_object_order_depends_target_mg-utils |
| 2 | libcurl/8.17.0 | mg-storage-v2 (mg::storage) | cmake_object_order_depends_target_mg-settings |
| 3 | aws-c-cal/0.9.8 | mg-communication | cmake_object_order_depends_target_mg-requests |
| 4 | openssl/3.0.18 | mg-rpc (mg::rpc) | cmake_object_order_depends_target_mg-kvstore |
| 5 | aws-c-io/0.23.2 | mg-dbms | cmake_object_order_depends_target_mg-slk |
These tables show three different "centers of gravity" in the same system. At a high level, PageRank highlights nodes with broad global influence (many important paths eventually flow through them), while betweenness centrality highlights bridge-like chokepoints that connect otherwise separate parts of the graph. For engineering decisions, that means high-PageRank nodes are where changes can create wide transitive ripple effects, and high-betweenness nodes are where changes can fragment workflows or create bottlenecks if interfaces shift. In CMake, the "excluding query modules" PageRank view and projected betweenness both point to core architecture chokepoints such as mg-utils, mg-storage-v2, mg-communication, and mg-rpc. In Ninja, the same signals highlight target-level build-order chokepoints (for example cmake_object_order_depends_target_* nodes), which are practical optimization points for reducing rebuild fanout. In Conan, they land on package hubs in the AWS/curl/OpenSSL ecosystem, where upgrade and compatibility risk is concentrated.
Practical Engineering Takeaways
When modifying Memgraph code, treat central targets such as mg-utils, mg-storage-v2, and mg-query as high blast-radius areas. In incremental builds, interface and header changes in these layers can trigger much wider rebuild and relink cascades than implementation-only .cpp changes. In this snapshot, traversing the Ninja graph from CXX_DYNDEP__mg-utils_Release reaches 966 downstream build nodes, including 26 downstream NinjaTarget nodes, e.g.:
MATCH (src {display_name: "CXX_DYNDEP__mg-utils_Release"})-[:DOT_EDGE*1..]->(d)
WHERE d.node_kind = "NinjaTarget"
RETURN count(DISTINCT d) AS downstream_targets;A practical workflow is to use CMake to estimate architecture-level impact first, then use Ninja to confirm real rebuild impact. Changes around bridge or chokepoint targets should be treated as higher risk and tested more thoroughly.
When optimizing build speed, start with Ninja chokepoints and linker-heavy nodes. The most reliable wins usually come from reducing rebuild fanout and relink scope: move unstable implementation details out of widely included public headers, prefer forward declarations where practical, split broad utility modules, break large monolithic targets into smaller components, and avoid top-level executable relinks for small internal changes. In parallel, keep interfaces stable at target boundaries so internal edits do not invalidate large downstream portions of the graph.
What a Coding Agent Can Do With These Graphs
Because this data is in Memgraph, we can now programmatically ask impact-oriented questions before making changes: Which targets are downstream of this node? Which nodes are central chokepoints right now? Which build-step paths dominate relink cost? Which third-party packages have high structural influence? A coding agent can use those answers to produce better-scoped change plans: prioritize downstream tests, focus optimization work on high-impact Ninja nodes, and flag higher-risk Conan package upgrades for stricter compatibility validation.
Wrapping Up
No single dependency graph is enough on its own. CMake shows architecture and coupling, Ninja shows execution and rebuild behavior, and Conan shows where third-party package risk is concentrated. For Memgraph, this layered approach turns dependency analysis from an abstract exercise into a practical workflow for safer refactors and faster builds.