Memgraph or Neo4j: Analyzing Write Speed Performance
Imagine creating a node in a database or reading a node from the database. Have you ever wondered which operation is more complex? Write and read operations are handled differently in databases, influenced by various technical challenges. In this analysis, we delve into the technical complexities of write operations in Memgraph and Neo4j, the impact of data storage, and presents comprehensive benchmark results that highlight Memgraph's superior performance for write-heavy applications.
Technical Complexity of Write Operations
From a technical perspective, write operations are more complex than read operations. Several factors contribute to this complexity, primarily around maintaining ACID properties (Atomicity, Consistency, Isolation, Durability). When performing a simple write operation, here’s what you need to consider:
-
Concurrency control - Managing locks to prevent simultaneous writes to the same node property by different threads.
-
Data integrity- Enforcing data integrity with triggers and constraints during write operations.
-
Durability - Ensuring data is not lost in case of a power outage. This requires mechanisms like write-ahead and disaster recovery.
-
Performance - Maintaining high performance to execute write operations within milliseconds across various data scales.
Read operations, in contrast, are easier to manage and optimize due to fewer technical challenges.
Impact of Data Storage on Performance
The performance of read and write operations depends significantly on the data storage medium. The famous table “Latency Numbers Every Programmer Should Know” highlights this:
- Reading 1 MB sequentially from disk: 20,000 microseconds (20 ms)
- Reading 1 MB sequentially from SSD: 1,000 microseconds (1 ms)
- Reading 1 MB sequentially from memory: 250 microseconds (0.25 ms)
This table shows that read operations from RAM are 80 times faster than from a traditional spinning disk and 20 times faster than from an SSD. The table lists only the read performance, but similar principles apply to write operations. Typically, modern SSDs have a throughput for read/write operations in the range of single-digit GB/second. In contrast, RAM can achieve a throughput in the tens of GB/second. However, these figures can vary significantly depending on the specific hardware models, RAM generations, and implementation details.
Storage Modes: Neo4j and Memgraph
Neo4j employs a hybrid storage model, using both disk and RAM. In contrast, Memgraph supports multiple storage modes, with the most performant being the IN_MEMORY_TRANSACTIONAL
mode, where all data resides in RAM. This results in much faster read and write operations compared to disk-based storage.
Benchmark Results: Memgraph Outperforms Neo4j
Benchmark tests clearly demonstrate that Memgraph's write performance surpasses Neo4j's.
The results showed that both edge writes and node writes in Memgraph are significantly faster than those in Neo4j when using the default benchgraph presets. This performance difference is due to Memgraph's native use of RAM for storage, while Neo4j primarily uses disk.
As a result, Memgraph is an excellent choice for applications with frequently changing graphs and write-heavy operations.
Latency performance follows a similar pattern.
P99 latency measures the time it takes for the slowest 1% of queries to complete in both Neo4j and Memgraph. Memgraph exhibits lower and more stable latency in write performance compared to Neo4j.
If you have doubts about these benchmarks, you can start an empty Neo4j Docker instance, connect to the container shell, use the cypher-shell to connect to the database, and run a simple write query. You will observe similar results.
For example, this is what you will get for the CREATE (:Object {name: "Piano"});
:
root@ff5408b31b9c:/var/lib/neo4j/bin# cypher-shell
Connected to Neo4j using Bolt protocol version 5.4 at neo4j://localhost:7687.
Type :help for a list of available commands or :exit to exit the shell.
Note that Cypher queries must end with a semicolon.
@neo4j> CREATE (:Object {name: "Piano"});
0 rows
ready to start consuming query after 94 ms, results consumed after another 0 ms
Added 1 nodes, Set 1 properties, Added 1 labels
@neo4j> CREATE (:Object {name: "Piano"});
0 rows
ready to start consuming query after 8 ms, results consumed after another 0 ms
Added 1 nodes, Set 1 properties, Added 1 labels
@neo4j>
Creating a node in Neo4j takes 94 ms on the first run and 8 ms on subsequent runs due to caching.
The first run is typically much slower than subsequent runs due to caching and database optimization processes, which P99 latency measurements do not account for. This phenomenon is especially noticeable in JVM-based systems.
You can replicate these tests on Memgraph as well. Start the Memgraph container, connect to the container shell, use mgconsole
to connect to the database, and run the query. You should see similar patterns, with the first run being slightly slower and subsequent runs being faster.
root@aa301cdb6f9a:/usr/lib/memgraph# mgconsole --fit-to-screen
mgconsole 1.4
Connected to 'memgraph://127.0.0.1:7687'
Type :help for shell usage
Quit the shell by typing Ctrl-D(eof) or :quit
memgraph> CREATE (:Object {name: "Piano"});
Empty set (round trip in 0.007 sec)
1 labels have been created.
1 nodes have been created.
memgraph> CREATE (:Object {name: "Piano"});
Empty set (round trip in 0.000 sec)
1 labels have been created.
1 nodes have been created.
memgraph>
For Memgraph, creating a node takes 7 ms on the first run and less than 1 ms on subsequent runs.
Large-Scale Write Performance
The query above is a simple and illustrates what happens on a larger scale. Let's test this by creating 100,000 nodes.
root@a3c3b810c54f:/var/lib/neo4j/bin# cypher-shell
Connected to Neo4j using Bolt protocol version 5.4 at neo4j://localhost:7687.
Type :help for a list of available commands or :exit to exit the shell.
Note that Cypher queries must end with a semicolon.
@neo4j> UNWIND range(1, 100000) AS id
CREATE (:Person {name: toString(id)});
0 rows
ready to start consuming query after 3787 ms, results consumed after another 0 ms
Added 100000 nodes, Set 100000 properties, Added 100000 labels
@neo4j>
It took 3.8 seconds for Neo4j to create 100k nodes.
Creating 100k nodes with Memgraph:
root@ae975247997a:/usr/lib/memgraph# mgconsole --fit-to-screen
mgconsole 1.4
Connected to 'memgraph://127.0.0.1:7687'
Type :help for shell usage
Quit the shell by typing Ctrl-D(eof) or :quit
memgraph> UNWIND range(1, 100000) AS id
-> CREATE (:Person {name: toString(id)});
Empty set (round trip in 0.339 sec)
100000 labels have been created.
100000 nodes have been created.
Memgraph takes around 400 milliseconds. That's a substantial performance advantage over Neo4j.
Conclusion
The benchmarks reveal a consistent pattern—Memgraph's write performance is superior to Neo4j's. The actual performance will vary depending on the environment and configurations, but Memgraph's in-memory storage provides a clear edge for write-heavy applications.
Refer to detailed Memgraph benchmarking results here.
Try It Yourself
Experiment with different configurations and run your own benchmarks to see the results in your environment. For any questions or debates, join our Discord community and let’s chat!