Neo4j vs Memgraph - How to choose a graph database?

by
Ivan Despot
Neo4j vs Memgraph - How to choose a graph database?

Introduction to graph databases

Graph databases are gaining traction across a variety of applications and industry sectors. From fraud detection and supply chain optimization to machine learning and artificial intelligence, graph analytics enable developers to build a new generation of applications centered around network analysis.

In this article, we compare two leading graph databases, Memgraph and Neo4j, to help you choose the best graph analytics platform for your needs. Although comparisons usually focus on performance benchmarks, there are many other crucial factors to consider when choosing a database for your business. With that in mind, we’ll start by looking at some of the similarities between Memgraph and Neo4j before moving on to the differentiating factors between the two graph databases.

What is Neo4j?

Neo4j is an ACID-compliant transactional native graph database. It is a disk-based system mainly implemented in Java and has been publicly available since 2007. It’s the most widely utilized graph solution out there.

What is Memgraph?

Memgraph is a platform designed for graph computations on streaming data. It’s powered by a high-performance, ACID-compliant transactional native graph database. It’s engineered from the ground-up leveraging an in-memory first, durable, and redundant architecture and a C/C++ implementation to deliver the unique capability of supporting both transactional and analytical workloads.

General Differences Between Neo4j and Memgraph

Feature Neo4j Memgraph
Initial release 2007 2017
License AGPLv3 / Commercial BSL / Apache 2 / Commercial
Written in Java C++
Data model Labeled Property Graph Labeled Property Graph
Data storage On-disk In memory
Source code GitHub GitHub
Hosted Cloud service Neo4j Aura Memgraph Cloud

Both solutions have been around for some time now. While Neo4j has a longer track record, Memgraph has the benefit of being implemented in C++, which makes it more optimized. The main difference between these two is the storage engine. Memgraph uses an in-memory storage engine while Neo4j implements a traditional on-disk storage solution.

The main difference: On-disk vs in-memory storage

Even though there are many differences between on-disk and in-memory, the choice primarily depends on your use case and your requirements. The on-disk storage method is a default choice if you are storing a large number of objects that don’t need to be retrieved very often, i.e., if you need a system of record and a general-purpose graph storage solution. If that is the case, Neo4j will do an amazing job.

On the other side, Memgraph has implemented a complete in-memory solution that focuses on stream processing and real-time computations that need to be executed in the shortest possible timeframe. So, if you have a large graph that needs to be analyzed frequently and you don’t want to experience performance-related issues, then Memgraph is the way to go.

Technical features

Feature Neo4j Memgraph
ACID transactions Yes Yes
Replication Yes Yes
Query language Cypher Cypher
Drivers & clients .Net, Clojure, Elixir, Go, Groovy, Haskell, Java JavaScript, Perl, PHP, Python, Ruby, Scala .Net, C, C++, Go, Haskell, Java, JavaScript, PHP, Python, Ruby, Scala
Triggers Yes Yes
Concurrency Yes Yes
Durability Yes Yes
Bolt protocol support Yes Yes
Backups Yes Yes
Streaming platform integrations Apache Kafka, Redpanda Apache Kafka, Redpanda, Apache Pulsar
Query execution Plans Yes Yes
Authentication and Authorization Yes Yes
Data encryption in transit Yes Yes
Data science library GDS MAGE
Custom procedures Java Python, C, C++, Rust [1]

[1] You can write the procedures in any programming language which can work with C and can be compiled to the ELF shared library format.

There are of course many more features that Neo4j and Memgraph implement, but we will be focusing on those that are necessary for stream processing. Memgraph is not primarily a graph database, but rather a tool for performing graph analytics and algorithms in real-time data.

Drivers & clients

There is a broad number of drivers in many different programming languages available for both solutions. While Memgraph only maintains a few in-house drivers that it develops and supports (C, C++, Python, Rust), most Neo4j drivers can also be used with Memgraph. This is due to the fact that both solutions use the Bolt protocol, labeled property graph model and Cypher query language.

Streaming platform integrations

Memgraph includes connectors out of the box for Apache Kafka and Apache Pulsar with a few more on the way in future releases. Neo4j also offers a Kafka Connect plugin that brings streaming support to the whole ecosystem. Memgraph has also been tested with Redpanda, a high-performance Kafka alternative.

Neo4j GDS & Memgraph MAGE

Neo4j GDS or Graph Data Science is a library that provides efficiently implemented, parallel versions of common graph algorithms for Neo4j, exposed as Cypher procedures. It contains many of the most popular graph algorithms out there and you can use it to perform complex graph analysis tasks.

Memgraph MAGE or Memgraph Advanced Graph Extensions is an open-source library for running graph algorithms exposed as Cypher procedures. It focuses on real-time analysis and implements a few online algorithms like PageRank, community detection and node2vec. These algorithms are suited for streaming data that needs to be processed incrementally whenever a new node or relationship is created, or existing ones are updated.

Custom procedures in Neo4j and Memgraph

Neo4j and Cypher can be extended with User Defined Procedures and Functions. Neo4j itself provides and utilizes custom procedures. Many of the monitoring, introspection and security features available through the Neo4j-Browser are implemented using these custom procedures. However, given that Neo4j is implemented in Java, the custom procedures and functions also depend on a Java API.

Memgraph is mainly focused on the Python ecosystem and community. While the core engine is implemented in C++ to ensure the best resource utilization and performance, custom procedures (called query modules in Memgraph) can be implemented in multiple programming languages and, most importantly, in Python as well. These procedures can contain graph algorithms, utility tools, custom APIs… whatever you can come up with it. You can call them from the Cypher query language like you would any other query and combine them with other features such as streams or triggers.

What is the best graph database for your use case?

If performance and cost are not crucial factors, then Neo4j is your best bet. However, if you need a faster and more optimized alternative, then you should go with Memgraph. While both solutions offer a broad range of features, you will need to decide depending on your specific use case which one to choose.

As already mentioned, Neo4j is a pioneer among graph databases and graph technologies in general. It is perfect for Java-oriented developers and for static data storage that doesn’t rely on frequent write operations.

Memgraph focuses on stream processing, real-time graph analytics and caters more to Python, C++ and Rust developers. If you need to run complex graph algorithms and traversals often and expect the results in the shortest amount of time, Memgraph is the way to go.

Whatever solution you end up choosing, you can always share your project on our Discord server. You can also ask us about different use cases, technical problems, or anything else graph-related.

If you want to try Memgraph, check our Memgraph Demo on Playground (no installation or registration needed). Explore our guides, samples and references on Memgraph Docs and if you have any questions, join our growing Community and share your projects with us.

Table of Contents

Continue Reading