Frequently asked questions
Memgraph is an open-source in-memory graph database built for teams that expect highly performant, advanced analytical insights - as compatible with your current infrastructure as Neo4j (but up to 120x faster). Memgraph is powered by a query engine built in C/C++ to handle real-time use cases at an enterprise scale. Memgraph supports strongly-consistent ACID transactions and uses the standardized Cypher query language over Bolt protocol for structuring, manipulating, and exploring data.
When data is stored on disk, the computer has to physically read it from the disk and transfer it to the RAM before it can be processed. This process is relatively slow because it involves several physical processes, such as seeking the right location on the disk and waiting for the data to be read. Writing the data is also much slower for the same reasons.
Storing data in the computer's RAM eliminates the need for these physical processes, and data can be accessed and added almost instantly.
Therefore, in-memory graph databases are ideal for applications requiring fast data processing, real-time analytics, and quick response times.
Memgraph is best suited for use cases with complex data relationships that require real-time processing and high scalability.
In relational databases, complex data relationships arise when data from different tables is related or somehow interconnected. Because data is spread across multiple tables, querying it requires hopping from one table to other and joining it with slow and resource-intensive join operations.
The complexity of join operations can increase exponentially as the number of tables increases and as the links between various tables are no longer neatly structured following a clearly set pattern. It is no longer sufficient to join just two or three tables but hop through more than seven tables to find the correct link between the data and gain valuable analytics.
Examples of complex data are deep hierarchical relationships between data, such as parent-child relationships or many-to-many relationships between different tables.
Memgraph is designed to be a high-performance graph database, and it typically outperforms many other graph databases in terms of speed and scalability. Key factors contributing to Memgraph's performance are its in-memory architecture and a performant query engine written in C++. Memgraph also offers a variety of tools and features to help optimize query performance, including label and label-property indexes and a custom visualization library. Check our benchmark (opens in a new tab) comparing Memgraph and Neo4j.
Although Memgraph is an in-memory database, the data is persistent and durable. This means data will not be lost upon reboot.
Memgraph uses two mechanisms to ensure the durability of stored data and make disaster recovery possible:
- write-ahead logging (WAL)
- periodic snapshot creation
Each database modification is recorded in a log file before being written to the database. Therefore, the log file contains all steps needed to reconstruct the DB’s most recent state. Memgraph also periodically takes snapshots during runtime to write the entire data storage to the drive. On startup, the database state is recovered from the most recent snapshot file. The timestamp of the snapshot is compared with the latest update recorded in the WAL file and, if the snapshot is less recent, the state of the DB will be completely recovered using the WAL file.
If you are using Memgraph with Docker, be sure to specify a volume for data persistence.
Memgraph ensures high availability by using replication. Replication involves replicating data from one MAIN instance to one or several REPLICA instances. If a MAIN instance fails, another REPLICA instance can be upgraded and serve as the MAIN instance, thus ensuring continuous data availability.
Memgraph is designed to utilize all available CPU cores on a machine to process queries and perform other operations in parallel, significantly improving performance and reducing query response times.
To run Memgraph on-premise, you require an Intel Xeon, AMD Opteron/Epyc, ARM machines, Apple M1, Amazon Graviton server or desktop processor, at least 1 GB of RAM and disk and at least 1 vCPU. We recommend using a server processor, at least 16 GB of ECC RAM, the same amount of disk storage and at least 8 vCPUs or 4 physical cores.
We recommend twice as many GB of RAM as the data size. If you have 8 GB of data, we recommend having at least 16 GB of RAM. Of course, the actual memory needs depend on the complexity of executed queries. The more graph objects query needs to return as a result, the more RAM will be required. To calculate the Memgraph RAM instance requirements based on your data, check out how Memgraph uses memory.
Memgraph vertically scales effortlessly up to 1B nodes and 10B edges. The only limit is the size of your RAM. We recommend twice as many GB of RAM as the data size. If you have 8 GB of data, we recommend having at least 16 GB of RAM. Of course, the actual memory needs depend on the complexity of executed queries. The more graph objects query needs to return as a result, the more RAM will be required.
There are three official Docker images for Memgraph:
memgraph/memgraph- the most basic Memgraph database instance.
memgraph/memgraph-mage- the image contains a Memgraph database instance together with all the newest MAGE modules and graph algorithms.
memgraph/memgraph-platform- the image contains Memgraph database, Memgraph Lab, mgconsole and MAGE. Once started, mgconsole will be opened in the terminal, while Memgraph Lab is available at
The MAGE graph algorithm library includes NVIDIA
cuGraph (opens in a new tab) GPU-powered graph algorithms. To
use them, you need a specific kind of
memgraph-mage image, so check the
DockerHub (opens in a new tab)
It is not necessary to define any data schema to import data. Data will be imported into the database regardless of the number of properties and their types. You can enforce property uniqueness and existence.
You can try running queries on preloaded datasets in Memgraph Playground (opens in a new tab). If you need help with Cypher queries, check out the Cypher manual. We also offer data modeling and Cypher e-mail courses (opens in a new tab) or watch one of our webinars. You can even deep dive into code with Memgraph's CTO -> Code with Buda (opens in a new tab). For all the other questions and help, fell free to join our community (opens in a new tab).
Yes, Memgraph offers a free q30-day Memgraph Enterprise Trial. Send a request via the form (opens in a new tab).
Does Memgraph offer professional services such as data modelling, development, integration and similar?
It depends on the scope of the project and the requirements. Contact us (opens in a new tab) for more information.
Currently, the fastest way to import data is from a CSV file with a LOAD CSV clause. LOAD CSV clause imports between 100K and 350K nodes per second and between 60K and 80K edges per second. To achieve this import speed, indexes have to be set up appropriately.
Other import methods include importing data from JSON and CYPHERL files, or connecting to a data stream.
CSV files can be imported in on-premise instances using the LOAD CSV clause.
Local JSON files and files on a remote address can be imported in on-premise instances using a json_util module from the MAGE library. On a Cloud instance, data from JSON files can be imported only from a remote address.
CYPHERL file contains Cypher queries necessary for creating nodes and relationships.
You can export data to JSON or CYPHERL files. Query results can be exported to a CSV file.
Data can be exported to a JSON file from on-premise instances using a export_util module from the MAGE library. The same module can be used to export query results to a CSV file.
CYPHERL file contains Cypher queries necessary for creating nodes and relationships and you can export files via Memgraph Lab.
Yes, you can connect your instance to Kafka, Redpanda or Pulsar streams and ingest data. You will need to write a transformation module that will instruct Memgraph on how to transform the incoming messages and consume them correctly.
No, data is not automatically indexed during import. You need to create a label or label-property indexes manually once the import is finished.
You can create logically separated graphs within the same instance by using different labels. Each node can have multiple labels and the cost of labels is 8B per label (but the memory is allocated dynamically, so 3 labels take up as much memory as 4, and 5-7 labels take as much space as 8, etc.) You can use the same technique to save multilayer networks.
You can use Memgraph Lab, a visual user interface that enables you to:
- visualize graph data using the Orb library (opens in a new tab)
- write and execute Cypher queries
- import and export data
- view and optimize query performance
- develop query modules in Python
- manage connections to streams.
Replication should not in any way affect the performance of your database instance.
You can check storage information by running the SHOW STORAGE INFO; that will provide information about the number of stored nodes and relationships and memory usage.
By default, Memgraph saves the log at
Accessing logs depends on how you've started Memgraph, so check the
documentation about accessing logs.
You can check the logs using Memgraph Lab (the visual interface). Memgraph Lab listens to logs on the 7444 port. You can also use this web socket port 7444 and listen to the logs from your custom system.
Log level and location can be modified using configuration flags.
You don't need to know Cypher to query the database. You can use GQLAlchemy (opens in a new tab), an Object Graph Mapper (OGM). OGM provides a developer-friendly workflow for writing object-oriented notation to communicate to a graph database. Instead of writing Cypher queries, you can write Python code, which the OGM will automatically translate into Cypher queries. It supports both Memgraph and Neo4j.
For easy browsing of documentation for versions between Memgraph 2.0 and 2.10.1, you can use the documentation archive (opens in a new tab).
For comprehensive documentation spanning versions from 1.3.0 to 2.10.1, refer to the archived GitHub repository (opens in a new tab).
Although we tried to implement openCypher query language as closely to the language reference as possible, we made some changes that can enhance the user experience. You can find the differences listed in the Cypher manual.
Yes, you can expand the Cypher query language with custom procedures grouped in query modules. Modules can be written in C/C++ and Python (which also has a mock API). For more details, check out the documentation on query modules.
Memgraph Advanced Graph Extensions (MAGE) is an open-source repository that contains graph algorithms and utility modules. It encourages developers to share innovative and useful query modules (custom Cypher procedures) the whole community can benefit from. It corresponds to APOC in Neo4j, except it's free and open source.
The MAGE library also includes dynamic algorithms specially designed for analyzing real-time data, NetworkX and igraph integrations, Elasticsearch synchronization module and NVIDIA GPU-powered algorithms. Check the full list of modules, and if there is a specific procedure you can't find in the MAGE library which you would like to use, please let us know.
Query modules are collections of custom Cypher procedures that extend the basic functionalities of the Cypher query language. Each query module consists of procedures connected by a common theme (for example, community detection). MAGE graph library gathers a number of implemented graph algorithms and utility modules. Still, if you need a specific procedure unavailable in MAGE, you can implement it using Python or C/C++ API and contribute to the library or contact us.
Memgraph Lab is a lightweight and intuitive visual user interface that enables you to:
- write and execute Cypher queries and algorithms
- visualize graph data using the Orb library (opens in a new tab)
- import and export data
- generate data schema
- view and optimize query performance
- develop custom procedures in Python
- manage stream connections.
No, Memgraph Lab can connect only to a running Memgraph instance.
Yes, you can customize the visual appearance of your graph results by using the Graph Style Script language. You can add images to nodes, change their shape, size and color. Change the line appearance of relationships and their thickness. For a complete list of available features, consult the GSS reference guide.
Instance will be stopped for next 7 days. If you wish to continue, add a payment method.
That is the initial limit for new users. If you want to create more projects, let us know.
Yes, it is. You can find detailed instructions in Memgraph Cloud documentation.
I've created a project with 2GB RAM, but Memgraph Labs says there is only 1.54GB available. Why is that so?
A par of RAM is allocated to the operating system and other services. They usually take 13-15% of the total RAM. Approximate free RAM is:
- 1GB RAM Memgraph Cloud project has about 860 MB free RAM
- 2GB RAM Memgraph Cloud project has about 1.60 GB free RAM
- 4GB RAM Memgraph Cloud project has about 3.40 GB free RAM
- 8GB RAM Memgraph Cloud project has about 6.7 GB free RAM
- 16GB RAM Memgraph Cloud project has about 14 GB free RAM
- 32GB RAM Memgraph Cloud project has about 28 GB free RAM
I've created a new project, and when I try to connect to the instance, I get an error: Unable to connect.
Upon creating a project, Memgraph loads all the MAGE algorithms, so it takes some time to load them all. Wait 30 seconds, and then try to connect again.
When you pause your project, usually the IP stays the same, but sometimes your IP can be released and a new one will be allocated. You can always check the IP in the connection details.
If you have forgotten your project password, we can't help you. We don't have a way of finding out or resetting your project password.
You can connect to an instance running within the Memgraph Cloud project via Memgraph Lab, a visual interface, mgconsole, command-line interface, or one of many drivers. You can find detailed instructions in Memgraph Cloud documentation.
A project is backed up by creating a snapshot with Amazon EBS. You cannot create a snapshot if you are using a 14-day free trial version of Memgraph Cloud. You can find detailed instructions in Memgraph Cloud documentation.
Yes, Memgraph cloud is running at AWS.
No, at the moment, Memgraph cloud is not available on the Google Cloud Platform.