Data migration

Memgraph supports multiple data sources for importing data into a running instance. Whether your data is structured in files, relational databases, or other graph databases, Memgraph provides the flexibility to integrate and analyze your data efficiently.

Memgraph supports file system imports like CSV files, offering efficient and structured data ingestion. However, if you want to migrate directly from another data source, you can use the migrate module from Memgraph MAGE for a direct transition without the need to convert your data into intermediate files.

💡

In order to learn all the pre-requisites for importing data into Memgraph, check import best practices.

File types

CSV files

CSV files provide a simple and efficient way to import tabular data into Memgraph using the LOAD CSV clause.

Using the LOAD CSV import guide, you can implement your own CSV import through Memgraph LAB, or a simple database driver.

JSON files

Memgraph supports importing data from JSON files, allowing structured and semi-structured data to be efficiently loaded, using the json_util module and import_util module.

Check out the JSON import guide.

Cypherl file

Cypherl file is a simple format which defines a cypher command per line. It is usually divided in 2 parts:

CREATE statements for nodes, followed by
MATCH, MATCH + CREATE statements for relationships

An example for node creation in Cypherl would be similar to this query:

CREATE (:Person {id: 1, name: "John", age: 18});
CREATE (:Person {id: 2, name: "Mark", age: 25});

While the example for relationship creation in Cypherl would be similar to this query:

MATCH (p1:Person {id: 1}) MATCH (p2:Person {id: 2}) CREATE (p1)-[:KNOWS]->(p2);

If you have existing Cypherl files defining your graph structure, Memgraph can execute them directly to populate your database. See the Cypherl import guide.

Database management systems

DuckDB

Memgraph is fully compatible with all data source systems supported by DuckDB. Using our DuckDB migration module, you can load data from any DuckDB-supported source, run analytical queries directly on it, and import the results into Memgraph.

List of DuckDB supported data sources

MySQL

Memgraph can integrate with MySQL databases, enabling seamless migration and real-time synchronization. Learn more about MySQL data import.

PostgreSQL

Memgraph can integrate with MySQL databases using the PostgreSQL migration module.

SQL Server

For enterprises using Microsoft SQL Server, Memgraph supports data import using ETL processes or direct queries. Read the SQL Server import documentation.

StarRocks

Memgraph supports migration from StarRocks through its compatibility with the MySQL migration module. Since StarRocks implements a MySQL-compatible protocol, you can use the same module to connect and migrate data into Memgraph.

OracleDB

Memgraph can extract and import data from OracleDB, leveraging SQL queries or external connectors. Find more details in the OracleDB import guide.

Graph databases

Neo4j

Memgraph offers a dedicated Neo4j migration module that allows users to easily transition from Neo4j. By connecting to an existing Neo4j instance, you can migrate your entire graph to Memgraph using a single Cypher query.

Step-by-step guide to migrate from Neo4j using a single Cypher query

Memgraph

The fastest way to migrate from one Memgraph instance to another is by using snapshot recovery, which leverages Memgraph’s proprietary durability files for efficient data backup and restore. This process can be parallelized during startup for even faster recovery.

If you only want to migrate a portion of your data, Memgraph also provides a Memgraph migration module that allows selective migration between Memgraph instances.

We also provide guides to migrate from Neo4j using a CSV file or by using a single Cypher query:

RPC Protocols

Arrow Flight

Memgraph supports migration from any system that uses Arrow Flight as its RPC protocol for fast and efficient data transfer. The Arrow Flight migration module enables you to stream data directly into Memgraph.

Streaming Sources

Apache Kafka

Memgraph offers native integration with Apache Kafka, enabling real-time data streaming directly into the graph. It includes built-in Kafka consumers for easy setup and processing of incoming data streams.

For more advanced setups, users can integrate with Kafka Connect or build custom consumers using Memgraph client libraries available in multiple programming languages.

Apache Pulsar

Memgraph offers native integration with Apache Pulsar, enabling real-time data streaming directly into the graph. It includes built-in Pulsar consumers for easy setup and processing of incoming data streams.

For more advanced setups, users can build custom consumers using Memgraph client libraries available in multiple programming languages.

Apache Spark

Memgraph is compatible with the Neo4j Spark Connector, which makes it easy to integrate Spark-powered pipelines with Memgraph. You can use this connector to process large-scale data in Spark and ingest the results directly into Memgraph for real-time graph analytics.

Step-by-step guide to migrate with Apache Spark

Dremio

Memgraph supports migration from the Dremio query engine using the Arrow Flight RPC protocol, which enables high-performance data transfer from data lakes and lakehouses. You can use Dremio to query formats like Apache Iceberg and stream the results directly into Memgraph.

Guide to migrate Iceberg tables from a data lake using Dremio

Data platforms

ServiceNow

Memgraph supports the migration module from ServiceNow through its dedicated migration module. By connecting to the ServiceNow REST API, the module enables you to retrieve and ingest data directly into Memgraph for further analysis and visualization.

Storage services

Amazon S3

Memgraph supports data migration from Amazon S3 and other S3-compatible storage services. Using the S3 migration module, you can load structured data stored in cloud object storage directly into Memgraph for graph-based processing and analysis.

Other

Data from an application or a program

Memgraph offers a wide range of client libraries that can be used to connect directly to the platform and import data.

Parquet, ORC or IPC/Feather/Arrow file

If you are a Python user you can import Parquet, ORC or IPC/Feather/Arrow file into Memgraph using GQLAlchemy.

Parquet files migration are also supported with the DuckDB migration.

NetworkX, PyG or DGL graph

If you are a Python user you can import NetworkX, PyG or DGL graph into Memgraph using GQLAlchemy.

Need help with data migration?

Schedule a 30 min session with one of our engineers to discuss how Memgraph fits with your architecture. Our engineers are highly experienced in helping companies of all sizes to integrate and get the most out of Memgraph in their projects. Talk to us about data modeling, optimizing queries, defining infrastructure requirements or migrating from your existing graph database. No nonsense or sales pitch, just tech.

Book a call

Best practices Best practices