Skip to main content
Version: Unreleased 🚧

CSV Import Tool

CSV is a universal and very versatile data format used to store large quantities of data. Each Memgraph database instance has a CSV import tool installed called mg_import_csv. The CSV import tool should be used for initial bulk ingestion of data into the database. Upon ingestion, the CSV importer creates a snapshot that will be used by the database to recover its state on its next startup.

If you are already familiar with the Neo4j bulk import tool, then using the mg_import_csv tool should be easy. The CSV import tool is fully compatible with the Neo4j CSV format. If you already have a pipeline set-up for Neo4j, you should only replace neo4j-admin import with mg_import_csv.

info

For more detailed information about the CSV Import Tool, check our Reference guide.

How to use the CSV Import Tool?

note

If you installed Memgraph through Docker Hub, the name of the Docker image memgraph should be replaced with memgraph/memgraph-platform if you didn't change the image tag.

If you installed Memgraph using Docker, you will need to run the importer using the following command:

docker run -v mg_lib:/var/lib/memgraph -v mg_import:/import-data --entrypoint=mg_import_csv memgraph/memgraph-platform
caution

This is an incomplete command as it's missing the files that need to be imported. It will result with a The --nodes flag is required! error. You can find a complete example below.

For information on other options, run:

docker run --entrypoint=mg_import_csv memgraph/memgraph-platform --help

Below, you can find two examples of how to use the CSV Import Tool depending on the complexity of your data:

caution

It is also important to note that importing CSV data using the mg_import_csv command should be a one-time operation before running Memgraph. In other words, this tool should not be used to import data into an already running Memgraph instance.


Examples

One type of nodes and relationships

Let's import a simple dataset.

Store the following in people_nodes.csv:

id:ID(PERSON_ID),name:string,:LABEL
100,Daniel,Person
101,Alex,Person
102,Sarah,Person
103,Mia,Person
104,Lucy,Person

Now, you can import the dataset using the CSV Import Tool.

caution

Your existing snapshot and WAL data will be considered obsolete, and Memgraph will load the new dataset.

Use the following command:

If using Docker, things are a bit more complicated. First you need to copy the CSV files where the Docker image can see them:

docker container create --user memgraph --name mg_import_helper -v mg_import:/import-data busybox
docker cp people_nodes.csv mg_import_helper:/import-data
docker cp people_relationships.csv mg_import_helper:/import-data
docker rm mg_import_helper

Then, run the importer with the following:

docker run -v mg_lib:/var/lib/memgraph -v mg_import:/import-data \
--entrypoint=mg_import_csv memgraph/memgraph-platform \
--nodes /import-data/people_nodes.csv \
--relationships /import-data/people_relationships.csv

Next time you run Memgraph, the dataset will be loaded:

 docker run -p 7687:7687 -v mg_lib:/var/lib/memgraph memgraph/memgraph-platform

Multiple types of nodes and relationships

The previous example is showcasing a simple graph with one node type and one relationship type. If we have more complex graphs, the procedure is similar. Let's define the following dataset:

Add the following to people_nodes.csv:

id:ID(PERSON_ID),name:string,age:int,city:string,:LABEL
100,Daniel,30,London,Person
101,Alex,15,Paris,Person
102,Sarah,17,London,Person
103,Mia,25,Zagreb,Person
104,Lucy,21,Paris,Person
105,Adam,23,New York,Person

After preparing the files above, you can import the dataset using the CSV Import tool.

If using Docker, things are a bit more complicated. First, you need to copy the CSV files where the Docker container can see them:

docker container create --user memgraph --name mg_import_helper -v mg_import:/import-data busybox
docker cp people_nodes.csv mg_import_helper:/import-data
docker cp people_relationships.csv mg_import_helper:/import-data
docker cp restaurants_nodes.csv mg_import_helper:/import-data
docker cp restaurants_relationships.csv mg_import_helper:/import-data
docker rm mg_import_helper

Then, run the importer with the following command:

docker run -v mg_lib:/var/lib/memgraph -v mg_etc:/etc/memgraph -v mg_import:/import-data \
--entrypoint=mg_import_csv memgraph/memgraph-platform \
--nodes /import-data/people_nodes.csv \
--nodes /import-data/restaurants_nodes.csv \
--relationships /import-data/people_relationships.csv \
--relationships /import-data/restaurants_relationships.csv

Next time you run Memgraph, the dataset will be loaded:

 docker run -p 7687:7687 -v mg_lib:/var/lib/memgraph memgraph/memgraph-platform