import_util

Module for importing data from different formats. Currently, this module supports only the import of JSON and graphML file formats.

docs-source (opens in a new tab)

TraitValue
Module typeutil
ImplementationPython
Parallelismsequential

Procedures

json(path)

Input:

  • path: string ➡ Path to the JSON file that is being imported.

Usage:

The JSON file you're importing needs to be structured the same as the JSON file that the export_util.json() procedure generates. The generated JSON file is a list of objects representing nodes or relationships. If the object is a node, then it looks like this:

{
    "id": 4000,
    "labels": [
        "City"
    ],
    "properties": {
        "id": 0,
        "name": "Amsterdam",
    },
    "type": "node"
}

The id key has the value of the Memgraph's internal node ide. The labels key holds the information about node labels in a list. The properties are key-value pairs representing the properties of a certain node. Each node needs to have the value of type set to "node".

On the other hand, if the object is a relationship, then it is structured like this:

{
    "end": 4052,
    "id": 7175,
    "label": "CloseTo",
    "properties": {
        "eu_border": true
    },
    "start": 4035,
    "type": "relationship"
}

The end and start keys hold the information about the internal IDs of start and end node of the relationship. Each relationship also has its internal ID exported as a value of id key. A relationship can only have one label which is exported to the label key. Properties are again key-value pairs, and the value of type needs to be set to "relationship".

The path you have to provide as procedure argument depends on how you started Memgraph.

If you ran Memgraph with Docker, you need to save the JSON file inside the Docker container. We recommend saving the JSON file inside the /usr/lib/memgraph/query_modules directory.

You can call the procedure by running the following query:

CALL export_util.json(path);

where path is the path to the JSON file inside the /usr/lib/memgraph/query_modules directory in the running Docker container (e.g., /usr/lib/memgraph/query_modules/import.json).

You can copy the JSON file to the running Docker container with the docker cp (opens in a new tab) command:

docker cp /path_to_local_folder/import.json <container_id>:/usr/lib/memgraph/query_modules/import.json

cypher(path)

Imports the Cypher queries from the given path by just running the queries.

Input:

  • path: string ➡ path to the Cypher file that needs to be imported.

This procedure is not fully functional and was created for testing. If you want to import Cypher, use the Memgraph Lab import feature, which is fully functional and faster.

graphml(path, config)

Input:

  • path: string ➡ path to the graphML file that is being imported.
  • config: Map (default={}) ➡ configuration parameters explained below.

Parameters:

NameTypeDefaultDescription
readLabelsBoolFalseCreate node labels by using the value of the labels property.
defaultRelationshipTypeString"RELATED"The default relationship type to use when none is specified in the import file.
storeNodeIdsBoolFalseStore node's id attribute as a property.
sourceMapA map with two keys: label and id. The label is mandatory, while the id's default value is id. This allows the import of relationships if the source node is absent in the file. It will search for a source node with a specific label and a property equal to the map's id value. The value of that property should be equal to the relationship's source node ID. For example, with a config map {source: {id: 'serial_number', label: 'Device'}} and an edge defined as <edge id="e0" source="n0" target="n1" label="CONNECT"><data key="label">CONNECT</data></edge>, if node "n0" doesn't exist, it will search for a source node (:Device {serial_number: "n0"}).
targetMapA map with two keys: label and id. The label is mandatory while the id's default value is id. This allows the import of relationships in case the target node is absent in the file. It will search for a target node with a specific label and a property equal to the map's id value. The value of that property should be equal to the relationship's target node ID. For example, with a config map {target: {id: 'serial_number', label: 'Device'}} and an edge defined as <edge id="e0" source="n0" target="n1" label="CONNECT"><data key="label">CONNECT</data></edge>, if node "n1" doesn't exist, it will search for a target node (:Device {serial_number: "n1"}).

Output:

  • status: stringsuccess if no errors are generated.

Usage:

The path you have to provide as procedure argument depends on how you started Memgraph.

If you ran Memgraph with Docker, database will be exported to a graphML file inside the Docker container. We recommend exporting the database to the graphML file inside the /usr/lib/memgraph/query_modules directory.

You can call the procedure by running the following query:

CALL export_util.graphML(path);

where path is the path to the graphML file inside the /usr/lib/memgraph/query_modules directory in the running Docker container (e.g., /usr/lib/memgraph/query_modules/export.graphml).

:::info You can copy the exported CSV file to your local file system using the docker cp (opens in a new tab) command. :::

Example - Importing JSON file to create a database

Input file

Below is the content of the import.json file.

  • If you're using Memgraph with Docker, then you have to save the import.json file in the /usr/lib/memgraph/query_modules directory inside the running Docker container.

  • If you're using Memgraph on Ubuntu, Debian, RPM package or WSL, then you have to save the import.json file in the local /users/my_user/import_folder directory.

[
    {
        "id": 6114,
        "labels": [
            "Person"
        ],
        "properties": {
            "name": "Anna"
        },
        "type": "node"
    },
    {
        "id": 6115,
        "labels": [
            "Person"
        ],
        "properties": {
            "name": "John"
        },
        "type": "node"
    },
    {
        "id": 6116,
        "labels": [
            "Person"
        ],
        "properties": {
            "name": "Kim"
        },
        "type": "node"
    },
    {
        "end": 6115,
        "id": 21120,
        "label": "IS_FRIENDS_WITH",
        "properties": {},
        "start": 6114,
        "type": "relationship"
    },
    {
        "end": 6116,
        "id": 21121,
        "label": "IS_FRIENDS_WITH",
        "properties": {},
        "start": 6114,
        "type": "relationship"
    },
    {
        "end": 6116,
        "id": 21122,
        "label": "IS_MARRIED_TO",
        "properties": {},
        "start": 6115,
        "type": "relationship"
    }
]

Running command

If you're using Memgraph with Docker, then the following Cypher query will create a graph database from the provided JSON file:

CALL import_util.json("/usr/lib/memgraph/query_modules/import.json");

If you're using Memgraph on Ubuntu, Debian, RPM package or WSL, then the following Cypher query will create a graph database from the provided JSON file:

CALL import_util.json("/users/my_user/import_folder/import.json");

Created database

After you import the import.json file, you get the following graph database:

Example - Importing graphML file to create a database

Below is the content of the import.graphml file.

  • If you're using Memgraph with Docker, then you have to save the import.graphml file in the /usr/lib/memgraph/query_modules directory inside the running Docker container.

  • If you're using Memgraph on Ubuntu, Debian, RPM package or WSL, then you have to save the import.graphml file in the local /users/my_user/import_folder directory.

<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
<key id="labels" for="node" attr.name="labels" attr.type="string"/>
<key id="title" for="node" attr.name="title" attr.type="string"/>
<key id="released" for="node" attr.name="released" attr.type="int"/>
<key id="program_creators" for="node" attr.name="program_creators" attr.type="string" attr.list="string"/>
<key id="name" for="node" attr.name="name" attr.type="string"/>
<key id="portrayed_by" for="node" attr.name="portrayed_by" attr.type="string"/>
<key id="label" for="edge" attr.name="label" attr.type="string"/>
<key id="seasons" for="edge" attr.name="seasons" attr.type="string" attr.list="int"/>
<graph id="G" edgedefault="directed">
<node id="n0" labels=":TVShow"><data key="labels">:TVShow</data><data key="title">Stranger Things</data><data key="released">2016</data><data key="program_creators">["Matt Duffer", "Ross Duffer"]</data></node>
<node id="n1" labels=":Character"><data key="labels">:Character</data><data key="name">Eleven</data><data key="portrayed_by">Millie Bobby Brown</data></node>
<node id="n2" labels=":Character"><data key="labels">:Character</data><data key="name">Joyce Byers</data><data key="portrayed_by">Winona Ryder</data></node>
<node id="n3" labels=":Character"><data key="labels">:Character</data><data key="name">Jim Hopper</data><data key="portrayed_by">David Harbour</data></node>
<node id="n4" labels=":Character"><data key="labels">:Character</data><data key="name">Mike Wheeler</data><data key="portrayed_by">Finn Wolfhard</data></node>
<node id="n5" labels=":Character"><data key="labels">:Character</data><data key="name">Dustin Henderson</data><data key="portrayed_by">Gaten Matarazzo</data></node>
<node id="n6" labels=":Character"><data key="labels">:Character</data><data key="name">Lucas Sinclair</data><data key="portrayed_by">Caleb McLaughlin</data></node>
<node id="n7" labels=":Character"><data key="labels">:Character</data><data key="name">Nancy Wheeler</data><data key="portrayed_by">Natalia Dyer</data></node>
<node id="n8" labels=":Character"><data key="labels">:Character</data><data key="name">Jonathan Byers</data><data key="portrayed_by">Charlie Heaton</data></node>
<node id="n9" labels=":Character"><data key="labels">:Character</data><data key="name">Will Byers</data><data key="portrayed_by">Noah Schnapp</data></node>
<node id="n10" labels=":Character"><data key="labels">:Character</data><data key="name">Steve Harrington</data><data key="portrayed_by">Joe Keery</data></node>
<node id="n11" labels=":Character"><data key="labels">:Character</data><data key="name">Max Mayfield</data><data key="portrayed_by">Sadie Sink</data></node>
<node id="n12" labels=":Character"><data key="labels">:Character</data><data key="name">Robin Buckley</data><data key="portrayed_by">Maya Hawke</data></node>
<node id="n13" labels=":Character"><data key="labels">:Character</data><data key="name">Erica Sinclair</data><data key="portrayed_by">Priah Ferguson</data></node>
<edge id="e0" source="n1" target="n0" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="seasons">[1, 2, 3, 4]</data></edge>
<edge id="e1" source="n2" target="n0" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="seasons">[1, 2, 3, 4]</data></edge>
<edge id="e2" source="n3" target="n0" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="seasons">[1, 2, 3, 4]</data></edge>
<edge id="e3" source="n4" target="n0" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="seasons">[1, 2, 3, 4]</data></edge>
<edge id="e4" source="n5" target="n0" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="seasons">[1, 2, 3, 4]</data></edge>
<edge id="e5" source="n6" target="n0" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="seasons">[1, 2, 3, 4]</data></edge>
<edge id="e6" source="n7" target="n0" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="seasons">[1, 2, 3, 4]</data></edge>
<edge id="e7" source="n8" target="n0" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="seasons">[1, 2, 3, 4]</data></edge>
<edge id="e8" source="n9" target="n0" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="seasons">[1, 2, 3, 4]</data></edge>
<edge id="e9" source="n10" target="n0" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="seasons">[1, 2, 3, 4]</data></edge>
<edge id="e10" source="n11" target="n0" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="seasons">[2, 3, 4]</data></edge>
<edge id="e11" source="n12" target="n0" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="seasons">[3, 4]</data></edge>
<edge id="e12" source="n13" target="n0" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="seasons">[2, 3, 4]</data></edge>
</graph>
</graphml>

If you're using Memgraph with Docker, then the following Cypher query will create a graph database from the provided graphML file:

CALL import_util.graphml("/usr/lib/memgraph/query_modules/import.graphml", {readLabels: true}) 
YIELD status RETURN status;

If you're using Memgraph on Ubuntu, Debian, RPM package or WSL, then the following Cypher query will create a graph database from the provided graphML file:

CALL import_util.graphml("/users/my_user/import_folder/import.graphml", {readLabels: true}) 
YIELD status RETURN status;

After you import the import.graphml file, you get the following graph database: