QueryingSchema

Schema

Schema queries are designed to retrieve information about the structure of the data within the database, for example, node labels and properties associated, as well as relationship types and properties.

Run-time schema tracking

Please use a snapshot to migrate a pre v2.21 datasets with properties on edges. When schema is tracked, recovering using WAL files generated pre v2.21 will incur a performance penalty as crucial data is missing, which leads to long scan times.

To use run-time schema tracking, start Memgraph with the --schema-info-enabled configuration flag set to True. Once enabled, every change to the data layout will be tracked, and a derived schema will be automatically updated. Note that enabling this feature may result in a slight performance decrease; for more details, refer to Performance Tips.

What Schema Information is Tracked?

The runtime schema tracks all uniquely defined nodes and edges:

  • Nodes are defined by their labels.
  • Edges are defined by their type, and the labels of their source and destination nodes.

For each unique object (node or edge), the schema tracks its count and categorizes stored properties by key and type.

Querying the Schema

You can query the current schema by running SHOW SCHEMA INFO. This returns a JSON representation of the schema, divided into six sections:

  • nodes - contains a set of uniquely identified nodes
  • edges - contains a set of uniquely identified edges
  • node_constraints - list of all currently defined node constraints
  • node_indexes - list of all currently defined node indexes
  • edge_indexes - list of all currently defined edge indexes
  • enums - list of all currently defined enums and their values

Exceptions

Exceptions that might be thrown while executing SHOW SCHEMA INFO query:

  • Show schema info is not supported for OnDisk. - Please use InMemory Transactional or Analytical mode instead.
  • Show schema info cannot be executed in multicommand transactions. - Show schema must be executed in an implicit (or auto-commit) transaction.
  • SchemaInfo disabled. - the flag --schema-info-enabled must be set to True when starting Memgraph.

JSON definition

The schema JSON output is structured as follows:

{
  "nodes": [{
    "labels": [""], "count": 1,
    "properties": [{
      "key": "property_key_0", "count": 1, "filling_factor": 100.00,
      "types": [{ "type": "", "count": 1}]
    }]
  }],
  "edges": [{
    "type": "", "start_node_labels": [""], "end_node_labels": [""], "count": 1,
    "properties": [...]
  }],
  "node_indexes": [
      { "labels": [""], "properties": [""], "count": 0,
        "type": "label|label+property|label+property_point|label_text|label+property_text" }
  ],
  "edge_indexes": [
      { "edge_type": "", "properties": [""],  "count": 0,
        "type": "edge_type|edge_type+property|edge_type+property_point|label_text|label+property_text" }
  ],
  "node_constraints": [
      { "type": "unique|existence|data_type", "labels": [""], "properties": [""], "data_type": "" }
  ],
  "enums": [
      { "name": "", "values": [""] }
  ]
}

Supported property types:

  • String
  • Boolean
  • Integer
  • Float
  • List
  • Map
  • Duration
  • Date
  • LocalTime
  • LocalDateTime
  • ZonedDateTime
  • Enum::Name
  • Point2D
  • Point3D

Empty graph example

Executing SHOW SCHEMA INFO on an empty graph will return the following result:

{
  "nodes": [],
  "edges": [],
  "node_indexes": [],
  "edge_indexes": [],
  "node_constraints": [],
  "enums": []
}

Node JSON examples

The following query adds a node without any edges, labels or properties.

CREATE ();

The nodes section of this simple schema is:

"nodes":[
  {
    "labels":[],
    "count":1,
    "properties":[]
  }
]

Adding a node with a single label:

CREATE (:L);

updates the nodes section of the schema to:

"nodes":[
  {
    "labels":[],
    "count":1,
    "properties":[]
  },
  {
    "labels":["L"],
    "count":1,
    "properties":[]
  }
]

Adding properties to the previously defined node:

MATCH (n:L) SET n.p = 1;

will result in the nodes section of the schema:

"nodes":[
  {
    "labels":[],
    "count":1,
    "properties":[]
  },
  {
    "labels":["L"],
    "count":1,
    "properties":[
      {
        "key":"p",
        "count":1,
        "filling_factor":100.00,
        "types":[
          {"type":"Integer", "count":1}
        ]
      }
    ]
  }
]

Now consider the following queries:

MATCH (n) DELETE n;
CREATE (:A:B);
CREATE (:A:B);
CREATE (:A:B);
CREATE (:A:B);
CREATE (:A:B{prop1:1});
CREATE (:A:B{prop1:"str"});
CREATE (:A:B{prop1:true});
CREATE (:A:B{prop1:"str2", prop2:10.0});
CREATE (:A:B{prop1:false,  prop2:20.0});
CREATE (:A:B{prop1:false,  prop2:30.0});

The resulting nodes section of the schema will look like this:

"nodes":[
  {
    "labels":["A", "B"],
    "count":10,
    "properties":[
      {
        "key":"prop1",
        "count":6,
        "filling_factor":60.00,
        "types":[
          {"type":"Boolean", "count":3},
          {"type":"Integer", "count":1},
          {"type":"String", "count":2}
        ]
      },
      {
        "key":"prop2",
        "count":3,
        "filling_factor":30.00,
        "types":[{"type":"Float", "count":3}]
      }
    ]
  }
]

Edge JSON examples

Consider the following queries:

CREATE (:C:D)-[:EDGE]->();
CREATE (:C:D)-[:EDGE{edge_prop:1}]->();
CREATE (:C:D)-[:EDGE{edge_prop:2}]->();

The resulting edges section of the schema would look like this:

"edges":[
  {
    "type":":EDGE",
    "start_node_labels":["C", "D"],
    "end_node_labels":[],
    "count":3,
    "properties":[
      {
        "key":"edge_prop",
        "count":2,
        "filling_factor":66.67,
        "types":[{"type":"Integer", "count":2}]
      }
    ]
  }
]

Performance Considerations

Enabling --schema-info-enabled incurs a performance cost because additional work is required after each query is committed (in transactional mode) or when data is modified (in analytical mode).

Performance Tips

  • Label Changes on Nodes with Edges: Changing labels on a node with existing edges can invalidate large parts of the schema. To avoid blocking queries during schema updates, define node labels before adding edges. Queries such as CREATE (:A)-[:EDGE]->(:B) are also fine.
  • Edge Property Type Changes: Modifying the type of an edge’s property may require scanning large portions of the graph. To avoid this, define edges and their properties in the same transaction and keep property types stable.
  • Recovery Using WALs: Recovering edges with properties via Write-Ahead Log (WAL) files can cause a significant performance hit. To mitigate this, use snapshots instead. Starting from v2.21, this issue has been alleviated, when using WAL files created by v2.21+.

Schema metadata

Schema metadata is a light weight run-time schema tracking. To enable schema metadata queries, start Memgraph with the --storage-enable-schema-metadata configuration flag set to True. The flag facilitates the utilization of a specialized cache designed to store specific metadata related to the database.

Information about node labels

The SHOW NODE_LABELS INFO; query provides a list of node labels that currently exist in the database or have existed at some point.

If the database contains the following data:

CREATE (:Dog {name: "Rex", age: 5})-[l:LOVES {how_much: "very"}]->(:Human:Owner {name: "Carl", age: 90});

The SHOW NODE_LABELS INFO; will return all the labels:

SHOW NODE_LABELS INFO;

Result:

+-------------+
| node labels |
+-------------+
| "Owner"     |
| "Human"     |
| "Dog"       |
+-------------+

Information about relationship types

The SHOW EDGE_TYPES INFO; query provides a list of relationship types that currently exist in the database or have existed at some point.

If the database contains the following data:

CREATE (:Dog {name: "Rex", age: 5})-[l:LOVES {how_much: "very"}]->(:Human:Owner {name: "Carl", age: 90});

The SHOW EDGE_TYPES INFO; will return all the types:

SHOW EDGE_TYPES INFO;

Result:

+------------+
| edge types |
+------------+
| "LOVES"    |
+------------+

node_type_properties()

The schema.node_type_properties() procedure returns schema information about nodes and their properties in the graph.

Output:

  • nodeType: string ➡ Concatenated node labels separated by a :.
  • nodeLabels: List[string] ➡ A list of node labels.
  • mandatory: boolean ➡ Returns True if every node with a given node type (defined by nodeType or nodeLabels) possesses the listed property (propertyName), and False otherwise.
  • propertyName: string ➡ Property name.
  • propertyTypes: string ➡ Property type.

Usage:

To get the information about nodes and properties, run the following query:

CALL schema.node_type_properties()
YIELD nodeType, nodeLabels, mandatory, propertyName, propertyTypes;

rel_type_properties()

This schema.rel_type_properties() procedure returns schema information about relationships and their properties in the graph.

Output:

  • relType: string ➡ The type of the relationship.
  • mandatory: boolean ➡ Returns True if every relationship with a given relationship type (defined by relType) possesses the listed property (propertyName), and False otherwise.
  • propertyName: string ➡ Property name.
  • propertyTypes: string ➡ Property type.

Usage:

To get the information about relationships and properties, run the following query:

CALL schema.rel_type_properties()
YIELD relType, mandatory, propertyName, propertyTypes;

assert()

The schema.assert() is used to ensure indices and constraints exist in Memgraph.

Input:

  • indices_map: (mgp.Map, required): Requires a map of key-value pairs where the key is a string representing a label and the value is a list containing properties as strings. If the list of properties is empty, an index will be created only on the label. If the list is not empty, indexes on the properties belonging to that label will be created. To create indexes on both the label and properties, add an empty string to the list of properties, for example, ["", "prop1", "prop2", ...].
  • unique_constraints: (mgp.Map, required): Requires a map of key-value pairs where the key is a string representing a label and the value is a list of lists containing properties as strings. List of lists is required since unique constraints can be created on a combination of multiple properties.
  • existence_constraints: (mgp.Map, optional): Requires a map of key-value pairs where the key is a string representing a label and the value is a list containing properties as strings. For each property, one existence constraint is created.
  • drop_existing: (bool, optional, default=true): If true, all existing indices and constraints will be dropped. If the user asserts existing indexes or constraints, they won’t be dropped and re-created.

Output:

  • action: string - The modification made to an index or a constraint.
  • label: string - Label used for modifying index or constraint.
  • key: string - Properties used for modifying index or constraint. When a label index is modified, an empty string is returned. When a unique constraint is modified, where key ia a list of properties a stringified list is returned.
  • keys: List[string] - List of properties used to modify indexes or constraints.
  • unique: bool - When true the record refers to unique constraint, when false it reffers to existence constraint and indices.

Usage:

Create initial indices and constraints in the database, using the following queries:

CREATE INDEX ON :Entity;
CREATE CONSTRAINT ON (p:Person) ASSERT EXISTS (p.address);
CREATE CONSTRAINT ON (p:Person) ASSERT p.id IS UNIQUE;

The following query will modify indexes and constraints by creating an index on the :Person label, and a label-property index on :Person(id) property. Also, a unique constraints will be created on (Person.name, Person.surname) and (Person.id). The index on the :Entity label will be dropped along with the existence constraint on :Person(id) property.

CALL schema.assert({Person: ["", "id"]}, {Person: [["name", "surname"], ["id"]]}, {}, true)
YIELD action, key, keys, label, unique
RETURN action, key, keys, label, unique;

Results:

+-----------------------------------------------------------------------------+
| action     | key               | keys                | label       | unique |
+-----------------------------------------------------------------------------+
| "Created"  | ""                | []                  | "Person"    | false  |
+-----------------------------------------------------------------------------+
| "Created"  | "id"              | ["id"]              | "Person"    | false  |
+-----------------------------------------------------------------------------+
| "Dropped"  | ""                | []                  | "Entity"    | false  |
+-----------------------------------------------------------------------------+
| "Dropped"  | "address"         | ["address"]         | "Person"    | false  |
+-----------------------------------------------------------------------------+
| "Created"  | "[name, surname]" | ["name", "surname"] | "Person"    | true   |
+-----------------------------------------------------------------------------+
| "Kept"     | "id"              | ["id"]              | "Person"    | true   |
+-----------------------------------------------------------------------------+

Delete all node indexes

⚠️

The schema.assert() procedure will not drop edge-type and point indexes. Our plan is to update it, and you can track the progress on our GitHub.

The assert() procedure can be used to delete all indexes and constraints. By providing empty indices_map, unique_constraints and existence_constraints as well as drop_existing set to true, ensure that there are no indexes or constraints in the database. Here is the query for deleting all indexes and constraints:

CALL schema.assert({}, {}, {}, true)
YIELD action, key, keys, label, unique
RETURN action, key, keys, label, unique;

Example

Create a graph using the following query:

CREATE (:Dog {name: "Rex", age: 5})-[l:LOVES {how_much: "very"}]->(:Human:Owner {name: "Carl", age: 90})
CREATE (:Dog)-[r:RUNS_AND_PLAYS_IN {speed: 100, duration: "5 hours" }]->(:Park)
CREATE (:Bird)-[f:FLIES_TO]->(:Sky);

Call the procedure to get information about the nodes:

CALL schema.node_type_properties()
YIELD nodeType, nodeLabels, mandatory, propertyName, propertyTypes;

Result:

+--------------------+--------------------+--------------------+--------------------+--------------------+
| nodeType           | nodeLabels         | mandatory          | propertyName       | propertyTypes      |
+--------------------+--------------------+--------------------+--------------------+--------------------+
| ":`Sky`"           | ["Sky"]            | false              | ""                 | ""                 |
| ":`Park`"          | ["Park"]           | false              | ""                 | ""                 |
| ":`Human`:`Owner`" | ["Human", "Owner"] | false              | "age"              | "Int"              |
| ":`Human`:`Owner`" | ["Human", "Owner"] | false              | "name"             | "String"           |
| ":`Bird`"          | ["Bird"]           | false              | ""                 | ""                 |
| ":`Dog`"           | ["Dog"]            | false              | "age"              | "Int"              |
| ":`Dog`"           | ["Dog"]            | false              | "name"             | "String"           |
+--------------------+--------------------+--------------------+--------------------+--------------------+

Call the procedure to get information about the relationships:

CALL schema.rel_type_properties()
YIELD relType, mandatory, propertyName, propertyTypes;

Results:

+------------------------+------------------------+------------------------+------------------------+
| relType                | mandatory              | propertyName           | propertyTypes          |
+------------------------+------------------------+------------------------+------------------------+
| ":`FLIES_TO`"          | false                  | ""                     | ""                     |
| ":`RUNS_AND_PLAYS_IN`" | false                  | "duration"             | "String"               |
| ":`RUNS_AND_PLAYS_IN`" | false                  | "speed"                | "Int"                  |
| ":`LOVES`"             | true                   | "how_much"             | "String"               |
+------------------------+------------------------+------------------------+------------------------+