Schema
Schema queries are designed to retrieve information about the structure of the data within the database, for example, node labels and properties associated, as well as relationship types and properties.
Run-time schema tracking
Please use a snapshot to migrate a pre v2.21 datasets with properties on edges. When schema is tracked, recovering using WAL files generated pre v2.21 will incur a performance penalty as crucial data is missing, which leads to long scan times.
To use run-time schema tracking, start Memgraph with the --schema-info-enabled
configuration flag set to True
. Once enabled, every change to the data layout will be tracked, and a derived schema will be automatically updated. Note that enabling this feature may result in a slight performance decrease; for more details, refer to Performance Tips.
What Schema Information is Tracked?
The runtime schema tracks all uniquely defined nodes and edges:
- Nodes are defined by their labels.
- Edges are defined by their type, and the labels of their source and destination nodes.
For each unique object (node or edge), the schema tracks its count and categorizes stored properties by key and type.
Querying the Schema
You can query the current schema by running SHOW SCHEMA INFO
. This returns a JSON representation of the schema, divided into six sections:
nodes
- contains a set of uniquely identified nodesedges
- contains a set of uniquely identified edgesnode_constraints
- list of all currently defined node constraintsnode_indexes
- list of all currently defined node indexesedge_indexes
- list of all currently defined edge indexesenums
- list of all currently defined enums and their values
Exceptions
Exceptions that might be thrown while executing SHOW SCHEMA INFO
query:
Show schema info is not supported for OnDisk.
- Please use InMemory Transactional or Analytical mode instead.Show schema info cannot be executed in multicommand transactions.
- Show schema must be executed in an implicit (or auto-commit) transaction.SchemaInfo disabled.
- the flag--schema-info-enabled
must be set toTrue
when starting Memgraph.
JSON definition
The schema JSON output is structured as follows:
{
"nodes": [{
"labels": [""], "count": 1,
"properties": [{
"key": "property_key_0", "count": 1, "filling_factor": 100.00,
"types": [{ "type": "", "count": 1}]
}]
}],
"edges": [{
"type": "", "start_node_labels": [""], "end_node_labels": [""], "count": 1,
"properties": [...]
}],
"node_indexes": [
{ "labels": [""], "properties": [""], "count": 0,
"type": "label|label+property|label+property_point|label_text|label+property_text" }
],
"edge_indexes": [
{ "edge_type": "", "properties": [""], "count": 0,
"type": "edge_type|edge_type+property|edge_type+property_point|label_text|label+property_text" }
],
"node_constraints": [
{ "type": "unique|existence|data_type", "labels": [""], "properties": [""], "data_type": "" }
],
"enums": [
{ "name": "", "values": [""] }
]
}
Supported property types:
- String
- Boolean
- Integer
- Float
- List
- Map
- Duration
- Date
- LocalTime
- LocalDateTime
- ZonedDateTime
- Enum::Name
- Point2D
- Point3D
Empty graph example
Executing SHOW SCHEMA INFO
on an empty graph will return the following result:
{
"nodes": [],
"edges": [],
"node_indexes": [],
"edge_indexes": [],
"node_constraints": [],
"enums": []
}
Node JSON examples
The following query adds a node without any edges, labels or properties.
CREATE ();
The nodes section of this simple schema is:
"nodes":[
{
"labels":[],
"count":1,
"properties":[]
}
]
Adding a node with a single label:
CREATE (:L);
updates the nodes section of the schema to:
"nodes":[
{
"labels":[],
"count":1,
"properties":[]
},
{
"labels":["L"],
"count":1,
"properties":[]
}
]
Adding properties to the previously defined node:
MATCH (n:L) SET n.p = 1;
will result in the nodes section of the schema:
"nodes":[
{
"labels":[],
"count":1,
"properties":[]
},
{
"labels":["L"],
"count":1,
"properties":[
{
"key":"p",
"count":1,
"filling_factor":100.00,
"types":[
{"type":"Integer", "count":1}
]
}
]
}
]
Now consider the following queries:
MATCH (n) DELETE n;
CREATE (:A:B);
CREATE (:A:B);
CREATE (:A:B);
CREATE (:A:B);
CREATE (:A:B{prop1:1});
CREATE (:A:B{prop1:"str"});
CREATE (:A:B{prop1:true});
CREATE (:A:B{prop1:"str2", prop2:10.0});
CREATE (:A:B{prop1:false, prop2:20.0});
CREATE (:A:B{prop1:false, prop2:30.0});
The resulting nodes section of the schema will look like this:
"nodes":[
{
"labels":["A", "B"],
"count":10,
"properties":[
{
"key":"prop1",
"count":6,
"filling_factor":60.00,
"types":[
{"type":"Boolean", "count":3},
{"type":"Integer", "count":1},
{"type":"String", "count":2}
]
},
{
"key":"prop2",
"count":3,
"filling_factor":30.00,
"types":[{"type":"Float", "count":3}]
}
]
}
]
Edge JSON examples
Consider the following queries:
CREATE (:C:D)-[:EDGE]->();
CREATE (:C:D)-[:EDGE{edge_prop:1}]->();
CREATE (:C:D)-[:EDGE{edge_prop:2}]->();
The resulting edges section of the schema would look like this:
"edges":[
{
"type":":EDGE",
"start_node_labels":["C", "D"],
"end_node_labels":[],
"count":3,
"properties":[
{
"key":"edge_prop",
"count":2,
"filling_factor":66.67,
"types":[{"type":"Integer", "count":2}]
}
]
}
]
Performance Considerations
Enabling --schema-info-enabled
incurs a performance cost because additional work is required after each query is committed (in transactional mode) or when data is modified (in analytical mode).
Performance Tips
- Label Changes on Nodes with Edges: Changing labels on a node with existing edges can invalidate large parts of the schema. To avoid blocking queries during schema updates, define node labels before adding edges. Queries such as
CREATE (:A)-[:EDGE]->(:B)
are also fine. - Edge Property Type Changes: Modifying the type of an edge's property may require scanning large portions of the graph. To avoid this, define edges and their properties in the same transaction and keep property types stable.
- Recovery Using WALs: Recovering edges with properties via Write-Ahead Log (WAL) files can cause a significant performance hit. To mitigate this, use snapshots instead. Starting from v2.21, this issue has been alleviated, when using WAL files created by v2.21+.
Schema metadata
Schema metadata is a light weight run-time schema tracking. To enable schema metadata queries, start Memgraph with the --storage-enable-schema-metadata
configuration flag set to True
. The flag facilitates the utilization of a specialized cache designed to store specific metadata related to the database.
Information about node labels
The SHOW NODE_LABELS INFO;
query provides a list of node labels that currently exist in the database or have existed at some point.
If the database contains the following data:
CREATE (:Dog {name: "Rex", age: 5})-[l:LOVES {how_much: "very"}]->(:Human:Owner {name: "Carl", age: 90});
The SHOW NODE_LABELS INFO;
will return all the labels:
SHOW NODE_LABELS INFO;
Result:
+-------------+
| node labels |
+-------------+
| "Owner" |
| "Human" |
| "Dog" |
+-------------+
Information about relationship types
The SHOW EDGE_TYPES INFO;
query provides a list of relationship types that currently exist in the database or have existed at some point.
If the database contains the following data:
CREATE (:Dog {name: "Rex", age: 5})-[l:LOVES {how_much: "very"}]->(:Human:Owner {name: "Carl", age: 90});
The SHOW EDGE_TYPES INFO;
will return all the types:
SHOW EDGE_TYPES INFO;
Result:
+------------+
| edge types |
+------------+
| "LOVES" |
+------------+
Schema-related procedures
You can execute these procedures on graph projections, subgraphs or portions of the graph using the project
function.
node_type_properties()
The schema.node_type_properties()
procedure returns schema information about nodes and their properties in the graph.
Output:
nodeType: string
➡ Concatenated node labels separated by a:
.nodeLabels: List[string]
➡ A list of node labels.mandatory: boolean
➡ ReturnsTrue
if every node with a given node type (defined by nodeType or nodeLabels) possesses the listed property (propertyName), andFalse
otherwise.propertyName: string
➡ Property name.propertyTypes: string
➡ Property type.
Usage:
To get the information about nodes and properties, run the following query:
CALL schema.node_type_properties()
YIELD nodeType, nodeLabels, mandatory, propertyName, propertyTypes;
rel_type_properties()
This schema.rel_type_properties()
procedure returns schema information about relationships and their properties in the graph.
Output:
relType: string
➡ The type of the relationship.mandatory: boolean
➡ ReturnsTrue
if every relationship with a given relationship type (defined by relType) possesses the listed property (propertyName), andFalse
otherwise.propertyName: string
➡ Property name.propertyTypes: string
➡ Property type.
Usage:
To get the information about relationships and properties, run the following query:
CALL schema.rel_type_properties()
YIELD relType, mandatory, propertyName, propertyTypes;
assert()
The schema.assert()
is used to ensure indices and constraints exist in Memgraph.
Input:
indices_map: (mgp.Map, required)
: Requires a map of key-value pairs where the key is a string representing a label and the value is a list containing properties as strings. If the list of properties is empty, an index will be created only on the label. If the list is not empty, indexes on the properties belonging to that label will be created. To create indexes on both the label and properties, add an empty string to the list of properties, for example,["", "prop1", "prop2", ...]
.unique_constraints: (mgp.Map, required)
: Requires a map of key-value pairs where the key is a string representing a label and the value is a list of lists containing properties as strings. List of lists is required since unique constraints can be created on a combination of multiple properties.existence_constraints: (mgp.Map, optional)
: Requires a map of key-value pairs where the key is a string representing a label and the value is a list containing properties as strings. For each property, one existence constraint is created.drop_existing: (bool, optional, default=true)
: If true, all existing indices and constraints will be dropped. If the user asserts existing indexes or constraints, they won't be dropped and re-created.
Output:
action: string
- The modification made to an index or a constraint.label: string
- Label used for modifying index or constraint.key: string
- Properties used for modifying index or constraint. When a label index is modified, an empty string is returned. When a unique constraint is modified, where key ia a list of properties a stringified list is returned.keys: List[string]
- List of properties used to modify indexes or constraints.unique: bool
- Whentrue
the record refers to unique constraint, whenfalse
it reffers to existence constraint and indices.
Usage:
Create initial indices and constraints in the database, using the following queries:
CREATE INDEX ON :Entity;
CREATE CONSTRAINT ON (p:Person) ASSERT EXISTS (p.address);
CREATE CONSTRAINT ON (p:Person) ASSERT p.id IS UNIQUE;
The following query will modify indexes and constraints by creating an index on
the :Person
label, and a label-property index on :Person(id)
property. Also,
a unique constraints will be created on (Person.name, Person.surname)
and
(Person.id)
. The index on the :Entity
label will be dropped along with
the existence constraint on :Person(id)
property.
CALL schema.assert({Person: ["", "id"]}, {Person: [["name", "surname"], ["id"]]}, {}, true)
YIELD action, key, keys, label, unique
RETURN action, key, keys, label, unique;
Results:
+-----------------------------------------------------------------------------+
| action | key | keys | label | unique |
+-----------------------------------------------------------------------------+
| "Created" | "" | [] | "Person" | false |
+-----------------------------------------------------------------------------+
| "Created" | "id" | ["id"] | "Person" | false |
+-----------------------------------------------------------------------------+
| "Dropped" | "" | [] | "Entity" | false |
+-----------------------------------------------------------------------------+
| "Dropped" | "address" | ["address"] | "Person" | false |
+-----------------------------------------------------------------------------+
| "Created" | "[name, surname]" | ["name", "surname"] | "Person" | true |
+-----------------------------------------------------------------------------+
| "Kept" | "id" | ["id"] | "Person" | true |
+-----------------------------------------------------------------------------+
Delete all node indexes
The schema.assert()
procedure will not drop edge-type and point indexes.
Our plan is to update it, and you can track the progress on our GitHub (opens in a new tab).
The assert()
procedure can be used to delete all indexes and constraints. By
providing empty indices_map
, unique_constraints
and existence_constraints
as
well as drop_existing
set to true
, ensure that there are no indexes or
constraints in the database. Here is the query for deleting all indexes and constraints:
CALL schema.assert({}, {}, {}, true)
YIELD action, key, keys, label, unique
RETURN action, key, keys, label, unique;
Example
Create a graph using the following query:
CREATE (:Dog {name: "Rex", age: 5})-[l:LOVES {how_much: "very"}]->(:Human:Owner {name: "Carl", age: 90})
CREATE (:Dog)-[r:RUNS_AND_PLAYS_IN {speed: 100, duration: "5 hours" }]->(:Park)
CREATE (:Bird)-[f:FLIES_TO]->(:Sky);
Call the procedure to get information about the nodes:
CALL schema.node_type_properties()
YIELD nodeType, nodeLabels, mandatory, propertyName, propertyTypes;
Result:
+--------------------+--------------------+--------------------+--------------------+--------------------+
| nodeType | nodeLabels | mandatory | propertyName | propertyTypes |
+--------------------+--------------------+--------------------+--------------------+--------------------+
| ":`Sky`" | ["Sky"] | false | "" | "" |
| ":`Park`" | ["Park"] | false | "" | "" |
| ":`Human`:`Owner`" | ["Human", "Owner"] | false | "age" | "Int" |
| ":`Human`:`Owner`" | ["Human", "Owner"] | false | "name" | "String" |
| ":`Bird`" | ["Bird"] | false | "" | "" |
| ":`Dog`" | ["Dog"] | false | "age" | "Int" |
| ":`Dog`" | ["Dog"] | false | "name" | "String" |
+--------------------+--------------------+--------------------+--------------------+--------------------+
Call the procedure to get information about the relationships:
CALL schema.rel_type_properties()
YIELD relType, mandatory, propertyName, propertyTypes;
Results:
+------------------------+------------------------+------------------------+------------------------+
| relType | mandatory | propertyName | propertyTypes |
+------------------------+------------------------+------------------------+------------------------+
| ":`FLIES_TO`" | false | "" | "" |
| ":`RUNS_AND_PLAYS_IN`" | false | "duration" | "String" |
| ":`RUNS_AND_PLAYS_IN`" | false | "speed" | "Int" |
| ":`LOVES`" | true | "how_much" | "String" |
+------------------------+------------------------+------------------------+------------------------+