cross_database
The cross_database module lets you reach into another database from a running Cypher query and stream
its rows into Memgraph. Use it to migrate data, build hybrid OLTP/graph pipelines, or join graph data
with rows fetched on-demand from a relational/Bolt/object-store source.
The migrate module is deprecated as of Memgraph 3.11 and has been replaced by cross_database.
Existing migrate.* calls keep working via the aliases shipped
with Memgraph, but new code should call cross_database.* directly. The previous migrate.memgraph()
procedure has been replaced by the more general cross_database.bolt().
| Trait | Value |
|---|---|
| Module type | util |
| Implementation | Python |
| Parallelism | sequential |
When running multiple cross-database calls against the same source, avoid repeating the config map in every call.
Use server-side parameters to store the connection config once
and reference it as $config across all your queries:
SET GLOBAL PARAMETER pg_config = {user: 'memgraph', password: 'password', host: 'localhost', database: 'demo_db'};
CALL cross_database.postgresql('users', $pg_config) YIELD row CREATE (u:User {id: row.id});
CALL cross_database.postgresql('orders', $pg_config) YIELD row CREATE (o:Order {id: row.id});Backwards compatibility with migrate
Every procedure listed below has a migrate.* alias preserved from earlier versions, so existing queries
keep working unchanged. The aliases are wired up through Memgraph’s callable-mapping mechanism
(/etc/memgraph/apoc_compatibility_mappings.json, enabled by default), so no configuration is required.
| Pre-3.11 name | New name |
|---|---|
migrate.memgraph | cross_database.bolt |
migrate.neo4j | cross_database.neo4j |
migrate.mysql | cross_database.mysql |
migrate.postgresql | cross_database.postgresql |
migrate.sql_server | cross_database.sql_server |
migrate.oracle_db | cross_database.oracle_db |
migrate.s3 | cross_database.s3 |
migrate.arrow_flight | cross_database.arrow_flight |
migrate.duckdb | cross_database.duckdb |
migrate.servicenow | cross_database.servicenow |
If you mix old and new names within the same transaction, treat them as the same procedure — calling
migrate.postgresql and cross_database.postgresql with identical arguments will trigger the
same-parameters guard.
You can inspect the active mapping at runtime with:
SHOW QUERY CALLABLE MAPPINGS;Type conversion
For Bolt-based sources (bolt, neo4j), primitives (Boolean, Integer, Float, String, Null),
lists, maps, the calendar-aware temporal types (Date, LocalTime, LocalDateTime, DateTime), and
spatial points (Point2d, Point3d, with their srid preserved) pass through cleanly. The cases worth
knowing about:
| Source type | Result | Notes |
|---|---|---|
Duration with months > 0 | flattened | Months are coerced to 30 days each. This differs from native Neo4j behavior, where months are preserved and date + P1M stays calendar-aware. See the warning below. |
Time (zoned) | unsupported | Memgraph has no zoned-time type. Convert to LocalTime on the source side, or carry the offset as a separate column. |
| Enum | unsupported | Memgraph enums can’t cross the Bolt boundary — the call raises an error. |
| Node / Relationship / Path | unsupported | Streaming raw graph objects is not supported. Return their properties() / labels() instead. |
The 30-days-per-month flattening means a duration like P1M becomes 30 days on the Memgraph side, and
P1Y becomes 360 days. Native Neo4j keeps months separate from days, so date + duration('P1M') stays
calendar-aware there; after crossing the Bolt boundary into Memgraph that calendar information is gone.
If you need exact calendar arithmetic, fetch the months and days components as separate columns and
reconstruct them on the Memgraph side.
Same-parameters guard
cross_database deliberately rejects two concurrent calls with the same (query, config, params) inside
a single transaction. Doing so would race on the underlying connection and yield duplicate rows. The error
message is:
Cross database module with these parameters is already running.
Please wait for it to finish before starting a new one.If you intentionally need two reads from the same source in one transaction, vary the query (for example
add a different LIMIT, alias, or comment) so the cache key differs.
Procedures
bolt()
cross_database.bolt() queries any Bolt-compatible database — Memgraph itself, another Memgraph instance,
or Neo4j — and streams the rows back. This replaces the old migrate.memgraph() procedure and is the
recommended way to read from any Bolt source.
Input:
label_or_rel_or_query: str➡ A label name ((:Label)), a relationship type ([:REL_TYPE]), or a plain Cypher query. When a label/relationship shorthand is used,cross_database.bolt()synthesizes the matchingMATCH … RETURN labels/properties …query for you.config: mgp.Map➡ Connection parameters. Notable keys:host(defaultlocalhost),port(default7687),username,password,database,uri_scheme(defaultbolt).config_path: str(optional) ➡ Path to a JSON file containing connection parameters; values in the file override values inconfig.params: mgp.Nullable[mgp.Any](optional, defaultNone) ➡ AMapof Cypher parameters passed to the remote query (e.g.{val: 42}for a query using$val).
Output:
row: mgp.Map➡ The result table as a stream of rows.- When fetching with the
(:Label)syntax, each row haslabelsandproperties. - When fetching with the
[:REL_TYPE]syntax, each row hasfrom_labels,to_labels,from_properties,to_properties, andedge_properties. - When passing a plain Cypher query, row keys match the columns the remote query returns.
- When fetching with the
Usage:
Retrieve nodes of a certain label and recreate them locally
CALL cross_database.bolt('(:Person)', {host: 'localhost', port: 7687})
YIELD row
WITH row.labels AS labels, row.properties AS props
CREATE (n:labels) SET n += props;Pass query parameters
CALL cross_database.bolt(
'MATCH (u:User) WHERE u.id = $id RETURN u.name AS name',
{host: 'localhost', port: 7687},
'',
{id: 42}
)
YIELD row
RETURN row.name AS name;Connect to Neo4j with explicit credentials
CALL cross_database.bolt(
'MATCH (n) RETURN count(n) AS cnt',
{host: 'neo4j-host', port: 7687, username: 'neo4j', password: 'secret'}
)
YIELD row
RETURN row.cnt AS cnt;neo4j()
cross_database.neo4j() is a thin convenience wrapper around bolt() that defaults the
credentials to Neo4j’s stock neo4j / password. Use it when you only need to override the host/port.
Input:
Same as bolt(). If username / password are missing from config, they default to
neo4j / password.
Output:
Same as bolt().
Usage:
CALL cross_database.neo4j('(:Person)', {host: 'neo4j-host', port: 7687})
YIELD row
WITH row.labels AS labels, row.properties AS props
CREATE (n:labels) SET n += props;mysql()
Query MySQL and stream rows back. The result table is converted into a stream that can be used to create graph structures.
Input:
table_or_sql: str➡ A table name (a single word — automatically expanded toSELECT * FROM <name>) or a full SQL query.config: mgp.Map➡ Connection parameters (as inmysql.connector.connect).config_path: str(optional) ➡ Path to a JSON file with connection parameters.params: mgp.Nullable[mgp.Any](optional, defaultNone) ➡ Query parameters. Accepts aListfor%s-style placeholders or aMapfor%(name)s-style placeholders (both supported bymysql.connector).
Output:
row: mgp.Map➡ The result table as a stream of rows.
Usage:
Retrieve and inspect data
CALL cross_database.mysql('example_table', {user: 'memgraph',
password: 'password',
host: 'localhost',
database: 'demo_db'})
YIELD row
RETURN row
LIMIT 5000;Create nodes from migrated data
CALL cross_database.mysql('SELECT id, name, age FROM users', {user: 'memgraph',
password: 'password',
host: 'localhost',
database: 'demo_db'})
YIELD row
CREATE (u:User {id: row.id, name: row.name, age: row.age});postgresql()
Query PostgreSQL and stream rows back.
Input:
table_or_sql: str➡ Table name or SQL query.config: mgp.Map➡ Connection parameters (as inpsycopg2.connect).config_path: str(optional) ➡ Path to a JSON file with connection parameters.params: mgp.Nullable[mgp.Any](optional, defaultNone) ➡ Query parameters as aList(psycopg2 uses positional%splaceholders). Passing aMapraises aTypeError.
Output:
row: mgp.Map➡ The result table as a stream of rows.
Usage:
Retrieve and inspect data
CALL cross_database.postgresql('example_table', {user: 'memgraph',
password: 'password',
host: 'localhost',
database: 'demo_db'})
YIELD row
RETURN row
LIMIT 5000;Establish relationships between orders and customers
CALL cross_database.postgresql('SELECT order_id, customer_id FROM orders', {user: 'memgraph',
password: 'password',
host: 'localhost',
database: 'retail_db'})
YIELD row
MATCH (o:Order {id: row.order_id}), (c:Customer {id: row.customer_id})
CREATE (c)-[:PLACED]->(o);sql_server()
Query SQL Server and stream rows back.
Input:
table_or_sql: str➡ Table name or SQL query.config: mgp.Map➡ Connection parameters (as inpyodbc.connect).config_path: str(optional) ➡ Path to a JSON file with connection parameters.params: mgp.Nullable[mgp.Any](optional, defaultNone) ➡ Query parameters as aList.
Output:
row: mgp.Map➡ The result table as a stream of rows.
Usage:
CALL cross_database.sql_server('SELECT id, name, role FROM employees', {user: 'memgraph',
password: 'password',
host: 'localhost',
database: 'company_db'})
YIELD row
CREATE (e:Employee {id: row.id, name: row.name, role: row.role});oracle_db()
Query Oracle DB and stream rows back.
Input:
table_or_sql: str➡ Table name or SQL query.config: mgp.Map➡ Connection parameters (as inoracledb.connect).config_path: str(optional) ➡ Path to a JSON file with connection parameters.params: mgp.Nullable[mgp.Any](optional, defaultNone) ➡ Query parameters. Accepts aListfor positional placeholders or aMapfor:name-style named placeholders (both supported byoracledb).
Output:
row: mgp.Map➡ The result table as a stream of rows.
Usage:
CALL cross_database.oracle_db('SELECT id, name FROM companies', {user: 'memgraph',
password: 'password',
host: 'localhost',
database: 'business_db'})
YIELD row
MERGE (c:Company {id: row.id})
SET c.name = row.name;s3()
Read a CSV file directly from AWS S3 and stream its rows into Memgraph.
Input:
file_path: str➡ S3 path in the forms3://bucket-name/path/to/file.csv.config: mgp.Map➡ AWS credentials; all keys optional. Missing keys fall back to the corresponding environment variables:aws_access_key_id(env:AWS_ACCESS_KEY_ID)aws_secret_access_key(env:AWS_SECRET_ACCESS_KEY)region_name(env:AWS_REGION)aws_session_token(env:AWS_SESSION_TOKEN)
config_path: str(optional) ➡ Path to a JSON file with AWS credentials.
Output:
row: mgp.Map➡ Each CSV row as a{column_name: value}map.
Usage:
CALL cross_database.s3('s3://my-bucket/employees.csv', {aws_access_key_id: 'your-key',
aws_secret_access_key: 'your-secret',
region_name: 'eu-central-1'})
YIELD row
CREATE (e:Employee {id: row.id, name: row.name, position: row.position});arrow_flight()
Connect to any data source that speaks the Arrow Flight RPC protocol (for example, Dremio) and stream rows in.
Input:
query: str➡ Query against the data source.config: mgp.Map➡ Connection parameters (as inpyarrow.flight.connect). Notable keys:host,port,username,password.config_path: str(optional) ➡ Path to a JSON file with connection parameters.
Output:
row: mgp.Map➡ The result table as a stream of rows.
Usage:
CALL cross_database.arrow_flight('SELECT id, name, age FROM users', {username: 'memgraph',
password: 'password',
host: 'localhost',
port: '12345'})
YIELD row
CREATE (u:User {id: row.id, name: row.name, age: row.age});duckdb()
Connect to DuckDB and use it as a proxy to query the data sources DuckDB supports. DuckDB is run in-memory with no persistence — it’s only used to proxy to underlying sources.
Input:
query: str➡ Table name or SQL query.setup_queries: mgp.Nullable[List[str]](optional) ➡ Queries executed beforequery, used to attach or configure the source DuckDB will proxy to (e.g.INSTALL httpfs; LOAD httpfs;).
Output:
row: mgp.Map➡ The result table as a stream of rows.
Usage:
CALL cross_database.duckdb(
'SELECT * FROM read_csv_auto(''s3://my-bucket/users.csv'')',
['INSTALL httpfs;', 'LOAD httpfs;']
)
YIELD row
CREATE (u:User {id: row.id, name: row.name});servicenow()
Pull data from the ServiceNow REST API.
The endpoint must return JSON of the form {"results": [...]}.
Input:
endpoint: str➡ The full ServiceNow URL, including any query parameters.config: mgp.Map➡ Connection parameters. Notable keys:username,password(passed through torequests.get).config_path: str(optional) ➡ Path to a JSON file with connection parameters.params: mgp.Nullable[mgp.Any](optional, defaultNone) ➡ Additional URL query parameters, forwarded torequests.get(endpoint, params=...). Unlike the SQL backends, these are HTTP query string parameters, not SQL placeholders.
Output:
row: mgp.Map➡ Each element ofresultsas a structured dictionary.
Usage:
CALL cross_database.servicenow('http://my_endpoint/api/data', {})
YIELD row
CREATE (e:Employee {id: row.id, name: row.name, position: row.position});