Backup and restore
Memgraph uses snapshots and WAL to ensure the durability of the stored data. Learn how to safely backup and restore your data.
Create backup
Follow these steps to create database backup:
Create a snapshot
If necessary, create a snapshot of the current database state by running the
following query in mgconsole
or Memgraph Lab:
CREATE SNAPSHOT;
The snapshot is saved in the snapshots
directory of the data directory
(/var/lib/memgraph
).
Lock the data directory
Durability files are deleted when an event is triggered, for example, exceeding the maximum number of snapshots.
To disable this behavior, run the following query in mgconsole
or Memgraph
Lab:
LOCK DATA DIRECTORY;
Copy files
Copy snapshot files (from the snapshots
directory) and any additional WAL
files (from the wal
directory) to a backup location.
If you’ve just created a snapshot file there is no need to backup WAL files.
To copy the snapshot files from the Docker container first check the container
ID by running docker ps
then run the following command:
docker cp <CONTAINER ID>:/var/lib/memgraph/snapshots/<snapshot_file> <snapshot_file>
Unlock the data directory
Run the following query in mgconsole
or Memgraph Lab to unlock the
directory:
UNLOCK DATA DIRECTORY;
Memgraph will delete the files which should have been deleted before locking and allow any future deletion of the durability files.
Restore data
To restore a snapshot in Memgraph, run the following command from an already running instance:
RECOVER SNAPSHOT "/path/to/snapshot";
By default, snapshots are stored in the local directory:
/var/lib/memgraph/snapshots/
If your snapshot is stored elsewhere, Memgraph will attempt to copy it to the local snapshot directory. Ensure the file has the necessary permissions to be moved. If not, you might encounter the following error:
Failed to copy snapshot over to local snapshots directory.
Use an absolute path when specifying the snapshot location. If you provide a relative path, it must be relative to the Memgraph execution path.
Before modifying the local data directory, Memgraph will move all existing WALs and snapshots to a hidden
directory in the format:
.old_<time-since-epoch>
If the instance is not freshly started, add the FORCE
flag to your command:
RECOVER SNAPSHOT "/path/to/snapshot" FORCE;
This will clear all existing data before applying the snapshot.
In order to query the snapshots currently present in the local data directory, execute the query:
SHOW SNAPSHOTS;
Its results contain the path to the file, the logical timestamp, the physical timestamp and the file size.
As of Memgraph v3.5, the SHOW SNAPSHOTS
query does not return information regarding the next scheduled snapshot.
A special query has been added:
SHOW NEXT SNAPSHOT;
If the periodic snapshot background job is active, the result will return the path and the time at which the snapshots will be created.
If you are using Memgraph pre v2.22, follow these steps to restore data from a backup:
Empty the `wal` directory
If you want to restore data only from the snapshot file, ensure that the
wal
directory is empty:
- Find the container ID using a
docker ps
command, then enter the container using:
docker exec -it CONTAINER_ID bash
- Position yourself in the
/var/lib/memgraph/wal
directory andrm *
Stop the instance
Run the following command
docker stop CONTAINER_ID
Start the instance
You can start the instance with the backed up files in two ways.
Option 1
You can start the instance by adding a -v ~/snapshots:/var/lib/memgraph/snapshots
flag to the docker run
command,
where the ~/snapshots
represents a path to the location of the directory
with the back-up snapshot, for example:
docker run -p 7687:7687 -p 7444:7444 -v ~/snapshots:/var/lib/memgraph/snapshots memgraph/memgraph
If you want to copy both WAL and snapshot files start the instance by adding
a -v ~/snapshots:/var/lib/memgraph/snapshots -v ~/wal:/var/lib/memgraph/wal
flags to the docker run
command, where the ~/snapshots
represents a path
to the location of the backed-up snapshot directory, and ~/wal
represents a
path to the location of the backed-up wal directory for example:
docker run -p 7687:7687 -p 7444:7444 -v ~/snapshots:/var/lib/memgraph/snapshots -v ~/wal:/var/lib/memgraph/wal memgraph/memgraph
Option 2
The other option is to copy the backed-up snapshot file into the snapshots
directory after creating the container and start the database. So the commands
should look like this:
docker create -p 7687:7687 -p 7444:7444 -v `snapshots`:/var/lib/memgraph/snapshots --name memgraphDB memgraph/memgraph
tar -cf - sample_snapshot_file | docker cp -a - memgraphDB:/var/lib/memgraph/snapshots
The sample_snapshot_file
is the snapshot file you want to use to restore the
data. Due to the nature of Docker file ownership, you need to use tar
to
copy the file as STDIN into the non-running container. It will allow you to
change the ownership of the file to the memgraph
user inside the container.
After that, start the database with:
docker start -a memgraphDB
The -a
flag is used to attach to the container’s output so you can see the logs.
Once memgraph is started, change the snapshot directory ownership to the memgraph
user by running the following command:
docker exec -it -u 0 memgraphDB bash -c "chown memgraph:memgraph /var/lib/memgraph/snapshots"
Otherwise, Memgraph will not be able to write the future snapshot files and will fail.
Database dump
The database dump contains a record of the database state in the form of Cypher queries. It’s equivalent to the SQL dump in relational DBs. Database dump preserves nodes, relationships, indexes, constraints and triggers.
You can run the queries constituting the dump to recreate the state of the DB as it was at the time of the dump.
To dump the Memgraph DB, run the following query:
DUMP DATABASE;
If you are using Memgraph Lab, you can dump the database, that is, the queries
to recreate it, to a CYPHERL file in the Import & Export
section of the Lab.
Storage modes
Memgraph has the option to work in IN_MEMORY_ANALYTICAL
,
IN_MEMORY_TRANSACTIONAL
or ON_DISK_TRANSACTIONAL
storage
modes.
Memgraph always starts in the IN_MEMORY_TRANSACTIONAL
mode in which it creates
periodic snapshots and write-ahead logging as durability mechanisms, and also
enables creating manual snapshots.
In the IN_MEMORY_ANALYTICAL
mode, Memgraph offers no periodic snapshots and
write-ahead logging. Users can create a snapshot with the CREATE SNAPSHOT;
Cypher query. During the process of snapshot creation, other transactions will
be prevented from starting until the snapshot creation is completed.
In the ON_DISK_TRANSACTIONAL
mode, durability is supported by RocksDB since it
keeps its own
WAL files.
Memgraph persists the metadata used in the implementation of the on-disk
storage.
Backup in multi-tenancy Enterprise
When running Memgraph with multi-tenancy, every database other than the default database
(named memgraph
) will have its own associated database UUID. Database UUID can be inspected
by running the SHOW STORAGE INFO
command and reading the value under the database_uuid
key.
The default data directory location for a specific database is /var/lib/memgraph/<database_uuid>/
.
The default database memgraph
does not follow this directory structure and the data files are directly located under /var/lib/memgraph
.
Manual snapshot backup flow should look like this:
Create snapshots inside Memgraph
Create snapshot for every database (or let it create automatically with periodic snapshot execution inside Memgraph)
Perform backup
Backup the snapshot for every database into a 3rd party location. Currently, you’re encouraged to perform the backup mechanisms by yourself with tools such as rclone.
When performing recovery, copy the snapshot to Memgraph
When recovering a specific database, copy the snapshot to any data location
- if the data directory the snapshot is being copied to is a location that’s outside the database directory, ensure the snapshot has the permissions to be copied to the database data directory
- if the data directory the snapshot is being copied to is a location that’s inside the database directory, there should be no issues with permissions as there is no copying being performed from source to target directory location
Position to specific database
Position the database driver interacting with Memgraph into the database using USE DATABASE <database_name>
Recover the snapshot
Execute RECOVER SNAPSHOT <path_to_snapshot> FORCE
Best practices
Memgraph can optimize restoring of snapshots in a multi-threaded manner. To enable multi-threaded restoration of a snapshot, you need to ensure the following flags are present:
--storage-parallel-schema-recovery=true
--storage-recovery-thread-count=<number_of_cores>
wherenumber_of_cores
is the amount of CPU cores to parallelize the restoration process