Manage Memgraph in Linux native environment
Learn the best practices for managing your Memgraph instance in the native Linux environment. This guide will cover the most common tasks you might need to do while configuring and running Memgraph in the Linux native environment.
Keep in mind that links from this guide will take you to the other documentation pages that cover the specific topic in more detail.
Install Memgraph on Linux
To install Memgraph on Linux, you need to download the appropriate package from the Memgraph download page. The installation process is straightforward, and it is covered in the Memgraph installation guide.
For a native Linux deployment, you can only use the base Memgraph package. Currently, no Memgraph MAGE package is available for Linux. To run Memgraph MAGE on Linux, you can use the Docker container.
Once Memgraph is installed, you can use systemd
to manage the Memgraph
service. Here is an example of how to start, stop, and restart the Memgraph
service:
sudo systemctl start memgraph
sudo systemctl stop memgraph
sudo systemctl restart memgraph
sudo systemctl status memgraph
Connect to Memgraph
By default, Memgraph listens on port 7687
for the Bolt protocol. That means you
can connect to Memgraph using any of the client libraries
or the mgconsole CLI tool.
Memgraph pre-installs the mgconsole
CLI tool, so you can connect to Memgraph
by running the following command:
mgconsole --host=127.0.0.1 --port=7687
You can use the mgconsole
in interactive or non-interactive mode to set up the
Memgraph and run setup scripts. To connect to Memgraph from remote machines, you
need to change the --host
parameter to the proper IP address.
The same goes for using a client library, you need to provide the proper IP address and port number. The client needs to be in the same network as the Memgraph instance, or the instance, needs to be exposed to the internet.
If you are unable to connect to a remote Memgraph instance, keep in mind that your firewall environment may be blocking the connection to Memgraph.
System configuration
Before running Memgraph, please check the system configuration guidelines, especially the
vm.max_map_count
parameter setting.
Configure Memgraph
Memgraph has its configuration settings persisted in the memgraph.conf
file
located in the /etc/memgraph
folder. Some of the most commonly configured
settings are:
--log-level=WARNING
--query-execution-timeout-sec=600
--storage-snapshot-interval-sec=300
--storage-snapshot-retention-count=3
--storage-mode=IN_MEMORY_TRANSACTIONAL
Of course, other settings are as important, so make sure you get familiar with them as well.
To change the configuration settings, you need to open and edit the
memgraph.conf
file and restart the Memgraph service.
The set flags override the default settings in the memgraph.conf
file. It is
also possible to provide an additional configuration file by providing a path to
it via --file-flag
or MEMGRAPH_CONFIG
environment variable. The set values
in that file will override values in the default memgraph.conf
file.
Secure the database
In the community version of Memgraph, you can create a user and its password for
security. To achieve that, set MEMGRAPH_USER
and MEMGRAPH_PASSWORD
environment variables. When you set values for those variables, a user will be
created if it does not exist, and no one except that user will be able to
access the database.
To authenticate, provide the correct username and password. For example, if you
want to run Memgraph’s CLI mgconsole
on localhost, run the following command:
mgconsole --host=127.0.0.1 --port=7687 --username=user --password=pass
In the Enterprise version of Memgraph, you can achieve a higher level of security with role-based access control (RBAC) and fine-grained access control. Within fine-grained access control Memgraph offers label-based access control.
To set it up, first enable Memgraph Enterprise, and then run the necessary Cypher queries to set up privileges properly.
You can add an additional layer of security by enabling SSL encryption.
Data persistency
Data persistence is a crucial aspect of running a database. You want to keep your data Persistent if anything goes wrong. Memgraph stores different types of information in different places.
Here are the locations of the different types of data that Memgraph uses:
- Configuration files:
/etc/memgraph
- Logs:
/var/log/memgraph
- User-related data:
/usr/lib/memgraph
- Graph data:
/var/lib/memgraph
Based on what you want to backup, consider copying any of the listed folders to a safe location.
Backup data
Memgraph uses snapshots
and WAL files
folders for durability, they enable the database and graph to recover in case
of a crash. The snapshots and WAL files folders are good candidates for backup,
and they are part of the /var/lib/memgraph
folder.
If you want to restore the data, all you need to do is start a new Memgraph
instance with different data directories by using the --data-directory
flag
and pointing it to a backup folder.
Also, if you are using query modules and custom configuration files, make sure
to back them up as well. The query modules are stored in the
/usr/lib/memgraph
.
Copy the data persistence folder
The easiest and most reliable way to backup the graph data is to copy the whole
/var/lib/memgraph
folder.
First, run the following query in Memgraph to lock the data directory in order to avoid changes happening during the backup process
LOCK DATA DIRECTORY;
After that, copy the data persistency folder onto your local file system:
cp -rp /var/lib/memgraph /path/to/my/local-folder
Notice that the -p
flag is used to preserve the file ownership and
permissions.
In the end, unlock the data directly by running the following query in Memgraph:
UNLOCK DATA DIRECTORY;
Dump database
Another approach is to dump the database into a file. This is useful if you want to move the data between different Memgraph versions that might have incompatible data formats are used for snapshots or wal files.
To dump the database, you can use the mgconsole
CLI tool. Here is an example
of how to dump the database into a file:
echo "DUMP DATABASE;" | mgconsole --output-format=cypherl > data.cypherl
Keep in mind that the cypherl
format is used to dump the database. Whole graph
data is stored in the data.cypherl
file, this means that the import process
can take a while since the entire graph data is stored in a single file.
You should prefer copying the data persistency folder when possible over dumping the database since it is significantly faster.
Restore data
Depending on the backup method you use, you can restore data in a different way.
If you have copied the data persistency folder, you can simply copy the folder
back to the custom location and start the Memgraph instance with the
--data-directory
flag pointing to the restored folder.
The backup data folder should be owned by the Memgraph user, so make sure the folder is accessible for read and write operations by the Memgraph user.
Putting the backup folder back to /var/lib/
path is recommended since you will
avoid issues with ownership.
If you have dumped the database, you can import the data back into Memgraph using the mgconsole CLI tool or the client library. Make sure you read the best practices for such import.
Upgrading Memgraph
Keep in mind that between different versions of Memgraph, we might break the compatibility of the configurations, snapshots, and WAL files. This is done very rarely, but it is useful to keep an eye on the release notes to see if there are any breaking changes.
Set up a cluster
To create a cluster, replicate data across several instances. Setting up replication means running a couple of Memgraph instances on different machines and connecting them to each other. One of the instances will be the MAIN instance and others will be either SYNC or ASYNC replicas.
In order to set up the replication cluster, start memgraph instances on
different machines with --replication-restore-state-on-startup
flag set to
true
.
All started instances are MAIN upon starting. Two instances must be demoted to REPLICA roles because only one instance can be MAIN. To do that, run the following query on the second and third instances (REPLICA instances):
SET REPLICATION ROLE TO REPLICA WITH PORT 10000;
Once the replica instances are in REPLICA
role, you can connect them to MAIN
instance by running the following query from the MAIN instance:
REGISTER REPLICA REP1 SYNC TO "<IP_ADDRESS_REP1>";
REGISTER REPLICA REP2 ASYNC TO "<IP_ADDRESS_REP2>";
If you have trouble connecting, check your firewall and network settings.
That’s it, replication cluster with one MAIN, one SYNC REPLICA, and one ASYNC REPLICA instance is set up. To learn more about the replication Memgraph, refer to our replication docs.
Setting up the replication cluster is great if you need to replicate data, add load balancing, or improve availability. Still, to achieve high availability, you need to manage automatic failover. On the other hand, Memgraph Enterprise has a high availability feature included in the offering to ease the management of the Memgraph cluster. In such cases, the cluster consists of MAIN instances, REPLICA instances and COORDINATOR instances, which, backed up by Raft protocol, manage the cluster state.
Logging
At any point in your Memgraph instance lifecycle, you might need to check the
logs either to debug an issue or to monitor the performance. Memgraph logs are
stored in the /var/log/memgraph
folder if the default location is not changed
by --log-file
command. They are typically stored in the format
memgraph_year-month-day.log
.
You can control the log levels as described in the logs
configuration page. If you are setting up the
production environment, you should consider setting the log level to INFO
or
WARNING
to avoid the log files growing too large.
If you are experiencing some issues or you have trouble setting up the Memgraph
instance, consider setting the log level to TRACE
to get more information
about the issue.
The best way to monitor logs is to attach the logs directly to your terminal as you debug the issue. You can do that by running the following command:
tail -f /var/log/memgraph/memgraph_$(date +"%Y-%m-%d").log
Where to next?
Each of the topics covered in this guide has its own dedicated documentation pages that cover the topic in more detail. Consider reading those pages to get a better understanding of the topic. If things are unclear or you need help, feel free to reach out to us on Linux and similar topics join our Discord community.
Schedule a 30-minute session with our engineers to discuss how Memgraph fits with your architecture. Our engineers are highly experienced in helping companies of all sizes to integrate and get the most out of Memgraph in their projects. Talk to us about data modeling, optimizing queries, defining infrastructure requirements, or migrating from your existing graph database. No nonsense or sales pitch, just tech.