Deploy Memgraph on Azure Virtual Machine (VM)
This guide will show you how to deploy Memgraph on an Azure Virtual Machine. The guide will cover only the specific bits that are different from the general deployment guide you can find for native Linux or Docker deployments.
This guide assumes you have an Azure account and are familiar with the Azure portal. If you are not, you can follow the Azure Virtual Machine quick start guide documentation.
Creating a new Azure VM
The first step is to have a VM running on Azure, where you will deploy Memgraph. Since Memgraph works nicely with Linux distributions, it is recommended that Linux-based VMs be used. If there is a need for a Windows-based VM, you can use Memgraph with Docker on Windows VM.
Below are some guidelines to consider if you are opening a new Azure VM for Memgraph.
Picking the OS
During the VM creation, you must pick the OS you wish to have on the VM. If you use Memgraph native installation, you must pick a Linux distribution that Memgraph supports.
Memgraph supports multiple Linux distributions. Memgraph packages for supported distributions and versions can be downloaded from the following page
Running Memgraph natively will bring some speed improvements compared to the Docker version of Memgraph. However, deploying Memgraph with Docker is a more straightforward approach since it comes with built-in Memgraph MAGE algorithms. Memgraph MAGE contains graph algorithms and utility modules written in C++, Python and Rust. If you decided to run Memgraph natively, then you need to build MAGE from source, which requires manual work. For native deployment, check the guide on how to build Memgraph MAGE algorithms from source. If you’re trying out Memgraph for the first time, and running your own benchmarks against it, Docker is the recommended way to run Memgraph, as it accelerates the time to value.
If you are going to use the Memgraph Docker image, pick the Linux you are most familiar with.
Memgraph in Docker can be deployed both on x86 and ARM architecture. All native distributions work on x86 architecture. Some of the native distributions also work on ARM architecture (Debian, Ubuntu), some do not (Centos, Fedora). Check the direct download links for detailed information.
Picking the VM type
When creating the VM, you need to pick the VM type. If you run
Memgraph in IN_MEMORY_TRANSACTIONAL
(default mode) or in
IN_MEMORY_ANALYTICAL
storage mode, all data is stored in RAM. That means it
would be good first to calculate how much memory you will
need. There
is also an easy-to-use
calulator for approximating
memory usage. The good rule of thumb is for your server to have double the memory of what your
storage was calculated to be. You can go with less if you do not have a demanding
workload. A demanding workload would be a query that traverses half of the graph
and returns half of the graph.
Azure has specialized instances for memory-optimized workloads which are designed for in-memory databases, data analytics, and other memory-intensive applications. They provide a good cost per GB of RAM and performance for Memgraph. Once you know how much memory you need, you can pick the instance from the memory-optimized instance pools based on your data needs and budget.
Instances vary based on the supported architectures, number of CPU cores, network bandwidth, block storage, etc. All hardware specs typically scale with the instance size. Memgraph is not demanding on the rest of the hardware specs as long as there is sufficent RAM. The basic E instance familiy are good starting points for Memgraph.
If you are running Memgraph in ON_DISK_TRANSACTIONAL
storage mode, you need to
consider the instances optimized for
storage.
System configuration
Before running Memgraph, please check the system configuration guidelines, especially the
vm.max_map_count
parameter setting.
Network Setup
When creating the VM, you need to set up the network access and inbound port rules. By default, the Memgraph uses port 7687 for the Bolt protocol. You need to open this port for TCP traffic and to allow connections to Memgraph.
If you change the default Bolt port, update the inbound port rules accordingly.
Also, if deploying Memgraph for Replication or High Availability, the ports for the replication and cluster management should also be open. In replication configuration each instance needs to have an open port 10000 for the replication. In a high availability configuration, each data instance needs to have an open port of 10000 for management and a port of 20000 for replication. Each coordinator instance*, needs to have open port 12000.
All instances need to have open port 7687 for the Bolt protocol.
Setup the storage
When creating the VM, you need to setup the disk storage. By default Memgraph stores all data to working RAM, but for the persistency between restarts Memgraph uses the disk storage to store snapshots, configurations, etc.
It is recommended to use the Premium SSD for storage.
Faster storage (SSDs) can lead to speedier snapshot creation and recovery times,
which can be important on bigger scales (billion-sized graphs). Still, it is not
critical for the Memgraph operating performance. Magnetic storage can also be
used
on smaller scales.
https://learn.microsoft.com/en-us/azure/virtual-machines/disks-types
The storage size depends on the amount of data you are going to store in your
Memgraph instance and the number of active snapshots you want to keep alive.
Memgraph will periodically create snapshots of the data and store them on disk.
As the new snapshot is created, the oldest one is deleted. The number of
snapshots you want to keep alive is configurable in the Memgraph configuration
file.
The recommendation is to double the storage of the data you will store in Memgraph. If you are going to store 100GB of data in Memgraph, you should have at least 200GB of disk storage.
Installing Memgraph
Depending on the way you want to deploy Memgraph, native or via Docker, you need to follow the steps below:
If you will use the Memgraph Docker image, you need to have Docker installed on your Azure VM.
After Docker is installed, you can pull the Memgraph image and run it:
docker run -p 7687:7687 memgraph/memgraph:latest
This will run Memgraph on the default port 7687. You should be able to connect to it via Memgraph Lab or via client libraries using the Bolt protocol. If you are experiencing issues while connecting to Memgraph remotely, make sure that the port 7687 is open in the inbound roules on your Azure VM.
For more information on how to install Memgraph via Docker, you can follow the getting started guide.
Manage Memgraph deployment
After Memgraph is installed and running on your Azure VM, Memgraph management in Azure VM is identical to the general guidelines that are described in the form of a native Linux Memgraph or Docker container Memgraph.
Depending on what you are using, you can follow the Linux or Docker deployment guide for more information on how to manage the Memgraph deployment.
Where to next?
Memgraph also supports deployment in the Kubernetes environment. If you are interested in deploying Memgraph in Kubernetes, you can follow the Memgraph Kubernetes installation guide.
To discuss Azure deployment and similar topics, join our Discord community.
Schedule a 30-min session with our engineers to discuss how Memgraph fits with your architecture. Our engineers are highly experienced in helping companies of all sizes to integrate and get the most out of Memgraph in their projects. Talk to us about data modeling, optimizing queries, defining infrastructure requirements or migrating from your existing graph database. No nonsense or sales pitch, just tech.