Kubernetes

Memgraph can be deployed on Kubernetes. The easiest way to do that is with Helm, the package manager for Kubernetes. Helm uses a packaging format called charts. A chart is a collection of files that describe a related set of Kubernetes resources.

Currently, we prepared and released the following charts:

The Helm charts are published on Artifact Hub. For details on the implementation of the Helm charts, check Memgraph Helm charts repository.

Due to numerous possible use cases and deployment setups via Kubernetes, the provided Helm charts are a starting point you can modify according to your needs. This page will highlight some of the specific parts of the Helm charts that you might want to adjust.

Memgraph standalone Helm chart

Memgraph is a stateful application (database), hence the Helm chart for standalone Memgraph is configured to deploy Memgraph as a Kubernetes StatefulSet workload.

It will deploy a single Memgraph instance in a single pod.

Typically, when deploying a stateful application like Memgraph, a StatefulSet workload is used to ensure that each pod has a unique identity and stable network identity. When deploying Memgraph, it is also necessary to define a PersistentVolumeClaims to store the data directory (/var/lib/memgraph). This enables the data to be persisted even if the pod is restarted or deleted.

Storage configuration

By default, the Helm chart will create a PersistentVolumeClaim (PVC) for storage and logs. If the storage class for PVC is not defined, PVC will use the default one available in the cluster. The storage class can be configured in the values.yaml file. To avoid losing your data, make sure you have Retain reclaim policy. If you delete PersistentVolumeClaim without having Retain reclaim policy, you will lose your data because PersistentVolume will be deleted.

An example of a storage class for AWS EBS volumes:

storageClass:
  name: "gp2"
  provisioner: "kubernetes.io/aws-ebs"
  storageType: "gp2"
  fsType: "ext4"
  reclaimPolicy: "Retain"
  volumeBindingMode: "Immediate"

Default template for a storage class is part of the Helm chart and can be found in the repository.

More details on the configuration options can be found in the configuration section.

Secrets

The Helm chart allows you to use Kubernetes secrets to store Memgraph credentials. By default, the secrets are disabled. If you want to use secrets, you can enable them in the values.yaml file.

The secrets are prepared to work for environment variables MEMGRAPH_USER and MEMGRAPH_PASSWORD.

System configuration

The Helm chart will set the linux kernel vm.max_map_count parameter to 262144 by default to ensure Memgraph won’t run into issues with memory mapping.

The vm.max_map_count parameter is a kernel parameter that specifies the maximum number of memory map areas a process may have. This change will be applied to all nodes in the cluster. If you want to disable this feature, you can set sysctlInitContainer.enabled to false in the values.yaml file.

Installing Memgraph standalone Helm chart

To include a standalone Memgraph into your Kubernetes cluster, you need to add the repository and install Memgraph.

The steps below will work in the Minikube environment, but you can also use them in other Kubernetes environments with minor adjustments.

Add the repository

Add the Memgraph Helm chart repository to your local Helm setup by running the following command:

helm repo add memgraph https://memgraph.github.io/helm-charts

Make sure to update the repository to fetch the latest Helm charts available:

helm repo update

Install Memgraph

To install Memgraph Helm chart, run the following command:

helm install <release-name> memgraph/memgraph

Replace <release-name> with the name of the release you chose.

Access Memgraph

Once Memgraph is installed, you can access it using the provided services and endpoints, such as various client libraries, command-line interface mgconsole or visual user interface Memgraph Lab.

Configuration options

The following table lists the configurable parameters of the Memgraph chart and their default values.

ParameterDescriptionDefault
image.repositoryMemgraph Docker image repositorymemgraph/memgraph
image.tagSpecific tag for the Memgraph Docker image. Overrides the image tag whose default is chart version."" (Defaults to chart’s app version)
image.pullPolicyImage pull policyIfNotPresent
useImagePullSecretsOverride the default imagePullSecretsfalse
imagePullSecretsSpecify image pull secrets- name: regcred
replicaCountNumber of Memgraph instances to run. Note: no replication or HA support.1
affinity.nodeKeyKey for node affinity (Preferred)""
affinity.nodeValueValue for node affinity (Preferred)""
nodeSelectorConstrain which nodes your Memgraph pod is eligible to be scheduled on, based on the labels on the nodes. Left empty by default.{}
service.typeKubernetes service typeClusterIP
service.enableBoltEnable Bolt protocoltrue
service.boltPortBolt protocol port7687
service.boltProtocolProtocol used by BoltTCP
service.enableWebsocketMonitoringEnable WebSocket monitoringfalse
service.websocketPortMonitoringWebSocket monitoring port7444
service.websocketPortMonitoringProtocolProtocol used by WebSocket monitoringTCP
service.enableHttpMonitoringEnable HTTP monitoringfalse
service.httpPortMonitoringHTTP monitoring port9091
service.httpPortMonitoringProtocolProtocol used by HTTP monitoringhttp
service.annotationsAnnotations to add to the service{}
persistentVolumeClaim.createStorageClaimEnable creation of a Persistent Volume Claim for storagetrue
persistentVolumeClaim.storageClassNameStorage class name for the persistent volume claim""
persistentVolumeClaim.storageSizeSize of the persistent volume claim for storage10Gi
persistentVolumeClaim.existingClaimUse an existing Persistent Volume Claimmemgraph-0
persistentVolumeClaim.storageVolumeNameName of an existing Volume to create a PVC for""
persistentVolumeClaim.createLogStorageEnable creation of a Persistent Volume Claim for logstrue
persistentVolumeClaim.logStorageClassNameStorage class name for the persistent volume claim for logs""
persistentVolumeClaim.logStorageSizeSize of the persistent volume claim for logs1Gi
memgraphConfigList of strings defining Memgraph configuration settings["--also-log-to-stderr=true"]
secrets.enabledEnable the use of Kubernetes secrets for Memgraph credentialsfalse
secrets.nameThe name of the Kubernetes secret containing Memgraph credentialsmemgraph-secrets
secrets.userKeyThe key in the Kubernetes secret for the Memgraph user, the value is passed to the MEMGRAPH_USER envUSER
secrets.passwordKeyThe key in the Kubernetes secret for the Memgraph password, the value is passed to the MEMGRAPH_PASSWORDPASSWORD
memgraphEnterpriseLicenseMemgraph Enterprise License""
memgraphOrganizationNameOrganization name for Memgraph Enterprise License""
statefulSetAnnotationsAnnotations to add to the stateful set{}
podAnnotationsAnnotations to add to the pod{}
resourcesCPU/Memory resource requests/limits. Left empty by default.{}
tolerationsA toleration is applied to a pod and allows the pod to be scheduled on nodes with matching taints. Left empty by default.[]
serviceAccount.createSpecifies whether a service account should be createdtrue
serviceAccount.annotationsAnnotations to add to the service account{}
serviceAccount.nameThe name of the service account to use. If not set and create is true, a name is generated.""
container.terminationGracePeriodSecondsGrace period for pod termination1800
probes.liveliness.initialDelaySecondsInitial delay for liveliness probe10
probes.liveliness.periodSecondsPeriod seconds for liveliness probe60
probes.liveliness.failureThresholdFailure threshold for liveliness probe3
probes.readiness.initialDelaySecondsInitial delay for readiness probe10
probes.readiness.periodSecondsPeriod seconds for readiness probe30
probes.readiness.failureThresholdFailure threshold for readiness probe3
probes.startup.initialDelaySecondsInitial delay for startup probe10
probes.startup.periodSecondsPeriod seconds for startup probe10
probes.startup.failureThresholdFailure threshold for startup probe30
nodeSelectorsNode selectors for pod. Left empty by default.{}
customQueryModulesList of custom Query modules that should be mounted to Memgraph Pod[]
sysctlInitContainer.enabledEnable the init container to set sysctl parameterstrue
sysctlInitContainer.maxMapCountValue for vm.max_map_count to be set by the init container262144
storageClass.nameName of the StorageClass"memgraph-generic-storage-class"
storageClass.provisionerProvisioner for the StorageClass""
storageClass.storageTypeType of storage for the StorageClass""
storageClass.fsTypeFilesystem type for the StorageClass""
storageClass.reclaimPolicyReclaim policy for the StorageClassRetain
storageClass.volumeBindingModeVolume binding mode for the StorageClassImmediate

To change the default chart values, provide your own values.yaml file during the installation:

helm install <resource-name> memgraph/memgraph -f values.yaml

Default chart values can also be changed by setting the values of appropriate parameters:

helm install <resource-name> memgraph/memgraph --set <flag1>=<value1>,<flag2>=<value2>,...

Memgraph will start with the --also-log-to-stderr=true flag, meaning the logs will also be written to the standard error output and you can access logs using the kubectl logs command. To modify other Memgraph database settings, you should update the memgraphConfig parameter. It should be a list of strings defining the values of Memgraph configuration settings. For example, this is how you can define memgraphConfig parameter in your values.yaml:

memgraphConfig: 
  - "--also-log-to-stderr=true"
  - "--log-level=TRACE"

For all available database settings, refer to the Configuration settings reference guide.

Memgraph high availability Helm chart

A Helm chart for deploying Memgraph in high availability (HA) setup. This helm chart requires Memgraph Enterprise license.

Memgraph HA cluster includes 3 coordinators and 2 data instances by default. Since multiple Memgraph instances are used, it is advised to use multiple workers nodes in Kubernetes. Our advice is that each Memgraph instance gets on its own node. The size of nodes on which data pods will reside depends on the computing power and the memory you need to store data. Coordinator nodes can be smaller and machines with basic requirements met (8-16 GB of RAM) will be enough.

Installing the Memgraph HA Helm chart

To include Memgraph HA cluster as a part of your Kubernetes cluster, you need to add the repository and install Memgraph.

Add the repository

Add the Memgraph Helm chart repository to your local Helm setup by running the following command:

helm repo add memgraph https://memgraph.github.io/helm-charts

Make sure to update the repository to fetch the latest Helm charts available:

helm repo update

Install Memgraph HA

Since Memgraph HA requires an Enterprise license, you need to provide the license and organization name during the installation.

helm install <release-name> memgraph/memgraph-high-availability --set env.MEMGRAPH_ENTERPRISE_LICENSE=<your-license>,env.MEMGRAPH_ORGANIZATION_NAME=<your-organization-name>

Replace <release-name> with a name of your choice for the release and set the Enterprise license.

Changing the default chart values

To change the default chart values, run the command with the specified set of flags:

helm install <resource-name> memgraph/memgraph-high-availability --set <flag1>=<value1>,<flag2>=<value2>,...

Or you can modify a values.yaml file and override the desired values:

helm install <resource-name> memgraph/memgraph-high-availability -f values.yaml

Upgrade Helm chart

To upgrade the helm chart you can use:

helm upgrade <release-name> memgraph/memgraph-high-availability --set <flag1>=<value1>,<flag2>=<value2>

Again it is possible use both --set and values.yaml to set configuration options.

Uninstall Helm chart

Uninstallation is done with:

helm uninstall <release-name>

Uninstalling the chart won’t trigger deletion of persistent volume claims (PVCs) which means that even if the reclaim policy of the default storage class is set to ‘Delete’ (which often is), the data from persistent volumes (PVs) won’t be lost. However, we advise users to set the reclaim policy to ‘Retain’ as suggessted in the Storage chapter.

Security context

All instances are started as StatefulSet with one pod. The pod has two or three containers depending on whether the sysctlInitContainer.enabled is used. The init container is used to set permissions on volume mounts. It is used as root user with CHOWN capability and without privileged access. The memgraph-coordinator container is the one which actually runs Memgraph image. The process is run by non-root memgraph user without any Linux capabilities. Privileges cannot escalate.

High availability storage

Memgraph HA chart will always use persistent volume claims (PVCs) for storing data and logs. By default they will use 1Gi so most likely you will need to set this parameter to a higher value. Access mode is by default set to ReadWriteOnce (RWO) and can be changed to one of values: ReadWriteOnce, ReadOnlyMany, ReadWriteMany and ReadWriteOncePod. All PVCs will use a default storage class. You can use parameters libStorageClassName and logStorageClassName to specify which storage classes will be used for storing data and logs on data and coordinator instances. Default storage classes usually have a reclaim policy set to Delete which means that if you delete PVCs, attached PVs will also get deleted. We advise users to set the reclaim policy to Retain One of the ways to achieve this is to run the following script after starting your cluster:

#!/bin/bash
 
# Get all Persistent Volume names
PVS=$(kubectl get pv --no-headers -o custom-columns=":metadata.name")
 
# Loop through each PV and patch it
for pv in $PVS; do
  echo "Patching PV: $pv"
  kubectl patch pv $pv -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
done
 
echo "All PVs have been patched."

Network configuration

All instances which are part of Memgraph HA cluster use the internal Cluster IP network for communicating between themselves. By default, management port is on all instances opened on 10000, replication port on data instances is opened on 20000 while coordinator port is opened on 12000. You can change this configuration by specifying ports.managementPort, ports.replicationPort and ports.coordinatorPort.

External network configuration

Since Memgraph HA works with client-side routing, the resolution of all DNS entries happens outside internal Cluster IP network. Because of that, we need one more type of network which will be used for clients accessing instances inside the cluster.

  • IngressNginx: Using only one load balancer you can access all instances from outside.
  • NodePort: The cheapest option because it will open port on each node. For this to work, you need to enable public IPs on each node.
  • LoadBalancer: This option will use one load balancer for each instance in the cluster. Load balancer is mostly interesting here because it comes with public IP out of the box.

For coordinators, there is an additional option of using CommonLoadBalancer. In this scenario, there is one load balancer sitting in front of coordinators. You can save the cost of two load balancers compared to LoadBalancer option since usually you don’t need to distinguish specific coordinators while using Memgraph capabilities. The default Bolt port is opened on 7687 but you can change it by setting ports.boltPort.

Read more in the guide for installing Memgraph HA with IngressNginx.

Node affinity

Since HA Memgraph deploys multiple pods, you can control on which nodes pods get distributed in the cluster.

The Memgraph HA Helm chart provides the following node affinity options:

  • default: Default affinity will try to schedule the data pods and coordinator pods on the nodes where there is no other pod with the same role. If there is no such node, the pods will still be scheduled on the same node, and deployment will not fail.
  • unique: This is achieved with the affinity.unique set to true. This option will try to deploy the data pods and coordinator pods on different nodes in the cluster so that each pod is on a unique node. If there are no sufficient nodes, this deployment will fail.
  • parity: This is achieved with the affinity.parity set to true. This option will try to deploy the data pods and coordinator pods on the same node with maximum one coordinator and one data pod on the node. If there are no sufficient nodes to deploy the pods, this deployment will fail. Coordinators get scheduled first. After that, data pods are looking for the nodes with coordinators.
  • nodeSelection: This is achieved with the affinity.nodeSelection set to true. This option will try to deploy the data pods and coordinator pods on the nodes with specific labels. You can set the labels with the affinity.dataNodeLabelValue and affinity.coordinatorNodeLabelValue parameters. If all the nodes with labels are occupied by the pods with the same role, the deployment will fail.

During the usage of nodeSelection affinity, make sure that the nodes are properly labeled, the default key for the role label is role, and default values are data-node and coordinator-node. Labels can be added to nodes using the kubectl label nodes <node-name> <key>=<value> command. Here is an example of how to deploy Memgraph HA cluster in AKS.

Sysctl options

You can use sysctlInitContainer configuration parameter to increase the vm_max_map_count which is necessary for high memory loads in Memgraph.

Authentication

By default, there is no user or password configured for Memgraph instances. You can use secrets configuration parameter to create the user with the password.

Setting up the cluster

Although there are many configuration options you can use to set up HA cluster (especially for networking), the set-up process stays always the same:

1. Provision the cluster

2. Label nodes based on the affinity strategy

3. Install memgraph/memgraph-high-availability chart using helm install

4. Install any other auxiliary resources, e.g ingress-nginx if you are using that type of external network configuration.

5. Connect instances to form the cluster

The last step, connecting instances, needs to be done manually to provide information to instances about their external addresses on which they are available. Otherwise, client-side routing wouldn’t work. To connect instances, we use the following queries:

ADD COORDINATOR 1 WITH CONFIG {"bolt_server": "<bolt-server-coord1>", "management_server":  "memgraph-coordinator-1.default.svc.cluster.local:10000", "coordinator_server":  "memgraph-coordinator-1.default.svc.cluster.local:12000"};
ADD COORDINATOR 2 WITH CONFIG {"bolt_server": "<bolt-server-coord2>", "management_server":  "memgraph-coordinator-2.default.svc.cluster.local:10000", "coordinator_server":  "memgraph-coordinator-2.default.svc.cluster.local:12000"};
ADD COORDINATOR 3 WITH CONFIG {"bolt_server": "<bolt-server-coord3>", "management_server":  "memgraph-coordinator-3.default.svc.cluster.local:10000", "coordinator_server":  "memgraph-coordinator-3.default.svc.cluster.local:12000"};
REGISTER INSTANCE instance_0 WITH CONFIG {"bolt_server": "<bolt-server-instance0>", "management_server": "memgraph-data-0.default.svc.cluster.local:10000", "replication_server": "memgraph-data-0.default.svc.cluster.local:20000"};
REGISTER INSTANCE instance_1 WITH CONFIG {"bolt_server": "<bolt-server-instance1>", "management_server": "memgraph-data-1.default.svc.cluster.local:10000", "replication_server": "memgraph-data-1.default.svc.cluster.local:20000"};
SET INSTANCE instance_1 TO MAIN;

Note that the only part which you need to change from the above template is the Bolt server address. The value of the Bolt server address depends on the type of the external network you are using for accessing instances from outside.

Using Memgraph HA chart with IngressNginx resource

The most cost-friendly way to manage a Memgraph HA cluster in K8s is using a IngressNginx contoller. This controller is capable of routing TCP messages on Bolt level protocol to the K8s Memgraph services. To achieve this, it uses only a single LoadBalancer which means there is only a single external IP for connecting to the cluster. Users can connect to any coordinator or data instance by distinguishing bolt ports. The only step required is to install Memgraph HA chart:

helm install mem-ha-test ./charts/memgraph-high-availability --set \
env.MEMGRAPH_ENTERPRISE_LICENSE=<licence>,\
env.MEMGRAPH_ORGANIZATION_NAME=<organization>,affinity.nodeSelection=true,\
externalAccessConfig.dataInstance.serviceType=IngressNginx,externalAccessConfig.coordinator.serviceType=IngressNginx

The chart will also install IngressNginx automatically with all required configuration.

Configuration options

The following table lists the configurable parameters of the Memgraph HA chart and their default values.

ParameterDescriptionDefault
image.repositoryMemgraph Docker image repositorymemgraph/memgraph
image.tagSpecific tag for the Memgraph Docker image. Overrides the image tag whose default is chart version.2.22.0
image.pullPolicyImage pull policyIfNotPresent
env.MEMGRAPH_ENTERPRISE_LICENSEMemgraph enterprise license<your-license>
env.MEMGRAPH_ORGANIZATION_NAMEOrganization name<your-organization-name>
storage.libPVCSizeSize of the storage PVC1Gi
storage.libStorageClassNameThe name of the storage class used for storing data.""
storage.libStorageAccessModeAccess mode used for lib storage.ReadWriteOnce
storage.logPVCSizeSize of the log PVC1Gi
storage.logStorageClassNameThe name of the storage class used for storing logs.""
storage.logStorageAccessModeAccess mode used for log storage.ReadWriteOnce
externalAccess.coordinator.serviceTypeIngressNginx, NodePort, CommonLoadBalancer or LoadBalancer.NodePort
externalAccess.dataInstance.serviceTypeIngressNginx, NodePort or LoadBalancer.NodePort
ports.boltPortBolt port used on coordinator and data instances.7687
ports.managementPortManagement port used on coordinator and data instances.10000
ports.replicationPortReplication port used on data instances.20000
ports.coordinatorPortCoordinator port used on coordinators.12000
affinity.uniqueSchedule pods on different nodes in the clusterfalse
affinity.paritySchedule pods on the same node with maximum one coordinator and one data nodefalse
affinity.nodeSelectionSchedule pods on nodes with specific labelsfalse
affinity.roleLabelKeyLabel key for node selectionrole
affinity.dataNodeLabelValueLabel value for data nodesdata-node
affinity.coordinatorNodeLabelValueLabel value for coordinator nodescoordinator-node
dataConfiguration for data instancesSee data section
coordinatorsConfiguration for coordinator instancesSee coordinators section
sysctlInitContainer.enabledEnable the init container to set sysctl parameterstrue
sysctlInitContainer.maxMapCountValue for vm.max_map_count to be set by the init container262144
secrets.enabledEnable the use of Kubernetes secrets for Memgraph credentialsfalse
secrets.nameThe name of the Kubernetes secret containing Memgraph credentialsmemgraph-secrets
secrets.userKeyThe key in the Kubernetes secret for the Memgraph user, the value is passed to the MEMGRAPH_USER env.USER
secrets.passwordKeyThe key in the Kubernetes secret for the Memgraph password, the value is passed to the MEMGRAPH_PASSWORD.PASSWORD

For the data and coordinators sections, each item in the list has the following parameters:

ParameterDescriptionDefault
idID of the instance0 for data, 1 for coordinators
argsList of arguments for the instanceSee args section

The args section contains a list of arguments for the instance.

For all available database settings, refer to the configuration settings docs.

Memgraph Lab Helm chart

A Helm chart for deploying Memgraph Lab on Kubernetes.

Installing the Memgraph Lab Helm chart

To install the Memgraph Lab Helm chart, follow the steps below:

helm install <release-name> memgraph/memgraph-lab

Replace <release-name> with a name of your choice for the release.

Changing the default chart values

To change the default chart values, run the command with the specified set of flags:

helm install <resource-name> memgraph/memgraph-lab --set <flag1>=<value1>,<flag2>=<value2>,...

Or you can modify a values.yaml file and override the desired values:

helm install <resource-name> memgraph/memgraph-lab -f values.yaml

Configuration options

The following table lists the configurable parameters of the Memgraph Lab chart and their default values.

ParameterDescriptionDefault
image.repositoryMemgraph Lab Docker image repositorymemgraph/memgraph-lab
image.tagSpecific tag for the Memgraph Lab Docker image. Overrides the image tag whose default is chart version."" (Defaults to chart’s app version)
image.pullPolicyImage pull policyIfNotPresent
replicaCountNumber of Memgraph Lab instances to run.1
service.typeKubernetes service typeClusterIP
service.portKubernetes service port3000
service.targetPortKubernetes service target port3000
service.protocolProtocol used by the serviceTCP
service.annotationsAnnotations to add to the service{}
podAnnotationsAnnotations to add to the pod{}
resourcesCPU/Memory resource requests/limits. Left empty by default.{} (See note on uncommenting)
serviceAccount.createSpecifies whether a service account should be createdtrue
serviceAccount.annotationsAnnotations to add to the service account{}
serviceAccount.nameThe name of the service account to use. If not set and create is true, a name is generated.""

Memgraph Lab can be further configured with environment variables in your values.yaml file.

env:
  - name: QUICK_CONNECT_MG_HOST
    value: memgraph
  - name: QUICK_CONNECT_MG_PORT
    value: "7687"
  - name: KEEP_ALIVE_TIMEOUT_MS
    value: 65000

In case you added Nginx Ingress service or web server for a reverse proxy, update the following proxy timeout annotations to avoid potential timeouts:

proxy_read_timeout X;
proxy_connect_timeout X;
proxy_send_timeout X;

where X is the number of seconds the connection (request query) can be alive. Additionally, update the Memgraph Lab KEEP_ALIVE_TIMEOUT_MS environment variable to a higher value to ensure that Memgraph Lab stays connected to Memgraph when running queries over 65 seconds.

Refer to the Memgraph Lab documentation for details on how to connect to and interact with Memgraph.