Debugging Memgraph

Why enable user debugging?

User-driven debugging helps improve Memgraph’s performance and reliability by providing diagnostic data from your environment. This data assists us in reproducing and resolving issues faster, especially for bugs that are hard to replicate.

To help with this, our containers come equipped with user-friendly debugging tools, empowering you to identify and report problems more effectively.

Chose the right debug image

Memgraph provides Docker images built in RelWithDebInfo mode, including tools like perf, gdb and pgrep. This image is about 10% slower but enables detailed debugging.

To pull a debug image:

docker image pull <memgraph_version>-relwithdebinfo

For Memgraph MAGE:

docker image pull <MAGE_version>-memgraph-<memgraph_version>-relwithdebinfo

All the images in the RelWithDebInfo build mode have the suffix -relwithdebinfo.

Run Memgraph in debug mode

Run the Memgraph container in privileged mode to allow debugging tools like gdb and perf to function:

Below, we can see an example command of how to run a Memgraph container in the privileged mode:

docker container run --name mg --privileged -p 7687:7687 -p 9091:9091 memgraph --log-level=TRACE --also-log-to-stderr

Accessing the container

All debugging is performed inside the container. To enter the container, you need to execute the following command.

docker container exec -it -u root memgraph bash

The -u root command is there to enable root privileges inside the container, necessary for running debugging tools.

Debugging tools overview

Memgraph supports the following debug capabilities:

Using GDB: Attaching Memgraph with GDB and inspecting threads
Generating a core dump after Memgraph crashed
Using perf to identify performance bottlenecks

Using GDB

GDB and pgrep are already installed packages in the Memgraph container that has the debug symbols. Since Memgraph is already running there at port 7687, you can attach GDB to your running Memgraph with the following command:

gdb -p $(pgrep memgraph)

Most likely, the Memgraph process will have the PID number 1, but for certainty, we use pgrep.

Useful GDB commands:

Name	Description
`CTRL + C`	Pausing execution.
`c`	Continuing execution.
`info thread`	List of all executing threads.
`t <x>`	Positioning in a specific thread with number `x`.
`bt`	Prints the backtrace of a thread.
`bt full`	Prints the backtrace with extra information.
`frame <x>`	Positioning in a particular point in the backtrace with number `x`.
`list`	Prints the 10 lines above and below the source code of the frame.
`up`	Go 1 frame up in the backtrace.
`down`	Go 1 frame down in the backtrace.
`info locals`	Prints local variables of the current frame.
`info args`	Prints function arguments of the current frame.
`print $local_var`	Prints the value of the variable `$local_var`.

In Memgraph, we usually want to see first what are all the threads currently in place. We do that by issuing:

info thread

By identifying a certain thread with a code that could belong to the Memgraph repository, we can issue the command

t <x>

where x is the specific thread number.

Seeing the backtrace can be done with the command

bt

Generating core dump via Docker

In order to generate a core dump, you need to do a few steps on the host and in the container image.

Set no size limit to the core dump

Initially, you will need to set no boundaries to the size of the core dump that can be generated. This is done with the following command:

ulimit -c unlimited

Mount the correct volume

When Memgraph crashes, we would want to get the present core dump file on our host system. When starting the container, we will provide the appropriate volume. Additionally, don’t forget to set the --privileged flag as noted in the previous sections.

docker container run --name mg --privileged -v /home/user/cores:/tmp/cores -p 7687:7687 -p 9091:9091 memgraph:2.16.0_17_050d5c985 --log-level=TRACE --also-log-to-stderr

Set up the container

Additionally, in the container, the following commands will need to be executed after the container has started, to be able to generate a correct core dump.

ulimit -c unlimited
mkdir -p /tmp/cores
chmod a+rwx /tmp/cores
echo "/tmp/cores/core.%e.%p.%h.%t" > /proc/sys/kernel/core_pattern

When Memgraph crashes, a core dump will be generated, and you will see it on the host system if you have mounted the volume correctly.

Inspecting the container

The container will need to be started again since we want the same debug symbols to be present, and using an identical image is the most proper way for that. However, we don’t need now the Memgraph process at port 7687, so we will ignore it.

You will need to copy the core dump file into the container with the docker cp command.

After logging into the container with the root credentials:

docker container exec -it -u root memgraph bash

we will execute GDB with the core dump file provided

gdb /usr/lib/memgraph/memgraph --core=/core.memgraph.file

where the core.memgraph.file is the name of your core dump file. Possibly, appropriate permissions will need to be set on the core dump file. You can check the list of useful GDB commands in the above sections.

To find out more about setting core dumps, you can check this article.

Generating core dump via Docker Compose

The setup with Docker Compose is similar to Docker. You will need to bind the volume, run Memgraph in privileged mode, and make sure you set no size limit on the generated core dump.

Below we can see an example Docker Compose file which can generate a core dump:

services:
  memgraph:
    image: memgraph:2.16.0_17_050d5c985
    container_name: mg
    privileged: true
    ports:
      - "7687:7687"
      - "7444:7444"
      - "9091:9091"
    volumes:
      - /home/josipmrden/cores:/tmp/cores
    command: ["--log-level=TRACE", "--also-log-to-stderr=true"]
    ulimits:
      core:
        hard: -1
        soft: -1
 
  lab:
    image: memgraph/lab:latest
    container_name: memgraph-lab
    ports:
      - "3000:3000"
    depends_on:
      - memgraph
    environment:
      - QUICK_CONNECT_MG_HOST=memgraph
      - QUICK_CONNECT_MG_PORT=7687

Profiling with `perf`

Perfing is the most common operation that is run when Memgraph is hanging or performing slowly.

In the next steps, the instructions are provided on how to check which parts of Memgraph are stalling during query execution, so we can use the information to make the system better.

Prior to performing perf instructions, you will need to bound the Memgraph binary to the local filesystem. You can start Memgraph with the volume binded like this:

docker container run --name mg -p 7687:7687 --privileged -v memgraph-binary:/usr/lib/memgraph <memgraph_image> --log-level=TRACE --also-log-to-stderr

Record performance data

Perfing Memgraph Inside the container, we will need to perf the system. That is done using the following command.

perf record -p $(pgrep memgraph) --call-graph dwarf sleep 5

The command will perf the Memgraph process for 5 seconds. Of course, you can tweak the number by yourself. The command will generate a file called perf.data in the directory you have performed the command in the container

Install hotspot (GUI tool)

We will need to install a GUI tool called hotspot, that will help us generate a flamegraph. We can install hotspot on the host machine by issuing the command:

apt install hotspot

If your machine does not support APT, please check the hotspot repo and follow the installation steps.

Copy `perf.data` to host

After we have perfed the data, we need to get the perf information from the container to the host. That is done with the following command on the host

docker cp <container_id>:<path_to_perf.data> .

where the container_id is the Memgraph container ID, and the path_to_perf.data is the absolute path in the container to the generated perf.data file.

Link debug symbols

For hotspot to be able to identify debug symbols and draw the flamegraph, it needs the path to the debug symbols to be the exact one as in the container. In the container it is the /usr/lib/memgraph/memgraph binary, so we will need to make a symbolic link from the container volume to the host system:

ln -s <path_to_docker_volume_debug_symbols_binary> /usr/lib/memgraph/memgraph

Open flamegraph

If you did everything correctly, by starting hotspot

hotspot perf.data

you should be able to see a similar flamegraph like in the picture below.

Debugging Memgraph under Kubernetes (k8s)

General commands

To being with, the master of all kubectl commands is:

kubectl get all

Managing nodes:

kubectl get nodes --show-labels # Show all nodes and their labels.
kubectl get nodes -o wide       # Show additional information about the nodes.
kubectl top nodes               # Get the current memory usage.

Managing pods:

kubectl get pods --show-labels               # Show all pods and their labels.
kubectl get pods -o wide                     # Inspect how pods get scheduled.
kubectl describe pod <pod-name>              # Inspect pod config (args, envs, ...).
kubectl get pod <pod-name> -o yaml           # Get pod yaml config.
kubectl exec -it <pod-name> -- /bin/bash     # Login to a running pod.
kubectl logs <pod-name>                      # Get logs for a running pod.
kubectl logs memgraph-data-0-0 | tail -n 100 # Filter last logs from a running pod.
kubectl logs --previous <pod-name>           # Get logs from a crashed pod.
kubectl logs <pod-name> -c <container-name>  # Get logs from a specific pod, e.g., debugging init containers.
kubectl cp <pod-name>:<pod-path> .           # Copy logs from a running pod.

Events:

kubectl get events --all-namespaces  --sort-by='.metadata.creationTimestamp' # List all events by creation time.
kubectl get events --namespace <namespace-name>                              # List all events in the given namespace.

Cluster:

kubectl port-forward <pod-name> <host-port>:<pod-port> # Forward/connect port on host to the pod port.
kubectl cluster-info dump                              # Dump current cluster state to stdout.

StatefulSets:

kubectl get statefulsets                  # Show all StatefulSets.
kubectl get pvc                           # Get all PersistentVolumeClaims.
kubectl get pvc -l app=<statefulset-name> # Get the PersistentVolumeClaims for the StatefulSet.

Debugging Memgraph pods

To use gdb inside a Kubernetes pod, the container must run in privileged mode. To run any given container in the privileged mode, the k8s cluster itself needs to have an appropriate configuration.

Below is an example on how to start the privileged kind cluster.

Create a privileged kind cluster

First, create new config debug-cluster.yaml file with allow-privileged enabled.

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
  - role: control-plane
    image: kindest/node:v1.31.0
    extraPortMappings:
      - containerPort: 80
        hostPort: 8080
        protocol: TCP
    kubeadmConfigPatches:
      - |
        kind: ClusterConfiguration
        kubeletConfiguration:
          extraArgs:
            allow-privileged: "true"
# To inspect the cluster run `kubectl get pods -n kube-system`.
# If some of the pods is in the CrashLoopBackOff status, try running `kubectl
# logs <pod-name> -n kube-system` to get the error message.

To start the cluster, execute the following command:

kind create cluster --name <cluster-name> --config debug-cluster.yaml

Deploy a debug pod

Once cluster is up and running, create a new debug-pod.yaml file with the following content:

apiVersion: v1
kind: Pod
metadata:
  name: debug-pod
spec:
  containers:
  - name: my-container
    image: memgraph/memgraph:3.2.0-relwithdebinfo # Use the latest, but make sure it's the relwithdebinfo one!
    securityContext:
      runAsUser: 0  # Runs the container as root.
      privileged: true
      capabilities:
        add: ["SYS_PTRACE"]
      allowPrivilegeEscalation: true
    command: ["sleep"]
    args: ["infinity"]
    stdin: true
    tty: true

To get the pod up and running and open a shell inside it run:

kubectl apply -f debug-pod.yaml
kubectl exec -it debug-pod -- bash

Once you are in the pod execute:

apt-get update && apt-get install -y gdb
su memgraph
gdb --args ./memgraph <memgraph-flags>
run

Once you have memgraph up and running under gdb, run your workload (insert data, write or queries…). When you manage to recreate the issue, use the gdb commands to pin point the exact issue.

Delete the debug pod

To delete the debug pod run:

kubectl delete pod debug-pod

k8s official documentation on how to debug running pods is quite detailed.

Handling core dumps

When Memgraph crashes, for example, due to segmentation faults (SIGSEGV), core dumps can provide invaluable insight for debugging. The Memgraph Helm charts provide an easy way to enable persistent core dump storage using the createCoreDumpsClaim option.

To enable core dumps, create a values.yaml file with at least the following setting:

createCoreDumpsClaim: true

Setting this value to true will also enable the use of GDB inside Memgraph containers when using our provided charts.

Feel free to copy values file from the helm-charts repository as a base, since additional required fields may be missing from a minimal config.

This instructs the Helm chart to create a PersistentVolumeClaim (PVC) to store core dumps generated by the Memgraph process.

Important configuration notes

By default the storage size is 10GiB. Core dumps can be as large as your node’s total RAM, so it’s recommended to set this explicitly and make sure to adjust the coreDumpsStorageSize under values.yaml file.

Make sure to use the relwithdebinfo image of Memgraph by setting the image.tag also under values.yaml file.

Run the following command to install Memgraph with the debugging configuration:

helm install my-release memgraph/memgraph -f values.yaml

The core dumps are written to a mounted volume inside the container (the default is /var/core/memgraph, it’s possible to tweak that by changing the coreDumpsMountPath under values.yaml). You can use kubectl exec or kubectl cp to access the files for post-mortem analysis.

If you have k8s cluster under any major cloud provider + you want to store the dumps under S3, probably the best repo to check out is the core-dump-handler.

Specific cloud provider instructions

The k8s quick reference is an amazing set of commands!

Configuration Enabling Memgraph Enterprise