Debugging Memgraph by yourself

Why do we expose users with the capability of debugging Memgraph?

In the process of enhancing Memgraph's performance and reliability, user-driven debugging plays a crucial role. Given the complex nature of software environments, it is often challenging for our team to address all issues simultaneously. Moreover, bugs can exhibit behaviors that are difficult to replicate on different systems.

By conducting initial debugging, users can provide us with valuable diagnostic data from their specific environment, which is essential for reproducing and pinpointing the root cause of issues. To facilitate this, we have equipped our container with a suite of user-friendly tools designed to assist in the debugging process.

These tools not only empower users to identify and report problems more accurately but also expedite the resolution process, enhancing overall system stability and performance.

Which Memgraph image can I use to debug Memgraph?

Memgraph offers Docker images specifically tailored for debugging, with Memgraph built in the RelWithDebInfo mode. These containers are equipped with crucial debugging tools such as perf, gdb, and pgrep, along with other useful apt packages.

This setup is intended to empower users to conduct thorough investigations into issues such as slow performance, system hangs, or unusually high memory usage. Although the RelWithDebInfo build mode of Memgraph comes with some performance degradations (~10% slower performance overall), this approach contributes to more stable and efficient system performance in the long run.

After identification of all the issues and moving to production, users are encouraged to switch to Release mode, with the default Memgraph images, to make sure they gain the extra performance, from the now fast and reliable system.

Memgraph in the RelWithDebInfo mode can be downloaded from DockerHub with the following command:

docker image pull <memgraph_version>-relwithdebinfo

If you are using Memgraph MAGE, the following command should do

docker image pull <MAGE_version>-memgraph-<memgraph_version>-relwithdebinfo

Where memgraph_version and MAGE_version are the respective versions of the image. All the images in the RelWithDebInfo build mode have the suffix -relwithdebinfo.

I have the Memgraph image, is there anything specific I should do to run Memgraph?

Yes, Memgraph container needs to be run in the --privileged mode. Privileged mode is an option of the Docker ecosystem, not Memgraph. It will allow for gdb and perf tools to run properly.

Below, we can see an example command of how to run a Memgraph container in the privileged mode:

docker container run --name mg --privileged -p 7687:7687 -p 9091:9091 memgraph --log-level=TRACE --also-log-to-stderr

I have run Memgraph container, where should I perform the debugging?

All debugging is performed inside the container. To enter the container, you need to execute the following command.

docker container exec -it -u root memgraph bash

The -u root command is there to enable root privileges inside the container, necessary for running debugging tools.

What are all the debugging capabilities I am equipped with?

Memgraph supports the following debug capabilities:

  1. Attaching Memgraph with GDB and inspecting threads
  2. Generating a core dump after Memgraph crashed
  3. Using perf to identify performance bottlenecks

we will be going through each of them in detail.

1. Attaching Memgraph with GDB

GDB and pgrep are already installed packages in the Memgraph container that has the debug symbols. Since Memgraph is already running there at port 7687, you can attach GDB to your running Memgraph with the following command:

gdb -p $(pgrep memgraph)

Most likely, the Memgraph process will have the PID number 1, but for certainty, we use pgrep.

List of useful commands when in GDB

NameDescription
CTRL + CPausing execution.
cContinuing execution.
info threadList of all executing threads.
t <x>Positioning in a specific thread with number x.
btPrints the backtrace of a thread.
bt fullPrints the backtrace with extra information.
frame <x>Positioning in a particular point in the backtrace with number x.
listPrints the 10 lines above and below the source code of the frame.
upGo 1 frame up in the backtrace.
downGo 1 frame down in the backtrace.
info localsPrints local variables of the current frame.
info argsPrints function arguments of the current frame.
print $local_varPrints the value of the variable $local_var.

Most commonly used commands.

In Memgraph, we usually want to see first what are all the threads currently in place. We do that by issuing:

info thread

By identifying a certain thread with a code that could belong to the Memgraph repository, we can issue the command

t <x>

where x is the specific thread number.

Seeing the backtrace can be done with the command

bt

2. Generating a core dump with Memgraph

Generating a core dump with a plain Docker image

In order to generate a core dump, you need to do a few steps on the host and in the container image.

  1. Setting no size limit to the core dump Initially, you will need to set no boundaries to the size of the core dump that can be generated. This is done with the following command:
ulimit -c unlimited
  1. Mounting the correct volume When Memgraph crashes, we would want to get the present core dump file on our host system. When starting the container, we will provide the appropriate volume. Additionally, don't forget to set the --privileged flag as noted in the previous sections.
docker container run --name mg --privileged -v /home/user/cores:/tmp/cores -p 7687:7687 -p 9091:9091 memgraph:2.16.0_17_050d5c985 --log-level=TRACE --also-log-to-stderr
  1. Setting up the container for core dump generation Additionally, in the container, the following commands will need to be executed after the container has started, to be able to generate a correct core dump.
ulimit -c unlimited
mkdir -p /tmp/cores
chmod a+rwx /tmp/cores
echo "/tmp/cores/core.%e.%p.%h.%t" > /proc/sys/kernel/core_pattern

When Memgraph crashes, a core dump will be generated, and you will see it on the host system if you have mounted the volume correctly.

  1. Starting Memgraph again and inspecting the core dump with GDB The container will need to be started again since we want the same debug symbols to be present, and using an identical image is the most proper way for that. However, we don't need now the Memgraph process at port 7687, so we will ignore it.

You will need to copy the core dump file into the container with the docker cp command.

After logging into the container with the root credentials:

docker container exec -it -u root memgraph bash

we will execute GDB with the core dump file provided

gdb /usr/lib/memgraph/memgraph --core=/core.memgraph.file

where the core.memgraph.file is the name of your core dump file. Possibly, appropriate permissions will need to be set on the core dump file. You can check the list of useful GDB commands in the above sections.

To find out more about setting core dumps, you can check this article (opens in a new tab).

Generating a core dump with Docker Compose

The setup with Docker Compose is similar to Docker. You will need to bind the volume, run Memgraph in privileged mode, and make sure you set no size limit on the generated core dump.

Below we can see an example Docker Compose file which can generate a core dump:

services:
  memgraph:
    image: memgraph:2.16.0_17_050d5c985
    container_name: mg
    privileged: true
    ports:
      - "7687:7687"
      - "7444:7444"
      - "9091:9091"
    volumes:
      - /home/josipmrden/cores:/tmp/cores
    command: ["--log-level=TRACE", "--also-log-to-stderr=true"]
    ulimits:
      core:
        hard: -1
        soft: -1
 
  lab:
    image: memgraph/lab:latest
    container_name: memgraph-lab
    ports:
      - "3000:3000"
    depends_on:
      - memgraph
    environment:
      - QUICK_CONNECT_MG_HOST=memgraph
      - QUICK_CONNECT_MG_PORT=7687

3. Using perf to identify performance bottlenecks

Perfing is the most common operation that is run when Memgraph is hanging or performing slowly.

In the next steps, the instructions are provided on how to check which parts of Memgraph are stalling during query execution, so we can use the information to make the system better.

Prior to performing perf instructions, you will need to bound the Memgraph binary to the local filesystem. You can start Memgraph with the volume binded like this:

docker container run --name mg -p 7687:7687 --privileged -v memgraph-binary:/usr/lib/memgraph <memgraph_image> --log-level=TRACE --also-log-to-stderr
  1. Perfing Memgraph Inside the container, we will need to perf the system. That is done using the following command.
perf record -p $(pgrep memgraph) --call-graph dwarf sleep 5

The command will perf the Memgraph process for 5 seconds. Of course, you can tweak the number by yourself. The command will generate a file called perf.data in the directory you have performed the command in the container

  1. Installing hotspot as the GUI tool on the host On the outside, we will need to install a GUI tool called hotspot, that will help us generate a flamegraph (opens in a new tab). We can install hotspot on the host machine by issuing the command:
apt install hotspot

if your machine does not support APT, please check the hotspot repo (opens in a new tab) and follow the installation steps.

  1. Transferring perf.data to host After we have perfed the data, we need to get the perf information from the container to the host. That is done with the following command on the host
docker cp <container_id>:<path_to_perf.data> .

where the container_id is the Memgraph container ID, and the path_to_perf.data is the absolute path in the container to the generated perf.data file.

  1. Linking Memgraph's debug symbols in the container to the host For hotspot to be able to identify debug symbols and draw the flamegraph, it needs the path to the debug symbols to be the exact one as in the container. In the container it is the /usr/lib/memgraph/memgraph binary, so we will need to make a symbolic link from the container volume to the host system:
ln -s <path_to_docker_volume_debug_symbols_binary> /usr/lib/memgraph/memgraph
  1. Start hotspot If you did everything correctly, by starting hotspot
hotspot perf.data

you should be able to see a similar flamegraph like in the picture below.