Run Link Prediction or Node Classification Algorithms and Write Custom Procedures in C++ With Mage 1.4

By Antonio Filipovic

4 min readDecember 6, 2022

In the new release of Memgraph’s open-source graph extension library MAGE, we focused on supporting graph machine learning. MAGE 1.4 now enables you to classify graph nodes and predict new relationships using the node classification and link prediction algorithms.

We also wanted to extend MAGE towards the C++ community even more and created the C++ API towards Memgraph database. Writing graph algorithms in C++ now comes close to working in Python since you don’t need to worry about handling memory and working with unnecessary interfaces.

If you are also familiar with igraph library, you’ll be happy to hear that we integrated it into MAGE, and the newly integrated k-means algorithm will help you cluster your data.

Link prediction

Link prediction tries to predict new relationships by generalizing on unseen nodes at inference time. Inside the module, you can choose to work on link prediction using GraphSAGE or GAT. The module was integrated using DGL implementation, and it supports a lot of different logging metrics, as well as storing models after a certain number of epochs.

One example of what you can do with the link prediction algorithm is to recommend new services for customers by using a query similar to this one:

MATCH (n:Customer {id: "1658"})
MATCH (s:Service)
WITH collect(s) AS services, n
CALL link_prediction.recommended_vertex(n, services, 6) YIELD *
RETURN score, recommendation;

Node classification

Node classification determines the labeling of samples (represented as nodes) by looking at the labels of their neighbors. It is motivated by homophily, which means "love of sameness” based on the sociological theory that similar things will group. The following module supports different layer types, loading and storing models, and much more.

With node classification, you can work on fraud prediction by using a query like the one below to determine if a certain user is a fraudster or not:

MATCH (n {id:1658})
CALL node_classification.predict(n) YIELD *
RETURN predicted_class;

C++ API designed for humans

The new C++ API is designed for humans, not robots. We followed best practices to reduce unnecessary cognitive load: the components have simple and consistent interfaces, common use cases require fewer user actions, and the API comes with developer guides and extensive documentation.

Memory management is probably the main pain point in C++ development. The new C++ API automatically manages the memory used by graph data, saving you time that would otherwise be spent debugging and writing repetitive code.

igraph support is here

Furthermore, the igraphalg module provides a comprehensive set of thin wrappers around some of the algorithms present in the igraph package. The wrapper functions can create an igraph-compatible graph-like object that can stream the native database graph directly, significantly lowering memory usage.

From this version, MAGE supports NetworkX integration, cuGraph to support graph algorithms on CUDA devices, and now igraph. Whether you need something else, feel free to drop us a comment on GitHub.

k-means clustering to group examples

And last but not least, the k-means algorithm clusters given data by trying to separate samples in n groups of equal variance by minimizing the criterion known as within-the-cluster sum-of-squares. Find out more about this algorithm in the documentation,

You can use this algorithm when you already have embeddings, but clustering is missing. For example, feel free to combine it with Node2Vec or Node2Vec online version.

What next?

If any of the new features are the one that will make your use case easier, update MAGE to version 1.4. Feel free to leave a comment, report an issue or give us a star as support for our work on GitHub. Also, we are always open for discussions and advice, drop them on our Discord Server and stay informed on everything graph-algorithm-related!