Embedding methods serve to map graph entities into n-dimensional vectors. The goal of such an approach is to map closely related entities to vectors with a high degree of similarity according to the chosen method of similarity estimation.
Node2Vec stands out as the most popular method. It is based on random walks. The point of this method is mapping nodes that are most likely to be within a common random walk to the same place in n-dimensional space. The method was developed by Aditya Grover and Jure Leskovec, professors at Stanford University in their paper "node2vec: Scalable Feature Learning for Networks"
The optimization of the mapped vectors themselves is done by a well-known machine learning method such as gradient descent. In the end, the result obtained can be used for node classification or link prediction, both truly popular.
Illustration of how graph embeddings can be mapped to 2D space. Boundaries between classes are more visible than in a graph.
Node2Vec is implemented within project MAGE. Be sure to check it out in the link above. ☝️
As already mentioned, link prediction refers to the task of predicting missing links or links that are likely to occur in the future. In this tutorial, we will make use the of MAGE spell called node2vec. Also, we will use Memgraph to store data, and gqlalchemy to connect from a Python application. The dataset will be similar to the one used in this paper: Graph Embedding Techniques, Applications, and Performance: A Survey.
For node2vec, the paper authors came up with the brilliant idea: We define a flexible notion of a node’s network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods.
Node2Vec is such a versatile method that it is easily integrated into various solutions. The biggest bonus of having such a method is the ability to use it in downstream prediction/classification tasks.
In the network of users and products, node2vec can be used to extract the deeply hidden insight about customers' shopping behavior. This way it can enable targeted advertisements and other recommendations to the user.
Knowledge graphs can be both complex and extremely large. Exploring and splitting the graph into logical units is a difficult task. Therefore, using node2vec and mapping the vectors, this way it would enable easier domain exploration since each knowledge node would be mapped to the same place in vector-space.