Analyze Supply Chain with Graph Notebook and Memgraph
The Graph Notebook is an open-source Python package that provides an easy way to interact with graph databases using the Jupyter Notebook. In that way, it’s easy to present and share the results of your queries with others. Recently, we tried it out with Memgraph to create and query the supply chain dataset, and we really liked what we saw. Because of that, we decided to contribute to the repository by adding Memgraph support.
Install and run Graph Notebook
Graph Notebook contributors provided a great description of how to install the Graph Notebook library: Besides having Python 3.7.x-3.10.11 on your system, here are the exact steps to take in order to run it with Jupyter Lab and Memgraph as an endpoint compatible with openCypher:
-
Install Memgraph Platform
docker run -it -p 7687:7687 -p 7444:7444 -p 3000:3000 -e MEMGRAPH="--bolt-server-name-for-init=Neo4j/" memgraph/memgraph-platform
-
Install Graph Notebook library
pip install graph-notebook
-
Install Jupyter Lab
pip install "jupyterlab>=3,<4"
-
Start Jupyter Lab
python -m graph_notebook.start_jupyterlab --jupyter-dir ~/notebook/destination/dir
Create a new notebook
To create a new notebook, click on the plus symbol (new launcher) and then on the Python3 button under Notebook section.
That will open up a new blank notebook to play with. The prerequisite of running the Cypher queries in the database is, of course, to have an actual database running. The Docker command you ran earlier started Memgraph Platform, which runs the Memgraph database and MAGE (graph algorithms library) on localhost:7687, and Memgraph Lab (visual user interface) on localhost:3000. For additional instructions on setting up and running Memgraph locally, refer to the Memgraph documentation.
Graph Notebook uses magic functions to communicate with the database, and the most important ones to know about in this case are %%graph_notebook_config
and %%oc bolt
. The %%graph_notebook_config
function helps us set up the configuration properly, and the %%oc bolt is the option that must be specified when submitting queries to the Bolt endpoint.
In order for the connection between the notebook and Memgraph to work, set the following configuration:
%%graph_notebook_config
{
"host": "localhost",
"port": 7687,
"ssl": false
}
Since host and port are set to proper values, you are ready to run your first Cypher query in Memgraph via Bolt protocol! Let’s run the simplest query of counting nodes just to ensure the connection is working correctly:
%%oc bolt
MATCH (n)
RETURN count(n);
As expected, the returned result is zero.
Another way of checking what’s currently stored in Memgraph is by opening Memgraph Lab on localhost:3000. Quick Connect screen shows that Memgraph is running, and you can click on Connect now to connect to the database. Currently, the database is empty, so let’s first get some data inside, and then you’ll be able to explore and analyze it.
Analyze the supply chain
It is always easier to learn about new tools on an interesting example. Let’s create a database that consists of a simple supply chain via Graph Notebook. To import the data, run multiple CREATE queries with %%oc bolt magic function.
*%%oc bolt
CREATE (sup1:Supplier {id: 1, name: "Supplissimus", centrality: 0.027920624240525559})
CREATE (sup2:Supplier {id: 2, name: "Supplionis", centrality: 0.002840909090909091})
CREATE (sup3:Supplier {id: 3, name: "MegaSupplies", centrality: 0.055822172619047615})
CREATE (sup4:Supplier {id: 4, name: "Supplies4you", centrality: 0})
CREATE (ing1:Ingredient {id: 1, name: "Ingredient 1", centrality: 0.0042365042365042358})
CREATE (ing2:Ingredient {id: 2, name: "Ingredient 2", centrality: 0.077438394705712787})
CREATE (ing3:Ingredient {id: 3, name: "Ingredient 3", centrality: 0.025363208468374868})
CREATE (ing4:Ingredient {id: 4, name: "Ingredient 4", centrality: 0.036831658149140731})
CREATE (ing5:Ingredient {id: 5, name: "Ingredient 5", centrality: 0.018939393939393933})
CREATE (ing6:Ingredient {id: 6, name: "Ingredient 6", centrality: 0.018939393939393933})
CREATE (ing7:Ingredient {id: 7, name: "Ingredient 7", centrality: 0.018939393939393933})
CREATE (ing8:Ingredient {id: 8, name: "Ingredient 8", centrality: 0.066602827149702143})
CREATE (ing9:Ingredient {id: 9, name: "Ingredient 9", centrality: 0.076719345469345446})
CREATE (ing10:Ingredient {id: 10, name: "Ingredient 10", centrality: 0.13523010455818119})
CREATE (pro1:Product {id: 1, name: "Intermediate product 1", centrality: 0.075849577597110474})
CREATE (pro2:Product {id: 2, name: "Intermediate product 2", centrality: 0.30307542895342809})
CREATE (pro3:Product {id: 3, name: "Intermediate product 3", centrality: 0.27450054057784318})
CREATE (pro4:Product {id: 4, name: "Intermediate product 4", centrality: 0.12564154013699291})
CREATE (pro5:Product {id: 5, name: "Intermediate product 5", centrality: 0.018604622671718259})
CREATE (pro6:FinalProduct:Product {id: 6, name: "Final product 1", centrality: 0.02814078282828282})
CREATE (pro7:FinalProduct:Product {id: 7, name: "Final product 2", centrality: 0.035353535353535366})
CREATE (pro8:FinalProduct:Product {id: 8, name: "Final product 3", centrality: 0.1539119291441273})
CREATE (shi1:Shipping {id: 1, name: "Shipping point 1", centrality: 0.0066761363636363633})
CREATE (shi2:Shipping {id: 2, name: "Shipping point 2", centrality: 0})
CREATE (rec1:Recipe {id: 1, name: "Recipe for product 1", centrality: 0.077470165525264201})
CREATE (rec2:Recipe {id: 2, name: "Recipe for product 2", centrality: 0.15612639008415902})
CREATE (rec3:Recipe {id: 3, name: "Recipe for product 3", centrality: 0.27750650680338179})
CREATE (rec4:Recipe {id: 4, name: "Recipe for product 4", centrality: 0.072996207394185345})
CREATE (rec5:Recipe {id: 5, name: "Recipe for product 5", centrality: 0.051091351458998513})
CREATE (rec6:Recipe {id: 6, name: "Recipe for final product 1", centrality: 0.23304036135039422})
CREATE (rec7:Recipe {id: 7, name: "Recipe for final product 2", centrality: 0.24386567715587651})
CREATE (rec8:Recipe {id: 8, name: "Recipe for final product 3 - variant 1", centrality: 0.088413170560519616})
CREATE (rec9:Recipe {id: 9, name: "Recipe for final product 3 - variant 2", centrality: 0.18098001437059097})
CREATE (rec10:Recipe {id: 10, name: "Recipe for final product 3 - variant 3", centrality: 0.082068494800692962})
CREATE (sup1)-[:SUPPLIES]->(ing1)
CREATE (sup1)-[:SUPPLIES]->(ing2)
CREATE (sup1)-[:SUPPLIES]->(ing3)
CREATE (sup1)-[:SUPPLIES]->(ing4)
CREATE (sup2)-[:SUPPLIES]->(ing5)
CREATE (sup2)-[:SUPPLIES]->(ing6)
CREATE (sup2)-[:SUPPLIES]->(ing7)
CREATE (sup3)-[:SUPPLIES]->(ing8)
CREATE (sup3)-[:SUPPLIES]->(ing9)
CREATE (sup4)-[:SUPPLIES]->(ing10)
CREATE (pro1)-[:FORMS {quantity: 30}]->(rec1)
CREATE (pro2)-[:FORMS {quantity: 50}]->(rec1)
CREATE (pro2)-[:FORMS {quantity: 100}]->(rec2)
CREATE (pro2)-[:FORMS {quantity: 50}]->(rec10)
CREATE (pro3)-[:FORMS {quantity: 80}]->(rec1)
CREATE (pro3)-[:FORMS {quantity: 200}]->(rec2)
CREATE (pro4)-[:FORMS {quantity: 150}]->(rec2)
CREATE (pro4)-[:FORMS {quantity: 70}]->(rec10)
CREATE (pro5)-[:FORMS {quantity: 10}]->(rec3)
CREATE (pro6)-[:FORMS {quantity: 90}]->(rec3)
CREATE (pro7)-[:FORMS {quantity: 100}]->(rec3)
CREATE (pro8)-[:FORMS {quantity: 200}]->(rec3)
CREATE (ing9)-[:FORMS {quantity: 300}]->(rec4)
CREATE (ing9)-[:FORMS {quantity: 80}]->(rec5)
CREATE (ing10)-[:FORMS {quantity: 120}]->(rec4)
CREATE (ing10)-[:FORMS {quantity: 5}]->(rec5)
CREATE (ing10)-[:FORMS {quantity: 100}]->(rec9)
CREATE (ing1)-[:FORMS {quantity: 15}]->(rec6)
CREATE (ing2)-[:FORMS {quantity: 25}]->(rec6)
CREATE (ing2)-[:FORMS {quantity: 65}]->(rec7)
CREATE (ing2)-[:FORMS {quantity: 100}]->(rec9)
CREATE (ing3)-[:FORMS {quantity: 35}]->(rec6)
CREATE (ing3)-[:FORMS {quantity: 120}]->(rec7)
CREATE (ing4)-[:FORMS {quantity: 130}]->(rec7)
CREATE (ing4)-[:FORMS {quantity: 140}]->(rec8)
CREATE (ing5)-[:FORMS {quantity: 85}]->(rec8)
CREATE (pro6)-[:SHIPS_WITH]->(shi1)
CREATE (pro7)-[:SHIPS_WITH]->(shi1)
CREATE (pro8)-[:SHIPS_WITH]->(shi2)
CREATE (rec1)-[:PRODUCES {quantity: 1}]->(pro1)
CREATE (rec2)-[:PRODUCES {quantity: 1}]->(pro2)
CREATE (rec3)-[:PRODUCES {quantity: 1}]->(pro3)
CREATE (rec4)-[:PRODUCES {quantity: 1}]->(pro4)
CREATE (rec5)-[:PRODUCES {quantity: 1}]->(pro5)
CREATE (rec6)-[:PRODUCES {quantity: 1}]->(pro6)
CREATE (rec7)-[:PRODUCES {quantity: 1}]->(pro7)
CREATE (rec8)-[:PRODUCES {quantity: 1}]->(pro8)
CREATE (rec9)-[:PRODUCES {quantity: 1}]->(pro8)
CREATE (rec10)-[:PRODUCES {quantity: 1}]->(pro8)*
To ensure that data is imported to Memgraph, head back to the Memgraph Lab at localhost:3000 and check the node and relationship count. There should be 34 nodes and 49 relationships in the database. The Graph Schema tab in Memgraph Lab allows us to see how data is structured in the database, and in that way, it is easier to query the data or see if there is incorrect data imported.
Explore the graph
Let’s run a couple of interesting queries from the Graph Notebook, to analyze the supply chain.
Get ingredients provided by the supplier
Since a graph database can be the ultimate source of truth between different data sources, it makes sense if all the information about suppliers is stored in Memgraph.
From there, you can query, for example, which ingredients are supplied by the supplier Supplissimus. In order to get that information, you need to run the following cell from the Graph Notebook:
%%oc bolt
MATCH (s:Supplier {name:"Supplissimus"})-[r:SUPPLIES]->(i:Ingredient)
RETURN i;
Here are the results:
This means that the supplier Supplissimus supplies ingredients 1, 2, 3, and 4.
Check the dependencies of the product
To determine what happens before the :FinalProduct with the ID 6 gets produced, you can run the graph_util.ancestors procedure that captures all the nodes from which a path to the destination node (FinalProduct) exists. Here is the query:
%%oc bolt
MATCH (f:FinalProduct {id:6})
CALL graph_util.ancestors(f) YIELD ancestors
UNWIND ancestors AS ancestor
RETURN ancestor;
And here are the results:
Hence, before the final product 6 can get produced, you need to have a recipe for final product 1, ingredients 1, 2 and 3, as well as Supplissimus supplier available. The previous procedure has yielded all the precedent nodes, but it only means a little since you don't know how they are connected.
To connect the nodes, you can use another MAGE extension procedure called graph_util.connect_nodes, which will connect the nodes with corresponding relationships between them.
This can help a lot to detect which parts of the supply chain affect the production of a certain product and in that way you can minimize the risk of not delivering that product on time.
Here is the visualization of the result in Memgraph Lab:
Check possible products for production
You might look at the production from the supplier's view and find out how many products or operations in the supply chain are affected by them. In case the supplier is unavailable for some reason, this information could be helpful to minimize the risk.
Similar to ancestors, you use the procedure graphutil.descendants, which yields all the nodes to which a path exists from the source node (supplier _Supplissimus in this case).
%%oc bolt
MATCH (s:Supplier {name: "Supplissimus"})
CALL graph_util.descendants(s) YIELD descendants
UNWIND descendants AS descendant
RETURN descendant;
Here are the results:
Again, to make it clear, let’s connect the nodes in Memgraph Lab, using the graph_util.connect_nodes procedure.
Obviously, the supplier Supplissimus affects the delivery of many products in this supply chain, so whoever is taking care that the products are delivered on time, will have to take good care of Supplissimus.
Getting the order of execution with topological sort
There are cases when some operations can't start before others finish, which blocks the delivery of a product until the process or a job with no dependencies or bottlenecks finishes.
In order to detect such bottlenecks, graph theory offers topological sort. It sorts the nodes to yield the ones (jobs, operations, or products) that get executed or produced first, followed by those that can start after the previous ones have started.
For sorting the nodes topologically, you use the graph_util.topological_sort procedure.
%%oc bolt
MATCH p=(r:Recipe)-[*bfs]->(f:FinalProduct)
WITH project(p) AS graph
CALL graph_util.topological_sort(graph) YIELD sorted_nodes
UNWIND sorted_nodes AS nodes
RETURN nodes.name;
Again, the results:
If you read it in the reverse order, the final product 3 can’t be produced without the recipe for the final product 3 (one of the variants) and one of those recipes list intermediate product 2 in the ingredient list. To get to the intermediate product 2 you need the recipe for product 2, and so on.
You can also explore those connections visually, on a graph, in Memgraph Lab:
Besides the ability to visualize your data in a pretty way, Memgraph Lab also offers you the possibility to save the queries you run in a query collection. In that way, it’s quite similar to the Graph Notebook.
Conclusion
It is always fun to find new open-source tools working with various graph databases. To witness Graph Notebook work with Memgraph out-of-the-box was quite pleasing and that kind of experience always makes it more rewarding to contribute. If you are committed to create your own Graph Notebook, feel free to share it with us in our Discord community or contribute directly to the Graph Notebook repository.