
How to build Agentic RAG with Pagerank using LlamaIndex?
In our previous blog posts, we first showed how to build a simple single-agent RAG system using LlamaIndex and Memgraph. Then in the second part, we expanded that concept into a multi-agent setup, enabling more powerful workflows with multiple agents collaborating on tasks like data retrieval and calculations.
Now, we’re taking it a step further by integrating graph algorithms into the mix. In this post, you’ll learn how to create an agentic RAG system that runs Memgraph's PageRank inside a multi-agent workflow.
In this example, we'll create a multi-agent workflow using LlamaIndex and Memgraph to perform graph-based querying and computation. We'll explore how to:
- Set up Memgraph as a graph store and create a sample dataset.
- Use LlamaIndex to define function agents for retrieval and arithmetic operations.
- Implement a retriever agent to run the PageRank algorithm and extract ranked nodes.
- Use a calculator agent to process numerical data from retrieved nodes.
- Design an AgentWorkflow that integrates retrieval and computation for automated query execution.
By the end, we'll have a system capable of retrieving graph-based data and performing calculations dynamically.
Prerequisites
-
Make sure you have Docker running in the background
-
Run Memgraph
The easiest way to run Memgraph is using the following commands:
For Linux/macOS:
curl https://install.memgraph.com | sh
For Windows:
iwr https://windows.memgraph.com | iex
- Install neccessary dependencies:
pip install llama-index llama-index-graph-stores-memgraph python-dotenv neo4j
Environment setup
Create a .env
file that contains your OpenAI API key and the values of
environment variables necessary to connect to your Memgraph instance. If the
user is not created, the default value is the empty string:
OPENAI_API_KEY=sk-proj-...
URI=bolt://localhost:7687
AUTH_USER=""
AUTH_PASS=""
Create the script
Let's first load our .env
file and set the LLM model we want to use. In this
example, we're using OpenAI's GPT-4 model.
from dotenv import load_dotenv
load_dotenv()
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings
# settings
Settings.llm = OpenAI(model="gpt-4",temperature=0)
Connect to Memgraph
In this section, we'll establish a connection to Memgraph using the environment variables for authentication and connection details.
-
Retrieve Environment Variables
The script fetches theURI
,AUTH_USER
, andAUTH_PASS
values from the environment usingos.getenv()
. These values determine how the script connects to the Memgraph database. -
Set Up Authentication
The credentials (AUTH_USER
,AUTH_PASS
) are combined into a tuple (AUTH
) to be used for authentication. -
Create a Memgraph Connection
A connection to Memgraph is established usingGraphDatabase.driver(URI, auth=AUTH)
.
This setup ensures that the script can interact with your Memgraph instance.
import os
from neo4j import GraphDatabase
from llama_index.graph_stores.memgraph import MemgraphPropertyGraphStore
URI = os.getenv("URI")
AUTH_USER = os.getenv("AUTH_USER")
AUTH_PASS = os.getenv("AUTH_PASS")
AUTH = (AUTH_USER, AUTH_PASS)
driver = GraphDatabase.driver(URI, auth=AUTH)
Define calculator tools
Next, define addition and subtraction tools for calculations and a calculator agent. The role of the agent in this case will be to perform basic arithmetic operations with access to the defined tools.
from llama_index.core.tools import FunctionTool
from llama_index.core.agent.workflow import FunctionAgent
def add(a: int, b: int) -> int:
"""Add two numbers."""
return a + b
def subtract(a: int, b: int) -> int:
"""Subtract two numbers."""
return a - b
# Create agent configs
calculator_agent = FunctionAgent(
name="calculator",
description="Performs basic arithmetic operations",
system_prompt="You are a calculator assistant.",
tools=[
FunctionTool.from_defaults(fn=add),
FunctionTool.from_defaults(fn=subtract),
],
llm=OpenAI(model="gpt-4"),
)
Next, define a function to execute Cypher queries and implement a PageRank retrieval tool. The retriever agent is responsible for running the PageRank algorithm and retrieving ranked nodes using the defined tool.
def execute_query(query: str):
"""Runs a given Cypher query inside a session."""
with driver.session() as session:
return session.execute_read(lambda tx: list(tx.run(query)))
def run_pagerank():
"""Executes the PageRank algorithm."""
query = "CALL pagerank.get() YIELD node, rank RETURN node, rank ORDER BY rank DESC LIMIT 5"
return execute_query(query)
pagerank_tool = FunctionTool.from_defaults(
fn=run_pagerank,
name="pagerank_tool",
description="Runs the PageRank algorithm and retrieves ranked nodes."
)
retriever_agent = FunctionAgent(
name="retriever",
description="Manages data retrieval",
system_prompt="You have the ability to run the PageRank algorithm.",
tools=[
pagerank_tool,
],
llm=OpenAI(model="gpt-4"),
memory=None
)
Create the dataset
Now, let's create a small dataset in Memgraph consisting of 10 nodes, each with
a weight property. The nodes are connected through LINKS_TO
relationships,
forming a structured graph. To create your graph, run the following Cypher query
in your Memgraph instance:
CREATE (n1:Node {id: 1, weight: 1.2}), (n2:Node {id: 2, weight: 2.5}), (n3:Node
{id: 3, weight: 0.8}), (n4:Node {id: 4, weight: 1.7}), (n5:Node {id: 5, weight:
3.0}), (n6:Node {id: 6, weight: 2.2}), (n7:Node {id: 7, weight: 1.0}), (n8:Node
{id: 8, weight: 2.8}), (n9:Node {id: 9, weight: 1.5}), (n10:Node {id: 10,
weight: 2.0}), (n1)-[:LINKS_TO]->(n2), (n1)-[:LINKS_TO]->(n3),
(n2)-[:LINKS_TO]->(n4), (n3)-[:LINKS_TO]->(n4), (n4)-[:LINKS_TO]->(n5),
(n5)-[:LINKS_TO]->(n6), (n6)-[:LINKS_TO]->(n7), (n7)-[:LINKS_TO]->(n8),
(n8)-[:LINKS_TO]->(n9), (n9)-[:LINKS_TO]->(n10), (n10)-[:LINKS_TO]->(n1),
(n3)-[:LINKS_TO]->(n6), (n4)-[:LINKS_TO]->(n9), (n7)-[:LINKS_TO]->(n2),
(n8)-[:LINKS_TO]->(n5);
Memgraph graph store
We'll now establish a connection to Memgraph, using
MemgraphPropertyGraphStore
from LlamaIndex. This allows us to store and
retrieve structured data efficiently, enabling graph-based querying for
retrieval-augmented generation (RAG) pipelines.
from llama_index.graph_stores.memgraph import MemgraphPropertyGraphStore
graph_store = MemgraphPropertyGraphStore(
username="", # Your Memgraph username, default is ""
password="", # Your Memgraph password, default is ""
url="bolt://localhost:7687" # Connection URL for Memgraph
)
Creating and running the workflow
Finally, let's create an AgentWorkflow that ties together the previously defined agents, including the calculator and retriever agents. The workflow runs the PageRank algorithm, retrieves nodes, and sums their weight properties using the addition tool.
We define an async function to execute the workflow, sending a user query that asks to run the PageRank algorithm and using the addition tool, add all of the weight properties of returned nodes.
from llama_index.core.agent.workflow import (
AgentWorkflow,
FunctionAgent,
ReActAgent,
)
import asyncio
# Create and run the workflow
workflow = AgentWorkflow(
agents=[calculator_agent, retriever_agent], root_agent="retriever"
)
# Define an async function to run the workflow
async def run_workflow():
response = await workflow.run(user_msg="Run PageRank algorithm and using addition tool, add all of the weight properties of returned nodes.")
print(response)
# Run the async function using asyncio
asyncio.run(run_workflow())
Conclusion
This post builds on the agentic RAG foundation we introduced in part one and expanded in part two. By adding PageRank into the agent workflow, we've demonstrated how to bring graph intelligence into agentic systems, enabling more meaningful and relevant data processing.
What we’ve shown here is just the beginning. Combining graph algorithms with multi-agent LLM systems opens up possibilities for context-aware, automated workflows. As always, we encourage you to experiment further, explore our ai-demos repo and keep building smarter GenAI pipelines with Memgraph and LlamaIndex.
Additional resources
Check out some of our previous webinars with interesting LLM use cases, such as: