
How to build single-agent RAG system with LlamaIndex?
As AI-powered applications continue to evolve, combining knowledge graphs with retrieval-augmented generation (RAG) systems can greatly improve how data is retrieved and processed. Memgraph, a fast and efficient graph database, works seamlessly with LlamaIndex, an framework designed to optimize information retrieval for large language models (LLMs). In this example, we build a single-agent GraphRAG system using LlamaIndex and Memgraph, integrating retrieval-augmented generation (RAG) with graph-based querying and tool-using agents. We'll explore how to:
- Set up Memgraph as a graph store for structured knowledge retrieval.
- Use LlamaIndex to create a Property Graph Index and perform Memgraph's vector search on embedded data.
- Implement an agent that uses tools for both arithmetic operations and semantic retrieval.
Prerequisites
-
Make sure you have Docker running in the background.
-
Run Memgraph The easiest way to run Memgraph is by using the following commands:
For Linux/macOS:
curl https://install.memgraph.com | sh
For Windows:
iwr https://windows.memgraph.com | iex
- Install necessary dependencies:
%pip install llama-index llama-index-graph-stores-memgraph python-dotenv
Environment setup
Create .env
file that contains your OpenAI API key:
OPENAI_API_KEY=sk-proj-...
Create the script
Let's first load our .env
file and set the LLM model we want to use. In this
example, we're using OpenAI's gpt-4 model.
from dotenv import load_dotenv
load_dotenv()
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings
# settings
Settings.llm = OpenAI(model="gpt-4", temperature=0)
Define calculator tools
Next, define addition and multiplication tools for calculations and add them to
FunctionTool
class.
from llama_index.core.tools import FunctionTool
# function tools
def multiply(a: float, b: float) -> float:
"""Multiply two numbers and return the product"""
return a * b
multiply_tool = FunctionTool.from_defaults(fn=multiply)
def add(a: float, b: float) -> float:
"""Add two numbers and return the sum"""
return a + b
add_tool = FunctionTool.from_defaults(fn=add)
Load the dataset
Besides the basic operations, we also want to create a RAG pipeline and perform
retrieval operations on the dataset of our choice. In this example, we're using
the PDF file about the Canadian budget for 2023. The file is transformed into PDF
and stored in the data
directory. Let's load that dataset:
from llama_index.core import SimpleDirectoryReader
documents = SimpleDirectoryReader("./data").load_data()
Memgraph graph store
We'll now establish a connection to Memgraph, using
MemgraphPropertyGraphStore
from LlamaIndex. This allows us to store and
retrieve structured data efficiently, enabling graph-based querying for
retrieval-augmented generation (RAG) pipelines.
from llama_index.graph_stores.memgraph import MemgraphPropertyGraphStore
graph_store = MemgraphPropertyGraphStore(
username="", # Your Memgraph username, default is ""
password="", # Your Memgraph password, default is ""
url="bolt://localhost:7687" # Connection URL for Memgraph
)
Create a knowledge graph in Memgraph
This section builds a Property Graph Index using PropertyGraphIndex
from
LlamaIndex. This index allows us to store and retrieve structured knowledge in a
graph database (Memgraph) while leveraging OpenAI embeddings for semantic
search.
from llama_index.core import PropertyGraphIndex
from llama_index.core.indices.property_graph import SchemaLLMPathExtractor
from llama_index.embeddings.openai import OpenAIEmbedding
index = PropertyGraphIndex.from_documents(
documents,
embed_model=OpenAIEmbedding(model_name="text-embedding-ada-002"),
kg_extractors=[
SchemaLLMPathExtractor(
llm=OpenAI(model="gpt-4", temperature=0.0)
)
],
property_graph_store=graph_store,
show_progress=True,
)
RAG Pipeline: query engine and retrieval agent
Let's now set up a Retrieval-Augmented Generation (RAG) pipeline. The pipeline enables efficient data retrieval from a structured knowledge base (Memgraph) and provides contextual responses using OpenAI's GPT-4.
First, we convert the Property Graph Index into a query engine, allowing structured queries over the indexed data.
query_engine = index.as_query_engine()
# smoke test
response = query_engine.query(
"What was the total amount of the 2023 Canadian federal budget?"
)
print(response)
Creating and running the agent
Let's now create a RAG agent that can retrieve budget data and perform
calculations. First, we define budget_tool
, which provides facts about the
2023 Canadian federal
budget Wikipedia
page that was turned into a PDF file (if you'd like to follow along, visit our
Jupyter Notebook
example
providing the PDF file). Then, we create a ReActAgent
that combines this tool
with calculation tools, allowing it to both fetch information and handle math
operations. Finally, we ask the agent: "What is the total amount of the 2023
Canadian federal budget multiplied by 3?" and print the response to see it work
step by step.
from llama_index.core.agent import ReActAgent
from llama_index.core.tools import QueryEngineTool
# RAG pipeline as a tool
budget_tool = QueryEngineTool.from_defaults(
query_engine,
name="canadian_budget_2023",
description="A RAG engine with some basic facts about the 2023 Canadian federal budget."
)
# Create the agent with tools
agent = ReActAgent.from_tools([multiply_tool, add_tool, budget_tool], verbose=True)
# Query the agent
response = agent.chat("What is the total amount of the 2023 Canadian federal budget multiplied by 3? Go step by step, using a tool to do any math.")
print(response)
Conclusion
By integrating LlamaIndex with Memgraph, developers can build more powerful, knowledge-graph-aware AI applications. Whether you need a single-agent RAG system, a multi-agent workflow, or enhanced query relevance using PageRank, these examples provide a solid foundation. The above are just simple examples showcasing how to start building something much more powerful.
Additional resources
Check out our ai-demos repository for more detailed examples or some of the previous webinars with interesting LLM use cases, such as: