How to build single-agent RAG system with LlamaIndex?

By Matea Pesic

7 min readApril 8, 2025

As AI-powered applications continue to evolve, combining knowledge graphs with retrieval-augmented generation (RAG) systems can greatly improve how data is retrieved and processed. Memgraph, a fast and efficient graph database, works seamlessly with LlamaIndex, an framework designed to optimize information retrieval for large language models (LLMs). In this example, we build a single-agent GraphRAG system using LlamaIndex and Memgraph, integrating retrieval-augmented generation (RAG) with graph-based querying and tool-using agents. We'll explore how to:

Set up Memgraph as a graph store for structured knowledge retrieval.
Use LlamaIndex to create a Property Graph Index and perform Memgraph's vector search on embedded data.
Implement an agent that uses tools for both arithmetic operations and semantic retrieval.

Prerequisites

Make sure you have Docker running in the background.
Run Memgraph The easiest way to run Memgraph is by using the following commands:

For Linux/macOS:

curl https://install.memgraph.com | sh

For Windows:

iwr https://windows.memgraph.com | iex

Install necessary dependencies:

%pip install llama-index llama-index-graph-stores-memgraph python-dotenv

Environment setup

Create .env file that contains your OpenAI API key:

OPENAI_API_KEY=sk-proj-...

Create the script

Let's first load our .env file and set the LLM model we want to use. In this example, we're using OpenAI's gpt-4 model.

from dotenv import load_dotenv
load_dotenv()
 
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings
 
# settings
Settings.llm = OpenAI(model="gpt-4", temperature=0)

Define calculator tools

Next, define addition and multiplication tools for calculations and add them to FunctionTool class.

from llama_index.core.tools import FunctionTool
 
# function tools
def multiply(a: float, b: float) -> float:
    """Multiply two numbers and return the product"""
    return a * b
 
multiply_tool = FunctionTool.from_defaults(fn=multiply)
 
def add(a: float, b: float) -> float:
    """Add two numbers and return the sum"""
    return a + b
 
add_tool = FunctionTool.from_defaults(fn=add)

Load the dataset

Besides the basic operations, we also want to create a RAG pipeline and perform retrieval operations on the dataset of our choice. In this example, we're using the PDF file about the Canadian budget for 2023. The file is transformed into PDF and stored in the data directory. Let's load that dataset:

from llama_index.core import SimpleDirectoryReader
 
documents = SimpleDirectoryReader("./data").load_data()

Memgraph graph store

We'll now establish a connection to Memgraph, using MemgraphPropertyGraphStore from LlamaIndex. This allows us to store and retrieve structured data efficiently, enabling graph-based querying for retrieval-augmented generation (RAG) pipelines.

from llama_index.graph_stores.memgraph import MemgraphPropertyGraphStore
 
graph_store = MemgraphPropertyGraphStore(
    username="",  # Your Memgraph username, default is ""
    password="",  # Your Memgraph password, default is ""
    url="bolt://localhost:7687"  # Connection URL for Memgraph
)

Create a knowledge graph in Memgraph

This section builds a Property Graph Index using PropertyGraphIndex from LlamaIndex. This index allows us to store and retrieve structured knowledge in a graph database (Memgraph) while leveraging OpenAI embeddings for semantic search.

from llama_index.core import PropertyGraphIndex
from llama_index.core.indices.property_graph import SchemaLLMPathExtractor
from llama_index.embeddings.openai import OpenAIEmbedding
 
index = PropertyGraphIndex.from_documents(
    documents,
    embed_model=OpenAIEmbedding(model_name="text-embedding-ada-002"),
    kg_extractors=[
        SchemaLLMPathExtractor(
            llm=OpenAI(model="gpt-4", temperature=0.0)
        )
    ],
    property_graph_store=graph_store,
    show_progress=True,
)

RAG Pipeline: query engine and retrieval agent

Let's now set up a Retrieval-Augmented Generation (RAG) pipeline. The pipeline enables efficient data retrieval from a structured knowledge base (Memgraph) and provides contextual responses using OpenAI's GPT-4.

First, we convert the Property Graph Index into a query engine, allowing structured queries over the indexed data.

query_engine = index.as_query_engine()
 
# smoke test
response = query_engine.query(
    "What was the total amount of the 2023 Canadian federal budget?"
)
print(response)

Creating and running the agent

Let's now create a RAG agent that can retrieve budget data and perform calculations. First, we define budget_tool, which provides facts about the 2023 Canadian federal budget Wikipedia page that was turned into a PDF file (if you'd like to follow along, visit our Jupyter Notebook example providing the PDF file). Then, we create a ReActAgent that combines this tool with calculation tools, allowing it to both fetch information and handle math operations. Finally, we ask the agent: "What is the total amount of the 2023 Canadian federal budget multiplied by 3?" and print the response to see it work step by step.

from llama_index.core.agent import ReActAgent
from llama_index.core.tools import QueryEngineTool
 
# RAG pipeline as a tool
budget_tool = QueryEngineTool.from_defaults(
    query_engine,
    name="canadian_budget_2023",
    description="A RAG engine with some basic facts about the 2023 Canadian federal budget."
)
 
# Create the agent with tools
agent = ReActAgent.from_tools([multiply_tool, add_tool, budget_tool], verbose=True)
 
# Query the agent
response = agent.chat("What is the total amount of the 2023 Canadian federal budget multiplied by 3? Go step by step, using a tool to do any math.")
 
print(response)

Conclusion

By integrating LlamaIndex with Memgraph, developers can build more powerful, knowledge-graph-aware AI applications. Whether you need a single-agent RAG system, a multi-agent workflow, or enhanced query relevance using PageRank, these examples provide a solid foundation. The above are just simple examples showcasing how to start building something much more powerful.

Additional resources

Check out our ai-demos repository for more detailed examples or some of the previous webinars with interesting LLM use cases, such as: