Integrations
Memgraph offers several integrations with popular AI frameworks to help you customize and build your own GenAI application from scratch. Below are some of the libraries integrated with Memgraph.
LlamaIndex
LlamaIndex is a simple, flexible data framework for connecting custom data sources to large language models. Currently, Memgraph’s integration supports creating a knowledge graph from unstructured data and querying with natural language. You can follow the example on LlamaIndex docs or go through quick start below.
Installation
To install LlamaIndex and Memgraph graph store, run:
pip install llama-index llama-index-graph-stores-memgraph
Environment setup
Before you get started, make sure you have Memgraph running in the background.
To use Memgraph as the underlying graph store for LlamaIndex, define your graph store by providing the credentials used for your database:
from llama_index.graph_stores.memgraph import MemgraphPropertyGraphStore
username = "" # Enter your Memgraph username (default "")
password = "" # Enter your Memgraph password (default "")
url = "" # Specify the connection URL, e.g., 'bolt://localhost:7687'
graph_store = MemgraphPropertyGraphStore(
username=username,
password=password,
url=url,
)
Additionally, a working OpenAI key is required:
import os
os.environ["OPENAI_API_KEY"] = "<YOUR_API_KEY>" # Replace with your OpenAI API key
Dataset
For the dataset, we’ll use a text about Charles Darwin stored in the
/data/charles_darwin/charles.txt
file:
Charles Robert Darwin was an English naturalist, geologist, and biologist,
widely known for his contributions to evolutionary biology. His proposition that
all species of life have descended from a common ancestor is now generally
accepted and considered a fundamental scientific concept. In a joint publication
with Alfred Russel Wallace, he introduced his scientific theory that this
branching pattern of evolution resulted from a process he called natural
selection, in which the struggle for existence has a similar effect to the
artificial selection involved in selective breeding. Darwin has been described
as one of the most influential figures in human history and was honoured by
burial in Westminster Abbey.
from llama_index.core import SimpleDirectoryReader
documents = SimpleDirectoryReader("./data/charles_darwin/").load_data()
The data is now loaded in the documents variable which we’ll pass as an argument in the next step of index creation and graph construction.
Graph construction
LlamaIndex provides multiple graph constructors. In this example, we’ll use the SchemaLLMPathExtractor, which allows to both predefine the schema or use the one LLM provides without explicitly defining entities.
from llama_index.core import PropertyGraphIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.core.indices.property_graph import SchemaLLMPathExtractor
index = PropertyGraphIndex.from_documents(
documents,
embed_model=OpenAIEmbedding(model_name="text-embedding-ada-002"),
kg_extractors=[
SchemaLLMPathExtractor(
llm=OpenAI(model="gpt-4", temperature=0.0),
)
],
property_graph_store=graph_store,
show_progress=True,
)
In the below image, you can see how the text was transformed into a knowledge graph and stored into Memgraph.
Querying
Labeled property graphs can be queried in several ways to retrieve nodes and paths and in LlamaIndex, several node retrieval methods at once can be combined.
If no sub-retrievers are provided, the defaults are LLMSynonymRetriever and VectorContexRetriever, if supported.
From the latest update, LlamaIndex utilizes Memgraph’s vector search feature in the background to enhance retrieval. This integration enables faster and more accurate querying by leveraging vector similarity searches for embeddings stored in the graph, leading to precise and context-aware answers.
query_engine = index.as_query_engine(include_text=True)
response = query_engine.query("Who did Charles Robert Darwin collaborate with?")
print(str(response))
In the image below, you can see what’s happening under the hood to get the answer.
Demos
If you’d like to take it one step further, explore how Memgraph and LlamaIndex work together in real-world applications with these interactive demos:
- Single-agent GraphRAG system: Learn how to build an agent-powered graph retrieval-augmented generation (RAG) system using Memgraph and LlamaIndex.
- Multi-agent GraphRAG System: Dive into a more advanced setup with multiple agents collaborating in a GraphRAG system.
LangChain
LangChain is a framework for developing applications powered by large language models (LLMs).
Memgraph has an integration with LangChain which supports Memgraph toolkit for building agentic applications, knowledge graph construction from unstructured data and MemgraphQAChain for querying via natural language.
Recently, we migrated the Memgraph LangChain integration to the repository under Memgraph organization for easier management.
Memgraph toolkit
The LangGraph framework enables users to build agentic applications. Memgraph now offers a toolkit for building agents that can autonomously interact with the Memgraph database.
Currently, the Memgraph toolkit supports the following tools:
- QueryMemgraphTool: Tool for executing Cypher queries on Memgraph.
We just started building Memgraph toolkit. In case you’re interested into having more tools, please open an issue on our repository or open a pull request and contribute.
Installation
Before starting to write code, make sure you have installed the required packages in your environment:
pip install langchain langchain-openai langchain-memgraph langgraph
Don’t forget to install langgraph
, as it is a prerequisite to use Memgraph
toolkit.
Enviroment setup
Make sure you have Memgraph running in the background.
After that, you can instantiate Memgraph
in your Python code. This object holds the
connection to the running Memgraph instance.
import os
from getpass import getpass
import pytest
from dotenv import load_dotenv
from langchain.chat_models import init_chat_model
from langgraph.prebuilt import create_react_agent
from langchain_memgraph import MemgraphToolkit
from langchain_memgraph.graphs.memgraph import Memgraph
"""Setup Memgraph connection fixture."""
url = os.getenv("MEMGRAPH_URI", "bolt://localhost:7687")
username = os.getenv("MEMGRAPH_USERNAME", "")
password = os.getenv("MEMGRAPH_PASSWORD", "")
graph = Memgraph(
url=url, username=username, password=password, refresh_schema=False
)
"""Set up Memgraph agent with React pattern."""
if not os.environ.get("OPENAI_API_KEY"):
os.environ["OPENAI_API_KEY"] = getpass("Enter API key for OpenAI: ")
llm = init_chat_model("gpt-4o-mini", model_provider="openai")
db = Memgraph(url=url, username=username, password=password)
toolkit = MemgraphToolkit(db=db, llm=llm)
The refresh_schema
is initially set to False
because there is still no data
in the database, and we want to avoid unnecessary database calls. The code above
also initializes the LLM chat model from OpenAI and gets the toolkit for Memgraph.
Agent setup:
After setting up Memgraph and the toolkit, you can create an agent that will use the toolkit to solve particular problem:
agent_executor = create_react_agent(
llm,
toolkit.get_tools(),
prompt="You will get a cypher query, try to execute it on the Memgraph database.",
)
This is a simple example of an agent using a tool which executes Cypher query.
Running agent
Now, we can create a node in the database and run the agent:
query = """
CREATE (c:Character {name: 'Jon Snow', house: 'Stark', title: 'King in the North'})
"""
memgraph_connection.query(query)
memgraph_connection.refresh_schema()
example_query = "MATCH (n) WHERE n.name = 'Jon Snow' RETURN n"
events = memgraph_agent.stream(
{"messages": [("user", example_query)]},
stream_mode="values",
)
last_event = None
for event in events:
last_event = event
event["messages"][-1].pretty_print()
print (last_event)
The agent will autonomously pick a tool and use the toolkit to solve the requested problem.
Querying unstructured data
You can follow the example on LangChain docs or go through quick start below.
Recently, we migrated the Memgraph LangChain integration to the repository under Memgraph organization for easier management.
Installation
To install all the required packages, run:
pip install langchain langchain-openai langchain-memgraph langchain-experimental
Environment setup
Before you get started, make sure you have Memgraph running in the background.
Then, instantiate Memgraph
in your Python code. This object holds the
connection to the running Memgraph instance. Make sure to set up all the
environment variables properly.
import os
from langchain_openai import ChatOpenAI
from langchain_experimental.graph_transformers import LLMGraphTransformer
from langchain_memgraph.graphs.graph_document import Document
from langchain_memgraph.chains.graph_qa import MemgraphQAChain
from langchain_memgraph.graphs.memgraph import Memgraph
url = os.environ.get("MEMGRAPH_URI", "bolt://localhost:7687")
username = os.environ.get("MEMGRAPH_USERNAME", "")
password = os.environ.get("MEMGRAPH_PASSWORD", "")
graph = MemgraphGraph(
url=url, username=username, password=password, refresh_schema=False
)
The refresh_schema
is initially set to False
because there is still no data in
the database and we want to avoid unnecessary database calls.
To interact with the LLM, you must configure it. Here is how you can set API key as an environment variable for OpenAI:
os.environ["OPENAI_API_KEY"] = "your-key-here"
Graph construction
For the dataset, we’ll use the following text about Charles Darwin:
text = """
Charles Robert Darwin was an English naturalist, geologist, and biologist,
widely known for his contributions to evolutionary biology. His proposition that
all species of life have descended from a common ancestor is now generally
accepted and considered a fundamental scientific concept. In a joint
publication with Alfred Russel Wallace, he introduced his scientific theory that
this branching pattern of evolution resulted from a process he called natural
selection, in which the struggle for existence has a similar effect to the
artificial selection involved in selective breeding. Darwin has been
described as one of the most influential figures in human history and was
honoured by burial in Westminster Abbey.
"""
To construct the graph, first initialize LLMGraphTransformer
from the desired
LLM and convert the document to the graph structure.
llm = ChatOpenAI(temperature=0, model_name="gpt-4-turbo")
llm_transformer = LLMGraphTransformer(llm=llm)
documents = [Document(page_content=text)]
graph_documents = llm_transformer.convert_to_graph_documents(documents)
The graph structure in the GraphDocument
format can be forwarded to the
add_graph_documents()
procedure to import in into Memgraph:
# Make sure the database is empty
graph.query("STORAGE MODE IN_MEMORY_ANALYTICAL")
graph.query("DROP GRAPH")
graph.query("STORAGE MODE IN_MEMORY_TRANSACTIONAL")
# Create KG
graph.add_graph_documents(graph_documents)
The add_graph_documents()
procedure transforms the list of graph_documents
into appropriate Cypher queries and executes them in Memgraph.
In the below image, you can see how the text was transformed into a knowledge graph and stored into Memgraph.
For additional options, check the full guide on the LangChain docs.
Querying
In the end, you can query the knowledge graph:
chain = MemgraphQAChain.from_llm(
ChatOpenAI(temperature=0),
graph=graph,
model_name="gpt-4-turbo",
allow_dangerous_requests=True,
)
print(chain.invoke("Who Charles Robert Darwin collaborated with?")["result"])
Here is the result:
MATCH (:Person {id: "Charles Robert Darwin"})-[:COLLABORATION]->(collaborator)
RETURN collaborator;
Alfred Russel Wallace
In the image below, you can see what’s happening under the hood to get the answer.