TrustGraph and Memgraph: Knowledge Retrieval for Complex Industries

By Memgraph

6 min readDecember 23, 2024

Data silos stall innovation. In industries like aerospace and law, disconnected information—from compliance docs to technical standards—buries critical insights. Innovation slows, risk aversion grows, and progress halts.

At a recent Memgraph Community Call, the founders of TrustGraph—Daniel Davis and Mark Adams, showed how their open-source framework turns this challenge into an opportunity. By combining AI agents, knowledge graphs in Memgraph, TrustGraph transforms unstructured chaos into connected knowledge, enabling smarter decisions at scale.

Watch the full webinar recording—Using AI Agents to Make Sense of the UK Law at Scale.

The Problem: Too Much Data, Not Enough Structure

Heavily regulated industries face unique challenges:

Tens of thousands of pages of documentation, often interlinked and inconsistent.
Fragmented data silos that make it nearly impossible to see the big picture.
Risk aversion, which stifles innovation.

As Mark Adams noted: “No one person can understand it all, and that’s why we see so little innovation in these fields.”

The Solution: TrustGraph and Memgraph

TrustGraph is an open-source framework built to:

Extract critical information from unstructured datasets.
Transform that data into a graph format for better insights.
Query it using AI-driven GraphRAG workflows.

During the demo, Memgraph handled thousands of graph nodes and edges from UK legislation, offering lightning-fast insights into highly connected legal texts.

Talking Point 1: Introduction to TrustGraph

TrustGraph was conceived as an end-to-end infrastructure for working with unstructured data using GraphRAG. Founders, Daniel and Mark, highlighted their expertise in knowledge graphs and enterprise AI systems, particularly scaling issues. Existing tools and frameworks were suitable for demos but inadequate for production at scale in enterprise environments.

Based on this need, TrustGraph was designed to overcome reliability, latency, and scalability limitations in the current AI ecosystem.

In short, what’s TrustGraph?

It’s a scalable data engineering for AI. Built on Apache Pulsar and containers, it handles data extraction, retrieval, and transformation for GraphRAG and AI agents. Out-of-the-box tools designed for production-grade knowledge systems with scalability, error recovery, and observability baked in.

Talking Point 2: Can AI Handle Large and Complex Data for Document Q&A?

TrustGraph’s founders tackled a massive problem: making sense of thousands of pages of complex technical documents, like safety engineering standards. The key isn’t just ingestion—it’s structuring the data so AI can understand it.

Enter chunking. Breaking documents into smaller, manageable pieces dramatically improves performance and accuracy, but it requires a robust pipeline to handle the volume. The real issue isn’t hallucination—it’s misinformation, where AI overlooks critical but indirect connections.

TrustGraph solves this with knowledge graphs. Using RDF schemas, it organizes extracted data into a connected framework, making it easier to query and manage complex datasets across domains.

Talking Point 3: TrustGraph Architecture

TrustGraph is built on Apache Pulsar’s Pub/Sub architecture, ensuring reliability and scalability. The architecture enables real-time processing, fault tolerance, and parallel workflows. This design supports fast ingestion and processing of unstructured data while minimizing latency and system downtime.

Talking Point 4: Demo - AI Meets UK Law

The live demo showcases how to set up TrustGraph and capabilities in processing UK legislation from 2024 from legislation.gov.uk and visualizing relationships within the dataset. This is an interesting use case because data is very connected and legal language is not natural language and it’s always interesting to see how LLMs processes it. There are a lot of internal references, acts written to replace older acts, it’s very complex.

Talking Point 5: Memgraph

Using Memgraph as one of the options for graph storage and querying, the demo highlights the speed and efficiency of queries and visualizations for 140,000 graph embeddings and 200,000 graph edges. Real-world applications include legal AI tools, compliance, and cybersecurity. For this, TrustGraph uses Memgraph Lab. Memgraph’s flexibility aligns with TrustGraph’s modular, containerized architecture.

Daniel and Mark mention Memgraph MAGE in the context of exploring analytical techniques to gain insights from knowledge graphs, particularly when no predefined ontology is available. These algorithms can help identify significant nodes, paths, and clusters within the graph.

Q&A

We’ve compiled the questions and answers from the community call Q&A session.

Note that we’ve paraphrased them slightly for brevity. For complete details, watch the entire video.

Can you explain GraphRAG querying and what happens in the background?

Answer: GraphRAG querying involves converting knowledge graph data into a format LLMs can process. The system uses a templated prompt, where the "knowledge statements" are represented in Cypher format (subject, predicate, object). Cypher notation works well with most LLMs, outperforming RDF formats like Turtle, which often yield inconsistent results. This process ensures the query integrates seamlessly with the graph, extracting relevant information for the LLM to process effectively. While putting the prompt together was straightforward, restructuring triples for compatibility remains a key challenge.

What are the biggest challenges in the extraction process?

Answer: The biggest challenge is balancing generalization and customization. TrustGraph is designed to be use-case agnostic, so users can adapt it to specific needs. However, small changes in prompts can significantly impact performance, and poorly chosen terms like "summary" or "fact" can lead to biased or incomplete extractions. Another issue is dealing with noise—TrustGraph extracts as much information as possible, which can include duplicates or irrelevant data. This approach ensures nothing important is missed, but it requires careful graph querying to identify what's truly useful. The system errs on over-extraction to maximize flexibility, leaving the refinement to the querying process.

Conclusion

TrustGraph and Memgraph are transforming how regulated industries approach unstructured data. By combining the flexibility of knowledge graphs with the power of AI agents, they’re breaking down silos and paving the way for smarter, faster innovation.

If you’re navigating the complexities of legal texts, compliance workflows, or any data-heavy domain, this duo might just be the game-changer you need.

Memgraph Academy

If you are new to the GraphRAG scene, check out a few short and easy-to-follow lessons from our subject matter experts. For free. Start with: