From Contribution to Impact: LangChain Docs and Memgraph's LLM Story

By Oleksandr Ichenskyi

11 min readSeptember 26, 2023

In the ever-evolving software development landscape, open-source contributions have emerged as a powerful catalyst for innovation and positive change. This journey into the world of LangChain and Memgraph, two formidable players in artificial intelligence and data management, is a testament to the transformative impact that open-source collaboration can achieve.

Why LangChain?

LangChain is not just another Python library; it represents a revolution in the field of LLM software development. With its innovative approach to facilitating communication with LLMs, LangChain opens new possibilities for natural language processing and understanding. With its open-source approach and active community of developers working daily to improve the technology, this project can be called one of the successful examples of community-driven development in the modern world.

The ethos of open-source contribution extends far beyond code and libraries. It is about embracing a collaborative spirit that transcends borders and empowers individuals to shape the future of technology. Although these concepts seem far away to many, as they seemed to me before this entire story, open source is much more friendly and accessible to everyone, including professionals looking for an interesting project to work on as well as beginners looking for new invaluable skills. The advantages of this approach are that any interested developer, or just a person interested in technology, can contribute while gaining not only hard skills but also developing their soft skills through communication with people and working together on solving real-world problems.

The power of open-source contribution

What does it truly mean to make an impact in the world of open-source software development? Beyond the lines of code and the technical intricacies lies a profound force that drives progress and innovation. Lately, too much emphasis has been placed on writing code - code, code, and more code. While everyone is trying, and rightly, to improve their technical skills and gain new knowledge, many people forget that technology is created for people and not vice versa. It is very difficult not to lose yourself in all this technical abundance and remember that communication with other people, both in your personal and professional life, is still the key to growth and development, which cannot be replaced by any library or framework.

open-source-community

Open-source contribution is not merely about writing code or fixing bugs; it's about shaping the digital landscape we all inhabit. Every line of code you contribute, every bug you help squash, and every idea you share contributes to a collective effort that can change the world. It's about democratizing technology, making it accessible to a broader audience, and fostering an inclusive community where diverse perspectives converge.

Democratizing technology: Memgraph and LangChain

Contributing to open-source projects like LangChain and Memgraph involves more than just understanding the codebase; it's about actively participating in the development process. One of the most common ways to contribute is by making a pull request (PR). A PR is essentially a proposal for changes you'd like to introduce to the project. These changes can be like adding some new cool feature to the functionality of the project, fixing some bugs, or even adding documentation, which should not be underestimated in the software development world. Before delving into the world of PR, it is very important to familiarize yourself with the contribution guidelines, which are found in almost every open-source project. For example, LangChain's guidelines have several very important points about formatting and spell-checking before proposing further changes. Such rules are necessary to save developers time and not break existing conventions and coding styles. If you're already familiar with Git, creating a PR will seem very easy, but if you're not, the following article makes a very good start for your Git work.

pull request

To create a PR you will need to fork the repository and commit your changes there. It is a good practice to create a new branch for your changes, but it is completely optional in most cases if you don’t want to waste your time. Ensure that your changes are well-documented, thoroughly tested, and align with the project's coding standards and contribution guidelines. Once ready, submit your PR on GitHub using the PR section on the project page, select the branch (which would most probably be just the main, honestly) you want to merge and engage in discussions with maintainers and other contributors to refine your work. And don’t forget that communication takes time, especially when working with people from all over the world. Making a successful pull request is not just about code; it's about communication, collaboration, and a commitment to improving the project.

Good documentation: a myth or a lifesaver?

documentation

Imagine a scenario where you're eager to integrate LangChain and Memgraph into your project. You've found the perfect library, but you're stuck, baffled by cryptic error messages, and struggling to understand how to make it work. This is where good documentation steps in as your guiding light. Well-documented software saves time by providing clear instructions and examples; it saves money by reducing the need for extensive support and troubleshooting and saves effort by enabling developers to get up to speed quickly. It's an investment that pays dividends in the long run.

Creating effective documentation is an art. It's not just about listing functions and parameters; it's about crafting a narrative that guides users through every step of their journey. Whether you're a developer or a user, understanding how to leverage well-crafted documentation is key to success in the open-source world.

With these thoughts in mind, I began to study how LangChain works from the inside and how their contribution guidelines and documentation are structured. I was pleasantly surprised when I saw that the part of the documentation I was interested in generated itself from .ipynb files because this meant that I could make the Jupyter notebook I was familiar with and get ready-made and convenient documentation!

To be honest, I rarely liked the process of writing documentation — a dry description of functionality, possible problems, and caveats... And, of course, there are quite convenient tools for automatically generating documentation, such as Swagger (which, by the way, is used in our BOR, but sometimes they are not enough to describe some rather important aspects of the application that should be described somewhere. Considering that I forget how my code works after a couple of days, I shouldn’t expect anyone else to understand it. And here, a creative approach comes in very handy — writing stories is much more interesting than documentation. Guided by this logic, I wrote a small draft version about how to run and initialize Memgraph with some example natural language queries. At first, I thought it was interesting and quite detailed. But after some of my friends and colleagues looked at my documentation at my request, I was surprised how much I missed - after all, I described everything! As it turned out, I did not.

In writing documentation, as in any other code, communication is important — put yourself in the position of a person who has never touched your technology in his life and who has no idea what it does or how it works. Would everything be clear to him after reading your documentation? Answer this question for yourself, and feel free to ask your friends to criticize your work. This is the only way to understand what you missed and what a potential user may be missing. Such thoughts led me to the successful merge of my PR, which now glimmers in the form of a documentation page here and, I hope, helps and will help other developers like you.

Benefits of using LangChain with Memgraph

In the fast-paced world of software development, speed matters. Memgraph, as an in-memory high-performance graph database, offers unparalleled speed and efficiency. When paired with LangChain, this combination not only accelerates your development process but also ensures that your applications run seamlessly, delivering lightning-fast results to end users. We had the opportunity to try Memgraph on our projects, such as ODIN and RUNE, and we were not disappointed — even I, as a developer more involved in the frontend, was able to contribute and very easily play with graph databases and their visualization using Memgraph Lab.

Knowledge graphs are the future of data representation and analysis. They provide a holistic view of interconnected data, allowing for complex insights and powerful data-driven applications. LangChain and Memgraph make it easier than ever to create and utilize knowledge graphs. This, coupled with the recent addition of the Memgraph interface to LangChain, was our main motivation to become pioneers in using LLM with Memgraph. The results were impressive, both in terms of performance and response accuracy, as detailed in this community call.

Knowledge graphs have a wide range of applications across various domains, from healthcare to finance and beyond. By integrating LangChain with Memgraph, you unlock the potential to create intelligent applications that leverage knowledge graphs for smarter decision-making. Although with some caution, this can already be used today in applications specializing in data processing using LLMs of any scale, all thanks to the simple scalability of workloads using Docker (both Memgraph itself and the LLMs used). Why with some caution? It's a long story of GPT-3.5 fighting against Cypher query language, which ended in favor of GPT-4 and which not many people have access to. Still, we hope that this will change soon and these technologies will be available to everyone to fully unleash the power of their imagination.

Top 3 reasons to use Memgraph in your next LLM project

If you had told me a year ago that I would be explaining why graph databases are cool, I wouldn't have believed it. But now, with two interesting projects synthesizing LangChain and Memgraph under my belt, I'd like to tell you why you should give it a try. And if I am not convincing enough, feel free to check out this amazing article about the graph tech impact on the data management industry.

Sustainable scalability

Memgraph's architecture is designed to handle large-scale data as efficiently as any small data amounts. As your project, be it LLM or anything else, grows, you can seamlessly scale your database infrastructure without accumulating technical debt. This scalability ensures that your application remains responsive and adaptable to evolving user (or your project manager) needs.

Optimized query performance

Memgraph's query execution engine is optimized for complex graph queries, making it an ideal partner for LangChain's natural language processing capabilities. In college, I didn't like writing complex SQL queries; they seemed too contrived and boring to me. But when I discovered Cypher, especially in conjunction with Memgraph, it seemed to give me a breath of fresh air, and the well-made visualization very clearly fueled my motivation to work with graph databases, no matter how complex my queries were. Your LLMs can now retrieve insights from knowledge graphs with blazing speed, enhancing user experiences and reducing latency.

Community-driven development

Memgraph and LangChain are both nurtured by vibrant and collaborative open-source communities. By integrating Memgraph into your LangChain projects, you become part of a dynamic ecosystem of developers, data scientists, and AI enthusiasts. This community support means you're never alone in overcoming challenges and can benefit from shared knowledge and experience.

To be honest, this experience of writing documentation and contributing to open-source was my first experience of this kind. This new experience scared me, and I was stunned by the number of people, details, and little things. But seeing how well the contribution guidelines were written and how responsive the members of this community were inspired me to not only make this documentation even better but also to try my hand at working with other open-source communities, which I recommend that you do too.

Concluding thoughts

For a long time, when starting my career in software development, the world of open source seemed to me like distant, unattainable magic, where some miracles happen that give us, ordinary developers, access to cool and effective solutions.

All this long adventure was, for me, my first contribution to some open-source project, and, to be honest, although I am very glad that LangChain became this project, for some reason, I am sure that in other similar open-source communities, you will be welcomed with open arms too. Yes, it’s still just creating code, documentation, and working on some product, but the collaborative atmosphere and constant support from both maintainers and other developers seem to give a second wind for more effective work on projects that are interesting to you personally. In addition, this is a very cool opportunity to exchange useful experience with more experienced developers and simply with people with their own fresh views on both the world and development who inspire developers like you and me every day.

Thank you for going through this whole journey with me, and I hope it has inspired you now to go and do something wonderful for such a sincere and atmospheric open-source community!