RUNE — Our Journey to Creating a GitHub LLM Analytics Tool: Intern’s Perspective
GitHub has become more than just a platform for version control and code collaboration; it's the epicenter of innovation in modern technology. In an era where code is not just a set of instructions, the ability to understand, navigate, and explore vast codebases has become a crucial skill for developer’s project management worldwide. This is where RUNE (Repository Understanding, Navigation, and Exploration) steps in, serving as a bridge between the expansive world of code hosted on GitHub and advanced generative LLMs.
In my development workflow, I use GitHub daily. This is a convenient and accessible tool for not only storing code but also sharing it with other developers, colleagues, and friends (sometimes GitHub pages are a lifesaver). This is especially important, given the diverse community of developers I get to work with from all over the world. In addition, I was always highly impressed by open-source projects that rely solely on the goodwill of developers and contributors, which you can easily support too with pull requests and many other features!
Like many developers, I don’t like doing two things — writing documentation and fixing bugs (which I myself sometimes call features). These thoughts led my colleague and me to an idea of a project that automates routine work with boilerplate code (and not only that!): Using the generative abilities of LLMs, our goal was to alleviate the plight of developers both in the problems described above and in general to increase the quality of developer experience.
In the realm of software development, developer experience (often called DX) is the linchpin of productivity. The smoother and more intuitive the development environment, the more empowered programmers are to focus on the heart of their work: crafting impeccable code for solving complex real-world problems and making the world a better place. RUNE is not just another tool; it's a catalyst for enhancing developer experience, redefining the way programmers interact with their code repositories using the magic of generative language models.
Enhancing your workflow with LLM assistance
In a world where data is abundant, harnessing the power of Large Language Model assistance can be a game-changer. RUNE takes this concept and weaves it into the fabric of the developer's daily workflow, offering a unique blend of code enhancement and natural language interactions.
At its core, RUNE is designed to simplify the complexities of working with code. It empowers you, a developer, to seamlessly navigate your codebases, understand structures of any complexity, and improve the quality of your work. With the assistance of LLMs and the power of graphs, you can now communicate with your code as if you were having a conversation with a colleague. No more sifting through endless lines of code or grappling with cryptic error messages — RUNE streamlines the development process and fosters a more intuitive connection between developers, their code, and the project’s structure.
RUNE, RUNE, and RUNE again... What is this? How did I end up in this Norse mythology? There are two answers to this question - MAGE and ODIN. I have always liked abbreviations (especially recursive ones) in project names, and the most important argument in this favor was MAGE - Memgraph Advanced Graph Extensions. Thus, ODIN, which you can read more about here, and RUNE - Repository Understanding, Navigation, and Exploration were born almost simultaneously.
The history of developer workflows has been punctuated by shifts in technology and methodology, from the early days of command-line interfaces to the rise of integrated development environments (IDEs). RUNE represents the next evolution in this journey, where the boundaries between human and machine interaction blur, and code comprehension becomes more accessible than ever before.
RUNE under the hood
To truly appreciate the significance of RUNE, we must venture under the hood and examine the technical foundations that enable its remarkable capabilities. RUNE is not just a web application; it's a fusion of cutting-edge technologies carefully orchestrated to deliver a seamless developer experience.
At the heart of RUNE is a robust tech stack that powers its web interface and AI interactions. React serves as the frontend framework, offering a dynamic, responsive, and reusable user interface, along with TypeScript, which brings type safety and improved development workflows to the project, ensuring code quality and reliability.
Behind the scenes, LangChain, an open-source Python framework designed for communicating with Large Language Models (LLMs), forms the backbone of RUNE's natural language interactions embodied in BOR – the backend for ODIN and RUNE, which has its own fascinating and very detailed article written by my colleague, Patrik. This technology enables developers to converse with their code, ask questions, and request improvements — all in plain language.
The Memgraph database, renowned for its in-memory, high-performance capabilities, plays a pivotal role in visualizing the GitHub repository structure. It transforms complex code hierarchies into interactive, intuitive graphs, allowing developers to explore their codebases effortlessly. BOR, the backend for ODIN and RUNE, is an invisible streamline between all parts of the application that utilizes LangChain in conjunction with Memgraph to provide frontend with all required data to function properly, enabling lightweight but reliable client-server communication.
Although this description sounds somewhat dry, the project, by its nature, was quite complex with a fairly large list of requirements (which we came up with during one of the many brainstorming sessions), so the choice of technologies was quite scrupulous, and the process of communication and development was very well debugged over time. RUNE is not only about a beautiful and convenient user interface (which, of course, is present), but first of all, it is about a complex application architecture where all the above components are harmoniously combined. If even one part of the project breaks (like the forced use of GPT-3.5 instead of GPT-4 in our case), all other parts will quickly lose meaning and user interest.
Although the application is a standard client-server architecture, additional functions were also required for visualization and requests to third-party APIs. So, the frontend used quite a lot of modern development approaches, including React Hooks and some optimization principles such as memoization, along with modern graph visualization libraries, such as the incredibly powerful Cytoscape, which includes TypeScript support. On the other hand, there is Patrick’s slice of the cake, which combines Memgraph (with which I, as a frontend developer, found it incredibly easy and pleasant to work with - I sincerely recommend playing with Memgraph Lab) and LangChain using the recently added interface for communication with LLMs.
For me, as an average web developer, this project opened my eyes to many aspects of structuring and organizing code, as well as understanding the benefits of using technologies such as TypeScript (2 + "2" = "22", hello) and Styled Components, with which my CSS Sheets just looked much better. I'm sure this project could be better in many ways and use more advanced design patterns, so if you want to make this story even more epic, GitHub issues are just waiting for you to discover them!
Not your boring chat-with-data
RUNE isn't just another tool in the developer's arsenal; it represents a paradigm shift in how we perceive GitHub repositories and engage with code. By offering a fresh perspective on code structure and facilitating direct conversations about code, RUNE transcends the boundaries of traditional development tools and ascends them to a whole new level.
One of RUNE's standout features is its ability to provide a fresh look at GitHub repository structures. Instead of deciphering complex directory trees and file hierarchies, you are presented with an interactive graph visualization. Nodes represent files and folders, offering an intuitive and visually appealing way to navigate codebases. This shift from the mundane to the visual not only streamlines code exploration but also makes it more accessible, particularly for newcomers to a project.
Perhaps the most groundbreaking aspect of RUNE is its capability to facilitate direct interactions with code. Developers can now have dynamic conversations with their codebase, seeking explanations, improvements, and insights into code chunks or the whole files from the repository in natural language. This real-time, bidirectional communication fosters a deeper understanding of code, promotes collaboration among developers, and accelerates the troubleshooting process. It's a novel approach that transforms code from a static artifact into a living, breathing entity that responds to human inquiries.
Although there is a prompt bar here, which represents the already classic chat-with-your-data applications, the main emphasis in our application is still placed on interactive visualization and interactive real-time code analysis. At the click of a button, LLM will conduct a full analysis of the file or selected piece of code and will be able to give valuable advice on improving or fixing existing bugs (surprisingly, it doesn’t even try to turn them into features!). LLM, taking just a question from you as input, gives you an answer, often indistinguishable from a human one. Behind all this is a complex structure (much more complex than that of this project), which includes a huge number of interrelated steps. If you want to understand how everything works with RUNE, in addition to the already cited article about BOR, I recommend also reading the following article from LangChain, which is the guarantor of all the magic that happens here. It is also worth mentioning the community call from Memgraph, thanks to which it became possible to elegantly combine these two intriguing technologies.
When developing RUNE, we first thought about how it would make sense for developers to use this tool — based on our previous experience and workflows, we asked ourselves, what are we missing? How do we use an LLM so that it does not do our work for us, but helps avoid the same type and routine actions? I'm still not sure if we were able to answer this question and if we were able to do something truly useful, but one thing I do know is that LLMs will sooner or later become the second right hand of every developer, and RUNE can be your first step in this direction of boosting your learning and creating productivity.
Adapting to the use of Language Model AIs or LLMs not only in your developer workflow but even in your daily life is essential in our fast-paced modern world based on the technological landscape. LLMs not only streamline code comprehension and troubleshooting but also empower developers to harness the full potential of their codebase. As software development continues to evolve, the integration of LLMs becomes more than a choice; it's a strategic advantage that can significantly enhance productivity, collaboration, and code quality, positioning developers at the forefront of innovation in the digital age.
By using RUNE, you'll gain access to a powerful set of features that can turbocharge your development workflow. With a freshly looking visualization tool, you can chat with your code, seek improvements, and receive explanations - all in real-time, powered by a blazingly fast Memgraph database. RUNE is not just a productivity booster; it's a game-changer that can make you rethink your typical daily code-writing workflow. So, dive in, explore your repositories with fresh eyes, and let RUNE be your trusted companion in the world of coding!