Writing Mutations and Complex Cypher Queries in Memgraph
Memgraph is an open-source, in-memory graph database that supports Cypher, a query language for exploring and manipulating graph data. In this blog post, we'll explore how to write complex Cypher queries and mutations in Memgraph to effectively work with your graph data.
Understanding the Cypher Language
Before we dive into writing complex Cypher queries and mutations in Memgraph, let's start with the basics of the Cypher query language. We’ll use the MovieLens dataset from Memgraph Lab, it consists of three types of nodes: Movie, User
and Genre
where Users
have id
property, Movies
title
property and Genres
name
property. Users can rate a movie and the rating is modeled with :RATED
relationship. Movies can be connected with :OF_GENRE
relationship to different genres.
To try and run these queries, download the Memgraph Platform. Once it’s running, access Memgraph Lab within the browser on localhost:3000. Go to Datasets
in the sidebar and choose the dataset MovieLens: Movies, genres and users.
MATCH (:User)-[r:RATED]->(:Movie {title:"'Round Midnight (1986)"})
RETURN avg(r.rating);
This query calculates the average rating of the movie titled "'Round Midnight (1986)" in Memgraph. This is done by matching all :User
nodes that have a :RATED
relationship with the :Movie
node with the title "'Round Midnight (1986)". The avg
function is then used to calculate the average of all the ratings.
Now that we have a basic understanding of how Cypher works, let's explore how to write more complex queries and mutations.
Mutations
Cypher query mutations allow you to add, update, or delete nodes and relationships in your graph.
Creating Nodes and Relationships
To create a new node and relationship in the graph, you can use the CREATE clause. MovieLens dataset consists of 9742 movies across 20 genres. To create a new movie Oppenheimer which would be the 9743th movie and in the thriller genre, you can run the following query:
CREATE (movie:Movie {title: "Oppenheimer (2023)", id: "9743"})-[:OF_GENRE]->(genre:Genre {name: "Thriller"});
Updating Data
Use the SET clause to update the properties of existing nodes and relationships. For example, to change the name of the movie "Father of the Bride Part II (1995)" you can run the following query:
MATCH (movie:Movie {title: "Father of the Bride Part II (1995)"})
SET movie.title = "Father of the Bride Part 2 (1995)";
Deleting Nodes and Relationships
If you want to delete nodes or relationships, you can use the DELETE clause. To remove the relationship between a genre and a movie, you can do this:
MATCH (movie:Movie {title: "Oppenheimer (2023)"})-[r:OF_GENRE]->(genre:Genre {name: "Thriller"})
DELETE r;
To delete a movie node:
MATCH (movie:Movie {title: "Oppenheimer (2023)"})
DELETE movie;
Or you can delete a node and its relationships with the keyword DETACH
by running the following query:
MATCH (movie:Movie {title: "Oppenheimer (2023)"})
DETACH DELETE movie;
Complex Queries
Complex querying means creating advanced queries to get specific data from a database using techniques like pattern matching, conditional statements, and filtering to extract and manipulate data from a graph database. Complex querying in data analysis is beneficial because it helps analysts find precise and subtle patterns with specific data. It makes analysis more efficient by saving time and minimizing unnecessary data transfer. Complex queries also offer flexibility, letting analysts adapt to changing needs and uncover complex relationships for valuable insights.
Filtering
In Memgraph, filtering is primarily achieved using the WHERE clause. It is used to add constraints to the described patterns or to filter results. You can filter nodes based on their properties, labels, or relationships. For instance, you can count the number of movies rated above 3 by a User
with the ID 2:
MATCH (user:User {id:2})-[rating:RATED]->(movie:Movie)
WHERE rating.rating > 3
RETURN COUNT(movie) AS movieCount;
Pattern Matching
You can build complex patterns to describe the relationships between nodes. For example, if you want to find Comedy movies rated above 3 by a User with the ID 2, you can write a query like this:
MATCH (user:User {id:2})-[rating:RATED]->(movie:Movie)<-[r:OF_GENRE]-(genre:Genre {name: "Comedy"})
WHERE rating.rating > 3
RETURN movie.title;
Using Multiple Clauses
In Cypher, you can combine various operations and conditions in a single query to fine-tune search results and gather more precise information from a graph database. Below is an example of how to use multiple clauses in Cypher to find similar users based on movie ratings. This query will return the IDs of the top 10 users who have rated the same movies (more than 2) as the user with ID 50, ordered by the similarity of their ratings.
MATCH (u:User {id:50})-[r:RATED]-(m:Movie)-[other_r:RATED]-(other:User)
WITH other.id AS other_id, avg(abs(r.rating-other_r.rating)) AS similarity, count(*) AS same_movies_rated
WHERE same_movies_rated > 2
RETURN other_id
ORDER BY similarity
LIMIT 10;
Furthermore, it’s interesting to make a recommendation system for finding similar users and recommending movies to people with similar preferences. A full lesson on this can be found in Memgraph Playground.
Takeaway
Mastering complex Cypher queries in Memgraph lets you precisely find, change, and optimize data for various tasks like extracting insights or managing data. Explore Cypher's capabilities in Memgraph to unlock your graph data's potential for applications like knowledge graphs, recommendation systems, fraud detection, and network analysis.