Text search
Text search is an experimental
feature introduced in Memgraph
2.15.1. To use it, start Memgraph with the --experimental-enabled=text-search
flag.
Text search allows you to look up nodes with properties that contain specific content. For a node to be searchable, you first need to create a text index that applies to it.
Text indices and search are powered by the Tantivy full-text search engine.
Create text indices
Text indices are created with the CREATE TEXT INDEX
command. You need to give
a name to the new index and specify which labels it should apply to.
This statement creates a text index named complianceDocuments
for nodes with
the Report
label:
CREATE TEXT INDEX complianceDocuments ON :Report;
If you attempt to create an index with an existing name, the statement will fail.
What is indexed
For any given node, if a text index applies to it, all its properties with text-indexable types (String
, Integer
, Float
, or Boolean
) are stored.
Show text indices
To list all text indices in Memgraph, use the SHOW INDEX INFO
statement.
Query text indices
Querying text indices is done through query procedures.
Unlike other index types, text indices are not used by the query planner.
Search in specific properties
The text_search.search
procedure finds text-indexed nodes matching the given query.
Input:
index_name: String
- The text index to be searched.search_query: String
- The query applied to the text-indexed nodes.
Output:
node: Node
- A node inindex_name
matching the givensearch_query
.
Usage:
The syntax for the search_query
parameter is available
here.
If the query contains property names, attach the data.
prefix to them.
The following query searches the complianceDocuments
index for nodes with the
value of title
property containing Rules2024
:
CALL text_search.search("complianceDocuments", "data.title:Rules2024")
YIELD node
RETURN node;
Search over all indexed properties
The text_search.search_all
procedure looks for text-indexed nodes where at
least one property value matches the given query.
Unlike text_search.search
, this procedure searches over all properties, and
there is no need to specify property names in the query.
Input:
index_name: String
- The text index to be searched.search_query: String
- The query applied to the text-indexed nodes.
Output:
node: Node
- A node inindex_name
matching the givensearch_query
.
Usage:
The following query searches the complianceDocuments
index for nodes where at
least one property value contains Rules2024
:
CALL text_search.search_all("complianceDocuments", "Rules2024")
YIELD node
RETURN node;
Regex search
The text_search.regex_search
procedure looks for text-indexed nodes where at
least one property value matches the given regular expression (regex).
Input:
index_name: String
- The text index to be searched.search_query: String
- The regex applied to the text-indexed nodes.
Output:
node: Node
- A node inindex_name
matching the givensearch_query
.
Usage:
Regex searches apply to all properties; do not include property names in the search query.
The following query searches the complianceDocuments
index for nodes where at
least one property value satisfies the wor.*s
regex, e.g. “works” and “words”:
CALL text_search.regex_search("complianceDocuments", "wor.*s")
YIELD node
RETURN node;
Aggregations
Aggregations allow you to perform calculations on text search results. By using them, you can efficiently summarize the results, calculate averages or totals, identify min/max values, and count indexed nodes that meet specific criteria.
The text_search.aggregate
procedure lets you define an aggregation and apply
it to the results of a search query.
Input:
index_name: String
- The text index to be searched.search_query: String
- The query applied to the text-indexed nodes.aggregation_query: String
- The aggregation (JSON-formatted) to be applied to the output ofsearch_query
.
Output:
aggregation: String
- JSON-formatted string with the output of aggregation.
Usage:
Aggregation queries and results are strings with Elasticsearch-compatible JSON
format, where "field"
corresponds to node properties. If the search or
aggregation queries contain property names, attach the data.
prefix to them.
The following query counts all nodes in the complianceDocuments
index:
CALL text_search.aggregate(
"complianceDocuments",
"data.title:Rules2024",
'{"count": {"value_count": {"field": "metadata.gid"}}}'
)
YIELD aggregation
RETURN aggregation;
Drop text indices
Text indices are dropped with the DROP TEXT INDEX
command. You need to give
the name of the index to be deleted.
This statement drops the text index named complianceDocuments
:
DROP TEXT INDEX complianceDocuments;
If one attempts to delete an index with the same name twice, the statement will fail.
Compatibility
Being an experimental feature, text search only supports some usage modalities that are available in Memgraph. Refer to the table below for an overview:
Feature | Support |
---|---|
Multitenancy | yes |
Durability | yes |
Storage modes | yes (all) |
Replication | no |
Concurrent transactions | no |
Disclaimer: For now, text search is not guaranteed to work correctly in use cases that involve concurrent transactions and replication.