Text search

Text search

⚠️

Text search is an experimental feature introduced in Memgraph 2.15.1. To use it, start Memgraph with the --experimental-enabled=text-search flag.

Text search allows you to look up nodes with properties that contain specific content. For a node to be searchable, you first need to create a text index that applies to it.

Text indices and search are powered by the Tantivy (opens in a new tab) full-text search engine.

Create text indices

Text indices are created with the CREATE TEXT INDEX command. You need to give a name to the new index and specify which labels it should apply to.

This statement creates a text index named complianceDocuments for nodes with the Report label:

CREATE TEXT INDEX complianceDocuments ON :Report;

If you attempt to create an index with an existing name, the statement will fail.

What is indexed

For any given node, if a text index applies to it, all its properties with text-indexable types (String, Integer, Float, or Boolean) are stored.

Show text indices

To list all text indices in Memgraph, use the SHOW INDEX INFO statement.

Query text indices

Querying text indices is done through query procedures.

Unlike other index types, text indices are not used by the query planner.

Search in specific properties

The text_search.search procedure finds text-indexed nodes matching the given query.

Input:

  • index_name: String - The text index to be searched.
  • search_query: String - The query applied to the text-indexed nodes.

Output:

  • node: Node - A node in index_name matching the given search_query.

Usage:

The syntax for the search_query parameter is available here (opens in a new tab). If the query contains property names, attach the data. prefix to them.

The following query searches the complianceDocuments index for nodes with the value of title property containing Rules2024:

CALL text_search.search("complianceDocuments", "data.title:Rules2024")
YIELD node
RETURN node;

Search over all indexed properties

The text_search.search_all procedure looks for text-indexed nodes where at least one property value matches the given query.

Unlike text_search.search, this procedure searches over all properties, and there is no need to specify property names in the query.

Input:

  • index_name: String - The text index to be searched.
  • search_query: String - The query applied to the text-indexed nodes.

Output:

  • node: Node - A node in index_name matching the given search_query.

Usage:

The following query searches the complianceDocuments index for nodes where at least one property value contains Rules2024:

CALL text_search.search_all("complianceDocuments", "Rules2024")
YIELD node
RETURN node;

Regex search

The text_search.regex_search procedure looks for text-indexed nodes where at least one property value matches the given regular expression (regex).

Input:

  • index_name: String - The text index to be searched.
  • search_query: String - The regex applied to the text-indexed nodes.

Output:

  • node: Node - A node in index_name matching the given search_query.

Usage:

Regex searches apply to all properties; do not include property names in the search query.

The following query searches the complianceDocuments index for nodes where at least one property value satisfies the wor.*s regex, e.g. "works" and "words":

CALL text_search.regex_search("complianceDocuments", "wor.*s")
YIELD node
RETURN node;

Aggregations

Aggregations allow you to perform calculations on text search results. By using them, you can efficiently summarize the results, calculate averages or totals, identify min/max values, and count indexed nodes that meet specific criteria.

The text_search.aggregate procedure lets you define an aggregation and apply it to the results of a search query.

Input:

  • index_name: String - The text index to be searched.
  • search_query: String - The query applied to the text-indexed nodes.
  • aggregation_query: String - The aggregation (JSON-formatted) to be applied to the output of search_query.

Output:

  • aggregation: String - JSON-formatted string with the output of aggregation.

Usage:

Aggregation queries and results are strings with Elasticsearch-compatible JSON format, where "field" corresponds to node properties. If the search or aggregation queries contain property names, attach the data. prefix to them.

The following query counts all nodes in the complianceDocuments index:

CALL text_search.aggregate(
    "complianceDocuments",
    "data.title:Rules2024",
    '{"count": {"value_count": {"field": "metadata.gid"}}}'
)
YIELD aggregation
RETURN aggregation;

Drop text indices

Text indices are dropped with the DROP TEXT INDEX command. You need to give the name of the index to be deleted.

This statement drops the text index named complianceDocuments:

DROP TEXT INDEX complianceDocuments;

If one attempts to delete an index with the same name twice, the statement will fail.

Compatibility

Being an experimental feature, text search only supports some usage modalities that are available in Memgraph. Refer to the table below for an overview:

FeatureSupport
Multitenancyyes
Durabilityyes
Storage modesyes (all)
Replicationno
Concurrent transactionsno
⚠️

Disclaimer: For now, text search is not guaranteed to work correctly in use cases that involve concurrent transactions and replication.