text

This algorithm is available again, starting with MAGE version 1.22. The algorithm was unavailable in MAGE from version 1.14 to version 1.21.

The text module offers a toolkit for manipulating strings.

TraitValue
Module typeutil
ImplementationC++
Parallelismsequential

Procedures

join()

Joins all the strings into a single one with the given delimiter between them.

Input:

  • subgraph: Graph (OPTIONAL) ➡ A specific subgraph, which is an object of type Graph returned by the project() function, on which the algorithm is run. If subgraph is not specified, the algorithm is computed on the entire graph by default.
  • strings: List[string] ➡ A list of strings to be joined.
  • delimiter: string ➡ A string to be inserted between the given strings.

Output:

  • string: string ➡ The joined string.

Usage:

To join strings, use the following query:

CALL text.join(["idora", " ", "ivan", "", "matija"], ",") 
YIELD string 
RETURN string;

Result:

+----------------------------+
| string                     |
+----------------------------+
| "idora, ,ivan,,matija"     |
+----------------------------+

regexGroups()

The procedure returns all matched subexpressions of the regex on the provided text using the C++ regex library.

Input:

  • subgraph: Graph (OPTIONAL) ➡ A specific subgraph, which is an object of type Graph returned by the project() function, on which the algorithm is run. If subgraph is not specified, the algorithm is computed on the entire graph by default.
  • input: string ➡ Text that will be searched for regex subexpressions.
  • regex: string ➡ Regex subexpression searched for in the text.

Output:

  • results: List[List[string]] ➡ All matched subexpressions. The inner list contains the whole subexpression and tokens matched inside.

Usage:

Use the following query to search for expressions:

CALL text.regexGroups("Memgraph: 1\nSQL: 2", "(\\w+): (\\d+)")
YIELD results
RETURN results;

Result:

+------------------------------------------------------------+
| results                                                    |
+------------------------------------------------------------+
| [["Memgraph: 1", "Memgraph", "1"], ["SQL: 2", "SQL", "2"]] |
+------------------------------------------------------------+

format()

The procedure formats strings using the C++ fmt library.

Input:

  • subgraph: Graph (OPTIONAL) ➡ A specific subgraph, which is an object of type Graph returned by the project() function, on which the algorithm is run. If subgraph is not specified, the algorithm is computed on the entire graph by default.
  • text: string ➡ Text that needs to be formatted.
  • parameters: string ➡ Parameters which will be applied to the text.

Output:

  • result: string ➡ Formatted string.

Usage:

Use the following queries to insert the parameters to the placeholders in the sentence:

CALL text.format("Memgraph is the number {} {} in the world.", [1, "graph database"])
YIELD result
RETURN result;

Result:

+---------------------------------------------------------+
| result                                                  |
+---------------------------------------------------------+
| "Memgraph is the number 1 graph database in the world. "|
+---------------------------------------------------------+

replace()

Replace each substring of the given string that matches the given regular expression with the given replacement.

Input:

  • subgraph: Graph (OPTIONAL) ➡ A specific subgraph, which is an object of type Graph returned by the project() function, on which the algorithm is run. If subgraph is not specified, the algorithm is computed on the entire graph by default.
  • text: string ➡ Text that needs to be replaced.
  • regex: string ➡ Regular expression by which to replace the string.
  • replacement: string ➡ Target string to replace the matched string.

Usage:

Use the following queries to do text replacement:

RETURN text.replace('Hello World!', '[^a-zA-Z]', '') AS result;

Result:

+--------------+
| result       |
+--------------+
| "HelloWorld" |
+--------------+
RETURN text.replace('MAGE is a Memgraph Product', 'MAGE', 'GQLAlchemy') AS result;

Result:

+------------------------------------+
| result                             |
+---------- -------------------------+
| "GQLAlchemy is a Memgraph Product" |
+------------------------------------+

regReplace()

Replace each substring of the given string that matches the given regular expression with the given replacement.

Input:

  • subgraph: Graph (OPTIONAL) ➡ A specific subgraph, which is an object of type Graph returned by the project() function, on which the algorithm is run. If subgraph is not specified, the algorithm is computed on the entire graph by default.
  • text: string ➡ Text that needs to be replaced.
  • regex: string ➡ Regular expression by which to replace the string.
  • replacement: string ➡ Target string to replace the matched string.

Usage:

Use the following query to do text replacement:

RETURN text.regreplace("Memgraph MAGE Memgraph MAGE", "MAGE", "GQLAlchemy") AS output;

Result:

+---------------------------------------+
| result                                |
+---------------------------------------+
| "GQLAlchemy MAGE Memgraph GQLAlchemy" |
+---------------------------------------+

distance()

Compare the given strings with the Levenshtein distance algorithm.

Input:

  • subgraph: Graph (OPTIONAL) ➡ A specific subgraph, which is an object of type Graph returned by the project() function, on which the algorithm is run. If subgraph is not specified, the algorithm is computed on the entire graph by default.
  • text1: string ➡ Source string.
  • text2: string ➡ Destination string for comparison.

Usage:

Use the following query to calculate distance between texts:

RETURN text.distance("Levenshtein", "Levenstein") AS result;

Result:

+--------+
| result |
+--------+
| 1      |
+--------+