LOAD CSV clause
The LOAD CSV
clause enables you to load and use data from a CSV file of your
choosing in a row-based manner within a query. We support the Excel CSV dialect,
as it’s the most commonly used one.
The syntax of the clause is:
LOAD CSV FROM <csv-location> ( WITH | NO ) HEADER [IGNORE BAD] [DELIMITER <delimiter-string>] [QUOTE <quote-string>] [NULLIF <nullif-string>] AS <variable-name>
-
<csv-location>
is a string of the location to the CSV file. Without a URL protocol it refers to a file path. There are no restrictions on where in your filesystem the file can be located, as long as the path is valid (i.e., the file exists). If usinghttp://
,https://
, orftp://
the CSV file will be fetched over the network. -
( WITH | NO ) HEADER
flag specifies whether the CSV file is to be parsed as though it has or hasn’t got a header. -
IGNORE BAD
flag specifies whether rows containing errors should be ignored or not. If it’s set, the parser attempts to return the first valid row from the CSV file. If it isn’t set, an exception will be thrown on the first invalid row encountered. -
DELIMITER <delimiter-string>
option enables you to specify the CSV delimiter character. If it isn’t set, the default delimiter character,
is assumed. -
QUOTE <quote-string>
option enables you to specify the CSV quote character. If it isn’t set, the default quote character"
is assumed. -
NULLIF <nullif-string>
option enables you to specify a sequence of characters that will be parsed as null. By default, all empty columns in Memgraph are treated as empty strings, so if this option is not used, no values will be treated as null. -
<variable-name>
is a symbolic name representing the variable to which the contents of the parsed row will be bound to, enabling access to the row contents later in the query.
The clause reads row by row from a CSV file and binds the contents of the parsed row to the variable you specified.
Adding a MATCH
or MERGE
clause before the LOAD CSV allows you to match
certain entities in the graph before running LOAD CSV, which is an optimization
as matched entities do not need to be searched for every row in the CSV file.
But, the MATCH
or MERGE
clause can be used prior the LOAD CSV
clause only
if the clause returns only one row. Returning multiple rows before calling the
LOAD CSV
clause will cause a Memgraph runtime error.
It’s important to note that the parser parses the values as strings.
It’s up to the user to convert the parsed row values to the appropriate type.
This can be done using the built-in conversion functions such as ToInteger
,
ToFloat
, ToBoolean
, ToEnum
, etc. Consult the documentation on the
available conversion functions.
Compressed CSV content with, gzip
or bzip2
will be automatically
decompressed on read.