Skip to content

loaders

ForeignKeyMapping Objects

@dataclass(frozen=True)
class ForeignKeyMapping()

Class that contains the full description of a single foreign key in a table.

Attributes:

  • column_name - Column name that holds the foreign key.
  • reference_table - Name of a table from which the foreign key is taken.
  • reference_key - Column name in the referenced table from which the foreign key is taken.

OneToManyMapping Objects

@dataclass(frozen=True)
class OneToManyMapping()

Class that holds the full description of a single one to many mapping in a table.

Attributes:

  • foreign_key - Foreign key used for mapping.
  • label - Label which will be applied to the relationship created from this object.
  • from_entity - Direction of the relationship created from the mapping object.
  • parameters - Parameters that will be added to the relationship created from this object (Optional).

ManyToManyMapping Objects

@dataclass(frozen=True)
class ManyToManyMapping()

Class that holds the full description of a single many to many mapping in a table. Many to many mapping is intended to be used in case of associative tables.

Attributes:

  • foreign_key_from - Describes the source of the relationship.
  • foreign_key_to - Describes the destination of the relationship.
  • label - Label to be applied to the newly created relationship.
  • parameters - Parameters that will be added to the relationship created from this object (Optional).

TableMapping Objects

@dataclass
class TableMapping()

Class that holds the full description of all of the mappings for a single table.

Attributes:

  • table_name - Name of the table.
  • mapping - All of the mappings in the table (Optional).
  • indices - List of the indices to be created for this table (Optional).

NameMappings Objects

@dataclass(frozen=True)
class NameMappings()

Class that contains new label name and all of the column name mappings for a single table.

Attributes:

  • label - New label (Optional).
  • column_names_mapping - Dictionary containing key-value pairs in form ("column name", "property name") (Optional).

NameMapper Objects

class NameMapper()

Class that holds all name mappings for all of the collections.

get_label

def get_label(collection_name: str) -> str

Returns label for given collection.

Arguments:

  • collection_name - Original collection name.

get_property_name

def get_property_name(collection_name: str, column_name: str) -> str

Returns property name for column from collection.

Arguments:

  • collection_name - Original collection name.
  • column_name - Original column name.

FileSystemHandler Objects

class FileSystemHandler(ABC)

Abstract class for defining FileSystemHandler.

Inherit this class, define a custom data source and initialize the connection.

get_path

@abstractmethod
def get_path(collection_name: str) -> str

Returns complete path in specific file system. Used to read the file system for a specific file.

S3FileSystemHandler Objects

class S3FileSystemHandler(FileSystemHandler)

Handles connection to Amazon S3 service via PyArrow.

__init__

def __init__(bucket_name: str, **kwargs)

Initializes connection and data bucket.

Arguments:

  • bucket_name - Name of the bucket on S3 from which to read the data

Kwargs: - access_key - S3 access key. - secret_key - S3 secret key. - region - S3 region. - session_token - S3 session token (Optional).

Raises:

  • KeyError - kwargs doesn't contain necessary fields.

get_path

def get_path(collection_name: str) -> str

Get file path in file system.

Arguments:

  • collection_name - Name of the file to read.

AzureBlobFileSystemHandler Objects

class AzureBlobFileSystemHandler(FileSystemHandler)

Handles connection to Azure Blob service via adlfs package.

__init__

def __init__(container_name: str, **kwargs) -> None

Initializes connection and data container.

Arguments:

  • container_name - Name of the Blob container storing data.

Kwargs: - account_name - Account name from Azure Blob. - account_key - Account key for Azure Blob (Optional - if using sas_token). - sas_token - Shared access signature token for authentication (Optional).

Raises:

  • KeyError - kwargs doesn't contain necessary fields.

get_path

def get_path(collection_name: str) -> str

Get file path in file system.

Arguments:

  • collection_name - Name of the file to read.

LocalFileSystemHandler Objects

class LocalFileSystemHandler(FileSystemHandler)

Handles a local filesystem.

__init__

def __init__(path: str) -> None

Initializes an fsspec local file system and sets path to data.

Arguments:

  • path - path to the local storage location.

get_path

def get_path(collection_name: str) -> str

Get file path in the local file system.

Arguments:

  • collection_name - Name of the file to read.

DataLoader Objects

class DataLoader(ABC)

Implements loading of a data type from file system service to TableToGraphImporter.

__init__

def __init__(file_extension: str,
             file_system_handler: FileSystemHandler) -> None

Arguments:

  • file_extension - File format to be read.
  • file_system_handler - Object for handling of the file system service.

load_data

@abstractmethod
def load_data(collection_name: str, is_cross_table: bool = False) -> None

Override this method in the derived class. Intended to be used for reading data from data format.

Arguments:

  • collection_name - Name of the file to read.
  • is_cross_table - Indicate whether or not the collection contains associative table (default=False).

Raises:

  • NotImplementedError - The method is not implemented in the extended class.

PyArrowFileTypeEnum Objects

class PyArrowFileTypeEnum(Enum)

Enumerates file types supported by PyArrow

PyArrowDataLoader Objects

class PyArrowDataLoader(DataLoader)

Loads data using PyArrow.

PyArrow currently supports "parquet", "ipc"/"arrow"/"feather", "csv", and "orc", see pyarrow.dataset.dataset for up-to-date info. ds.dataset in load_data accepts any fsspec subclass, making this DataLoader compatible with fsspec-compatible filesystems.

__init__

def __init__(file_extension_enum: PyArrowFileTypeEnum,
             file_system_handler: FileSystemHandler) -> None

Arguments:

  • file_extension_enum - The file format to be read.
  • file_system_handler - Object for handling of the file system service.

load_data

def load_data(collection_name: str,
              is_cross_table: bool = False,
              columns: Optional[List[str]] = None) -> None

Generator for loading data.

Arguments:

  • collection_name - Name of the file to read.
  • is_cross_table - Flag signifying whether it is a cross table.
  • columns - Table columns to read.

TableToGraphImporter Objects

class TableToGraphImporter(Importer)

Implements translation of table data to graph data, and imports it to Memgraph.

__init__

def __init__(data_loader: DataLoader,
             data_configuration: Dict[str, Any],
             memgraph: Optional[Memgraph] = None) -> None

Arguments:

  • data_loader - Object for loading data.
  • data_configuration - Configuration for the translations.
  • memgraph - Connection to Memgraph (Optional).

translate

def translate(drop_database_on_start: bool = True) -> None

Performs the translations.

Arguments:

  • drop_database_on_start - Indicate whether or not the database should be dropped prior to the start of the translations.

PyArrowImporter Objects

class PyArrowImporter(TableToGraphImporter)

TableToGraphImporter wrapper for use with PyArrow for reading data.

__init__

def __init__(file_system_handler: str,
             file_extension_enum: PyArrowFileTypeEnum,
             data_configuration: Dict[str, Any],
             memgraph: Optional[Memgraph] = None) -> None

Arguments:

  • file_system_handler - File system to read from.
  • file_extension_enum - File format to be read.
  • data_configuration - Configuration for the translations.
  • memgraph - Connection to Memgraph (Optional).

Raises:

  • ValueError - PyArrow doesn't support ORC on Windows.

PyArrowS3Importer Objects

class PyArrowS3Importer(PyArrowImporter)

PyArrowImporter wrapper for use with the Amazon S3 File System.

__init__

def __init__(bucket_name: str,
             file_extension_enum: PyArrowFileTypeEnum,
             data_configuration: Dict[str, Any],
             memgraph: Optional[Memgraph] = None,
             **kwargs) -> None

Arguments:

  • bucket_name - Name of the bucket in S3 to read from.
  • file_extension_enum - File format to be read.
  • data_configuration - Configuration for the translations.
  • memgraph - Connection to Memgraph (Optional).
  • **kwargs - Specified for S3FileSystem.

PyArrowAzureBlobImporter Objects

class PyArrowAzureBlobImporter(PyArrowImporter)

PyArrowImporter wrapper for use with the Azure Blob File System.

__init__

def __init__(container_name: str,
             file_extension_enum: PyArrowFileTypeEnum,
             data_configuration: Dict[str, Any],
             memgraph: Optional[Memgraph] = None,
             **kwargs) -> None

Arguments:

  • container_name - Name of the container in Azure Blob to read from.
  • file_extension_enum - File format to be read.
  • data_configuration - Configuration for the translations.
  • memgraph - Connection to Memgraph (Optional).
  • **kwargs - Specified for AzureBlobFileSystem.

PyArrowLocalFileSystemImporter Objects

class PyArrowLocalFileSystemImporter(PyArrowImporter)

PyArrowImporter wrapper for use with the Local File System.

__init__

def __init__(path: str,
             file_extension_enum: PyArrowFileTypeEnum,
             data_configuration: Dict[str, Any],
             memgraph: Optional[Memgraph] = None) -> None

Arguments:

  • path - Full path to the directory to read from.
  • file_extension_enum - File format to be read.
  • data_configuration - Configuration for the translations.
  • memgraph - Connection to Memgraph (Optional).

ParquetS3FileSystemImporter Objects

class ParquetS3FileSystemImporter(PyArrowS3Importer)

PyArrowS3Importer wrapper for use with the S3 file system and the parquet file type.

__init__

def __init__(bucket_name: str,
             data_configuration: Dict[str, Any],
             memgraph: Optional[Memgraph] = None,
             **kwargs) -> None

Arguments:

  • bucket_name - Name of the bucket in S3 to read from.
  • data_configuration - Configuration for the translations.
  • memgraph - Connection to Memgraph (Optional).
  • **kwargs - Specified for S3FileSystem.

CSVS3FileSystemImporter Objects

class CSVS3FileSystemImporter(PyArrowS3Importer)

PyArrowS3Importer wrapper for use with the S3 file system and the CSV file type.

__init__

def __init__(bucket_name: str,
             data_configuration: Dict[str, Any],
             memgraph: Optional[Memgraph] = None,
             **kwargs) -> None

Arguments:

  • bucket_name - Name of the bucket in S3 to read from.
  • data_configuration - Configuration for the translations.
  • memgraph - Connection to Memgraph (Optional).
  • **kwargs - Specified for S3FileSystem.

ORCS3FileSystemImporter Objects

class ORCS3FileSystemImporter(PyArrowS3Importer)

PyArrowS3Importer wrapper for use with the S3 file system and the ORC file type.

__init__

def __init__(bucket_name: str,
             data_configuration: Dict[str, Any],
             memgraph: Optional[Memgraph] = None,
             **kwargs) -> None

Arguments:

  • bucket_name - Name of the bucket in S3 to read from.
  • data_configuration - Configuration for the translations.
  • memgraph - Connection to Memgraph (Optional).
  • **kwargs - Specified for S3FileSystem.

FeatherS3FileSystemImporter Objects

class FeatherS3FileSystemImporter(PyArrowS3Importer)

PyArrowS3Importer wrapper for use with the S3 file system and the feather file type.

__init__

def __init__(bucket_name: str,
             data_configuration: Dict[str, Any],
             memgraph: Optional[Memgraph] = None,
             **kwargs) -> None

Arguments:

  • bucket_name - Name of the bucket in S3 to read from.
  • data_configuration - Configuration for the translations.
  • memgraph - Connection to Memgraph (Optional).
  • **kwargs - Specified for S3FileSystem.

ParquetAzureBlobFileSystemImporter Objects

class ParquetAzureBlobFileSystemImporter(PyArrowAzureBlobImporter)

PyArrowAzureBlobImporter wrapper for use with the Azure Blob file system and the parquet file type.

__init__

def __init__(container_name: str,
             data_configuration: Dict[str, Any],
             memgraph: Optional[Memgraph] = None,
             **kwargs) -> None

Arguments:

  • container_name - Name of the container in Azure Blob storage to read from.
  • data_configuration - Configuration for the translations.
  • memgraph - Connection to Memgraph (Optional).
  • **kwargs - Specified for AzureBlobFileSystem.

CSVAzureBlobFileSystemImporter Objects

class CSVAzureBlobFileSystemImporter(PyArrowAzureBlobImporter)

PyArrowAzureBlobImporter wrapper for use with the Azure Blob file system and the CSV file type.

__init__

def __init__(container_name: str,
             data_configuration: Dict[str, Any],
             memgraph: Optional[Memgraph] = None,
             **kwargs) -> None

Arguments:

  • container_name - Name of the container in Azure Blob storage to read from.
  • data_configuration - Configuration for the translations.
  • memgraph - Connection to Memgraph (Optional).
  • **kwargs - Specified for AzureBlobFileSystem.

ORCAzureBlobFileSystemImporter Objects

class ORCAzureBlobFileSystemImporter(PyArrowAzureBlobImporter)

PyArrowAzureBlobImporter wrapper for use with the Azure Blob file system and the CSV file type.

__init__

def __init__(container_name,
             data_configuration: Dict[str, Any],
             memgraph: Optional[Memgraph] = None,
             **kwargs) -> None

Arguments:

  • container_name - Name of the container in Blob storage to read from.
  • data_configuration - Configuration for the translations.
  • memgraph - Connection to Memgraph (Optional).
  • **kwargs - Specified for AzureBlobFileSystem.

FeatherAzureBlobFileSystemImporter Objects

class FeatherAzureBlobFileSystemImporter(PyArrowAzureBlobImporter)

PyArrowAzureBlobImporter wrapper for use with the Azure Blob file system and the Feather file type.

__init__

def __init__(container_name,
             data_configuration: Dict[str, Any],
             memgraph: Optional[Memgraph] = None,
             **kwargs) -> None

Arguments:

  • container_name - Name of the container in Blob storage to read from.
  • data_configuration - Configuration for the translations.
  • memgraph - Connection to Memgraph (Optional).
  • **kwargs - Specified for AzureBlobFileSystem.

ParquetLocalFileSystemImporter Objects

class ParquetLocalFileSystemImporter(PyArrowLocalFileSystemImporter)

PyArrowLocalFileSystemImporter wrapper for use with the local file system and the parquet file type.

__init__

def __init__(path: str,
             data_configuration: Dict[str, Any],
             memgraph: Optional[Memgraph] = None) -> None

Arguments:

  • path - Full path to directory.
  • data_configuration - Configuration for the translations.
  • memgraph - Connection to Memgraph (Optional).
  • **kwargs - Specified for LocalFileSystem.

CSVLocalFileSystemImporter Objects

class CSVLocalFileSystemImporter(PyArrowLocalFileSystemImporter)

PyArrowLocalFileSystemImporter wrapper for use with the local file system and the CSV file type.

__init__

def __init__(path: str,
             data_configuration: Dict[str, Any],
             memgraph: Optional[Memgraph] = None) -> None

Arguments:

  • path - Full path to directory.
  • data_configuration - Configuration for the translations.
  • memgraph - Connection to Memgraph (Optional).
  • **kwargs - Specified for LocalFileSystem.

ORCLocalFileSystemImporter Objects

class ORCLocalFileSystemImporter(PyArrowLocalFileSystemImporter)

PyArrowLocalFileSystemImporter wrapper for use with the local file system and the ORC file type.

__init__

def __init__(path: str,
             data_configuration: Dict[str, Any],
             memgraph: Optional[Memgraph] = None) -> None

Arguments:

  • path - Full path to directory.
  • data_configuration - Configuration for the translations.
  • memgraph - Connection to Memgraph (Optional).
  • **kwargs - Specified for LocalFileSystem.

FeatherLocalFileSystemImporter Objects

class FeatherLocalFileSystemImporter(PyArrowLocalFileSystemImporter)

PyArrowLocalFileSystemImporter wrapper for use with the local file system and the Feather/IPC/Arrow file type.

__init__

def __init__(path: str,
             data_configuration: Dict[str, Any],
             memgraph: Optional[Memgraph] = None) -> None

Arguments:

  • path - Full path to directory.
  • data_configuration - Configuration for the translations.
  • memgraph - Connection to Memgraph (Optional).
  • **kwargs - Specified for LocalFileSystem.