IMPORT Statement

Overview

The IMPORT statement allows you to import data from external sources into your graph database. It supports importing both vertices (nodes) and edges (relationships) from CSV files stored in S3 or OSS storage.

The IMPORT statement has two main variants:

IMPORT VERTEX: For importing vertex/node data
IMPORT EDGE: For importing edge/relationship data

Data Sources

Currently supported data sources:

Amazon S3
Alibaba Cloud OSS (Object Storage Service)

File Formats

Currently supported file format:

CSV (Comma-Separated Values)

Import Process

Create an import job using IMPORT VERTEX or IMPORT EDGE statement
Check the job status using CHECK JOB command
Data will be imported according to specified handling options

IMPORT VERTEX

Syntax


IMPORT VERTEX <vertex_table_name>
COLUMNS (
    <vertex_property_name> = $<file_column_index>, 
    <vertex_property_name> = $<file_column_index>, 
    ...
)
FROM <import_file_source_uri>[,<import_file_source_uri>]+  <sourceOptions>
FORMAT AS CSV <fileFormatOptions>
[ <importOptions> ]

sourceOptions

For S3:


WITH (
  REGION = <region>,
  ACCESS_KEY_ID = <access_key_id>,
  SECRET_ACCESS_KEY = <secret_access_key>
)

For Oss :


WITH (
  REGION = <region>,
  ACCESS_KEY_ID = <access_key_id>,
  SECRET_ACCESS_KEY = <secret_access_key>,
  ENDPOINT = <endpoint>
)

fileFormatOptions


(
  <key> = <value>,
  <key> = <value>,
  ...
)

For csv file format, user can specify the following options:

Key	Description	Value Type	Default Value
`has_header`	Whether the file as header line	Boolean	`false`
`delimiter`	Delimiter to separate the columns	String	`","`
`null_value`	Recognized spellings for null values.	String	`""`

importOptions


PROPERTIES (
  <key> = <value>,
  <key> = <value>,
  ...
)

Key	Description	Value Type	Default Value
`duplicate_vertex_handling`	When importing encountered duplicated vertex, how to handle it.	`"fail"` : Fail the job. `"overwrite"` : Overwrite the duplicated vertex value. `"ignore"`: Ignore the vertex.	`"ignore"`
`log_problematic_lines`	Whether to log problematic lines.	Boolean	`false`
`format_error_handling`	How to handle bad format lines.	`"fail"` : Fail the job. `"ignore"` : Skip the error line.	`"ignore"`

Example


IMPORT VERTEX Person COLUMNS("col1" = $0, "col2" = $1, "col3" = $2)
FROM "s3://kasma-fileio-ci/tinysoc/vPerson.csv","s3://kasma-fileio-ci/tinysoc/vPerson2.csv" 
WITH (
    region = "xxx",
    access_key_id = "xxx",
    secret_access_key = "xxx"
) 
FORMAT AS CSV (has_header = true, delimiter = ",")
PROPERTIES (
    duplicate_vertex_handling = "ignore", 
    log_problematic_lines = true, 
    format_error_handling = "ignore"
)

Example of importing from oss


IMPORT VERTEX Person COLUMNS("col1" = $0, "col2" = $1, "col3" = $2) 
FROM "oss://kasma-fileio-ci/tinysoc/vPerson.csv" 
WITH (
    region = "xxx",
    access_key_id = "xxx",
    secret_access_ke = "xxx",
    endpoint = "https://oss-cn-hongkong.aliyuncs.com"
)
FORMAT AS CSV (has_header = true, delimiter = "," )

IMPORT EDGE

Syntax


IMPORT EDGE <edge_table_name>
FROM (
  <source_vertex_primary_key_name> = $<file_column_index>, 
  <source_vertex_primary_key_name> = $<file_column_index>, 
  ... 
)
TO ( 
  <target_vertex_primary_key_name> = $<file_column_index>, 
  <target_vertex_primary_key_name> = $<file_column_index>, 
  ... 
)
COLUMNS ( 
  <edge_property_name> = $<file_column_index>, 
  <edge_property_name> = $<file_column_index>, 
  ... 
)
FROM <import_file_source_uri>[,<import_file_source_uri>]+ <sourceOptions>
FORMAT AS CSV <fileFormatOptions>
[ <importOptions> ]

sourceOptions

Same as IMPORT VERTEX.

importOptions


PROPERTIES (
  <key> = <value>,
  <key> = <value>,
  ...
)

Key	Description	Value Type	Default Value
`incident_vertex_not_exists_handling`	How to handle cases where the edge endpoint vertex does not exist.	`"fail"` : Fail the job. `"ignore"`: Ignore the edge.	`"ignore"`
`log_problematic_lines`	Whether to log problematic lines.	Boolean	`false`
`format_error_handling`	How to handle bad format lines.	`"fail"` : Fail the job. `"ignore"` : Skip the error line.	`"ignore"`

Example


IMPORT EDGE Knows FROM ("col1" = $0) TO ("col1" = $1)
COLUMNS("col1" = $2)
FROM "s3://kasma-fileio-ci/tinysoc/eKnows.csv"
WITH (region = "xxx", access_key_id = "xxx", secret_access_key = "xxx" ) 
FORMAT AS CSV (has_header = true, delimiter = "," )
PROPERTIES (
  incident_vertex_not_exists_handling = "fail", 
  log_problematic_lines = true, 
  format_error_handling = "ignore"
)

Check Import Job Staus

IMPORT statement will return an job_id, which can be used to check the import status.


CHECK JOB <job_id>