IMPORT Statement
Overview
The IMPORT statement allows you to import data from external sources into your graph database. It supports importing both vertices (nodes) and edges (relationships) from CSV files stored in S3 or OSS storage.
The IMPORT statement has two main variants:
- IMPORT VERTEX: For importing vertex/node data
- IMPORT EDGE: For importing edge/relationship data
Data Sources
Currently supported data sources:
- Amazon S3
- Alibaba Cloud OSS (Object Storage Service)
File Formats
Currently supported file format:
- CSV (Comma-Separated Values)
Import Process
- Create an import job using IMPORT VERTEX or IMPORT EDGE statement
- Check the job status using CHECK JOB command
- Data will be imported according to specified handling options
IMPORT VERTEX
Syntax
IMPORT VERTEX <vertex_table_name>
COLUMNS (
<vertex_property_name> = $<file_column_index>,
<vertex_property_name> = $<file_column_index>,
...
)
FROM <import_file_source_uri>[,<import_file_source_uri>]+ <sourceOptions>
FORMAT AS CSV <fileFormatOptions>
[ <importOptions> ]
sourceOptions
For S3:
WITH (
REGION = <region>,
ACCESS_KEY_ID = <access_key_id>,
SECRET_ACCESS_KEY = <secret_access_key>
)
For Oss :
WITH (
REGION = <region>,
ACCESS_KEY_ID = <access_key_id>,
SECRET_ACCESS_KEY = <secret_access_key>,
ENDPOINT = <endpoint>
)
fileFormatOptions
(
<key> = <value>,
<key> = <value>,
...
)
For csv file format, user can specify the following options:
Key | Description | Value Type | Default Value |
---|---|---|---|
has_header | Whether the file as header line | Boolean | false |
delimiter | Delimiter to separate the columns | String | "," |
null_value | Recognized spellings for null values. | String | "" |
importOptions
PROPERTIES (
<key> = <value>,
<key> = <value>,
...
)
Key | Description | Value Type | Default Value |
---|---|---|---|
duplicate_vertex_handling | When importing encountered duplicated vertex, how to handle it. | "fail" : Fail the job. "overwrite" : Overwrite the duplicated vertex value. "ignore" : Ignore the vertex. | "ignore" |
log_problematic_lines | Whether to log problematic lines. | Boolean | false |
format_error_handling | How to handle bad format lines. | "fail" : Fail the job. "ignore" : Skip the error line. | "ignore" |
Example
IMPORT VERTEX Person COLUMNS("col1" = $0, "col2" = $1, "col3" = $2)
FROM "s3://kasma-fileio-ci/tinysoc/vPerson.csv","s3://kasma-fileio-ci/tinysoc/vPerson2.csv"
WITH (
region = "xxx",
access_key_id = "xxx",
secret_access_key = "xxx"
)
FORMAT AS CSV (has_header = true, delimiter = ",")
PROPERTIES (
duplicate_vertex_handling = "ignore",
log_problematic_lines = true,
format_error_handling = "ignore"
)
Example of importing from oss
IMPORT VERTEX Person COLUMNS("col1" = $0, "col2" = $1, "col3" = $2)
FROM "oss://kasma-fileio-ci/tinysoc/vPerson.csv"
WITH (
region = "xxx",
access_key_id = "xxx",
secret_access_ke = "xxx",
endpoint = "https://oss-cn-hongkong.aliyuncs.com"
)
FORMAT AS CSV (has_header = true, delimiter = "," )
IMPORT EDGE
Syntax
IMPORT EDGE <edge_table_name>
FROM (
<source_vertex_primary_key_name> = $<file_column_index>,
<source_vertex_primary_key_name> = $<file_column_index>,
...
)
TO (
<target_vertex_primary_key_name> = $<file_column_index>,
<target_vertex_primary_key_name> = $<file_column_index>,
...
)
COLUMNS (
<edge_property_name> = $<file_column_index>,
<edge_property_name> = $<file_column_index>,
...
)
FROM <import_file_source_uri>[,<import_file_source_uri>]+ <sourceOptions>
FORMAT AS CSV <fileFormatOptions>
[ <importOptions> ]
sourceOptions
Same as IMPORT VERTEX.
importOptions
PROPERTIES (
<key> = <value>,
<key> = <value>,
...
)
Key | Description | Value Type | Default Value |
---|---|---|---|
incident_vertex_not_exists_handling | How to handle cases where the edge endpoint vertex does not exist. | "fail" : Fail the job. "ignore" : Ignore the edge. | "ignore" |
log_problematic_lines | Whether to log problematic lines. | Boolean | false |
format_error_handling | How to handle bad format lines. | "fail" : Fail the job. "ignore" : Skip the error line. | "ignore" |
Example
IMPORT EDGE Knows FROM ("col1" = $0) TO ("col1" = $1)
COLUMNS("col1" = $2)
FROM "s3://kasma-fileio-ci/tinysoc/eKnows.csv"
WITH (region = "xxx", access_key_id = "xxx", secret_access_key = "xxx" )
FORMAT AS CSV (has_header = true, delimiter = "," )
PROPERTIES (
incident_vertex_not_exists_handling = "fail",
log_problematic_lines = true,
format_error_handling = "ignore"
)
Check Import Job Staus
IMPORT statement will return an job_id, which can be used to check the import status.
CHECK JOB <job_id>
Last updated on