Using Custom Graph¶
This guide walks you through the process of using custom graph data in GraphScope Interactive. The process comprises three main steps:
Creating a new graph,
Importing graph data, and
Starting the service with the new graph.
We use a simple graph, which contains only one kind vertices with label person
and one kind edges with label knows
, to demonstrate creating a new graph in Interactive.
Step 1: Create a New Graph¶
Before starting, please make sure you are in the GLOBAL context.
gsctl use GLOBAL
First you need to define the vertex types and edge types of your graph, i.e. here is a sample definition of a test_graph
. Save the file to disk with name test_graph.yaml
.
name: test_graph
description: "This is a test graph"
schema:
vertex_types:
- type_name: person
properties:
- property_name: id
property_type:
primitive_type: DT_SIGNED_INT64
- property_name: name
property_type:
string:
long_text: ""
- property_name: age
property_type:
primitive_type: DT_SIGNED_INT32
primary_keys:
- id
edge_types:
- type_name: knows
vertex_type_pair_relations:
- source_vertex: person
destination_vertex: person
relation: MANY_TO_MANY
properties:
- property_name: weight
property_type:
primitive_type: DT_DOUBLE
In this file:
For each vertex type, specify its name, allowed properties, primary keys (if any), and other relevant details.
For each edge type, define the source/destination vertex types and their associated properties.
To create a new graph test_graph
, execute the following command:
gsctl create graph -f ./test_graph.yaml
For a comprehensive list of supported types, please refer to the data model page.
Step 2: Import Graph Data¶
To import your data, you need to first bind the data source and then submit a bulk loading job.
Bind Data Source¶
To create a new graph, you will need the original data of the graph. We currently support files in CSV format. Fortunately, we have prepared it for you, and you can find it here modern-graph.
The import.yaml
file maps raw data fields to the schema of the “modern” graph created in Step 1. Here’s an illustrative example import.yaml
,
note that each vertex/edge type need at least one input for bulk loading.
In the following example, we will import data to the new graph from local file
person.csv
and person_knows_person.csv
.
You can download the files from GitHub, with following commands.
wget https://raw.githubusercontent.com/alibaba/GraphScope/main/flex/interactive/examples/modern_graph/person.csv
wget https://raw.githubusercontent.com/alibaba/GraphScope/main/flex/interactive/examples/modern_graph/person_knows_person.csv
After successfully downloading them, remember to replace @/path/to/person.csv
and @/path/to/person_knows_person.csv
with the actual path to files.
Note
@
means the file is a local file and need to be uploaded.
vertex_mappings:
- type_name: person
inputs:
- "@/path/to/person.csv"
column_mappings:
- column:
index: 0
name: id
property: id
- column:
index: 1
name: name
property: name
- column:
index: 2
name: age
property: age
edge_mappings:
- type_triplet:
edge: knows
source_vertex: person
destination_vertex: person
inputs:
- "@/path/to/person_knows_person.csv"
source_vertex_mappings:
- column:
index: 0
name: person.id
property: id
destination_vertex_mappings:
- column:
index: 1
name: person.id
property: id
column_mappings:
- column:
index: 2
name: weight
property: weight
Note: The provided yaml file above offers a basic configuration for data importing. For a comprehensive understanding of data import configurations, please consult the data import page.
Now bind the datasource to test_graph
with following command
gsctl create datasource -f ./import.yaml -g test_graph
Create Data Loading Job¶
So far, we have only created the dataource, a job config job_config.yaml
is also needed to data import.
A job_config.yaml
is also needed to specify the configuration for bulk loading.
loading_config:
import_option: overwrite
format:
type: csv
metadata:
delimiter: "|"
header_row: "true"
vertices:
- type_name: person
edges:
- type_name: knows
source_vertex: person
destination_vertex: person
Now create a bulk loading job with following command
gsctl create loaderjob -f ./job_config.yaml -g test_graph
a message like Create job xxx successfully
will be printed.
Wait the job to finish by checking the job status with following command
gsctl desc job <job_id>
Step 3: Start the Service with the New Graph¶
After you have obtained a successful status with gsctl desc job <job_id>
, you can now switch to the context of the test_graph
graph.
gsctl use GRAPH test_graph
Step 4: A More Complicated Movies Graph(optional)¶
The above graph is very simple, which only contains one kind vertices and one kind edges.
For a more complicated example, We’ll use the movies
graph as an example, you can download the files from our Github Repo.
wget https://interactive-release.oss-cn-hangzhou.aliyuncs.com/dataset/movies/movies.zip
Try to use the following configuration files to create movie_graph
!
movie_graph.yaml¶
name: movies
schema:
vertex_types:
- type_name: Movie
properties:
- property_name: id
property_type:
primitive_type: DT_SIGNED_INT64
- property_name: released
property_type:
primitive_type: DT_SIGNED_INT32
- property_name: tagline
property_type:
string:
long_text: ""
- property_name: title
property_type:
string:
long_text: ""
primary_keys:
- id
- type_name: Person
properties:
- property_name: id
property_type:
primitive_type: DT_SIGNED_INT64
- property_name: born
property_type:
primitive_type: DT_SIGNED_INT32
- property_name: name
property_type:
string:
long_text: ""
primary_keys:
- id
- type_name: User
properties:
- property_name: id
property_type:
primitive_type: DT_SIGNED_INT64
- property_name: born
property_type:
primitive_type: DT_SIGNED_INT32
- property_name: name
property_type:
string:
long_text: ""
primary_keys:
- id
edge_types:
- type_name: ACTED_IN
vertex_type_pair_relations:
- source_vertex: Person
destination_vertex: Movie
relation: MANY_TO_MANY
- type_name: DIRECTED
vertex_type_pair_relations:
- source_vertex: Person
destination_vertex: Movie
relation: MANY_TO_MANY
- type_name: REVIEW
vertex_type_pair_relations:
- source_vertex: User
destination_vertex: Movie
relation: MANY_TO_MANY
properties:
- property_name: rating
property_type:
primitive_type: DT_SIGNED_INT32
- type_name: FOLLOWS
vertex_type_pair_relations:
- source_vertex: User
destination_vertex: Person
relation: MANY_TO_MANY
- type_name: WROTE
vertex_type_pair_relations:
- source_vertex: Person
destination_vertex: Movie
relation: MANY_TO_MANY
- type_name: PRODUCED
vertex_type_pair_relations:
- source_vertex: Person
destination_vertex: Movie
relation: MANY_TO_MANY
import.yaml¶
vertex_mappings:
- type_name: Person # must align with the schema
inputs:
- "@/path/to/Person.csv"
- type_name: Movie
inputs:
- "@/path/to/Movie.csv"
edge_mappings:
- type_triplet:
edge: ACTED_IN
source_vertex: Person
destination_vertex: Movie
inputs:
- "@/path/to/ACTED_IN.csv"
- type_triplet:
edge: DIRECTED
source_vertex: Person
destination_vertex: Movie
inputs:
- "@/path/to/DIRECTED.csv"
- type_triplet:
edge: FOLLOWS
source_vertex: Person
destination_vertex: Person
inputs:
- "@/path/to/FOLLOWS.csv"
- type_triplet:
edge: PRODUCED
source_vertex: Person
destination_vertex: Movie
inputs:
- "@/path/to/PRODUCED.csv"
- type_triplet:
edge: REVIEW
source_vertex: Person
destination_vertex: Movie
column_mappings:
- column:
index: 3
name: rating
property: rating
inputs:
- "@/path/to/REVIEWED.csv"
- type_triplet:
edge: WROTE
source_vertex: Person
destination_vertex: Movie
inputs:
- "@/path/to/WROTE.csv"
job_config.yaml¶
loading_config:
import_option: overwrite
format:
type: csv
metadata:
delimiter: "|"
header_row: "true"
vertices:
- type_name: Person
- type_name: Movie
edges:
- type_name: ACTED_IN
source_vertex: Person
destination_vertex: Movie
- type_name: DIRECTED
source_vertex: Person
destination_vertex: Movie
- type_name: FOLLOWS
source_vertex: Person
destination_vertex: Person
- type_name: PRODUCED
source_vertex: Person
destination_vertex: Movie
- type_name: REVIEW
source_vertex: Person
destination_vertex: Movie
- type_name: WROTE
source_vertex: Person
destination_vertex: Movie
Try other graphs¶
In addition to movies
graph, we have also prepared the graph_algo
graph. You can find the raw CSV files, graph.yaml, and import.yaml in the ./examples/graph_algo/
directory. You can import the graph_algo
graph just like importing the movies
graph. There are also some sample cypher queries, you can find them at GraphScope/flex/interactive/examples/graph_algo.