Skip to content

Add migration module with Arrow Flight #1204

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 18, 2025
Merged
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 60 additions & 0 deletions pages/advanced-algorithms/available-algorithms/migrate.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,66 @@ filter, and convert relational data into a graph format.

## Procedures

### `arrow_flight()`

With the `arrow_flight()` procedure, users can access data sources which support the **Arrow Flight RPC protocol** for transfer
of large data records to achieve high performance. Underlying implementation is using the `pyarrow` Python library to stream rows to
Memgraph. **List of known sources based on our previous experience include Dremio, and others**.

{<h4 className="custom-header"> Input: </h4>}

- `query: str` ➡ Query used to query the data source.
- `config: mgp.Map` ➡ Connection parameters (as in `pyarrow.flight.connect`).
- useful parameters for connecting are `host`, `port`, `username` and `password`
- `config_path` ➡ Path to a JSON file containing configuration parameters.

{<h4 className="custom-header"> Output: </h4>}

- `row: mgp.Map` ➡ The result table as a stream of rows.

#### Retrieve and inspect data
```cypher
CALL migrate.arrow_flight('SELECT * FROM users', {username: 'memgraph',
password: 'password',
host: 'localhost',
port: '12345'} )
YIELD row
RETURN row
LIMIT 5000;
```

#### Filter specific data
```cypher
CALL migrate.arrow_flight('SELECT * FROM users', {username: 'memgraph',
password: 'password',
host: 'localhost',
port: '12345'} )
YIELD row
WHERE row.age >= 30
RETURN row;
```

#### Create nodes from migrated data
```cypher
CALL migrate.arrow_flight('SELECT id, name, age FROM users', {username: 'memgraph',
password: 'password',
host: 'localhost',
port: '12345'} )
YIELD row
CREATE (u:User {id: row.id, name: row.name, age: row.age});
```

#### Create relationships between users
```cypher
CALL migrate.arrow_flight('SELECT user1_id, user2_id FROM friendships', {username: 'memgraph',
password: 'password',
host: 'localhost',
port: '12345'} )
YIELD row
MATCH (u1:User {id: row.user1_id}), (u2:User {id: row.user2_id})
CREATE (u1)-[:FRIENDS_WITH]->(u2);
```

### `mysql()`

With the `migrate.mysql()` procedure, you can access MySQL and migrate your data to Memgraph.
Expand Down