Skip to content

Commit e9f26be

Browse files
Josipmrdenmatea16
andauthored
Add migration module with DuckDB (#1205)
* Add migration with DuckDB * Clarify duckdb startup * Merge * Update pages/advanced-algorithms/available-algorithms/migrate.mdx * Add migration from another Memgraph instance (#1206) * Add migration from another Memgraph instance * Update pages/advanced-algorithms/available-algorithms/migrate.mdx * Add migration from ServiceNow (#1207) * Add migration from servicenow * Apply suggestions from code review * add callout --------- Co-authored-by: Matea Pesic <80577904+matea16@users.noreply.github.com> Co-authored-by: matea16 <mateapesic@hotmail.com> --------- Co-authored-by: Matea Pesic <80577904+matea16@users.noreply.github.com> Co-authored-by: matea16 <mateapesic@hotmail.com> --------- Co-authored-by: Matea Pesic <80577904+matea16@users.noreply.github.com> Co-authored-by: matea16 <mateapesic@hotmail.com>
1 parent 614b004 commit e9f26be

File tree

1 file changed

+160
-0
lines changed
  • pages/advanced-algorithms/available-algorithms

1 file changed

+160
-0
lines changed

pages/advanced-algorithms/available-algorithms/migrate.mdx

Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ description: Discover the migration capabilities of Memgraph for efficient trans
66
import { Cards } from 'nextra/components'
77
import GitHub from '/components/icons/GitHub'
88
import { Steps } from 'nextra/components'
9+
import { Callout } from 'nextra/components';
910

1011
# migrate
1112

@@ -95,6 +96,122 @@ MATCH (u1:User {id: row.user1_id}), (u2:User {id: row.user2_id})
9596
CREATE (u1)-[:FRIENDS_WITH]->(u2);
9697
```
9798

99+
### `duckdb()`
100+
With the `migrate.duckdb()` procedure, users can connect to the ** DuckDB** database and query various data sources.
101+
List of data sources that are supported by DuckDB can be found on their [official documentation page](https://duckdb.org/docs/stable/data/data_sources.html).
102+
The underlying implementation streams results from DuckDB to Memgraph using the `duckdb` Python Library. DuckDB is started with the in-memory mode, without any
103+
persistence and is used just to proxy to the underlying data sources.
104+
105+
{<h4 className="custom-header"> Input: </h4>}
106+
107+
- `query: str` ➡ Table name or an SQL query.
108+
- `setup_queries: mgp.Nullable[List[str]]` ➡ List of queries that will be executed prior to the query provided as the initial argument.
109+
Used for setting up the connection to additional data sources.
110+
111+
{<h4 className="custom-header"> Output: </h4>}
112+
113+
- `row: mgp.Map` ➡ The result table as a stream of rows.
114+
115+
{<h4 className="custom-header"> Usage: </h4>}
116+
117+
#### Retrieve and inspect data
118+
```cypher
119+
CALL migrate.duckdb("SELECT * FROM 'test.parquet';")
120+
YIELD row
121+
RETURN row
122+
LIMIT 5000;
123+
```
124+
125+
#### Filter specific data
126+
```cypher
127+
CALL migrate.duckdb("SELECT * FROM 'test.parquet';")
128+
YIELD row
129+
WHERE row.age >= 30
130+
RETURN row;
131+
```
132+
133+
#### Create nodes from migrated data
134+
```cypher
135+
CALL migrate.duckdb("SELECT * FROM 'test.parquet';")
136+
YIELD row
137+
CREATE (u:User {id: row.id, name: row.name, age: row.age});
138+
```
139+
140+
#### Create relationships between users
141+
```cypher
142+
CALL migrate.duckdb("SELECT * FROM 'test.parquet';")
143+
YIELD row
144+
MATCH (u1:User {id: row.user1_id}), (u2:User {id: row.user2_id})
145+
CREATE (u1)-[:FRIENDS_WITH]->(u2);
146+
```
147+
148+
#### Setup connection to query additional data sources
149+
```cypher
150+
CALL migrate.duckdb("SELECT * FROM 's3://your_bucket/your_file.parquet';", ["CREATE SECRET secret1 (TYPE s3, KEY_ID 'key', SECRET 'secret', REGION 'region');"])
151+
YIELD row
152+
MATCH (u1:User {id: row.user1_id}), (u2:User {id: row.user2_id})
153+
CREATE (u1)-[:FRIENDS_WITH]->(u2);
154+
```
155+
156+
---
157+
158+
### `memgraph()`
159+
160+
With the `migrate.memgraph()` procedure, you can access another Memgraph instance and migrate your data to a new Memgraph instance.
161+
The resulting nodes and edges are converted into a stream of rows which can include labels, properties, and primitives.
162+
163+
<Callout type="info">
164+
Streaming of raw node and relationship objects is not supported and users are advised to migrate all the necessary identifiers in order to recreate the same graph in Memgraph.
165+
</Callout>
166+
167+
{<h4 className="custom-header"> Input: </h4>}
168+
169+
- `label_or_rel_or_query: str` ➡ Label name (written in format `(:Label)`), relationship name (written in format `[:rel_type]`) or a plain cypher query.
170+
- `config: mgp.Map` ➡ Connection parameters (as in `gqlalchemy.Memgraph`). Notable parameters are `host[String]`, and `port[Integer]`
171+
- `config_path` ➡ Path to a JSON file containing configuration parameters.
172+
- `params: mgp.Nullable[mgp.Any] (default=None)` ➡ Query parameters (if applicable).
173+
174+
{<h4 className="custom-header"> Output: </h4>}
175+
176+
- `row: mgp.Map` ➡ The result table as a stream of rows.
177+
- when retrieving nodes using the `(:Label)` syntax, row will have the following keys: `labels`, and `properties`
178+
- when retrieving relationships using the `[:REL_TYPE]` syntax, row will have the following keys: `from_labels`, `to_labels`, `from_properties`, `to_properties`, and `edge_properties`
179+
- when retrieving results using a plain Cypher query, row will have keys identical to the returned column names from the Cypher query
180+
181+
{<h4 className="custom-header"> Usage: </h4>}
182+
183+
#### Retrieve nodes of certain label and create them in a new Memgraph instance
184+
```cypher
185+
CALL migrate.memgraph('(:Person)', {host: 'localhost', port: 7687})
186+
YIELD row
187+
WITH row.labels AS labels, row.properties as props
188+
CREATE (n:labels) SET n += row.props
189+
```
190+
191+
#### Retrieve relationships of certain type and create them in a new Memgraph instance
192+
```cypher
193+
CALL migrate.memgraph('[:KNOWS]', {host: 'localhost', port: 7687})
194+
YIELD row
195+
WITH row.from_labels AS from_labels,
196+
row.to_labels AS to_labels,
197+
row.from_properties AS from_properties,
198+
row.to_properties AS to_properties,
199+
row.edge_properties AS edge_properties
200+
MATCH (p1:Person {id: row.from_properties.id})
201+
MATCH (p2:Person {id: row.to_properties.id})
202+
CREATE (p1)-[r:KNOWS]->(p2)
203+
SET r += edge_properties;
204+
```
205+
206+
#### Retrieve information from Memgraph using an arbitrary Cypher query
207+
```cypher
208+
CALL migrate.memgraph('MATCH (n) RETURN count(n) as cnt', {host: 'localhost', port: 7687})
209+
YIELD row
210+
RETURN row.cnt as cnt;
211+
```
212+
213+
---
214+
98215
### `mysql()`
99216

100217
With the `migrate.mysql()` procedure, you can access MySQL and migrate your data to Memgraph.
@@ -394,3 +511,46 @@ CALL migrate.s3('s3://my-bucket/employees.csv', {aws_access_key_id: 'your-key',
394511
YIELD row
395512
CREATE (e:Employee {id: row.id, name: row.name, position: row.position});
396513
```
514+
515+
---
516+
517+
### `servicenow()`
518+
519+
With the `migrate.servicenow()` procedure, you can access [ServiceNow REST API](https://developer.servicenow.com/dev.do#!/reference/api/xanadu/rest/) and transfer your data to Memgraph.
520+
The underlying implementation is using the [`requests` Python library] to migrate results to Memgraph. The REST API from
521+
ServiceNow must provide results in the format `{results: []}` in order for Memgraph to stream it into result rows.
522+
523+
{<h4 className="custom-header"> Input: </h4>}
524+
525+
- `endpoint: str` ➡ ServiceNow endpoint. Users can optionally include their own query parameters to filter results.
526+
- `config: mgp.Map` ➡ Connection parameters. Notable connection parameters are `username` and `password`, per `requests.get()` method.
527+
- `config_path: str` ➡ Path to a JSON file containing configuration parameters.
528+
529+
{<h4 className="custom-header"> Output: </h4>}
530+
531+
- `row: mgp.Map` ➡ Each row from the CSV file as a structured dictionary.
532+
533+
{<h4 className="custom-header"> Usage: </h4>}
534+
535+
#### Retrieve and inspect CSV data from ServiceNow
536+
```cypher
537+
CALL migrate.servicenow('http://my_endpoint/api/data', {})
538+
YIELD row
539+
RETURN row
540+
LIMIT 100;
541+
```
542+
543+
#### Filter specific rows from the CSV
544+
```cypher
545+
CALL migrate.servicenow('http://my_endpoint/api/data', {})
546+
YIELD row
547+
WHERE row.age >= 30
548+
RETURN row;
549+
```
550+
551+
#### Create nodes dynamically from CSV data
552+
```cypher
553+
CALL migrate.servicenow('http://my_endpoint/api/data', {})
554+
YIELD row
555+
CREATE (e:Employee {id: row.id, name: row.name, position: row.position});
556+
```

0 commit comments

Comments
 (0)