Skip to content
This repository was archived by the owner on Sep 18, 2023. It is now read-only.

[master < E-meta] Add MAGE meta module docs #1017

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
199 changes: 199 additions & 0 deletions mage/query-modules/cpp/meta.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
---
id: meta
title: meta
sidebar_label: meta
---

import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import RunOnSubgraph from '../../templates/_run_on_subgraph.mdx';

export const Highlight = ({children, color}) => (
<span
style={{
backgroundColor: color,
borderRadius: '2px',
color: '#fff',
padding: '0.2rem',
}}>
{children}
</span>
);


The **meta** module provides a set of procedures for generating metadata about the database.

[![docs-source](https://img.shields.io/badge/source-util_module-FB6E00?logo=github&style=for-the-badge)](https://github.com/memgraph/mage/tree/main/cpp/meta_module)

| Trait | Value |
| ------------------- | ----------------------------------------------------- |
| **Module type** | <Highlight color="#FB6E00">**algorithm**</Highlight> |
| **Implementation** | <Highlight color="#FB6E00">**C++**</Highlight> |
| **Parallelism** | <Highlight color="#FB6E00">**sequential**</Highlight> |

### Procedures

## stats
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## stats
## `stats()`

If it's a procedure?

Copy link
Contributor Author

@imilinovic imilinovic Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stats is not a procedure but it made sense to me to write about it since stats_online and stats_offline do the same thing (described in stats) so I wrote it as some sort of summary


The stats procedure returns the following metadata about the graph:
- `labelCount` ➡ number of unique labels in nodes
- `relationshipTypeCount` ➡ number of unique relationship types (labels)
- `nodeCount` ➡ number of nodes in the graph
- `relationshipCount` ➡ number of relationships in the graph
- `labels` ➡ map with the following (key, value) pairs:
- `label` : number_of_occurrences
- `relationshipTypes` ➡ map with the following (key, value) pairs:
- `(:label)-[:relationship_type]->()` : number_of_occurrences
- `()-[:relationship_type]->(:label)` : number_of_occurrences
- `()-[:relationship_type]->()` : number_of_occurrences
- `relationshipTypesCount` ➡ map with the following (key, value) pairs:
- `relationship_type` : number_of_occurrences
- `stats` ➡ map which contains all of the above

It is split into two version which return the same metadata:
- stats_online - works in **O(1)** and requires setting up a trigger
- stats_offline - traverses the whole graph
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- stats_online - works in **O(1)** and requires setting up a trigger
- stats_offline - traverses the whole graph
- `stats_online` - works in **O(1)** and requires setting up a trigger
- `stats_offline` - traverses the whole graph

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you listing the same outputs three times then?
I don't think i would do that... i would just have stats online and offline and then this knowledge

  • stats_online - works in O(1) and requires setting up a trigger
  • stats_offline - traverses the whole graph

can be in the intro, or each in its own section.

It's confusing like this + different from all the other mage pages


### `stats_online(update_stats)`

Retrieves the graph metadata in **O(1)** complexity. Requires setting up the following trigger:

```cypher
CREATE TRIGGER meta_trigger BEFORE COMMIT EXECUTE CALL meta.update(createdObjects, deletedObjects, removedVertexProperties, removedEdgeProperties, setVertexLabels, removedVertexLabels);
```
This procedure tracks the data created/deleted/modified after the trigger was added. If you want to return the metadata about the whole graph you need to run the *stats_online* procedure with the *update_stats* flag set to true **once**. That flag will cause the procedure to traverse the whole graph to update the metadata. After that you can always run with the *update_stats* flag set to false and the procedure will return the metadata in **O(1)** complexity.


#### Input:

- `update_stats: bool (default=false)` ➡ if true traverses the whole graph to update the metadata otherwise returns the stored metadata

#### Output:

- `labelCount: int` ➡ number of unique labels in nodes
- `relationshipTypeCount: int` ➡ number of unique relationship types (labels)
- `nodeCount: int` ➡ number of nodes in the graph
- `relationshipCount: int` ➡ number of relationships in the graph
- `labels: Map[string: int]` ➡ map with the following (key, value) pairs:
- `label` : number_of_occurrences
- `relationshipTypes: Map[string: int]` ➡ map with the following (key, value) pairs:
- `(:label)-[:relationship_type]->()` : number_of_occurrences
- `()-[:relationship_type]->(:label)` : number_of_occurrences
- `()-[:relationship_type]->()` : number_of_occurrences
- `relationshipTypesCount: Map[string: int]` ➡ map with the following (key, value) pairs:
- `relationship_type` : number_of_occurrences
- `stats` ➡ map which contains all of the above

#### Usage:

Running stats on the following graph:
```cypher
MERGE (a:Node {id: 0}) MERGE (b:Node {id: 1}) CREATE (a)-[:Relation1]->(b);
MERGE (a:Node {id: 1}) MERGE (b:Node {id: 2}) CREATE (a)-[:Relation1]->(b);
MERGE (a:Node {id: 2}) MERGE (b:Node {id: 0}) CREATE (a)-[:Relation1]->(b);
MERGE (a:Node {id: 3}) MERGE (b:Node {id: 3}) CREATE (a)-[:Relation2]->(b);
MERGE (a:Node {id: 3}) MERGE (b:Node {id: 4}) CREATE (a)-[:Relation2]->(b);
MERGE (a:Node {id: 3}) MERGE (b:Node {id: 5}) CREATE (a)-[:Relation2]->(b);
```

```cypher
CALL meta.stats_online() YIELD stats;
```

```plaintext
+-------------------------------------------------------+
| stats |
+-------------------------------------------------------+
| |
|{ |
| "labelCount": 1, |
| "labels": { |
| "Node": 6 |
| }, |
| "nodeCount": 6, |
| "propertyKeyCount": 1, |
| "relationshipCount": 6, |
| "relationshipTypeCount": 2, |
| "relationshipTypes": { |
| "()-[:Relation1]->()": 3, |
| "()-[:Relation1]->(:Node)": 3, |
| "()-[:Relation2]->()": 3, |
| "()-[:Relation2]->(:Node)": 3, |
| "(:Node)-[:Relation1]->()": 3, |
| "(:Node)-[:Relation2]->()": 3 |
| }, |
| "relationshipTypesCount": { |
| "Relation1": 3, |
| "Relation2": 3 |
| } |
|} |
| |
+-------------------------------------------------------+
```

### `stats_offline()`

Retrieves the graph metadata by traversing the whole graph. *stats_online* should be preferred because of the better complexity unless you don't want to use triggers.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Retrieves the graph metadata by traversing the whole graph. *stats_online* should be preferred because of the better complexity unless you don't want to use triggers.
Retrieves the graph metadata by traversing the whole graph. `stats_online` should be preferred because of the better complexity unless you don't want to use triggers.


#### Output:

- `labelCount: int` ➡ number of unique labels in nodes
- `relationshipTypeCount: int` ➡ number of unique relationship types (labels)
- `nodeCount: int` ➡ number of nodes in the graph
- `relationshipCount: int` ➡ number of relationships in the graph
- `labels: Map[string: int]` ➡ map with the following (key, value) pairs:
- `label` : number_of_occurrences
- `relationshipTypes: Map[string: int]` ➡ map with the following (key, value) pairs:
- `(:label)-[:relationship_type]->()` : number_of_occurrences
- `()-[:relationship_type]->(:label)` : number_of_occurrences
- `()-[:relationship_type]->()` : number_of_occurrences
- `relationshipTypesCount: Map[string: int]` ➡ map with the following (key, value) pairs:
- `relationship_type` : number_of_occurrences
- `stats` ➡ map which contains all of the above

#### Usage:

Running stats on the following graph:
```cypher
MERGE (a:Node {id: 0}) MERGE (b:Node {id: 1}) CREATE (a)-[:Relation1]->(b);
MERGE (a:Node {id: 1}) MERGE (b:Node {id: 2}) CREATE (a)-[:Relation1]->(b);
MERGE (a:Node {id: 2}) MERGE (b:Node {id: 0}) CREATE (a)-[:Relation1]->(b);
MERGE (a:Node {id: 3}) MERGE (b:Node {id: 3}) CREATE (a)-[:Relation2]->(b);
MERGE (a:Node {id: 3}) MERGE (b:Node {id: 4}) CREATE (a)-[:Relation2]->(b);
MERGE (a:Node {id: 3}) MERGE (b:Node {id: 5}) CREATE (a)-[:Relation2]->(b);
```

```cypher
CALL meta.stats_offline() YIELD stats;
```

```plaintext
+-------------------------------------------------------+
| stats |
+-------------------------------------------------------+
| |
|{ |
| "labelCount": 1, |
| "labels": { |
| "Node": 6 |
| }, |
| "nodeCount": 6, |
| "propertyKeyCount": 1, |
| "relationshipCount": 6, |
| "relationshipTypeCount": 2, |
| "relationshipTypes": { |
| "()-[:Relation1]->()": 3, |
| "()-[:Relation1]->(:Node)": 3, |
| "()-[:Relation2]->()": 3, |
| "()-[:Relation2]->(:Node)": 3, |
| "(:Node)-[:Relation1]->()": 3, |
| "(:Node)-[:Relation2]->()": 3 |
| }, |
| "relationshipTypesCount": { |
| "Relation1": 3, |
| "Relation2": 3 |
| } |
|} |
| |
+-------------------------------------------------------+
```
1 change: 1 addition & 0 deletions mage/templates/_mage_spells.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@
| [import_util](/mage/query-modules/python/import-util) | Python | A module for importing data from different formats (JSON). |
| [json_util](/mage/query-modules/python/json-util) | Python | A module for loading JSON from a local file or remote address. |
| [llm_util](/mage/query-modules/python/llm-util) | Python | A module that contains procedures describing graphs in a format best suited for large language models (LLMs). |
| [meta](/mage/query-modules/cpp/meta) | C++ | A module that contains procedures describing graphs on a meta-level. |
| [meta_util](/mage/query-modules/python/meta-util) | Python | A module that contains procedures describing graphs on a meta-level. |
| [migrate](/mage/query-modules/python/migrate) | Python | A module that can access data from a MySQL, SQL Server or Oracle database. |
| [periodic](/mage/query-modules/cpp/periodic) | C++ | A module containing procedures for periodically running difficult and/or memory/time consuming queries. |
Expand Down
1 change: 1 addition & 0 deletions sidebars/sidebarsMAGE.js
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ module.exports = {
"query-modules/python/llm-util",
"query-modules/cpp/map",
"query-modules/python/max-flow",
"query-modules/cpp/meta",
"query-modules/python/meta-util",
"query-modules/python/migrate",
"query-modules/python/node-classification-with-gnn",
Expand Down