You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Update schema definition
* Add graph pruning component and tests (WIP)
* Cleaning
* Add pruner to SimpleKGPipeline
* Add test for relationship enforcement
* Change return model to have some stats about pruned objects
* We need to filter out relationships if start/end node is not valid in all cases (additional_relationship_types or not)
* Do not filter based on patterns if relationship type not in schema and additional_relationship_types is allowed
* Raise proper error type
* Ruff/mypy
* Add e2e test for graph pruning component
* Mypy
* Update changelog and doc
* Mypy
* ChatGPT was wrong
* Change edge case behaviour
* Fix doc
* Update doc
* Fix condition
* Remove incomplete comments
* More pruning stats
* Typo
* Remove default value for consistency
* Add a section to the doc
* Mypy checks
@@ -124,7 +124,8 @@ This schema information can be provided to the `SimpleKGBuilder` as demonstrated
124
124
schema={
125
125
"node_types": NODE_TYPES,
126
126
"relationship_types": RELATIONSHIP_TYPES,
127
-
"patterns": PATTERNS
127
+
"patterns": PATTERNS,
128
+
"additional_node_types": False,
128
129
},
129
130
# ...
130
131
)
@@ -145,7 +146,6 @@ They are also accessible via the `SimpleKGPipeline` interface.
145
146
# ...
146
147
prompt_template="",
147
148
lexical_graph_config=my_config,
148
-
enforce_schema="STRICT"
149
149
on_error="RAISE",
150
150
# ...
151
151
)
@@ -878,38 +878,6 @@ It can be used in this way:
878
878
879
879
The LLM to use can be customized, the only constraint is that it obeys the :ref:`LLMInterface <llminterface>`.
880
880
881
-
Schema Enforcement Behaviour
882
-
----------------------------
883
-
.. _schema-enforcement-behaviour:
884
-
885
-
By default, even if a schema is provided to guide the LLM in the entity and relation extraction, the LLM response is not validated against that schema.
886
-
This behaviour can be changed by using the `enforce_schema` flag in the `LLMEntityRelationExtractor` constructor:
887
-
888
-
.. code:: python
889
-
890
-
from neo4j_graphrag.experimental.components.entity_relation_extractor import LLMEntityRelationExtractor
891
-
from neo4j_graphrag.experimental.components.types import SchemaEnforcementMode
892
-
893
-
extractor = LLMEntityRelationExtractor(
894
-
# ...
895
-
enforce_schema=SchemaEnforcementMode.STRICT,
896
-
)
897
-
898
-
In this scenario, any extracted node/relation/property that is not part of the provided schema will be pruned.
899
-
Any relation whose start node or end node does not conform to the provided tuple in `potential_schema` will be pruned.
900
-
If a relation start/end nodes are valid but the direction is incorrect, the latter will be inverted.
901
-
If a node is left with no properties, it will be also pruned.
902
-
903
-
.. note::
904
-
905
-
If the input schema lacks a certain type of information, pruning is skipped.
906
-
For example, if an entity is defined only by a label and has no properties,
907
-
property pruning is not performed and all properties returned by the LLM are kept.
908
-
909
-
910
-
.. warning::
911
-
912
-
Note that if the schema enforcement mode is on but the schema is not provided, no schema enforcement will be applied.
913
881
914
882
Error Behaviour
915
883
---------------
@@ -1017,6 +985,64 @@ If more customization is needed, it is possible to subclass the `EntityRelationE
1017
985
See :ref:`entityrelationextractor`.
1018
986
1019
987
988
+
Schema Guidance and Graph Filtering
989
+
===================================
990
+
991
+
The provided schema serves as a guiding structure for the language model during graph construction. However, it does not impose strict constraints on the model's output. As a result, the model may generate additional node labels, relationship types, or properties that are not explicitly defined in the schema.
992
+
993
+
By default, all extracted elements — including nodes, relationships, and properties — are retained in the constructed graph. This behavior can be configured using the following schema options:
994
+
(see :ref:`graphschema`)
995
+
996
+
997
+
Configuration Options
998
+
---------------------
999
+
1000
+
- **Required Properties**
1001
+
Required properties may be specified at the node or relationship type level. Any extracted node or relationship missing one or more of its required properties will be pruned from the graph.
1002
+
1003
+
- **Additional Properties** *(default: True)*
1004
+
This node- or relationship-level option determines whether extra properties not listed in the schema should be retained.
1005
+
1006
+
- If set to ``True`` (default), all extracted properties are retained.
1007
+
- If set to ``False``, only the properties defined in the schema are preserved; all others are removed.
1008
+
1009
+
1010
+
.. note:: Node pruning
1011
+
1012
+
If, after property pruning using the above rule, a node is left without any property, it is removed from the graph.
1013
+
1014
+
1015
+
- **Additional Node Types** *(default: True)*
1016
+
This schema-level option specifies whether node types not defined in the schema are included in the graph.
1017
+
1018
+
- If set to ``True`` (default), such node types are retained.
1019
+
- If set to ``False``, nodes with undefined types are removed.
0 commit comments