You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add optional schema enforcement for KG builder as a validation layer after entity and relation extraction (neo4j#296)
* Add schema enforcement modes and strict mode behaviour
* Add unit tests for schema enforcement modes
* Update change log and docs
* Fix documentation
* Add warning when schema enforcement is on but schema not provided
* Code cleanups
* Improve code for more clarity
* Apply changes requested by the PR review
* Invert rel direction
* Adapt SimpleKGPipelineConfig
These parameters are part of the `EntityAndRelationExtractor` component.
132
132
For detailed information, refer to the section on :ref:`Entity and Relation Extractor`.
@@ -138,6 +138,7 @@ They are also accessible via the `SimpleKGPipeline` interface.
138
138
# ...
139
139
prompt_template="",
140
140
lexical_graph_config=my_config,
141
+
enforce_schema="STRICT"
141
142
on_error="RAISE",
142
143
# ...
143
144
)
@@ -829,6 +830,30 @@ It can be used in this way:
829
830
830
831
The LLM to use can be customized, the only constraint is that it obeys the :ref:`LLMInterface <llminterface>`.
831
832
833
+
Schema Enforcement Behaviour
834
+
----------------------------
835
+
By default, even if a schema is provided to guide the LLM in the entity and relation extraction, the LLM response is not validated against that schema.
836
+
This behaviour can be changed by using the `enforce_schema` flag in the `LLMEntityRelationExtractor` constructor:
837
+
838
+
.. code:: python
839
+
840
+
from neo4j_graphrag.experimental.components.entity_relation_extractor import LLMEntityRelationExtractor
841
+
from neo4j_graphrag.experimental.components.types import SchemaEnforcementMode
842
+
843
+
extractor = LLMEntityRelationExtractor(
844
+
# ...
845
+
enforce_schema=SchemaEnforcementMode.STRICT,
846
+
)
847
+
848
+
In this scenario, any extracted node/relation/property that is not part of the provided schema will be pruned.
849
+
Any relation whose start node or end node does not conform to the provided tuple in `potential_schema` will be pruned.
850
+
If a relation start/end nodes are valid but the direction is incorrect, the latter will be inverted.
851
+
If a node is left with no properties, it will be also pruned.
852
+
853
+
.. warning::
854
+
855
+
Note that if the schema enforcement mode is on but the schema is not provided, no schema enforcement will be applied.
chunks (TextChunks): List of text chunks to extract entities and relations from.
307
314
document_info (Optional[DocumentInfo], optional): Document the chunks are coming from. Used in the lexical graph creation step.
308
315
lexical_graph_config (Optional[LexicalGraphConfig], optional): Lexical graph configuration to customize node labels and relationship types in the lexical graph.
309
-
schema (SchemaConfig | None): Definition of the schema to guide the LLM in its extraction. Caution: at the moment, there is no guarantee that the extracted entities and relations will strictly obey the schema.
316
+
schema (SchemaConfig | None): Definition of the schema to guide the LLM in its extraction.
310
317
examples (str): Examples for few-shot learning in the prompt.
0 commit comments