1
1
.. _user-guide-kg-builder :
2
2
3
3
User Guide: Knowledge Graph Builder
4
- ########################################
4
+ ###################################
5
5
6
6
7
7
This page provides information about how to create a Knowledge Graph from
@@ -14,9 +14,9 @@ unstructured data.
14
14
It is not recommended to use it in production yet.
15
15
16
16
17
- ******************************
17
+ ******************
18
18
Pipeline structure
19
- ******************************
19
+ ******************
20
20
21
21
A Knowledge Graph (KG) construction pipeline requires a few components:
22
22
@@ -36,9 +36,9 @@ This package contains the interface and implementations for each of these compon
36
36
To see an end-to-end example of a Knowledge Graph construction pipeline,
37
37
refer to `this example <https://github.com/neo4j/neo4j-genai-python/blob/main/examples/pipeline/kg_builder.py >`_.
38
38
39
- ***************************************
39
+ **********************************
40
40
Knowledge Graph Builder Components
41
- ***************************************
41
+ **********************************
42
42
43
43
Below is a list of the different components available in this package, and how to use them.
44
44
@@ -66,7 +66,7 @@ They can also be used within a pipeline:
66
66
67
67
68
68
Document Parser
69
- ========================
69
+ ===============
70
70
71
71
Document parsers start from a file path and return the text extracted from this file.
72
72
@@ -95,7 +95,7 @@ To implement your own loader, use the `DataLoader` interface:
95
95
96
96
97
97
Document Splitter
98
- ========================
98
+ =================
99
99
100
100
Document splitters, as the name indicate, are splitting documents into smaller chunks
101
101
that can be processed within the LLM token limits. Wrappers for LangChain and LlamaIndex
@@ -137,7 +137,7 @@ To implement a custom text splitter, the `TextSplitter` interface can be used:
137
137
138
138
139
139
Chunk Embedder
140
- ===============================
140
+ ==============
141
141
142
142
In order to embed the chunks' texts (to be used in vector search RAG), one can use the
143
143
`TextChunkEmbedder ` component, which rely on the :ref: `Embedder ` interface.
@@ -168,7 +168,7 @@ The embeddings are added to each chunk metadata, and will be saved as a Chunk no
168
168
169
169
170
170
Schema Builder
171
- ========================
171
+ ==============
172
172
173
173
The schema is used to try and ground the LLM to a list of possible entities and relations of interest.
174
174
So far, schema must be manually created by specifying:
@@ -227,7 +227,7 @@ to the LLM.
227
227
228
228
229
229
Entity and Relation Extractor
230
- ===============================
230
+ =============================
231
231
232
232
This component is responsible for extracting the relevant entities and relationships from each text chunk,
233
233
using the schema as guideline.
@@ -259,7 +259,7 @@ It can be used in this way:
259
259
The LLM to use can be customized, the only constraint is that it obeys the :ref: `LLMInterface <llminterface >`.
260
260
261
261
Error Behaviour
262
- -------------------------------
262
+ ---------------
263
263
264
264
By default, if the extraction fails for one chunk, it will be ignored and the non-failing chunks will be saved.
265
265
This behaviour can be changed by using the `on_error ` flag in the `LLMEntityRelationExtractor ` constructor:
@@ -287,7 +287,7 @@ will be saved to Neo4j.
287
287
288
288
289
289
Lexical Graph
290
- -------------------------------
290
+ -------------
291
291
292
292
By default, the `LLMEntityRelationExtractor ` adds some extra nodes and relationships to the extracted graph:
293
293
@@ -306,7 +306,7 @@ If this 'lexical graph' is not desired, set the `created_lexical_graph` to `Fals
306
306
307
307
308
308
Customizing the Prompt
309
- ----------------------------------------
309
+ ----------------------
310
310
311
311
The default prompt uses the :ref: `erextractiontemplate `. It is possible to provide a custom prompt as string:
312
312
@@ -325,7 +325,7 @@ The following variables can be used in the prompt:
325
325
326
326
327
327
Subclassing the EntityRelationExtractor
328
- ----------------------------------------
328
+ ---------------------------------------
329
329
330
330
If more customization is needed, it is possible to subclass the `EntityRelationExtractor ` interface:
331
331
@@ -360,7 +360,7 @@ See :ref:`entityrelationextractor`.
360
360
361
361
362
362
Knowledge Graph Writer
363
- ===============================
363
+ ======================
364
364
365
365
KG writer are used to save the results of the `EntityRelationExtractor `.
366
366
The main implementation is the `Neo4jWriter ` that will write nodes and relationships
0 commit comments