@@ -15,17 +15,17 @@ unstructured data.
15
15
16
16
17
17
******************
18
- Pipeline structure
18
+ Pipeline Structure
19
19
******************
20
20
21
21
A Knowledge Graph (KG) construction pipeline requires a few components:
22
22
23
- - Document ** parser **: extract text from files (PDFs, ...)
24
- - Document ** chunker **: split the text into smaller pieces of text, manageable by the LLM context window (token limit).
25
- - Chunk ** embedder ** (optional): compute and store the chunk embeddings
23
+ - ** Document parser **: extract text from files (PDFs, ...).
24
+ - ** Document chunker **: split the text into smaller pieces of text, manageable by the LLM context window (token limit).
25
+ - ** Chunk embedder ** (optional): compute the chunk embeddings.
26
26
- **Schema builder **: provide a schema to ground the LLM extracted entities and relations and obtain an easily navigable KG.
27
27
- **Entity and relation extractor **: extract relevant entities and relations from the text.
28
- - **Knowledge Graph writer **: write the identified entities and relations to a Neo4j database .
28
+ - **Knowledge Graph writer **: save the identified entities and relations.
29
29
30
30
.. image :: images/kg_builder_pipeline.png
31
31
:alt: KG Builder pipeline
@@ -171,9 +171,9 @@ Schema Builder
171
171
The schema is used to try and ground the LLM to a list of possible entities and relations of interest.
172
172
So far, schema must be manually created by specifying:
173
173
174
- - The entities the LLM should look for in the text, including their properties (name and type)
175
- - The relations of interest between these entities, including the relation properties (name and type)
176
- - A list of possible triplets to define the start (source) and end (target) types for each relation
174
+ - ** Entities ** the LLM should look for in the text, including their properties (name and type).
175
+ - ** Relations ** of interest between these entities, including the relation properties (name and type).
176
+ - ** Triplets ** to define the start (source) and end (target) entity types for each relation.
177
177
178
178
Here is a code block illustrating these concepts:
179
179
@@ -317,9 +317,9 @@ The default prompt uses the :ref:`erextractiontemplate`. It is possible to provi
317
317
318
318
The following variables can be used in the prompt:
319
319
320
- - `text ` (str): the text to be analyzed
321
- - `schema ` (str): the graph schema to be used
322
- - `examples ` (str): examples for few-shot learning
320
+ - `text ` (str): the text to be analyzed.
321
+ - `schema ` (str): the graph schema to be used.
322
+ - `examples ` (str): examples for few-shot learning.
323
323
324
324
325
325
Subclassing the EntityRelationExtractor
0 commit comments