-
Notifications
You must be signed in to change notification settings - Fork 143
Open
Labels
documentationImprovements or additions to documentationImprovements or additions to documentation
Description
The documentation for implementing a custom data loader is out of date:
https://neo4j.com/docs/neo4j-graphrag-python/current/user_guide_kg_builder.html#data-loader
The documentation specifies the following:
from pathlib import Path
from neo4j_graphrag.experimental.components.pdf_loader import DataLoader, PdfDocument
class MyDataLoader(DataLoader):
async def run(self, path: Path) -> PdfDocument:
# process file in `path`
return PdfDocument(text="text")
When using as part of the SimpleKGPipeline, the interface expects filepath
and DocumentInfo
e.g.
from pathlib import Path
from neo4j_graphrag.experimental.components.pdf_loader import DataLoader, PdfDocument, DocumentInfo
class MyDataLoader(DataLoader):
async def run(self, filepath: Path) -> PdfDocument:
# process file in `filepath`
return PdfDocument(
text="text",
document_info=DocumentInfo(
path=str(filepath),
metadata={}
)
)
Metadata
Metadata
Assignees
Labels
documentationImprovements or additions to documentationImprovements or additions to documentation