Skip to content

Neo4j GraphRAG Package for Python 1.8.0

Latest
Compare
Choose a tag to compare
@stellasia stellasia released this 03 Jul 15:05

New in 1.8.0

Full Changelog: 1.7.0...1.8.0

Schema

  • Introduced a new GraphSchema object

  • GraphSchema object can be serialized as json or yaml

  • Extra parameters have been added to the GraphSchema object to control how the LLM-extracted graph is cleaned in a post-processing step. This replaces the "STRICT mode" introduced in 1.7.0.

  • SchemaExtractionFromText: an automatic schema extraction from the input text can be run before entering the entity and relation extraction. This is controlled using the schema parameter:

    • schema="EXTRACTED" or (schema=None, default value); The schema is automatically extracted from the input text once using LLM. This guiding schema is then used to structure entity and relation extraction for all chunks. This guarantees all chunks have the same guiding schema. (See Automatic Schema Extraction)
    • schema="FREE" or empty schema ({"node_types": ()}) : No schema extraction is performed. Entity and relation extraction proceed without a predefined or derived schema, resulting in unguided entity and relation extraction. Use this to bypass automatic schema extraction.
    • Any other schema values are parsed into a GraphSchema object that is used to guide the LLM in the extractor and clean the graph in the pruner.

GraphRAG: Ability to return a user-defined message if context is empty in GraphRAG

  • In a GraphRAG pipeline, if the context returned by the retriever was empty, it is now possible to stop the pipeline and return a fallback response.

Misc

  • Added support for Python 3.13
  • Added the option to change the DB with retrieve_vector_index_info and retrieve_fulltext_index_info
  • Added the ability to store messages in a different database in Neo4jMessageHistory
  • Added createdAt property to Neo4jMessageHistory nodes

Fixed in 1.8.0

  • Fixed a bug where spacy and rapidfuzz needed to be installed even if not using the relevant entity resolvers.
  • Fixed RunResult model so that the results of each component in a pipeline are logged properly.
  • Fixed a bug where VertexAILLM.invoke_with_tools (and ainvoke_with_tools) would fail with multiple tools

Changed in 1.8.0

Strict mode in KG Builder pipeline (BREAKING CHANGE - EXPERIMENTAL NAMESPACE)

enforce_schema=”STRICT” | “None” has been removed from SimpleKGPipeline and config file. Schema enforcement is now based on schema definition. See "New in 1.8.0" above.

Removed SchemaConfig in favor of a new GraphSchema type (BREAKING CHANGE - EXPERIMENTAL NAMESPACE)

This is largely an internal change. However, you may be affected if you have implemented your own SchemaBuilder or if you are using its return value in another custom component (e.g. an entity/relation extractor).

BEFORE:

all_entities = list(schema_config.entities.values())  # list[dict]
person_entity = schema_config.entities["Person"]  # dict
friendship_relationship = schema_config.relations["FRIENDSHIP"]

NOW:

all_entities = graph_schema.node_types  # list[NodeType]
person_entity = graph_schema.node_type_from_label("Person")  # NodeType object
friendship_relationship = graph_schema.relationship_type_from_label("FRIENDSHIP")

Introduce a new schema parameter in SimpleKGPipeline

Note: previous syntax with ‘entities’, ‘relations’ and ‘potential_schema’ still works… But it is deprecated and will be removed soon!

BEFORE:

kg_builder = SimpleKGPipeline(
    # ...
    entities=node_types,
    relations=relationship_types,
    potential_schema=patterns,
    # ...
)

NOW:

kg_builder = SimpleKGPipeline(
    # ...
    schema={
        "node_types": node_types,
        "relationship_types": relationship_types,
        "patterns": patterns,
    },
    # ...
)

Definition for node_types, relationship_types and patterns is unchanged compared to the previous entities, relations and potential_schema respectively.

Node properties in KG Builder

  • Nodes created during the KG construction pipeline do not have an id property anymore.
  • Similarly the chunk_index property is removed from all entity nodes (users can use the FROM_CHUNK relationship).

New Contributors