You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Base class for similarity-based matching of properties for entity resolution.
153
156
Resolve entities with same label and similar set of textual properties (default is
154
-
["name"]) based on spaCy's static embeddings and cosine similarities.
157
+
["name"]):
158
+
- Group entities by label
159
+
- Concatenate the specified textual properties
160
+
- Compute similarity between each pair
161
+
- Consolidate overlapping sets
162
+
- Merge similar nodes via APOC (See apoc.refactor.mergeNodes documentation for more
163
+
details).
164
+
165
+
Subclasses implement `compute_similarity` based on different strategies, and return
166
+
a similarity score between 0 and 1.
155
167
156
168
Args:
157
169
driver (neo4j.Driver): The Neo4j driver to connect to the database.
158
170
filter_query (Optional[str]): Optional Cypher WHERE clause to reduce the resolution scope.
159
171
resolve_properties (Optional[List[str]]): The list of properties to consider for embeddings Defaults to ["name"].
160
172
similarity_threshold (float): The similarity threshold above which nodes are merged. Defaults to 0.8.
161
-
spacy_model (str): The name of the spaCy model to load. Defaults to "en_core_web_lg".
162
173
neo4j_database (Optional[str]): The name of the Neo4j database. If not provided, this defaults to the server's default database ("neo4j" by default) (`see reference to documentation <https://neo4j.com/docs/operations-manual/current/database-administration/#manage-databases-default>`_).
163
174
164
-
Example:
165
-
166
-
.. code-block:: python
167
-
168
-
from neo4j import GraphDatabase
169
-
from neo4j_graphrag.experimental.components.resolver import SinglePropertyExactMatchResolver
Resolve entities with same label and similar set of textual properties (default is
293
+
["name"]) based on spaCy's static embeddings and cosine similarities.
294
+
295
+
Args:
296
+
driver (neo4j.Driver): The Neo4j driver to connect to the database.
297
+
filter_query (Optional[str]): Optional Cypher WHERE clause to reduce the resolution scope.
298
+
resolve_properties (Optional[List[str]]): The list of properties to consider for embeddings Defaults to ["name"].
299
+
similarity_threshold (float): The similarity threshold above which nodes are merged. Defaults to 0.8.
300
+
spacy_model (str): The name of the spaCy model to load. Defaults to "en_core_web_lg".
301
+
neo4j_database (Optional[str]): The name of the Neo4j database. If not provided, this defaults to the server's default database ("neo4j" by default) (`see reference to documentation <https://neo4j.com/docs/operations-manual/current/database-administration/#manage-databases-default>`_).
302
+
303
+
Example:
304
+
305
+
.. code-block:: python
306
+
307
+
from neo4j import GraphDatabase
308
+
from neo4j_graphrag.experimental.components.resolver import SpaCySemanticMatchResolver
0 commit comments