Skip to content

Allow users to override NER with known locations #7

@ahalterman

Description

@ahalterman

Sometimes the spaCy NER step fails with very short documents, including examples where the entire document is just a single sentence, because it's lacking the usual context to identify locations. This happens especially with locations outside of Europe or English-speaking countries because of the limitations of the NER training data.

It would be nice to allow users to override the NER step and force a geolocation on a known entity.

The easiest way to do this would probably be to:

  • pull most of the geolocation logic out of geoparse_doc into a separate function
  • create a new function (geoparse_ent?) that takes a document + entity string as arguments. It can then manually add the string entity as a spaCy entity, and then call the new function above.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions