Skip to content

Improve indexing for multi-lingual items #275

@ruthtillman

Description

@ruthtillman

Our language faceting is based on the primary language field 008[35-37]. The 041's subfields contain additional language data. We have not been indexing this data because it can be overkill, e.g. https://searchworks.stanford.edu/view/13749584 (The Crown showing English, Arabic, Danish, Dutch, Estonian, Finnish, French, German, Italian, Norwegian, Polish, Spanish, Swedish, Turkish languages.)

But if the 008[35-37] is mul for multilingual, the 041a will often be useful. The 041a may repeat. It contains the same language codes as the 008[35-37], so the mapping won't need to be updated. But this will turn the facet from a single value into an array/list.

Some records will not have an 041a, so we'll need to do an "if exists" check.

So new logic would be:

  • If 008[35-37] == mul:
  • THEN index the 041a fields too
  • AND make unique (because the 041a may contain "mul" as well)

We'll want to retain the "mul" as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions