Skip to content

import-wikidata should prefer name statements over labels #437

@1ec5

Description

@1ec5

import-wikidata fetches the label of each linked Wikidata item in each available language:

query = f"""\
SELECT ?id ?label WHERE {{
VALUES ?id {{ {' '.join(batch)} }}
?id rdfs:label ?label.
}}"""

This is suboptimal because Wikidata labels are technically mainly for labeling items on the Wikidata site. Even though a label usually corresponds to a concept’s common name, it may sometimes contain some modifications to be recognizable on the site. (The closest analogy in OSM would be the name of a route relation that a mapper has optimized for display in the osm.org sidebar or JOSM’s relation list.)

A better alternative is the name (P2561) property. When an item has statements for this property, the query should prefer those statements. If there’s no statement for a given language, it should fall back to the label in that language.

If there are multiple name statements in a given language, the query should prefer the one with preferred rank, or without an end time (P582). Better yet, it should prefer the statement with the object has role (P3831) qualifier set to map label (Q104642575). For example, this will avoid adding an extra “D.C.” disambiguator to Washington, D.C. (which is correct in most written mediums, just not maps).

/ref https://github.com/ZeLonewolf/openstreetmap-americana/pull/592#discussion_r1035274034

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions