Skip to content

Support node names with string containing unicode code points.. #17

@zoj613

Description

@zoj613

The Zarr V3 specification insists that node names must have a name, which is a string of unicode code points. It also recommends implementations to only use characters in the sets a-z, A-Z, 0-9, -, _, . , but doesn't enforce this. It also recommends using case-folded NFKC-normalized strings for non-ASCII unicode charecters. Ocaml seems to have a bunch of libraries to support this:

  • Decoding utf-8 encoded strings with uutf
  • Segmentation of unicode text with uuseg
  • Normalization with uunf
  • Inspection with uucp.

There is also an introductory text with usage tips that can help make things easy to implement.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needednoderelated to the node module

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions