Skip to content

[meta] HXLm.lisp and/or related strategies for portable 'Turing complete' HDP custom functions #18

@fititnt

Description

@fititnt

Related topics:

  • [meta] HDP Declarative Programming (working draft) [meta] HDP Declarative Programming (working draft) #16
    • The HDP YAML (+JSON) syntax have potential to, at minimum, be an way to humans document (without leak access keys and passwords, yet with minimum viability to allow be parsed by programs) how to access datasets
      • The urnresolver: Uniform Resource Names - URN Resolver urnresolver: Uniform Resource Names - URN Resolver #13 could aid the part of how to point to the resources themselves
      • The way to express Acceptable Use Policies (ones that could be enforced by tools) still not defined, because this need be good enough to allow localization in natural languages supported by the drafted HDP
  • hxl-yml-spec-to-hxl-json-spec: HXL Data processing specs exporter hxl-yml-spec-to-hxl-json-spec: HXL Data processing specs exporter #14
    • The HXL Data processing specs is known to work in production for years. The official wiki says the underlining usage is focused for coders, but the HXL-Proxy already provides a front-end for end users.
      • At bare minimum any syntax should be able to also be transpiled to an to JSON processing specs for HXL data so it could run on existing HXL-proxy (and command line tools) with the same stability of tools already working in the past.
        • Note: Lisp-like syntax is even less user-friendly than YAML syntactic sugar to JSON processing specs for HXL data, so this approach is not expected to replace the possibility of users doing without Lisp-like syntax.
    • Note: as this comment hxl-yml-spec-to-hxl-json-spec: HXL Data processing specs exporter #14 (comment) an early version, not integrated on HXLm.HDP, but accessible via hdpcli, already transpile YAML structure to HXL Data processing specs, supported by libhxl cli tools and the HXL-proxy. The notable exception is the inline to use as input (so instead of giving an external URL, the processing spec would already have the input data) and the inline output (that could be used to test if the filter was validated).
      • But if we manage to let users, before pass data to libhxl-python cli tool or the HXL-proxy to save the input, like on a local file or implement the HXLm.core.io to allow even write google spreadsheets, as hard as this extra step may be, it would allow validation of entire HXL-like systems, like if it was with more close to real data and meet requirements of an 'design-by-contract programming' testing the full chain of components, not just the internals of the HXL data processing implementation
  • [meta] Internationalization and localization (i18n, l10n) and internal working vocabulary [meta] Internationalization and localization (i18n, l10n) and internal working vocabulary #15
    • An proof of concept the HDP already support 6 UN working languages plus Latin and Portuguese.
      • The early prototypes actually were faster to make than core refactoring to an more OOP approach with strong focus on automated testing for internals. But the point here is that the HDP yaml files already support several natural languages and to add new ones , do not need to know internals, just change some files on hxlm/ontologia
    • While the same supported languages do not need to be both for the high level HDP yaml keywords (ideally, since is less terms, that should be done first!) the pertinence here is, whatever would be an language syntax for create HDP macro/extensions/plugins, this must be done already optimizing to to provide an Source-to-source compiler
      • In other words: even if we decide to implement keywords using English/Latin, as soon as do exist help to add new natural languages to knowledge graph, this should allow source to source compiling. This means that the syntax should be "L10N friendly" from start

Related concepts:


About this topic

This topic is a draft. It will be referenced on specific commits and other discussions.

But to say upfront that as much as possible, the idea here is keep as much as possible documents that could be used by decision makers to authorize usage and/or for people who document that datasets do exist (even if they do not say on the document how to find them) and, for what is not feasible already have via the underlining python implementation, allow customization.

Note that these customization, while not explicitly sandboxes (but could be) do not need to be allowed to have direct disk or network access. This approach is not just more safe, also open room for they be more reusable and (this is very important!) simplify documentation on how to use, even by individuals who would not speak the same language.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions