-
Notifications
You must be signed in to change notification settings - Fork 27
Description
Is your feature request related to a problem? Please describe.
Currently, we allow descriptive stream names such as NPP, ATMS
following a known convention in the satellite world. See this example. However, logging systems only want simple names for metrics such as a.b.c
. where a
is a-zA-Z0-9-_
. It could be more but I would not push because it would break pandas column selection.
As a result, we cannot really have per-stream metrics, which would be useful for example to track how many samples are being ingested per stream. In this case it would be streams.NPP, ATMS.count_samples
which would cause issues with WandB or MLFlow.
I see 3 ways forward:
- magic: conversion of
NPP, ATMS
to camelCaseNppAtms
for example - convention: only allow names that follow the convention above and return error instead
- extra info: when defining a stream name, require a user to set an
stream_id
as a unique machine-ready identifier (for exampleNPP_ATMS
)
In any case, I believe this name should be a-zA-Z0-9_
(no _ to prevent weird issues with databases or pandas)
https://stackoverflow.com/questions/47964380/pandas-dataframe-column-naming-conventions
Do you have any thoughts on that?
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response
Organisation
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status