-
Notifications
You must be signed in to change notification settings - Fork 1
Add prototype of new client #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
from _api.scout_catalog import CatalogService | ||
|
||
catalog = create_service(CatalogService, get_base_url()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can probably wrap this all in a client or session class so we have fewer globals floating around
|
||
# wrappers for conjure service apis | ||
|
||
from _api.scout_catalog import CatalogService |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all these paths need to change to ._api.combined.<whatever>
(though also can just create a separately installable package for this eventually)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
down to start publishing the conjure apis immediately - conjure python generator produces python packages, just needs some wiring to push to pypi
|
||
# todo - merge this with the existing _upload_s3 function, it's mostly copy-pasted wholesale | ||
def _upload_file_s3(self): | ||
if hasattr(self, "s3_path"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These attrs can be optional or something, rather than needing to check for presence.
class Dataset: | ||
def __init__(self, csv_path: str): | ||
self.csv_path = csv_path | ||
print("Reading dataset schema...", flush=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can migrate over to logging for these messages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree - do you know if it auto-flushes? was v annoying not seeing logs during long-running blocking operations
if hasattr(self, "run_rid"): | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should definitely log when we skip steps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the reason I didn't do this is that this will be called recursively - so if you had 10 series in a workbook, it'd call sync
on all of them, then each of those would call sync
on their respective run (and eventually dataset). so the dataset/run would be uploaded the first time, then skipped for the other 9 series & produce a lot of confusing noise.
if/when we switch to a proper logger, would def add this at debug level or something
@staticmethod | ||
def from_ch(run: Run, sess, name: str, series_id: str, series_type: str) -> "Series": | ||
base_query = f"SELECT timestamp, value FROM nominal.dataset_{series_type} WHERE series='{series_id}'" | ||
return Series(run, sess, name, base_query, RawNumericSeriesNode(name=name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@staticmethod | |
def from_ch(run: Run, sess, name: str, series_id: str, series_type: str) -> "Series": | |
base_query = f"SELECT timestamp, value FROM nominal.dataset_{series_type} WHERE series='{series_id}'" | |
return Series(run, sess, name, base_query, RawNumericSeriesNode(name=name)) | |
@classmethod | |
def from_ch(cls, run: Run, sess, name: str, series_id: str, series_type: str) -> "Series": | |
base_query = f"SELECT timestamp, value FROM nominal.dataset_{series_type} WHERE series='{series_id}'" | |
return cls(run, sess, name, base_query, RawNumericSeriesNode(name=name)) |
data_sources = [ | ||
(ds.filename, CreateRunDataSource(data_source=DataSource(dataset=ds.dataset_rid), series_tags={})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will want the ability to add a custom ref name here, with the filename probably as a fallback
|
||
|
||
class Dataset: | ||
def __init__(self, csv_path: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets create a bunch of classmethods for alternative ctors, e.g. from_csv()
so that we can later have from_pandas()
, from_parquet()
, etc
Closing this PR (but not deleting the branch) |
Todo: