Skip to content

Documentation updates #15

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 15 commits into from
Closed

Documentation updates #15

wants to merge 15 commits into from

Conversation

normandy7
Copy link
Contributor

@normandy7 normandy7 commented Aug 15, 2024

Base automatically changed from dev/minimal-flow to main August 21, 2024 11:45
@normandy7 normandy7 marked this pull request as ready for review August 28, 2024 11:56
README.md Outdated
| `creation_time` | `datetime`, optional | `None` | Custom creation time of the run. |
| `from_run_id` | `str`, optional | `None` | If forking off an existing run, ID of the run to fork from. |
| `from_step` | `int`, optional | `None` | If forking off an existing run, step number to fork from. |
| `max_queue_size` | `int`, optional | `None` | Maximum number of operations allowed in the queue. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| `max_queue_size` | `int`, optional | `None` | Maximum number of operations allowed in the queue. |
| `max_queue_size` | `int`, optional | `None` | Maximum number of operations queued for processing. 1M by default. You should raise this value if you see the `on_queue_full_callback` being called. |

normandy7 and others added 2 commits August 29, 2024 12:46
Co-authored-by: Edyta <142720610+szaganek@users.noreply.github.com>
)

run.log(
metrics={"Metric1": metric1_value, "Metric2": metric2_value},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about actual values instead? It doesn't look to be very useful right now.

from neptune_scale import Run

run = Run(
family="RunFamilyName",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we mention project and API token? (via env variables for instance)

| `mode` | `Literal`, `"async"` or `"disabled"` | `"async"` | Mode of operation. If set to `"disabled"`, the run doesn't log any metadata. |
| `as_experiment` | `str`, optional | `None` | Name of the experiment to associate the run with. Learn more about [experiments](https://docs-beta.neptune.ai/experiments) in the Neptune documentation. |
| `creation_time` | `datetime`, optional | `None` | Custom creation time of the run. |
| `from_run_id` | `str`, optional | `None` | If forking off an existing run, ID of the run to fork from. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should mention that both from_step and from_run_id are mandatory if user would like to use forking.


| Name | Type | Default | Description |
|---------------|----------------------------------------------------|---------|---------------------------------------------------------------------------|
| `step` | `Union[float, int]`, optional | `None` | Index of the log entry. Must be increasing. If not specified, the `log()` call increments the step starting from the highest already logged value. **Tip:** Using float rather than int values can be useful, for example, when logging substeps in a batch. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Will not be incremented; And we should mention that this cannot be lower than from_step Run was forked from another.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then how does this work when omitting the step? Or are we missing that this is connected to metrics (FloatSeries) only?

README.md Outdated
| `timestamp` | `datetime`, optional | `None` | Time of logging the metadata. |
| `fields` | `Dict[str, Union[float, bool, int, str, datetime, list, set]]`, optional | `None` | Dictionary of configs or other values to log. Independent of the step value. Available types: float, integer, Boolean, string, and datetime. To log multiple values at once, pass multiple dictionaries. |
| `metrics` | `Dict[str, float]`, optional | `None` | Dictionary of metrics to log. Each metric value is associated with a step. To log multiple metrics at once, pass multiple dictionaries. |
| `add_tags` | `Dict[str, Union[List[str], Set[str]]]`, optional | `None` | Dictionary of tags to add to the run, as a list of strings. Independent of the step value. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent of the step value.

I'm not sure about it in terms of forking, let's do not mention such phrase for now.

...
```

**Note:** Calling `log()` without specifying the step still increments the index. To correlate logged values, make sure to send all metadata related to a step in a single `log()` call, or specify the step explicitly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not true

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@normandy7 It's fine to call log() with a specific step value more than once:

This:

run.log(step=1, metrics={"loss": 0.08})
run.log(step=1, metrics={"acc": 0.86})

is equivalent to:

run.log(step=1, metrics={"loss": 0.08, "acc": 0.86})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I'm not sure it helps clarify which part of the note is false. Maybe the problem is that I didn't include a mention of metrics?

Isn't it true that if the step is omitted when calling log_metrics(), the highest index found among logged FloatSeries fields is used as the reference step?

run.wait_for_processing()
run.log(fields={"scores/some_score": some_score_value}) # called once submitted data has been processed
```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error handling

In case an unrecoverable error is encountered, you can terminate the failed run in your error callback.
Note that this will effectively disable processing in-flight operations, as well as logging new data. However,
the training process won't be interrupted.

def my_error_callback(exc):
    run.terminate()
    
run = Run(..., on_error_callback=my_error_callback)

Co-authored-by: Edyta <142720610+szaganek@users.noreply.github.com>
> [!NOTE]
> This package only works with the `3.0` version of neptune.ai called Neptune Scale, which is in beta.
>
> It's supported on Linux and MacOS.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update to this.
It can be used on Windows if the run is initialized inside the if __name__ == "__main__": guard.

Read more here: https://docs.python.org/3/library/multiprocessing.html#multiprocessing-programming > Safe importing of main module

cc: @Raalsky

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's even not specific to Windows, it's common for all of the operating systems and a common pitfall of Python itself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to use it without the guard in Linux 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird, on Mac it doesn't work without the guard.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I conclude that we don't need any OS-specific notes for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To cover our bases, I'd prefer if we add a debugging/Help section of sorts and inform users to initialize the run inside the if __name__ == "__main__": guard if they get the below error:

RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

)

run.log(
metrics={"Metric1": metric1_value, "Metric2": metric2_value},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
metrics={"Metric1": metric1_value, "Metric2": metric2_value},
metrics={"acc": 0.98, "loss": 0.2},


run.log(
metrics={"Metric1": metric1_value, "Metric2": metric2_value},
fields={"Field1": field1_value}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fields={"Field1": field1_value}
fields={"params/lr": 0.01, "params/optimizer": "adam"}

| `api_token` | `str`, optional | `None` | Your Neptune API token or a service account's API token. If `None`, the value of the `NEPTUNE_API_TOKEN` environment variable is used. To keep your token secure, don't place it in source code. Instead, save it as an environment variable. |
| `resume` | `bool`, optional | `False` | If `False` (default), creates a new run. To continue an existing run, set to `True` and pass the ID of an existing run to the `run_id` argument. To fork a run, use `from_run_id` and `from_step` instead. |
| `mode` | `Literal`, `"async"` or `"disabled"` | `"async"` | Mode of operation. If set to `"disabled"`, the run doesn't log any metadata. |
| `as_experiment` | `str`, optional | `None` | Name of the experiment to associate the run with. Learn more about [experiments](https://docs-beta.neptune.ai/experiments) in the Neptune documentation. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| `as_experiment` | `str`, optional | `None` | Name of the experiment to associate the run with. Learn more about [experiments](https://docs-beta.neptune.ai/experiments) in the Neptune documentation. |
| `as_experiment` | `str`, optional | `None` | Name of the experiment to associate the run with. Learn more about [experiments](https://docs-beta.neptune.ai/experiments) in the Neptune documentation. Max length: 730 bytes |


| Name | Type | Default | Description |
|------------------|------------------|---------|---------------------------------------------------------------------------|
| `family` | `str` | - | Identifies related runs. All runs of the same lineage must have the same `family` value. That is, forking is only possible within the same family. Max length: 128 characters. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| `family` | `str` | - | Identifies related runs. All runs of the same lineage must have the same `family` value. That is, forking is only possible within the same family. Max length: 128 characters. |
| `family` | `str` | - | Identifies related runs. All runs of the same lineage must have the same `family` value. That is, forking is only possible within the same family. Max length: 128 bytes. |

| Name | Type | Default | Description |
|------------------|------------------|---------|---------------------------------------------------------------------------|
| `family` | `str` | - | Identifies related runs. All runs of the same lineage must have the same `family` value. That is, forking is only possible within the same family. Max length: 128 characters. |
| `run_id` | `str` | - | Identifier of the run. Must be unique within the project. Max length: 128 characters. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| `run_id` | `str` | - | Identifier of the run. Must be unique within the project. Max length: 128 characters. |
| `run_id` | `str` | - | Identifier of the run. Must be unique within the project. Max length: 128 bytes. |

@Raalsky Raalsky mentioned this pull request Sep 3, 2024
@normandy7 normandy7 closed this Sep 4, 2024
@normandy7 normandy7 deleted the sabine/docs branch May 26, 2025 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants