Skip to content

Add a new version of the sampling page for Python SDK 3.0 #14275

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 27 additions & 2 deletions docs/platforms/python/configuration/sampling.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,27 @@ The information contained in the <PlatformIdentifier name="sampling-context" />

When using custom instrumentation to create a transaction, you can add data to the <PlatformIdentifier name="sampling-context" /> by passing it as an optional second argument to <PlatformIdentifier name="start-transaction" />. This is useful if there's data to which you want the sampler to have access but which you don't want to attach to the transaction as `tags` or `data`, such as information that's sensitive or that’s too large to send with the transaction. For example:

<PlatformContent includePath="performance/custom-sampling-context" />
```python
sentry_sdk.start_transaction(
# kwargs passed to Transaction constructor - will be recorded on transaction
name="GET /search",
op="search",
data={
"query_params": {
"animal": "dog",
"type": "very good"
}
},
# `custom_sampling_context` - won't be recorded
custom_sampling_context={
# PII
"user_id": "12312012",
# too big to send
"search_results": { ... }
}
)
```

Comment on lines +86 to +106
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the current content of the snippet in performance/custom-sampling-context. Inlined it here so that the snippet can be updated to be 3.x compatible and used from the 3.x page.


## Inheritance

Expand All @@ -105,7 +125,12 @@ If you're using a <PlatformIdentifier name="traces-sample-rate" /> rather than a

If you know at transaction creation time whether or not you want the transaction sent to Sentry, you also have the option of passing a sampling decision directly to the transaction constructor (note, not in the <PlatformIdentifier name="custom-sampling-context" /> object). If you do that, the transaction won't be subject to the <PlatformIdentifier name="traces-sample-rate" />, nor will <PlatformIdentifier name="traces-sampler" /> be run, so you can count on the decision that's passed not to be overwritten.

<PlatformContent includePath="performance/force-sampling-decision" />
```python
sentry_sdk.start_transaction(
name="GET /search",
sampled=True
)
Comment on lines +129 to +132
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

```

## Precedence

Expand Down
125 changes: 125 additions & 0 deletions docs/platforms/python/configuration/sampling__v3.x.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
---
title: Sampling
description: "Learn how to configure the volume of error and transaction events sent to Sentry."
sidebar_order: 60
---

Adding Sentry to your app gives you a great deal of very valuable information about errors and performance you wouldn't otherwise get. And lots of information is good -- as long as it's the right information, at a reasonable volume.

## Sampling Error Events

To send a representative sample of your errors to Sentry, set the <PlatformIdentifier name="sample-rate" /> option in your SDK configuration to a number between `0` (0% of errors sent) and `1` (100% of errors sent). This is a static rate, which will apply equally to all errors. For example, to sample 25% of your errors:

<PlatformContent includePath="configuration/sample-rate" />

The error sample rate defaults to `1.0`, meaning all errors are sent to Sentry.

Changing the error sample rate requires re-deployment. In addition, setting an SDK sample rate limits visibility into the source of events. Setting a [rate limit](/pricing/quotas/manage-event-stream-guide/#rate-limiting) for your project (which only drops events when volume is high) may better suit your needs.

### Dynamically Sampling Error Events

To sample error events dynamically, set the <PlatformIdentifier name="error-sampler" /> to a function that returns the desired sample rate for the event. The <PlatformIdentifier name="error-sampler" /> takes two arguments, <PlatformIdentifier name="event" /> and <PlatformIdentifier name="hint" />. `event` is the [Event](https://github.com/getsentry/sentry-python/blob/master/sentry_sdk/_types.py) that will be sent to Sentry, `hint` includes Python's [sys.exc_info()](https://docs.python.org/3/library/sys.html#sys.exc_info) information in `hint["exc_info"]`.

<Alert>

Your <PlatformIdentifier name="error-sampler" /> function **must return a valid value**. A valid value is either:

- A **floating-point number** between `0.0` and `1.0` (inclusive) indicating the probability an error gets sampled, **or**
- A **boolean** indicating whether or not to sample the error.

</Alert>

One potential use case for the <PlatformIdentifier name="error-sampler" /> is to apply different sample rates for different exception types. For instance, if you would like to sample some exception called `MyException` at 50%, discard all events of another exception called `MyIgnoredException`, and sample all other exception types at 100%, you could use the following code when initializing the SDK:

<PlatformContent includePath="configuration/error-sampler" />

<Alert>

You can define at most one of the <PlatformIdentifier name="error-sampler" /> and the <PlatformIdentifier name="sample-rate" />. If both are set, the <PlatformIdentifier name="error-sampler" /> will control sampling, and the <PlatformIdentifier name="sample-rate" /> will be ignored.

</Alert>

## Sampling Transaction Events

We recommend sampling your transactions (root spans) for two reasons:

1. Capturing a single trace involves minimal overhead, but capturing traces for _every_ page load or _every_ API request may add an undesirable load to your system.
2. Enabling sampling allows you to better manage the number of events sent to Sentry, so you can tailor your volume to your organization's needs.

Choose a sampling rate with the goal of finding a balance between performance and volume concerns with data accuracy. You don't want to collect _too_ much data, but you want to collect sufficient data from which to draw meaningful conclusions. If you’re not sure what rate to choose, start with a low value and gradually increase it as you learn more about your traffic patterns and volume.

## Configuring the Transaction Sample Rate

The Sentry SDKs have two configuration options to control the volume of transactions sent to Sentry, allowing you to take a representative sample:

1. Uniform sample rate (<PlatformIdentifier name="traces-sample-rate" />):
- Provides an even cross-section of transactions, no matter where in your app or under what circumstances they occur.
- Uses default [inheritance](#inheritance) and [precedence](#precedence) behavior
2. Sampling function (<PlatformIdentifier name="traces-sampler" />) which:
- Samples different transactions at different rates
- <PlatformLink to="/configuration/filtering/">Filters</PlatformLink> out some
transactions entirely
- Modifies default [precedence](#precedence) and [inheritance](#inheritance) behavior

By default, none of these options are set, meaning no transactions will be sent to Sentry. You must set either one of the options to start sending transactions.

### Setting a Uniform Sample Rate

<PlatformContent includePath="performance/uniform-sample-rate" />

### Setting a Sampling Function

<PlatformContent includePath="performance/sampling-function-intro" />

## Sampling Context Data

### Default Sampling Context Data

The information contained in the <PlatformIdentifier name="sampling-context" /> object passed to the <PlatformIdentifier name="traces-sampler" /> when a root span is created varies by integration.

<PlatformContent includePath="performance/default-sampling-context" />

### Custom Sampling Context Data

When using custom instrumentation to create a root span, you can add data to the <PlatformIdentifier name="sampling-context" /> by providing additional `attributes` to <PlatformIdentifier name="start-span" />. All span attributes provided at span start are accessible via the <PlatformIdentifier name="sampling-context" /> and will also ultimately be sent to Sentry. If you want to exclude an attribute, you can filter it out in a <PlatformLink to="/configuration/filtering">`before_send`</PlatformLink>.

<PlatformContent includePath="performance/custom-sampling-context" />

## Inheritance

Whatever a root span's sampling decision, that decision will be passed to its child spans and from there to any root spans they subsequently cause in other services.

(See <PlatformLink to="/tracing/trace-propagation/">Distributed Tracing</PlatformLink> for more about how that propagation is done.)

If the root span currently being created is one of those subsequent root spans (in other words, if it has a parent root span), the upstream (parent) sampling decision will be included in the sampling context data. Your <PlatformIdentifier name="traces-sampler" /> can use this information to choose whether to inherit that decision. In most cases, inheritance is the right choice, to avoid breaking distributed traces. A broken trace will not include all your services.

<PlatformContent includePath="performance/always-inherit-sampling-decision">

In some SDKs, for convenience, the <PlatformIdentifier name="traces-sampler" /> function can return a boolean, so that a parent's decision can be returned directly if that's the desired behavior.

</PlatformContent>

If you're using a <PlatformIdentifier name="traces-sample-rate" /> rather than a <PlatformIdentifier name="traces-sampler" />, the decision will always be inherited. The provided <PlatformIdentifier name="traces-sample-rate" /> will only be used to generate a sample rate if there is sampling decision coming in from upstream.

## Forcing a Sampling Decision

If you know at span creation time whether or not you want the root span (transaction) sent to Sentry, you also have the option of passing a sampling decision directly in the `start_span` API. If you do that, the root span won't be subject to the <PlatformIdentifier name="traces-sample-rate" />, nor will <PlatformIdentifier name="traces-sampler" /> be run, so you can count on the decision that's passed not to be overwritten.

<PlatformContent includePath="performance/force-sampling-decision" />

## Precedence

There are multiple ways for a root span (transaction) to end up with a sampling decision.

- Random sampling according to a static sample rate set in <PlatformIdentifier name="traces-sample-rate" />
- Random sampling according to a sample function rate returned by <PlatformIdentifier name="traces-sampler" />
- Absolute decision (100% chance or 0% chance) returned by <PlatformIdentifier name="traces-sampler" />
- If the transaction has a parent, inheriting its parent's sampling decision
- Absolute decision passed to <PlatformIdentifier name="start-span" />

When there's the potential for more than one of these to come into play, the following precedence rules apply:

1. If a sampling decision is passed to <PlatformIdentifier name="start-transaction" />, that decision will be used, overriding everything else.
2. If <PlatformIdentifier name="traces-sampler" /> is defined, its decision will be used. It can choose to keep or ignore any parent sampling decision, use the sampling context data to make its own decision, or choose a sample rate for the transaction. We advise against overriding the parent sampling decision because it will break distributed traces.
3. If <PlatformIdentifier name="traces-sampler" /> is not defined, but there's a parent sampling decision, the parent sampling decision will be used.
4. If <PlatformIdentifier name="traces-sampler" /> is not defined and there's no parent sampling decision, <PlatformIdentifier name="traces-sample-rate" /> will be used.
24 changes: 12 additions & 12 deletions platform-includes/performance/custom-sampling-context/python.mdx
Original file line number Diff line number Diff line change
@@ -1,20 +1,20 @@
```python
sentry_sdk.start_transaction(
# kwargs passed to Transaction constructor - will be recorded on transaction
sentry_sdk.start_span(
name="GET /search",
op="search",
data={
"query_params": {
"animal": "dog",
"type": "very good"
}
"query_params": {
"animal": "dog",
"type": "very good"
}
},
# `custom_sampling_context` - won't be recorded
custom_sampling_context={
# PII
# Attributes defined at root span start will be accessible in traces_sampler
# and they will be sent to Sentry unless filtered out manually.
# Note that these can only be primitive types or a list of a single primitive
# type.
attributes={
"user_id": "12312012",
# too big to send
"search_results": { ... }
"foo": "bar",
}
);
)
```
10 changes: 6 additions & 4 deletions platform-includes/performance/force-sampling-decision/python.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
```python
sentry_sdk.start_transaction(
name="GET /search",
sampled=True
);
sentry_sdk.start_span(
name="GET /search",
# Sample this span. Note that the `sampled` parameter is only taken into
# account for root-level spans (transactions).
sampled=True,
)
```