|
| 1 | +The deletion subsystem manages asynchronous scheduled bulk deletes as well as cascading deletes |
| 2 | +into relations. When adding new models to the application, you should consider how those records will |
| 3 | +be deleted when a project or organization are deleted. |
| 4 | + |
| 5 | +The deletion subsystem uses records in PostgreSQL to track deletions and their status. This also |
| 6 | +allows deletions to be retried when a deploy interrupts a deletion task, or a deletion job fails |
| 7 | +because of a new relation or database failure. |
| 8 | + |
| 9 | +## Taskbroker Tasks |
| 10 | + |
| 11 | +Every 15 minutes `sentry.tasks.deletion.run_scheduled_deletions()` runs. This task queries for jobs |
| 12 | +that were scheduled to be run in the past that are not already in progress. Tasks are spawned for |
| 13 | +each deletion that needs to be processed. |
| 14 | + |
| 15 | +If a task fails, the daily run of `sentry.tasks.deletion.reattempt_deletions()` will |
| 16 | +clear the `in_progress` flag of old jobs so that they are picked up by the next scheduled run. |
| 17 | + |
| 18 | +## Scheduling Deletions |
| 19 | + |
| 20 | +The entrypoint into deletions for the majority of application code is via the `ScheduledDeletion` |
| 21 | +model. This model lets you create deletion jobs that are run in the future. |
| 22 | + |
| 23 | +```python |
| 24 | +from sentry.deletions.models.scheduleddeletion import ScheduledDeletion |
| 25 | + |
| 26 | +ScheduledDeletion.schedule(organization, days=1, hours=2) |
| 27 | +``` |
| 28 | + |
| 29 | +The above would schedule an organization to be deleted in 1 day and 2 hours. |
| 30 | + |
| 31 | +## Deletion Tasks |
| 32 | + |
| 33 | +The deletion system provides two base classes to cover common scenarios: |
| 34 | + |
| 35 | +- `ModelDeletionTask` fetches records and deletes each instance individually. |
| 36 | + - This strategy is good for models that rely on Django signals or have child relations. |
| 37 | + - This strategy is also the default used when a deletion task isn't specified for a model. |
| 38 | +- `BulkModelDeletionTask` deletes records in bulk using a single query. |
| 39 | + - This strategy is well suited to removing records that don't have any relations. |
| 40 | + |
| 41 | +If your model has child relations that need to be cleaned up you should implement a custom |
| 42 | +deletion task. Doing so requires a few steps: |
| 43 | + |
| 44 | +1. Add your deletion task subclass to `sentry.deletions.defaults` |
| 45 | +2. Add your deletion task to the default manager mapping in `sentry.deletions.__init__`. |
| 46 | + |
| 47 | +## Undoing Deletions |
| 48 | + |
| 49 | +If you have scheduled a record for deletion and want to be able to cancel that deletion, your |
| 50 | +deletion task needs to implement the `should_proceed` hook. |
| 51 | + |
| 52 | +```python |
| 53 | +def should_proceed(self, instance: ModelT) -> bool: |
| 54 | + return instance.status in { |
| 55 | + ObjectStatus.PENDING_DELETION, |
| 56 | + ObjectStatus.DELETION_IN_PROGRESS |
| 57 | + } |
| 58 | +``` |
| 59 | + |
| 60 | +The above would only proceed with the deletion if the record's status was correct. When a deletion |
| 61 | +is cancelled by this hook, the `ScheduledDeletion` row will be removed. |
| 62 | + |
| 63 | +## Using Deletions Manager Directly |
| 64 | + |
| 65 | +For example, let's say you want to delete an organization: |
| 66 | + |
| 67 | +```python |
| 68 | +from sentry import deletions |
| 69 | +task = deletions.get(model=Organization, query={}) |
| 70 | +work = True |
| 71 | +while work: |
| 72 | + work = task.chunk() |
| 73 | +``` |
| 74 | + |
| 75 | +The system has a default task implementation to handle Organization which will efficiently cascade |
| 76 | +deletes. This behavior varies based on the input object, as the task can override the behavior for |
| 77 | +its children. |
| 78 | + |
| 79 | +For example, when you delete a Group, it will cascade in a more traditional manner. It will batch |
| 80 | +each child (such as Event). However, when you delete a project, it won't actually cascade to the |
| 81 | +registered Group task. It will instead take a more efficient approach of batch deleting its indirect |
| 82 | +descendants, such as Event, so it can more efficiently bulk delete rows. |
0 commit comments