-
Notifications
You must be signed in to change notification settings - Fork 401
MSC1763: Proposal for specifying configurable message retention periods #1763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: old_master
Are you sure you want to change the base?
Changes from all commits
687b650
f770440
b25367e
2aafa02
64695ed
c493dbd
0afc3af
7597e03
7a8d204
4646fcd
c55158d
6e33c2f
28ea4e1
cca99dd
a4974b6
c27394c
f0553c0
bdce6f1
a30a853
c281420
ef215dd
0b6a209
5c29779
032e63b
1a4101e
90b17d6
32f21ac
a1b8726
ee0a7ee
cabef48
f5c3729
f8ceb97
8b1a0c3
9357ec6
ac2f87e
116c5b9
f809087
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,388 @@ | ||
# Proposal for specifying configurable per-room message retention periods. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I’m sensing an innate conflict within this MSCs interests, one where it both wants to reduce server history in rooms, but where it also simultaneously expects to be able to fetch that history from thin air at any convenient time. I have a feeling it’s written with the underlying idea that large servers will carry all the events in the federation, with some servers being able to fetch from those at any time. …however, this is mentioned nowhere in the MSC, where it skirts around these problems by putting these assumptions between the lines, while not thinking critically about what this means for the larger federation; more dependency on large servers. With this, it does not bring a lucid solution to the problem of dealing with history retention, one where any server eventually has to face that it cannot fetch events it knows exist(ed), but are now expected to respond with them to a client’s query. The semantic equivalent of HTTP Error 410 (“gone”) has to exist somewhere here, to be able to tell clients it’s unable to fetch a historical event due to history retention, and all sad and happy paths that spring from that. The current stance against this is “you’re SOL, have a 404 with no context”. I don’t see this MSC deal with the reality that it is deleting events, I don’t see a coherent solution to allow some servers to “archive” history, and make that explicit (also in the rooms, for privacy concerns, for people who wanna know which servers are ignoring retention rules and archiving anyways) Servers ignoring retention rules does have a basis, namely one of actually archiving historic conversations, in a similar philosophy as The Internet Archive. If this MSC were to go through as-is, then we’d have a similar situation as the general internet, namely one where all history is lost to time due to individual retention strategies. While reliance on large servers isn’t what a federation would want, an explicit form of mentioning where at least people are aware which servers are backing up, and which ones aren’t, would help this MSC greatly in the long run. |
||
|
||
A major shortcoming of Matrix has been the inability to specify how long events | ||
should stored by the servers and clients which participate in a given room. | ||
|
||
This proposal aims to specify a simple yet flexible set of rules which allow | ||
users, room admins and server admins to determine how long data should be stored | ||
for a room, from the perspective of respecting the privacy requirements of that | ||
room (which may range from a "burn after reading" ephemeral conversation, | ||
through to FOIA-style public record keeping requirements). | ||
|
||
As well as enforcing privacy requirements, these rules provide a way for server | ||
administrators to better manage disk space (e.g. to enforce rules such as "don't | ||
store remote events for public rooms for more than a month"). | ||
|
||
This proposal originally tried to also define semantics for per-message | ||
retention as well as per-room; this has been split out into | ||
[MSC2228](https://github.com/matrix-org/matrix-doc/pull/2228) in order to get | ||
the easier per-room semantics landed. | ||
|
||
|
||
## Problem | ||
|
||
Matrix is inherently a protocol for storing and synchronising conversation | ||
history, and various parties may wish to control how long that history is stored | ||
for. | ||
|
||
Room administrators, for instance, may wish to control how long a message can be | ||
stored (e.g. to comply with corporate/legal requirements to store message | ||
history for at least a specific amount of time), or how early a message can be | ||
deleted (e.g. to address privacy concerns of the room's members, to avoid | ||
messages staying in the public record forever, or to comply with corporate/legal | ||
requirements to only store specific kinds of information for a limited amount of | ||
time). | ||
|
||
Additionally, server administrators may also wish to control how long message | ||
history is kept in order to better manage their server's disk space, or to | ||
enforce corporate/legal requirements for the organisation managing the server. | ||
|
||
We would like to provide this behaviour whilst also ensuring that users | ||
generally see a consistent view of message history, without lots of gaps and | ||
one-sided conversations where messages have been automatically removed. | ||
|
||
We would also like to set the expectation that rooms typically have a long | ||
message retention - allowing those who wish to use Matrix to act as an archive | ||
of their conversations to do so. If everyone starts defaulting their rooms to | ||
finite retention periods, then the value of Matrix as a knowledge repository is | ||
broken. | ||
|
||
This proposal does not try to solve the problems of: | ||
* GDPR erasure (as this involves retrospectively changing the lifetime of | ||
messages) | ||
ara4n marked this conversation as resolved.
Show resolved
Hide resolved
|
||
* Bulk redaction (e.g. to remove all messages from an abusive user in a room, | ||
as again this is retrospectively changing message lifetime) | ||
* Specifying history retention based on the number of messages (as opposed to | ||
their age) in a room. This is descoped because it is effectively a disk space | ||
management problem for a given server or client, rather than a policy | ||
problem of the room. It can be solved as an implementation specific manner, or | ||
a new MSC can be proposed to standardise letting clients specify disk quotas | ||
per room. | ||
* Per-message retention (as having a mix of message lifetime within a room | ||
complicates implementation considerably - for instance, you cannot just | ||
purge arbitrary events from the DB without fracturing the DAG of the room, | ||
and so a different approach is required) | ||
|
||
|
||
## Proposal | ||
|
||
### Per-room retention | ||
|
||
We introduce a `m.room.retention` state event, which room admins or moderators | ||
can set to mandate the history retention behaviour for a given room. It follows | ||
the default PL semantics for a state event (requiring PL of 50 by default to be | ||
set). Its state key is an empty string (`""`). | ||
|
||
The following fields are defined in the `m.room.retention` contents: | ||
|
||
* `max_lifetime`: the maximum duration in milliseconds for which a server must | ||
store events in this room. Must be null or an integer in range [0, | ||
2<sup>53</sup>-1]. If absent or null, should be interpreted as not setting an | ||
upper bound to the room's retention policy. | ||
|
||
* `min_lifetime`: the minimum duration in milliseconds for which a server should | ||
store events in this room. Must be null or an integer in range [0, | ||
2<sup>53</sup>-1]. If absent or null, should be interpreted as not setting a | ||
lower bound to the room's retention policy. | ||
|
||
In the instance of both `max_lifetime` and `min_lifetime` being provided, | ||
`max_lifetime` must always be higher or equal to `min_lifetime`. | ||
|
||
|
||
For instance: | ||
|
||
```json | ||
{ | ||
"max_lifetime": 86400000 | ||
} | ||
``` | ||
|
||
The above example means that servers receiving messages in this room should | ||
store the event for only 86400000 milliseconds (1 day), as measured from that | ||
event's `origin_server_ts`, after which they MUST purge all references to that | ||
event (e.g. from their db and any in-memory queues). | ||
|
||
We consciously do not redact the event, as we are trying to eliminate metadata | ||
and save disk space at the cost of deliberately discarding older messages from | ||
the DAG. | ||
|
||
```json | ||
{ | ||
"min_lifetime": 2419200000 | ||
} | ||
``` | ||
|
||
The above example means that servers receiving this message SHOULD store the | ||
event forever, but can choose to purge their copy after 28 days (or longer) in | ||
order to reclaim diskspace. | ||
|
||
```json | ||
{ | ||
"min_lifetime": 2419200000, | ||
"max_lifetime": 15778800000 | ||
} | ||
``` | ||
|
||
The above example means that servers SHOULD store their copy of the event for at least 28 | ||
days after it has been sent, and MUST delete it at the latest after 6 months. | ||
|
||
|
||
## Server-defined retention | ||
|
||
Server administrators can benefit from a few capabilities to control how long | ||
history is stored: | ||
|
||
* the ability to set a default retention policy for rooms that don't have a | ||
retention policy defined in their state | ||
* the ability to override the retention policy for a room | ||
* the ability to cap the effective `max_lifetime` and `min_lifetime` of the rooms the | ||
server is in | ||
|
||
The implementation of these capabilities in the server is left as an | ||
implementation detail. | ||
|
||
We introduce the following authenticated endpoint to allow clients to enquire | ||
about how the server implements this policy: | ||
|
||
|
||
``` | ||
GET /_matrix/client/v3/retention/configuration | ||
``` | ||
|
||
200 response properties: | ||
|
||
* `policies` (required): An object mapping room IDs to a retention policy. If | ||
the room ID is `*`, the associated policy is the default policy. Each policy | ||
follows the format for the content of an `m.room.retention` state event. | ||
* `limits` (required): An object defining the limits to apply to policies | ||
defined by `m.room.retention` state events. This object has two optional | ||
properties, `min_lifetime` and `max_lifetime`, which each define a limit to | ||
the equivalent property of the state events' content. Each limit defines an | ||
optional `min` (the minimum value, in milliseconds) and an optional `max` (the | ||
maximum value, in milliseconds). | ||
|
||
If both `policies` and `limits` are included in the response, the policies | ||
specified in `policies` __must__ comply with the limits defined in `limits`. | ||
|
||
Example response: | ||
|
||
```json | ||
{ | ||
"policies": { | ||
"*": { | ||
"max_lifetime": 15778800000 | ||
}, | ||
"!someroom:test": { | ||
"min_lifetime": 2419200000, | ||
"max_lifetime": 15778800000 | ||
} | ||
}, | ||
"limits": { | ||
"min_lifetime": { | ||
"min": 86400000, | ||
"max": 172800000 | ||
}, | ||
"max_lifetime": { | ||
"min": 7889400000, | ||
"max": 15778800000 | ||
} | ||
} | ||
} | ||
``` | ||
|
||
In this example, the server is configured with: | ||
|
||
* a default policy with a `max_lifetime` of 6 months and no `min_lifetime` (i.e. messages | ||
can only be kept up to 6 months after they have been sent) | ||
* an override for the retention policy in room `!someroom:test` | ||
* limits on `min_lifetime` that | ||
|
||
Example response with no policy or limit set: | ||
|
||
```json | ||
{ | ||
"policies": {}, | ||
"limits": {} | ||
} | ||
``` | ||
|
||
Example response with only a default policy and an upper limit on `max_lifetime`: | ||
|
||
```json | ||
{ | ||
"policies": { | ||
"*": { | ||
"min_lifetime": 86400000, | ||
"max_lifetime": 15778800000 | ||
} | ||
}, | ||
"limits": { | ||
"max_lifetime": { | ||
"max": 15778800000 | ||
} | ||
} | ||
} | ||
``` | ||
|
||
### Defining the effective retention policy of a room | ||
|
||
In this section, as well as in the rest of this document, we define the | ||
"effective retention policy" of a room as the retention policy that is used to | ||
determine whether an event should be deleted or not. This may be the policy | ||
determined by the `m.room.retention` event in the state of the room, but it | ||
might not be depending on limits set by the homeserver. | ||
|
||
The algorithm implementation must implement to determine the effective retention | ||
policy of a room is | ||
|
||
|
||
* if the homeserver defines a specific retention policy for this room, then use | ||
this policy as the effective retention policy of the room. | ||
* otherwise, if the state of the room does not include a `m.room.retention` | ||
event with an empty state key: | ||
* if the homeserver defines a default retention policy, then use this policy | ||
as the effective retention policy of the room. | ||
* if the homeserver does not define a default retention policy, then don't | ||
apply a retention policy in this room. | ||
* otherwise, if the state of the room includes a `m.room.retention` event with | ||
an empty state key: | ||
* if no limit is set by the homeserver use the policy in the state of the | ||
room as the effective retention policy of the room. | ||
* for `min_lifetime` and `max_lifetime`: | ||
* if there is no limit for the property, use the value specified in the | ||
room's state for the effective retention policy of the room (if any). | ||
* if there is a limit for the property: | ||
* if the value specified in the room's state complies with the | ||
limit, use this value for the effective retention policy of the | ||
room. | ||
* if the value specified in the room's state is lower than the | ||
limit's `min` value, use the `min` value for the effective | ||
retention policy of the room. | ||
* if the value specified in the room's state is greater than the | ||
limit's `max` value, use the `max` value for the effective | ||
retention policy of the room. | ||
* if there is no value specified in the room's state, use the | ||
limit's `min` value for the effective retention policy of the | ||
room (which can be null or absent). | ||
* otherwise, don't apply a retention policy in this room. | ||
|
||
So, for example, if a homeserver defines a lower limit on `max_lifetime` of | ||
`86400000` (a day) and no limit on `min_lifetime`, and a room's retention policy | ||
is the following: | ||
|
||
```json | ||
{ | ||
"max_lifetime": 43200000, | ||
"min_lifetime": 21600000 | ||
} | ||
``` | ||
|
||
Then the effective retention policy of the room is: | ||
|
||
```json | ||
{ | ||
"max_lifetime": 86400000, | ||
"min_lifetime": 21600000 | ||
} | ||
``` | ||
|
||
|
||
## Enforcing a retention policy | ||
|
||
Retention is only considered for non-state events. Retention is also not | ||
considered for the most recent event in a room, in order to allow a new event | ||
sent to that room to reference it in its `prev_events`. | ||
|
||
When purging events in a room, only the latest retention policy state event in | ||
that room is considered. This means that in a room where the history looks like | ||
the following (oldest event first): | ||
|
||
1. Retention policy A | ||
2. Event 1 | ||
3. Event 2 | ||
4. Retention policy B | ||
|
||
Then the retention policy B is used to determine the effective retention that | ||
defines whether events 1 and 2 should be purged, even though they were sent when | ||
the retention policy A was in effect. This is to avoid creating wholes in the | ||
room's DAG caused by events in the middle of the timeline being subject to a | ||
lower `max_lifetime` than other events being sent before and after them. Such | ||
holes would make it more difficult for homeservers to calculate room timelines | ||
when showing them to clients. They would also force clients to display | ||
potentially incomplete or one-sided conversations without being able to easily | ||
tell which parts of the conversation is missing. | ||
|
||
Servers decide whether an event should or should not be purged by calculating | ||
how much time has passed since the event's `origin_server_ts` property, and | ||
comparing this duration with the room's effective retention policy. | ||
|
||
Note that, for performance reasons, a server might decide to not purge an event | ||
the second it hits the end of its lifetime (e.g. so it can batch several events | ||
together). In this case, the server must make sure to omit the expired events | ||
from reponses to client requests. Similarly, if the server is sent an expired | ||
event over federation, it must omit it from responses to client requests (and | ||
ensure it is eventually purged). | ||
|
||
## Tradeoffs | ||
|
||
This proposal specifies that the lifetime of an event is defined by the latest | ||
retention policy in the room, rather than the one in effect when the event was | ||
sent. This might be controversial as, in Matrix, the state that an event is | ||
subject to is usually the state of the room at the time it was sent. However, | ||
there are a few issues with using the retention that was in effect at the time | ||
the event was sent: | ||
|
||
* it would create holes in the DAG of a room which would complexify the | ||
server-side handling of the room's history | ||
* malicious servers could potentially make an event evade retention policies by | ||
selecting their event's `prev_events` and `auth_events` so that the event is | ||
on a portion of the DAG where the policy does not exist | ||
* it would be difficult to translate the configuration of retention policies | ||
into a clear and easy to use UX (especially considering server-side | ||
configuration applies to the whole history of the room) | ||
* it would not allow room administrators to retroactively update the lifetime of | ||
events that have already been sent (e.g. if the context of a room administered | ||
by an organisation which requirements for data retention change over time) | ||
|
||
This proposal does not cover per-message retention (i.e. the ability to set | ||
different lifetimes to different messages). This has been split out into | ||
[MSC2228](https://github.com/matrix-org/matrix-spec-proposals/pull/2228) to | ||
simplify this proposal. | ||
|
||
This proposal does also not cover the case where a room's administrator wishes | ||
to only restrict the lifetime of a specific section of the room's history. This | ||
is left to be covered by a separate MSC, possibly built on top of MSC2228. | ||
|
||
## Security considerations | ||
|
||
In a context of open federation, it is worth keeping in mind the possibility | ||
that not all servers in a room will enforce its retention policy. Similarly, | ||
different servers will likely enforce different server-side configuration, and | ||
as a result calculate different lifetimes for a given event. This proposal aims | ||
at trying to compromise between finding an absolute consensus on an event's | ||
lifetime and working within the constraints of a server's operator in terms of | ||
data retention. | ||
|
||
In a kind of contradictory way with the previous paragraph, a server may keep an | ||
expired event in its database for some time after its expiration, while not | ||
sharing it with clients and federating servers. This is in order to prevent | ||
abusers from using low lifetime values in a room's retention policy in order to | ||
erase any proof of such abuse and avoid being investigated. | ||
|
||
Basing the expiration time of an event on its `origin_server_ts` is not ideal as | ||
this field can be falsified by the sending server. However, there currently | ||
isn't a more reliable way to certify the send time of an event. | ||
|
||
As mentioned previously in this proposal, servers might store expired events for | ||
longer than their lifetime allows, either for performance reason or to mitigate | ||
abuse. This is considered acceptable as long as: | ||
|
||
* an expired event is not kept permanently | ||
* an expired event is not shared with clients and federated servers | ||
|
||
## Unstable prefixes | ||
|
||
While this proposal is under review, the `m.room.retention` event type should be | ||
replaced by the `org.matrix.msc1763.retention` type. | ||
|
||
Similarly, the `/_matrix/client/v3/retention/configuration` path should be replaced with `/_matrix/client/unstable/org.matrix.msc1763/retention/configuration`. |
Uh oh!
There was an error while loading. Please reload this page.