Skip to content

Commit 6e3e162

Browse files
erikjohnstonhughnsanoadragon453fkwprichvdh
authored
MSC4222: Adding state_after to /sync (#4222)
* First draft of MSC4222 * Fix indentation * Fix json * Include msc number in unstable prefixes * Update proposals/4222-sync-v2-state-after.md Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com> * Update proposals/4222-sync-v2-state-after.md Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com> * Apply suggestions from code review As discussed during the MSC clinic hour Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> * Re-word the paragraphs about rebuilding the history of state * Add more details about why /v3/sync's current behaviour is insufficient. * Clarify state_after limitation regarding state removal * Update proposals/4222-sync-v2-state-after.md Co-authored-by: Alexey Rusakov <Kitsune-Ral@users.sf.net> --------- Co-authored-by: Hugh Nimmo-Smith <hughns@users.noreply.github.com> Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com> Co-authored-by: fkwp <fkwp@users.noreply.github.com> Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> Co-authored-by: Andy Balaam <andy.balaam@matrix.org> Co-authored-by: fkwp <github-fkwp@w4ve.de> Co-authored-by: Travis Ralston <travisr@matrix.org> Co-authored-by: Alexey Rusakov <Kitsune-Ral@users.sf.net>
1 parent 07ee4ff commit 6e3e162

File tree

1 file changed

+253
-0
lines changed

1 file changed

+253
-0
lines changed

proposals/4222-sync-v2-state-after.md

Lines changed: 253 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,253 @@
1+
# MSC4222: Adding `state_after` to `/sync`
2+
3+
The current [`/sync`](https://spec.matrix.org/v1.14/client-server-api/#get_matrixclientv3sync) API does not
4+
differentiate between state events in the timeline and updates to state, and so can cause the client's view
5+
of the current state of the room to diverge from the actual state of the room as seen by the server.
6+
7+
The fundamental issue is that clients need to know the current authoritative room state, but the current model
8+
lacks an explicit representation of that. Clients derive state by assuming a linear application of events, for
9+
example:
10+
11+
```
12+
state_before + timeline => state_after
13+
```
14+
15+
However, room state evolves as a DAG (Directed Acyclic Graph), not a linear chain. A simple example illustrates:
16+
```diagram
17+
A
18+
|
19+
B
20+
/ \
21+
C D
22+
23+
```
24+
Each of A, B, C, and D are non-conflicting state events.
25+
- State after C = `{A, B, C}`
26+
- State after D = `{A, B, D}`
27+
- Current state = `{A, B, C, D}`
28+
29+
In this case, both C and D are concurrent, so the correct current state includes both. Clients that try to reconstruct
30+
state from a timeline such as `[A, B, C, D]` or `[A, B, D, C]` might trivially compute a union — and for non-conflicting
31+
cases, this works.
32+
33+
However, once conflicting state enters, resolution is needed. Consider this more complex example:
34+
```diagram
35+
A
36+
|
37+
B
38+
/ \
39+
C C' <-- C' wins via state resolution
40+
\ / \
41+
D E
42+
```
43+
Here, C and C' are conflicting state events — for example, both might define a different `m.room.topic`. Let's say C' wins
44+
according to the server's state resolution rules. Then D and E are independent non-conflicting additions.
45+
- State after C = `{A, B, C}`
46+
- State after D = `{A, B, C'}`
47+
- State after E = `{A, B, C', E}`
48+
- Current state = `{A, B, C', D, E}`
49+
50+
Now suppose the client first receives timeline events `[A, B, C', E]`. The state it constructs is:
51+
```
52+
{A, B, C', E} ← Correct so far
53+
```
54+
Then it receives a subsequent sync with timeline `[C, D]`, and the state block includes only `{B}`. Under the current
55+
`/sync` behavior:
56+
- The timeline includes state event C, which incorrectly replaces C'.
57+
- The client ends up with `{A, B, C, D, E}`, which is **invalid** — it prefers the wrong version of C.
58+
This happens because the client re-applies C from the timeline, unaware that C' had already been resolved and accepted
59+
earlier. There's no way for the client to know that C' is supposed to win, based solely on the timeline.
60+
61+
In [MSC4186 - Simplified Sliding Sync](https://github.com/matrix-org/matrix-spec-proposals/pull/4186) this problem is
62+
solved by the equivalent `required_state` section including all state changes between the previous sync and the end of
63+
the current sync, and clients do not update their view of state based on entries in the timeline.
64+
65+
66+
## Proposal
67+
68+
This change is gated behind the client adding a `?use_state_after=true` (the unstable name is
69+
`org.matrix.msc4222.use_state_after`) query param.
70+
71+
When enabled, the Homeserver will **omit** the `state` section in the room response sections. This is replaced by
72+
`state_after` (the unstable field name is `org.matrix.msc4222.state_after`), which will include all state changes between the
73+
previous sync and the *end* of the timeline section of the current sync. This is in contrast to the old `state` section
74+
that only included state changes between the previous sync and the *start* of the timeline section. Note that this does
75+
mean that a new state event will (likely) appear in both the timeline and state sections of the response.
76+
77+
This is basically the same as how state is returned in [MSC4186 - Simplified Sliding
78+
Sync](https://github.com/matrix-org/matrix-spec-proposals/pull/4186).
79+
80+
Clients **MUST** only update their local state using `state_after` and **NOT** consider the events that appear in the timeline section of `/sync`.
81+
82+
Clients can tell if the server supports this change by whether it returns a `state` or `state_after` section in the
83+
response. Servers that support this change **MUST** return the `state_after` property, even if empty.
84+
85+
### Examples
86+
87+
#### Example 1 \- Common case
88+
89+
Let’s take a look at the common case of a state event getting sent down an incremental sync, which is non-gappy.
90+
91+
<table>
92+
<tr><th>Previously</th><th>Proposed</th></tr>
93+
<tr>
94+
<td>
95+
96+
```json
97+
{
98+
"timeline": {
99+
"events": [ {
100+
"type": "org.matrix.example",
101+
"state_key": ""
102+
} ],
103+
"limited": false,
104+
},
105+
"state": {
106+
"events": []
107+
}
108+
}
109+
```
110+
111+
</td>
112+
<td>
113+
114+
```json
115+
{
116+
"timeline": {
117+
"events": [ {
118+
"type": "org.matrix.example",
119+
"state_key": ""
120+
} ],
121+
"limited": false,
122+
},
123+
"state_after": {
124+
"events": [ {
125+
"type": "org.matrix.example",
126+
"state_key": ""
127+
} ]
128+
}
129+
}
130+
```
131+
132+
</td>
133+
</tr>
134+
</table>
135+
136+
Since the current state of the room will include the new state event, it's included in the `state_after` section.
137+
138+
> [!NOTE]
139+
> In the proposed API the state event comes down both in the timeline section *and* the state section.
140+
141+
142+
#### Example 2 - Receiving “outdated” state
143+
144+
Next, let’s look at what would happen if we receive a state event that does not take effect, i.e. that shouldn’t cause the client to update its state.
145+
146+
<table>
147+
<tr><th>Previously</th><th>Proposed</th></tr>
148+
<tr>
149+
<td>
150+
151+
```json
152+
{
153+
"timeline": {
154+
"events": [ {
155+
"type": "org.matrix.example",
156+
"state_key": ""
157+
} ],
158+
"limited": false,
159+
},
160+
"state": {
161+
"events": []
162+
}
163+
}
164+
```
165+
166+
</td>
167+
<td>
168+
169+
```json
170+
{
171+
"timeline": {
172+
"events": [ {
173+
"type": "org.matrix.example",
174+
"state_key": ""
175+
} ],
176+
"limited": false,
177+
},
178+
"state_after": {
179+
"events": []
180+
}
181+
}
182+
```
183+
184+
</td>
185+
</tr>
186+
</table>
187+
188+
Since the current state of the room does not include the new state event, it's excluded from the `state_after` section.
189+
190+
> [!IMPORTANT]
191+
> Even though both responses look very similar, the client **MUST NOT** update its state with the event from the timeline section when using `state_after`.
192+
193+
194+
## Potential issues
195+
196+
With the proposed API the common case for receiving a state update will cause the event to come down in both the
197+
`timeline` and `state_after` sections, potentially increasing bandwidth usage. However, it is common for the HTTP responses to
198+
be compressed, heavily reducing the impact of having duplicated data.
199+
200+
Both before and after this proposal, clients are not able to calculate reliably exactly when in the
201+
timeline the state changed (e.g. to figure out which message should show a user's previous/updated
202+
display name - note that some clients e.g. Element have moved away from this UX). This is because
203+
the accurate picture of the current state at an event is calculated by the server based on the room
204+
DAG, including the state resolution process, and not based on a linear list of state updates.
205+
206+
This proposal ensures that the client has a more accurate view of the room state *after the sync has
207+
finished*, but it does not provide any more information about the *history of state* as it relates
208+
to events in the timeline. Clients attempting to build a best-effort view of this history by walking
209+
the timeline may still do so, with the same caveats as before about correctness, but they should be
210+
sure to make their view of the final state consistent with the changes provided in `state_after`.
211+
212+
The format of returned state in `state_after` in this proposal is a list of events. This
213+
does not allow the server to indicate if an entry has been removed from the state. As with
214+
[MSC4186 - Simplified Sliding Sync](https://github.com/matrix-org/matrix-spec-proposals/pull/4186),
215+
this limitation is acknowledged but not addressed here. This is not a new issue and is left for
216+
resolution in a future MSC.
217+
218+
219+
## Alternatives
220+
221+
There are a number of options for encoding the same information in different ways, for example the response could
222+
include both the `state` and a `state_delta` section, where `state_delta` would be any changes that needed to be applied
223+
to the client calculated state to correct it. However, since
224+
[MSC4186](https://github.com/matrix-org/matrix-spec-proposals/pull/4186) is likely to replace the current `/sync` API, we may as
225+
well use the same mechanism. This also has the benefit of showing that the proposed API shape can be successfully
226+
implemented by clients, as the MSC is implemented and in use by clients.
227+
228+
Another option would be for server implementations to try and fudge the state and timeline responses to ensure that
229+
clients came to the correct view of state. For example, if the server detects that a sync response will cause the client
230+
to come to an incorrect view of state it could either a) "fixup" the state in the `state` section of the *next* sync
231+
response, or b) remove or add old state events to the timeline section. While both these approaches are viable, they're
232+
both suboptimal to just telling the client the correct information in the first place. Since clients will need to be
233+
updated to handle the new behavior for future sync APIs anyway, there is little benefit from not updating clients now.
234+
235+
We could also do nothing, and instead wait for [MSC4186](https://github.com/matrix-org/matrix-spec-proposals/pull/4186)
236+
(or equivalent) to land and for clients to update to it.
237+
238+
239+
## Security considerations
240+
241+
There are no security concerns with this proposal, as it simply encodes the same information sent to clients in a
242+
different way
243+
244+
## Unstable prefix
245+
246+
| Name | Stable prefix | Unstable prefix |
247+
| - | - | - |
248+
| Query param | `use_state_after` | `org.matrix.msc4222.use_state_after` |
249+
| Room response field | `state_after` | `org.matrix.msc4222.state_after` |
250+
251+
## Dependencies
252+
253+
None

0 commit comments

Comments
 (0)