You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Data Protection usually generates a new key 2 days (hardwired) before the current preferred default key expires and this key is propagated to other nodes that use the same keyring through the 1 day (hardwired) cache refresh. However when there is no valid key the new generated key is a prompt key effective immediately and may not get propagated to other nodes until much later, at the next cache refresh, which could take up to 1 day.
#3975 addresses this issue by forcing a cache refresh when an unknown key is received, but only for a couple of minutes during startup. This solves the problem e.g. when a new app is deployed for the first time with an empty keyring.
It looks like both the cache refresh and the generation of the new key is not proactive but triggered by a data protection operation (e.g. an authenticated request), therefore long periods of idle time may have the same result: on the next request a new prompt key is generated which cannot be seen by other nodes. I have no repro, the hardwired cache expiration/key propagation window makes this really hard to test in practice, this is only my understanding of how stuff works under the hood based on the source code. Am I missing something?
I have a load balanced setup where nodes are configured as always running due to business requirements and only ever stops for maintenance. It is unprobable but cannot be ruled out that the web interface experiences more than 2 days of idle time (e.g. long weekend). If that coincides with a key expiration then the next time activity returns - which usually occurs in bursts therefore splatted across all nodes - the key desync could bring down the whole system.
In case my understanding is correct the forced refresh window should be tied to the same thing that triggers the prompt key creation instead of startup: when an empty keyring (no valid keys) is detected.
There are a couple alternatives.
Configure the propagation interval/cache refresh: these are currently hardwired. I could find a suitable value where there is guaranteed activity within the propagation interval therefore prompt keys would never be created
Create a scheduled ping to the web interface that triggers a DP operation and a timely key creation: this relies on the operators. One slight misconfiguration and then a big surprise 90 days later.
Create a BackgroundService that triggers GetCurrentKeyRing when the last key is about to expire: this is my preferred solution for the time being, still seems rather hackish.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Data Protection usually generates a new key 2 days (hardwired) before the current preferred default key expires and this key is propagated to other nodes that use the same keyring through the 1 day (hardwired) cache refresh. However when there is no valid key the new generated key is a prompt key effective immediately and may not get propagated to other nodes until much later, at the next cache refresh, which could take up to 1 day.
#3975 addresses this issue by forcing a cache refresh when an unknown key is received, but only for a couple of minutes during startup. This solves the problem e.g. when a new app is deployed for the first time with an empty keyring.
It looks like both the cache refresh and the generation of the new key is not proactive but triggered by a data protection operation (e.g. an authenticated request), therefore long periods of idle time may have the same result: on the next request a new prompt key is generated which cannot be seen by other nodes. I have no repro, the hardwired cache expiration/key propagation window makes this really hard to test in practice, this is only my understanding of how stuff works under the hood based on the source code. Am I missing something?
I have a load balanced setup where nodes are configured as always running due to business requirements and only ever stops for maintenance. It is unprobable but cannot be ruled out that the web interface experiences more than 2 days of idle time (e.g. long weekend). If that coincides with a key expiration then the next time activity returns - which usually occurs in bursts therefore splatted across all nodes - the key desync could bring down the whole system.
In case my understanding is correct the forced refresh window should be tied to the same thing that triggers the prompt key creation instead of startup: when an empty keyring (no valid keys) is detected.
There are a couple alternatives.
Beta Was this translation helpful? Give feedback.
All reactions