[source-mongodb-v2] Specfic UUID Handling #60818
lepagea01
started this conversation in
Connector Ideas and Features
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
Currently, the MongoDB source connector (
source-mongodb-v2
) emits all BSON binary fields as base64 strings, regardless of BSON subtype. While this is safe in a general sense, it causes semantic loss and inconsistency for fields that actually represent UUIDs, specifically BSON binary subtypes:03
= legacy UUID (non-RFC 4122)04
= standard UUID (RFC 4122)This behavior is problematic because it makes it impossible to recover the original UUID with full confidence once the data is synced to a destination. This can silently break downstream use cases such as:
Proposed Feature
Introduce a source-level configuration setting, for example:
When set to
"string"
, the connector would:03
or04
This somewhat mirrors the decoding already applied to
_id
inIdType
, but generalizes it to all fields in the document.Reference in Code
In the
MongoDbCdcEventUtils
class, the current logic for binary fields is:Suggested improved version, gated by the optional setting:
Where
toUuidOrBase64(...)
would look like (not tested):Benefits
base64
remains the default behavior)Notes
This is proposed as a non-breaking, opt-in feature to avoid disrupting existing pipelines. Default behavior remains unchanged unless the
"uuid_rendering"
flag is explicitly enabled.Beta Was this translation helpful? Give feedback.
All reactions