Skip to content

Handle Ion clob serde behavior properly #73

@guyilin-amazon

Description

@guyilin-amazon

Current IonClobSerilaizer implementation:

public override string Deserialize(IIonReader reader)
{
    byte[] clob = new byte[reader.GetLobByteSize()];
    reader.GetBytes(clob);
    return Encoding.UTF8.GetString(clob);
}

public override void Serialize(IIonWriter writer, string item)
{
    writer.WriteClob(Encoding.UTF8.GetBytes(item));
}

Problems in current implementation:

  1. The Ion clob specification allows arbitrary interpretation of the raw binary data but current implementation only allows UTF-8 encoding, which seems weird because string in C# actually uses UTF-16 encoding.
  2. The C# string type will map to Ion string by default. So there is no use case for the Serialize(IIonWriter writer, string item) method to serialize a C# string to Ion clob, only the Deserialize method will be used when deserializing Ion clob data.

We should consider:

  1. Allow encoding format to be passed in for Ion clob serde
  2. What's the proper mapping between Ion clob type and C# type, probably the clob should map to C# byte array byte[] instead of the current string type.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions