-
Notifications
You must be signed in to change notification settings - Fork 3
Archive: New binary format
Google Protocol Buffers format have a lot of problems when you want to serialize a .NET object. It's not very suitable when you need a support for nulls (including null in a field that should contain list), inheritance and references. Also it's focused on version compatibility but not everyone needs it.
I decided to introduce my own format that will support these .NET features.
Pseudo specification for the format:
//reference-tracked-objects (rto) - class and dynamic types
//non-rto - structs
// field number - byte/ushort - per type
(rto_count:)(varint)20
// all rtos, 1 - has values, 0 - null
(has_value:)(bits)10101010101010101010
// all rtos that has value, 1 - ref, 0 - first encounter
(refs:)(bits)1010101010
[:if numbered fields on in model]
// all rtos with first encounter, 0 - default type, 1 - specified type
(specified_type:)(bits)101010
[:endif]
[:if dynamic_types allowed]
// all type ids that > max will be dynamic types with index = id-max-1
(dynamic_types_count:)(varint)1
(dynamic_type_fullnames:)(string[])System.Guid
// dynamic type mapping settings should be possible to specify with attribute on field
[:endif]
// data entries:
--
// rto object (first-encounter), ref_id++
[:for types from base to current]
[:if numbered fields on in model]
(fields_count:)(field_number)field_number.MaxValue
[:for each field]
(field_number:)0
(field_type:)0
(value:)...
[:endfor]
[:else]
[:for each field]
[:if rto && type specified] (type:)(varint)0 [:endif]
(value:)...
[:endfor]
[:endif]
[:endfor]
--
// rto object ref
// ref ids - only first-encountered rtos
(ref:)(varint)1
--
// rto object null
// nothing
--
// value object (non-rto)
// the same data as rto
...
All special handling (lists, dictionaries) should be done inside the corresponding serializer class and written as a field data. It's very important that items count should go before items themselves so we can create an array if needed before populating it so it can be referenced from inside itself.
The field numbering option is a global option per TypeModel. Either you need verisoning or not. There is no point to serialize some types with versioning and some without.
A "patch" format when you serialize only modified properties is not designed yet but I think it will just use additional bit mask in header to specify what fields should be read.