Frame structure #262

iboB · 2025-03-07T15:00:06Z

iboB
Mar 7, 2025
Maintainer

These are some observations and analysis of #227

This impacts #241 as the schema language (the struct/frame definitions part) must be compatible with whatever we decide here

In is related to #56 as it determines whether need a datatype such as Dict itself. In any case carrying binary data in frames is a concern.

The goal of the data structure chosen is to allow us to minimize allocations and copying of data, especially large binary buffers. Ideally, a solution will take into account same-process comms and allow further reduction of data copies. This all taking into account that schema-based codegen is desirable both from a client and a server perspective.

Self-descriptive frames and vanilla clients

The current implementation uses frames which are self-describing (as they are Dict/JSON-based). This allows vanilla clients - ones that don't run any AC code. For example the current acord examples only need to do websocket comms and write json or cbor in order to communicate with the server.

If frames are not self-describing, some AC code will be obligatory for clients. Even if it's not a full codegen suite, there must be at least something that can allow writing frame data based on the schema.

Self-description comes at a cost: We need the keys and structure, besides the values.

This solution automatically negates the use of popular serializers such as FlatBuffers, Protobuf, Cap'n Proto and many more.

A big question in and of itself is whether we need this. Are we willing to sacrifice some io performance in order to allow vanilla clients? So far we're leaning towards yes, but I'm not sure anymore.

The desire for vanilla clients negates the ability to have client-side metadata in frames. We can't force clients to write a frame with a type complete-text-v143-8D8AC610-566D-4EF0-9C22-186B2A5ED793 or something. This means that the server-side frame dispatcher becomes more fragile because of async-based racey frames, and also because of versions, backwards and forwards compatibility.

Note that we can have self-descriptive frames which don't allow vanilla clients. For example: A FlatBuffer buf + its IDL description. This alone seems like a pointless idea, but it is necessary if we want to convert from an existing vanilla-accessible format (like JSON or CBOR) to such a frame.

Polymorphic frames

Our current frames are non-polymorphic. Potential benefits of polymorphic frames include:

Carrying binary data in `Dict` #56 - CoW (or otherwise ref-counted) blobs
Multi-format. A frame can be one of many things which all have the same interface. This will save copies when converting to a struct. One of the formats can be the main/default one, but the others if compatible may safe the conversion step/layer. Otherwise we would need to chain conversion layers on our polymorphic streams instead. This is not necessarily bad. Just something to keep in mind

The tradeoff here is that this will incur additional allocations for each frame (can be circumvented with custom allocators) but also that endpoints would need to be able to manifest their frame needs if they have them.

So far we're leaning against polymorphic frames, but they're worth keeping in mind because of they allow zero-copy blobs even between heterogeneous (say C++/Swift/Java) sides of a local channel.

Asymmetrical frames

Our frames are currently io symmetrical. This means that the same data structure is used when reading a frame and when writing it. This does not have to be the case as often times the needs of a frame producer and a frame consumer are different. A frame consumer could do just fine if the frame is a flat buffer and data fields are just pointers inside, whereas a frame producer needs to be able to create/allocate fields.

Having a frame-builder object is the minimal representation of this idea. I think that we should have a frame builder.

pminev · 2025-03-20T16:25:56Z

pminev
Mar 20, 2025
Maintainer

What do you think about a hybrid approach;

we define an IDL and use FlatBuffer for example which we can use for same process communication
use the same IDL to generate JSON schema to be able to be used from vanilla clients

I think technically it's a possible solution. Some questions come to my mind:

One of the problems is supporting 2 flows (they might be more if we make frame builder/translators as a components). How much of a problem that is?
The generator to JSON - It exists but not sure if all things we need are supported

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Frame structure #262

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Frame structure #262

Uh oh!

iboB Mar 7, 2025 Maintainer

Self-descriptive frames and vanilla clients

Polymorphic frames

Asymmetrical frames

Replies: 1 comment

Uh oh!

pminev Mar 20, 2025 Maintainer

iboB
Mar 7, 2025
Maintainer

pminev
Mar 20, 2025
Maintainer