-
Notifications
You must be signed in to change notification settings - Fork 201
Open
Labels
Milestone
Description
Currently, all languages use something similar to this code, when it's time to do a substream and parse objects from substream:
seq:
- id: foo
size: some_size
type: foo_class
this._raw_foo = io.readBytes(someSize());
KaitaiStream _io__raw_foo = new KaitaiStream(_raw_foo);
this.foo = new FooClass(_io__raw_foo, this, _root);
This is inefficient, especially for larger data streams - it needs to load everything into memory and then parse it from there. A more efficient approach would use using some sort of substreams in manner of data views, i.e. something like:
KaitaiStream _io__foo = _io.substream(_io.pos(), someSize());
this.foo = new FooClass(_io__foo, this, _root);
or, for instances that have known pos
field, something like that:
instances:
foo:
pos: some_pos
size: some_size
type: foo_class
KaitaiStream _io__foo = _io.substream(somePos(), someSize());
this.foo = new FooClass(_io__foo, this, _root);
The devil, of course, is in the details:
- We need to make sure a substream functionality for every target language exists (or implement it there)
- Probably at least for some time we'll have to support two different ways: i.e. parsing via reading
_raw_*
byte arrays and parsing using substreams. - We need to decide what to do in lots of other cases:
- What do we return when reading a type-less value: i.e. user would expect a byte array or we'll switch to use substream there too? or a substream-that-can-be-used-as-bytearray? or it should be used-choosable?
- What do to when
repeat
constructs exist on this field? - How would that mix with
process
- these actually require reading and re-processing the whole byte array in memory. Or should we re-implement them as stream-in => stream-out converters as well?
milahu