-
Hi, developers. Thanks for all of your works. I'm using dwarfs for archival use, and it turns out to be super space efficient and easy to access with FUSE. I'm interested in the detail design about the on-disk dwarfs image format and want to write my own reader/decompressor. I think having more alternative reader implementations is also good for the community of a file-format. How hard would it be? IIUC, the file format is open sourced in MIT license and should be publicly usable. The only documentation I found is doc/dwarfs-format.md which contains basic block definitions and high-level structures. But it does not define all fine details and I have a few questions:
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 9 replies
-
Hi and thanks for your feedback! I've already tried to update
I agree that this would be desirable.
That depends on how much of the code you want to re-implement. The most complex part is very likely going to be reading the metadata. This uses Frozen2, which is a very space-efficient, memory-mappable representation of Thrift data structures. The exact layout of the representation is defined by the "schema", which you've already come across. You can actually inspect the schema using
You can also get a full dump of the metadata:
Having a Frozen2-compatible reader/writer for these data structures that is independent from fbthrift would certainly be desirable, but this is definitely not for the faint of heart. Once you can read the metadata, parts of it may require further unpacking. DwarFS gives you the choice to optimize the stored metadata for size or for "memory-mappability". In the former case, parts of the metadata need to be unpacked, in the latter case, it is directly usable. As also mentioned in
Definitely let me know if there's something missing in
That hasn't been touched in a while and it's not up-to-date. I've added some details to
I've also described this in
Yes,
See above.
This is also linked in
This is currently true. However, the only requirement is that the Let me know if you need more information. |
Beta Was this translation helpful? Give feedback.
Hi and thanks for your feedback!
I've already tried to update
dwarfs-format.md
based on some of your questions.I agree that this would be desirable.
That depends on how much of the code you want to re-implement.
The most complex part is very likely going to be reading the metadata. This uses Frozen2, which is a very space-efficient, memory-mappable representation of Thrift data structures. The exact layout of the representation is defined by the "schema", which you've already come across.
You can actually inspect the schema using
dwarfsck
: