Multi Layer Archive (MLA)

MLA is an archive file format with the following features:

Support for traditional and post-quantum encryption hybridation with asymmetric keys (HPKE with AES256-GCM and a KEM based on an hybridation of X25519 and post-quantum ML-KEM 1024)
Support for traditional and post-quantum signing hybridation
Support for compression (based on rust-brotli)
Streamable archive creation:
- An archive can be built even over a data-diode
- An entry can be added through chunks of data, without initially knowing the final size
- Entry chunks can be interleaved (one can add the beginning of an entry, start a second one, and then continue adding the first entry's parts)
Architecture agnostic and portable to some extent (written entirely in Rust)
Archive reading is seekable, even if compressed or encrypted. An entry can be accessed in the middle of the archive without reading from the beginning
If truncated, archives can be repaired to some extent. Two modes are available:
- Authenticated repair (default): only authenticated (as in AEAD, there is no signature verification) encrypted chunks of data are retrieved
- Unauthenticated repair: authenticated and unauthenticated encrypted chunks of data are retrieved. Use at your own risk.
Arguably less prone to bugs, especially while parsing an untrusted archive (Rust safety)

Repository

This repository contains:

mla: the Rust library implementing MLA reader and writer
mlar: a Rust cli utility wrapping mla for common actions (create, list, extract...)
doc : documentation related to MLA (e.g. format specification, cryptography)
- MLA book
bindings : bindings for other languages
samples : test assets
mla-fuzz-afl : a Rust utility to fuzz mla
.github: Continuous Integration needs

Quick command-line usage

Here are some commands to use mlar in order to work with archives in MLA format.

# Generate MLA key pairs.
mlar keygen sender
mlar keygen receiver

# Create an archive with some files.
mlar create -k sender.mlapriv -p receiver.mlapub -o my_archive.mla /etc/./os-release /etc/security/../issue ../file.txt

# List the content of the archive.
# Note that order may vary, root dir are stripped,
# paths are normalized and listing is encoded as described in
# `doc/src/ENTRY_NAME.md`.
# This outputs:
# ``
# etc/issue
# etc/os-release
# file.txt
# ``
mlar list -k receiver.mlapriv -p sender.mlapub -i my_archive.mla

# Extract the content of the archive into a new directory.
# In this example, this creates two files:
# extracted_content/etc/issue and extracted_content/etc/os-release
mlar extract -k receiver.mlapriv -p sender.mlapub -i my_archive.mla -o extracted_content

# Display the content of a file in the archive
mlar cat -k receiver.mlapriv -p sender.mlapub -i my_archive.mla etc/os-release

# Convert the archive into a long-term format, primarily for archival purposes.
# Below operation also removes encryption and applies
# the highest (but slowest) compression level.
mlar convert -k receiver.mlapriv -p sender.mlapub -i my_archive.mla -o longterm.mla -l compress -q 11

# Create an archive with multiple recipients and without signature nor compression
mlar create -l encrypt -p archive.mlapub -p client1.mlapub -o my_archive.mla ...

# List an archive containing an entry with a name that cannot be interpreted as path.
# This outputs:
# `c%3a%2f%00%3b%e2%80%ae%0ac%0dd%1b%5b1%3b31ma%3cscript%3eevil%5c..%2f%d8%01%c2%85%e2%88%95`
# corresponding to an entry name containing: ASCII chars, c:, /, .., \,
# NUL, RTLO, newline, terminal escape sequence, carriage return,
# HTML, surrogate code unit, U+0085 weird newline, fake unicode slash.
# Please note that some of these characters may appear in a valid path.
mlar list -k samples/test_mlakey_archive_v2_receiver.mlapriv -p samples/test_mlakey_archive_v2_sender.mlapub -i samples/archive_weird.mla --raw-escaped-names

# Get its content.
# This displays:
# `' OR 1=1`
mlar cat -k samples/test_mlakey_archive_v2_receiver.mlapriv -p samples/test_mlakey_archive_v2_sender.mlapub -i samples/archive_weird.mla --raw-escaped-names c%3a%2f%00%3b%e2%80%ae%0ac%0dd%1b%5b1%3b31ma%3cscript%3eevil%5c..%2f%d8%01%c2%85%e2%88%95

# Create an archive of a web file and utf-8 string, without compression, without encryption and without signature
(curl https://raw.githubusercontent.com/ANSSI-FR/MLA/refs/heads/main/LICENSE.md; echo "SEP"; echo "All Hail MLA!") | mlar create -l -o my_archive.mla --separator "SEP" --filenames great_license.md -

mlar can be obtained:

through Cargo: cargo install mlar
using the latest release for supported operating systems
- The released binaries are built with opt-level = 3, enabling great performance

For even higher performance, you can build a native-optimized binary (not portable), for example on a Linux machine:

RUSTFLAGS="-Ctarget-cpu=native" cargo build --release --target x86_64-unknown-linux-musl

Note: Native builds are optimized for your machine's CPU and are not portable. Use them only when running on the same machine you build on.

API usage

See https://docs.rs/mla

Using MLA with others languages

Bindings are available for:

C/CPP
Python

Security

Please keep in mind, it is generally not safe to extract in a place where at least one ancestor is writable by others (symbolic link attacks).
Even if encrypted with an authenticated cipher, if you receive an unsigned archive, it may have been crafted by anyone having your public key and thus can contain arbitrary data.
Read API documentation and mlar help before using their functionalities. They sometimes provide important security warnings. doc/src/ENTRY_NAME.md is also of particular interest.
mlar escapes entry names on output to avoid security issues.
Except for symbolic link attacks, mlar will not extract outside given output directory.

FAQ

Is MLAArchiveWriter Send?

By default, MLAArchiveWriter is not Send. If the inner writable type is also Send, one can enable the feature send for mla in Cargo.toml, such as:

[dependencies]
mla = { version = "...", default-features = false, features = ["send"]}

Was a new format really required?

As existing archive formats are numerous, probably not.

But to the best of the authors' knowledge, none of them support the aforementioned features (but, of course, are better suitable for others purposes).

For instance (from the understanding of the author):

tar format needs to know the size of files before adding them, and is not seekable
zip format could lose information about files if the footer is removed
7zip format requires to rebuild the entire archive while adding files to it (not streamable). It is also quite complex, and so harder to audit / trust when unpacking unknown archive
journald format is not streamable. Also, one writer / multiple reader is not needed here, thus releasing some constraints journald format has
any archive + age: age does not, as of MLA 2.0 release, support post quantum encryption nor signatures.
Backup formats are generally written to avoid things such as duplication, hence their need to keep bigger structures in memory, or not being streamable

Tweaking these formats would likely have resulted in similar properties. The choice has been made to keep a better control over what the format is capable of, and to (try to) KISS.

Performance

One can evaluate the performance through embedded benchmark, based on Criterion.

Several scenarios are already embedded, such as:

File addition, with different size and layer configurations
File addition, varying the compression quality
File reading, with different size and layer configurations
Random file read, with different size and layer configurations
Linear archive extraction, with different size and layer configurations

On an "Intel(R) Core(TM) i7-1255U CPU @ 2.60GHz":

$ cargo bench
...
multiple_layers_multiple_block_size/compression: true, encryption: true, signature: true/1048576
                        time:   [7.0850 ms 7.1179 ms 7.1586 ms]
                        thrpt:  [139.69 MiB/s 140.49 MiB/s 141.14 MiB/s]
...
chunk_size_decompress_mutilfiles_random/compression: true, encryption: true, signature: true/1048576
                        time:   [11.285 ms 11.494 ms 11.663 ms]
                        thrpt:  [85.745 MiB/s 87.005 MiB/s 88.616 MiB/s]
...
reader_multiple_layers_multiple_block_size_multifiles_linear/compression: true, encryption: true, signature: true/1048576
                        time:   [4.6197 ms 4.6383 ms 4.6604 ms]
                        thrpt:  [214.58 MiB/s 215.60 MiB/s 216.47 MiB/s]
...

Criterion.rs documentation explains how to get back HTML reports, compare results, etc.

The AES-NI extension is enabled in the compilation toolchain for the supported architectures, leading to massive performance gain for the encryption layer, especially in reading operations. Because the crate aesni statically enables it, it might lead to errors if the user's architecture does not support it. It could be disabled at the compilation time, or by commenting the associated section in .cargo/config.

Contributing

We appreciate your help! To contribute, please read our contributing instructions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multi Layer Archive (MLA)

Repository

Quick command-line usage

API usage

Using MLA with others languages

Security

FAQ

Performance

Contributing

About

Uh oh!

Releases 12

Packages

Uh oh!

Contributors 8

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 957 Commits
.cargo		.cargo
.github		.github
bindings		bindings
doc		doc
mla-fuzz-afl		mla-fuzz-afl
mla		mla
mlar		mlar
samples		samples
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE.md		LICENSE.md
README.md		README.md

License

ANSSI-FR/MLA

Folders and files

Latest commit

History

Repository files navigation

Multi Layer Archive (MLA)

Repository

Quick command-line usage

API usage

Using MLA with others languages

Security

FAQ

Performance

Contributing

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 12

Packages 0

Uh oh!

Contributors 8

Uh oh!

Languages

Packages