-
Notifications
You must be signed in to change notification settings - Fork 2
Will this support WASMFS? #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @Ercilan, I think what you're after is WORKERFS, which is included in the recommended build. Combine this with the single-threaded version and the browser should allow you to randomly access chosen, huge files directly from the user's storage without loading them all into memory. If I remember, this requires manually wrapping the single-threaded version of JS7z in a worker, and this didn't work with the multi-threaded version the last time I tried. Streaming via stdin and stdout is also possible, so you don't have to load data all at once, but that restricts you to filetypes that support sequential reads/writes, and it won't support types that need random access. From memory (and I might be wrong) -- I think the Regarding the 'new instance' restriction -- for nested decompression steps, you're better off using PROXYFS, which again, is included. You still have to create a new instance, because WASM runtime restarts are necessary to ensure memory is correctly reset, just like when you use the terminal, but it does mean you don't have to copy data between the Emscripten FS and JavaScript. Command-line ports that don't operate this way are unsafe for reasons I've indicated here: https://github.com/GMH-Code/JS7z?tab=readme-ov-file#technical-info-multi-start-safety-in-emscripten-projects . A part I forgot to add to that documentation is that the 7-Zip port doesn't actually exit 'tens of times a second' like my other WASM projects, because the default multi-threaded version runs almost entirely in workers. OPFS could in theory be used for temporarily holding large files outside of RAM, and I've looked at using WASMFS to improve performance and reliability of filesystem synchronisation for other projects (mainly to handle performance issues when using threads), but it doesn't look ready enough. See WasmFS (view) -- there's a lot to do on it. I've been keeping an eye on it, but won't be using it on any projects until it is officially released. |
I’m able to use your library to decompress a large, nested archive file in the browser (it seems that for each nested decompression step, I need to create a new JS7z instance, otherwise it will not call onExit, and i only can use onExit to callback after finished decompression ). However, the file sizes can be quite large — for example, 4GB or more(i only tested 1GB) — and loading everything entirely into memory could become a serious issue.
I did some research and found wasmFS. It seems that wasmFS offers better performance, and it also appears to support direct read/write access to OPFS, or "enable streaming files directly to a backend without loading the entire file into memory similar to the function of WorkerFS".
https://emscripten.org/docs/api_reference/Filesystem-API.html#new-file-system-wasmfs
File System Layer Design Doc
Would it be possible for your library to support wasmFS? Or is there a better way to enable streaming or avoid loading the entire archive into memory?
I’m sorry — I’m not familiar with C/C++ and i'm using the Windows, so compiling it myself would be quite costly in terms of time and effort.
Currently, with the MEMFS-based implementation, the available memory must be at least twice the size of the archive: once for the compressed source file, and at least once more for the extracted contents.
The text was updated successfully, but these errors were encountered: