From cc8eb66fa2b2e05840ecb2e2a369bc615b2df8ee Mon Sep 17 00:00:00 2001
From: Erfan <ahmadierfan99@gmail.com>
Date: Sat, 8 Feb 2025 12:21:25 +0330
Subject: [PATCH 01/21] Update README.md

WIP Features
---
 README.md | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 89 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 10956182c1..8826c9c959 100644
--- a/README.md
+++ b/README.md
@@ -95,7 +95,95 @@ Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean eu odio gravida,
 
 # Features
 
-< features > 
+### [The Nabla Core Profile](https://github.com/Devsh-Graphics-Programming/Nabla/blob/master/src/nbl/video/vulkan/profiles/NablaCore.json)
+
+Nabla exposes a well-defined, curated set of Vulkan extensions and features compatible across the GPUs we aim to support on [TODO:PLATFORMS].
+
+### Physical Device Selection and Filteration
+
+Nabla allows you to select the best GPU for your workload.
+
+[TODO]: Maybe merge this somehow with the Nabla Core Profile?
+[TODO]: Replace with some cooler code selecting a GPU for a cool (and common) workload
+```c++
+nbl::video::SPhysicalDeviceFilter deviceFilter = {};
+
+deviceFilter.minApiVersion = { 1,3,0 };
+deviceFilter.minConformanceVersion = {1,3,0,0};
+
+deviceFilter.minimumLimits = getRequiredDeviceLimits();
+deviceFilter.requiredFeatures = getRequiredDeviceFeatures();
+
+deviceFilter.requiredImageFormatUsagesOptimalTiling = getRequiredOptimalTilingImageUsages();
+
+const auto memoryReqs = getMemoryRequirements();
+deviceFilter.memoryRequirements = memoryReqs;
+
+const auto queueReqs = getQueueRequirements();
+deviceFilter.queueRequirements = queueReqs;
+
+deviceFilter(physicalDevices);
+```
+
+### SPIR-V and Vulkan as First-Class Citizens
+
+Nabla treats SPIR-V and Vulkan as core components, this ensures full control over [TODO] ..
+
+### Integration of Renderdoc
+
+Built-in support for capturing frames and debugging with Renderdoc.
+
+```c++
+const IQueue::SSubmitInfo submitInfo = {
+    .waitSemaphores = {},
+    .commandBuffers = {&cmdbufInfo,1},
+    .signalSemaphores = {&signalInfo,1}
+};
+
+m_api->startCapture(); // Start Renderdoc Capture
+
+queue->submit({&submitInfo,1});
+
+m_api->endCapture(); // End Renderdoc Capture
+```
+
+### Nabla Event Handler: Seamless GPU-CPU Synchronization
+
+Nabla Event Handler's extensive usage of [Timeline Semaphores]() enables CPU Callbacks on GPU conditions.
+
+You can enqueue callbacks that trigger upon specific GPU conditions, making it possible to handle tasks such as resource deallocation only after the GPU has completed relevant work.
+```c++
+// This doesn't actually free the memory from the pool, the memory is queued up to be freed only after the `scratchSemaphore` reaches a value a future submit will signal
+memory_pool->deallocate(&offset,&size,nextSubmit.getFutureScratchSemaphore());
+```
+
+### GPU Object Lifecycle Tracking
+
+Nabla uses [smart reference counting]() to track the lifecycle of GPU objects. Descriptor sets and command buffers are responsible for maintaining reference counts on the resources (e.g., buffers, textures) they use. The queue itself also tracks command buffers, ensuring that objects remain alive as long as they are pending execution. This system guarantees the correct order of deletion and makes it difficult for GPU objects to go out of scope and be destroyed before the GPU has finished using them.
+
+[TODO] Code?
+
+---
+[TODO]: Remaining:
+- Reusability: HLSL2021 Standard Template Library
+- Testability: HLSL subset compiling as both C++ Host and SPIR-V Device code
+- Future Proof: C++20 Concepts in HLSL for safe and documented Static Polymorphism
+- Insane: Boost PreProcessor and Template Metaprogramming in HLSL!
+- Embraces Buffer Device Address and Descriptor Indexing to the full
+- Minimally Invasive (vulkan handle acquisition, multiple windows, content playing second fiddle)
+- Designed for Interoperation (memory export, import and Coming Soon: CUDA Interop)
+- Cancellable Future based Async I/O
+- Virtual File System (archive mounting, our alternative to #embed, everything is referenced by absolute - path)
+- Asset Managment: Directed Acyclic Graphs
+- Asset Converter: Merkle Trees de-duplicating GPU Object Instances
+- Unit tested BxDFs in a Statically Polymorhic framework
+- In Progress: GPU ECS (Property Pools)
+- SPIR-V Introspection and Layout creation
+- Extensions (ImGUI, FFT, Workgroup Prefix Sum, Blur, Counting Sort In Progress: Autoexposure, Tonemap, - GPU MPMC Queue, OptiX Interop, Global Scan)
+- Coming Soon: Scene Loaders, GPU Driven Scene Graph, Material Compiler v2 for efficient scheduling of - BxDF graph evaluation
+
+
+### [TODO?] IUtiltities (Using Fixed-sized staging memory for easier cpu-gpu transfers?)
 
 # FAQ
 

From 25e7cdc4875391775145e7365cfee1ebaca9a6a8 Mon Sep 17 00:00:00 2001
From: Erfan <ahmadierfan99@gmail.com>
Date: Sat, 8 Feb 2025 15:36:34 +0330
Subject: [PATCH 02/21] Update README.md

---
 README.md | 71 ++++++++++++++++++++++++++++++-------------------------
 1 file changed, 39 insertions(+), 32 deletions(-)

diff --git a/README.md b/README.md
index 8826c9c959..a8f3db909c 100644
--- a/README.md
+++ b/README.md
@@ -93,41 +93,29 @@ TODO aspect ratio + images alignment + more more images
 
 Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean eu odio gravida, tristique quam quis, dignissim purus. Sed sed neque facilisis, venenatis odio in, dignissim risus. Nulla facilisi. Aliquam dictum volutpat ligula. Quisque vehicula condimentum bibendum. Morbi posuere, libero ac porttitor molestie, sem enim molestie sapien, at consectetur metus lacus nec justo. Sed sollicitudin nisl ut tellus posuere pharetra. Phasellus in rutrum elit. Nunc dui dui, ultricies eu nunc in, dictum gravida eros. Integer fermentum in turpis non ultricies. Cras sit amet sagittis sapien. Integer dignissim mauris ac magna dapibus, non ultrices risus rhoncus. Sed gravida hendrerit mattis. Pellentesque a congue massa. Nullam in cursus libero. Ut ac tristique mauris.
 
+
 # Features
 
-### [The Nabla Core Profile](https://github.com/Devsh-Graphics-Programming/Nabla/blob/master/src/nbl/video/vulkan/profiles/NablaCore.json)
+### The Nabla Core Profile
 
-Nabla exposes a well-defined, curated set of Vulkan extensions and features compatible across the GPUs we aim to support on [TODO:PLATFORMS].
+Nabla exposes [a well-defined, curated set of Vulkan extensions and features](https://github.com/Devsh-Graphics-Programming/Nabla/blob/master/src/nbl/video/vulkan/profiles/NablaCore.json) compatible across the GPUs we aim to support. (TODO: on which platforms?)
 
 ### Physical Device Selection and Filteration
 
 Nabla allows you to select the best GPU for your workload.
 
-[TODO]: Maybe merge this somehow with the Nabla Core Profile?
-[TODO]: Replace with some cooler code selecting a GPU for a cool (and common) workload
 ```c++
 nbl::video::SPhysicalDeviceFilter deviceFilter = {};
-
 deviceFilter.minApiVersion = { 1,3,0 };
 deviceFilter.minConformanceVersion = {1,3,0,0};
-
-deviceFilter.minimumLimits = getRequiredDeviceLimits();
-deviceFilter.requiredFeatures = getRequiredDeviceFeatures();
-
-deviceFilter.requiredImageFormatUsagesOptimalTiling = getRequiredOptimalTilingImageUsages();
-
-const auto memoryReqs = getMemoryRequirements();
-deviceFilter.memoryRequirements = memoryReqs;
-
-const auto queueReqs = getQueueRequirements();
-deviceFilter.queueRequirements = queueReqs;
-
+deviceFilter.requiredFeatures.rayQuery = true;
+// TODO: add something else here
 deviceFilter(physicalDevices);
 ```
 
 ### SPIR-V and Vulkan as First-Class Citizens
 
-Nabla treats SPIR-V and Vulkan as core components, this ensures full control over [TODO] ..
+Nabla treats SPIR-V and Vulkan as core components, this ensures full control over [TODO]
 
 ### Integration of Renderdoc
 
@@ -139,11 +127,8 @@ const IQueue::SSubmitInfo submitInfo = {
     .commandBuffers = {&cmdbufInfo,1},
     .signalSemaphores = {&signalInfo,1}
 };
-
 m_api->startCapture(); // Start Renderdoc Capture
-
 queue->submit({&submitInfo,1});
-
 m_api->endCapture(); // End Renderdoc Capture
 ```
 
@@ -151,7 +136,7 @@ m_api->endCapture(); // End Renderdoc Capture
 
 Nabla Event Handler's extensive usage of [Timeline Semaphores]() enables CPU Callbacks on GPU conditions.
 
-You can enqueue callbacks that trigger upon specific GPU conditions, making it possible to handle tasks such as resource deallocation only after the GPU has completed relevant work.
+You can enqueue callbacks that trigger upon specific GPU conditions, enabling tasks like resource deallocation to be handled only after the GPU has completed the relevant work.
 ```c++
 // This doesn't actually free the memory from the pool, the memory is queued up to be freed only after the `scratchSemaphore` reaches a value a future submit will signal
 memory_pool->deallocate(&offset,&size,nextSubmit.getFutureScratchSemaphore());
@@ -161,17 +146,41 @@ memory_pool->deallocate(&offset,&size,nextSubmit.getFutureScratchSemaphore());
 
 Nabla uses [smart reference counting]() to track the lifecycle of GPU objects. Descriptor sets and command buffers are responsible for maintaining reference counts on the resources (e.g., buffers, textures) they use. The queue itself also tracks command buffers, ensuring that objects remain alive as long as they are pending execution. This system guarantees the correct order of deletion and makes it difficult for GPU objects to go out of scope and be destroyed before the GPU has finished using them.
 
-[TODO] Code?
+```cpp
+[TODO][CODE]
+```
+
+### HLSL2021 Standard Template Library
+
+- 🔄 Reusable: Unified single-source C++/HLSL libraries eliminate code duplication with reimplementation of STL's `type_traits`, `limits`, `functional`, `tgmath`, etc.
+
+- 🐞 Shader Logic, CPU-Tested: A subset of HLSL compiles as both C++ and SPIR-V, enabling CPU-side debugging of GPU logic, ensuring correctness in complex tasks like FFT, Prefix Sum, etc.
+Future Proof: C++20 Concepts in HLSL for safe and documented Static Polymorphism
+
+- 🔮 Future-Proof: C++20 concepts in HLSL enable safe and documented polymorphism.
+
+- 🧠 Insane: Boost Preprocessor and Template Metaprogramming in HLSL!
+
+- 🛠️ Real-World Problem Solvers: The library offers GPU-optimized solutions for tasks like Prefix Sum, Binary Search, FFT, Global Sort, and even emulated `shaderFloat64` when native GPU support is unavailable!
+
+```cpp
+[TODO][CODE] Code for each or just one showcasing most of the above points?
+```
+
+### Full Embrace of [Buffer Device Address]() and [Descriptor Indexing]()
+
+By utilizing Buffer Device Addresses (BDAs), Nabla allows more efficient direct access to GPU memory; synergized with Descriptor Indexing, it improves flexibility by enabling more dynamic, scalable resource binding without relying on traditional descriptor sets.
+
+### Minimally Invasive Design
+[TODO]: vulkan handle acquisition, multiple windows, content playing second fiddle
+
+### Designed for Interoperation
+Nabla is built with interoperation in mind, supporting memory export and import between different compute and graphics APIs.
+
+🚀 Coming soon: Full CUDA Interop support for enhanced cross-platform compatibility.
 
 ---
 [TODO]: Remaining:
-- Reusability: HLSL2021 Standard Template Library
-- Testability: HLSL subset compiling as both C++ Host and SPIR-V Device code
-- Future Proof: C++20 Concepts in HLSL for safe and documented Static Polymorphism
-- Insane: Boost PreProcessor and Template Metaprogramming in HLSL!
-- Embraces Buffer Device Address and Descriptor Indexing to the full
-- Minimally Invasive (vulkan handle acquisition, multiple windows, content playing second fiddle)
-- Designed for Interoperation (memory export, import and Coming Soon: CUDA Interop)
 - Cancellable Future based Async I/O
 - Virtual File System (archive mounting, our alternative to #embed, everything is referenced by absolute - path)
 - Asset Managment: Directed Acyclic Graphs
@@ -181,8 +190,6 @@ Nabla uses [smart reference counting]() to track the lifecycle of GPU objects. D
 - SPIR-V Introspection and Layout creation
 - Extensions (ImGUI, FFT, Workgroup Prefix Sum, Blur, Counting Sort In Progress: Autoexposure, Tonemap, - GPU MPMC Queue, OptiX Interop, Global Scan)
 - Coming Soon: Scene Loaders, GPU Driven Scene Graph, Material Compiler v2 for efficient scheduling of - BxDF graph evaluation
-
-
 ### [TODO?] IUtiltities (Using Fixed-sized staging memory for easier cpu-gpu transfers?)
 
 # FAQ

From cacef051e03667c15be8f6e86463556cc5558ad5 Mon Sep 17 00:00:00 2001
From: Erfan Ahmadi <ahmadierfan99@gmail.com>
Date: Sat, 8 Feb 2025 16:15:20 +0400
Subject: [PATCH 03/21] emojis are not bad :D

---
 README.md | 22 +++++++++-------------
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/README.md b/README.md
index a8f3db909c..e5ab37b658 100644
--- a/README.md
+++ b/README.md
@@ -96,11 +96,11 @@ Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean eu odio gravida,
 
 # Features
 
-### The Nabla Core Profile
+### 🧩 The Nabla Core Profile
 
 Nabla exposes [a well-defined, curated set of Vulkan extensions and features](https://github.com/Devsh-Graphics-Programming/Nabla/blob/master/src/nbl/video/vulkan/profiles/NablaCore.json) compatible across the GPUs we aim to support. (TODO: on which platforms?)
 
-### Physical Device Selection and Filteration
+### 🧩 Physical Device Selection and Filteration
 
 Nabla allows you to select the best GPU for your workload.
 
@@ -113,7 +113,7 @@ deviceFilter.requiredFeatures.rayQuery = true;
 deviceFilter(physicalDevices);
 ```
 
-### SPIR-V and Vulkan as First-Class Citizens
+### 🧩 SPIR-V and Vulkan as First-Class Citizens
 
 Nabla treats SPIR-V and Vulkan as core components, this ensures full control over [TODO]
 
@@ -132,7 +132,7 @@ queue->submit({&submitInfo,1});
 m_api->endCapture(); // End Renderdoc Capture
 ```
 
-### Nabla Event Handler: Seamless GPU-CPU Synchronization
+### 🧩 Nabla Event Handler: Seamless GPU-CPU Synchronization
 
 Nabla Event Handler's extensive usage of [Timeline Semaphores]() enables CPU Callbacks on GPU conditions.
 
@@ -142,15 +142,11 @@ You can enqueue callbacks that trigger upon specific GPU conditions, enabling ta
 memory_pool->deallocate(&offset,&size,nextSubmit.getFutureScratchSemaphore());
 ```
 
-### GPU Object Lifecycle Tracking
+### 🧩 GPU Object Lifecycle Tracking
 
 Nabla uses [smart reference counting]() to track the lifecycle of GPU objects. Descriptor sets and command buffers are responsible for maintaining reference counts on the resources (e.g., buffers, textures) they use. The queue itself also tracks command buffers, ensuring that objects remain alive as long as they are pending execution. This system guarantees the correct order of deletion and makes it difficult for GPU objects to go out of scope and be destroyed before the GPU has finished using them.
 
-```cpp
-[TODO][CODE]
-```
-
-### HLSL2021 Standard Template Library
+### 🧩 HLSL2021 Standard Template Library
 
 - 🔄 Reusable: Unified single-source C++/HLSL libraries eliminate code duplication with reimplementation of STL's `type_traits`, `limits`, `functional`, `tgmath`, etc.
 
@@ -167,14 +163,14 @@ Future Proof: C++20 Concepts in HLSL for safe and documented Static Polymorphism
 [TODO][CODE] Code for each or just one showcasing most of the above points?
 ```
 
-### Full Embrace of [Buffer Device Address]() and [Descriptor Indexing]()
+### 🧩 Full Embrace of [Buffer Device Address]() and [Descriptor Indexing]()
 
 By utilizing Buffer Device Addresses (BDAs), Nabla allows more efficient direct access to GPU memory; synergized with Descriptor Indexing, it improves flexibility by enabling more dynamic, scalable resource binding without relying on traditional descriptor sets.
 
-### Minimally Invasive Design
+### 🧩 Minimally Invasive Design
 [TODO]: vulkan handle acquisition, multiple windows, content playing second fiddle
 
-### Designed for Interoperation
+### 🧩 Designed for Interoperation
 Nabla is built with interoperation in mind, supporting memory export and import between different compute and graphics APIs.
 
 🚀 Coming soon: Full CUDA Interop support for enhanced cross-platform compatibility.

From 8d33ee24530560df2f1423acdd3aee1289792f2d Mon Sep 17 00:00:00 2001
From: Erfan Ahmadi <ahmadierfan99@gmail.com>
Date: Sat, 8 Feb 2025 16:17:22 +0400
Subject: [PATCH 04/21] small edits

---
 README.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index e5ab37b658..f353d1117d 100644
--- a/README.md
+++ b/README.md
@@ -175,8 +175,8 @@ Nabla is built with interoperation in mind, supporting memory export and import
 
 🚀 Coming soon: Full CUDA Interop support for enhanced cross-platform compatibility.
 
----
-[TODO]: Remaining:
+
+[TODO]:
 - Cancellable Future based Async I/O
 - Virtual File System (archive mounting, our alternative to #embed, everything is referenced by absolute - path)
 - Asset Managment: Directed Acyclic Graphs
@@ -186,7 +186,7 @@ Nabla is built with interoperation in mind, supporting memory export and import
 - SPIR-V Introspection and Layout creation
 - Extensions (ImGUI, FFT, Workgroup Prefix Sum, Blur, Counting Sort In Progress: Autoexposure, Tonemap, - GPU MPMC Queue, OptiX Interop, Global Scan)
 - Coming Soon: Scene Loaders, GPU Driven Scene Graph, Material Compiler v2 for efficient scheduling of - BxDF graph evaluation
-### [TODO?] IUtiltities (Using Fixed-sized staging memory for easier cpu-gpu transfers?)
+### [TODO?] IUtiltities?/ Using Fixed-sized staging memory for easier cpu-gpu transfers? format promotion?
 
 # FAQ
 

From 4cb1f2c9933b344b4ad4409e37234338361edd0b Mon Sep 17 00:00:00 2001
From: Erfan Ahmadi <ahmadierfan99@gmail.com>
Date: Sat, 8 Feb 2025 17:45:54 +0400
Subject: [PATCH 05/21] Asset Stuff and BxDFs

---
 README.md | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index f353d1117d..9ef1c905d8 100644
--- a/README.md
+++ b/README.md
@@ -175,13 +175,25 @@ Nabla is built with interoperation in mind, supporting memory export and import
 
 🚀 Coming soon: Full CUDA Interop support for enhanced cross-platform compatibility.
 
+----
 
 [TODO]:
 - Cancellable Future based Async I/O
 - Virtual File System (archive mounting, our alternative to #embed, everything is referenced by absolute - path)
-- Asset Managment: Directed Acyclic Graphs
-- Asset Converter: Merkle Trees de-duplicating GPU Object Instances
-- Unit tested BxDFs in a Statically Polymorhic framework
+
+----
+
+### 🧩 Asset Manager
+Nabla’s Asset Manager efficiently loads assets while tracking dependencies using a Directed Acyclic Graph (DAG). Assets are loaded in the correct order, avoids redundant allocations, and simplifies resource management.
+
+### 🧩 Asset Converter (CPU to GPU)
+The Asset Converter transforms CPU objects (asset::IAsset) into GPU objects (video::IBackendObject) while eliminating duplicates. Instead of relying on pointer comparisons, it hashes asset contents to detect and reuse identical GPU objects.
+
+### 🧩 Unit-Tested BxDFs for Physically Based Rendering
+A statically polymorphic library for defining Bidirectional Scattering Distribution Functions (BxDFs) in HLSL and C++. Each BxDF is rigorously unit-tested in C++ as well as HLSL. This is part of Nabla’s HLSL-C++ compatible library.
+
+
+[TODO]:
 - In Progress: GPU ECS (Property Pools)
 - SPIR-V Introspection and Layout creation
 - Extensions (ImGUI, FFT, Workgroup Prefix Sum, Blur, Counting Sort In Progress: Autoexposure, Tonemap, - GPU MPMC Queue, OptiX Interop, Global Scan)

From 50b821e6ecc3d1480436c5f39a6d344037e979ee Mon Sep 17 00:00:00 2001
From: Erfan Ahmadi <ahmadierfan99@gmail.com>
Date: Sat, 8 Feb 2025 18:24:18 +0400
Subject: [PATCH 06/21] property pools

---
 README.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 9ef1c905d8..0f17326e49 100644
--- a/README.md
+++ b/README.md
@@ -192,9 +192,10 @@ The Asset Converter transforms CPU objects (asset::IAsset) into GPU objects (vid
 ### 🧩 Unit-Tested BxDFs for Physically Based Rendering
 A statically polymorphic library for defining Bidirectional Scattering Distribution Functions (BxDFs) in HLSL and C++. Each BxDF is rigorously unit-tested in C++ as well as HLSL. This is part of Nabla’s HLSL-C++ compatible library.
 
+### 🔧 In Progress: GPU Entity component system  
+Property Pools group related properties together in a Structure Of Arrays (SoA) manner, allowing efficient, cache-friendly access to data on the GPU. The system enables transferring properties between the CPU and GPU, with the `PropertyPoolHandler` managing scattered updates with a special compute shader. Handles are assigned for each object and remain constant as data is added or removed.
 
 [TODO]:
-- In Progress: GPU ECS (Property Pools)
 - SPIR-V Introspection and Layout creation
 - Extensions (ImGUI, FFT, Workgroup Prefix Sum, Blur, Counting Sort In Progress: Autoexposure, Tonemap, - GPU MPMC Queue, OptiX Interop, Global Scan)
 - Coming Soon: Scene Loaders, GPU Driven Scene Graph, Material Compiler v2 for efficient scheduling of - BxDF graph evaluation

From 9bda55abd774198556a4735127f76b6d8dafa030 Mon Sep 17 00:00:00 2001
From: Erfan Ahmadi <ahmadierfan99@gmail.com>
Date: Sat, 8 Feb 2025 18:33:11 +0400
Subject: [PATCH 07/21] small improvements

---
 README.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index 0f17326e49..d1f27b3cd1 100644
--- a/README.md
+++ b/README.md
@@ -165,7 +165,7 @@ Future Proof: C++20 Concepts in HLSL for safe and documented Static Polymorphism
 
 ### 🧩 Full Embrace of [Buffer Device Address]() and [Descriptor Indexing]()
 
-By utilizing Buffer Device Addresses (BDAs), Nabla allows more efficient direct access to GPU memory; synergized with Descriptor Indexing, it improves flexibility by enabling more dynamic, scalable resource binding without relying on traditional descriptor sets.
+By utilizing Buffer Device Addresses (BDAs), Nabla enables more direct access to memory through 64-bit GPU virtual addresses. Synergized with Descriptor Indexing, this approach enhances flexibility by enabling more dynamic, scalable resource binding without relying on traditional descriptor sets.
 
 ### 🧩 Minimally Invasive Design
 [TODO]: vulkan handle acquisition, multiple windows, content playing second fiddle
@@ -192,8 +192,8 @@ The Asset Converter transforms CPU objects (asset::IAsset) into GPU objects (vid
 ### 🧩 Unit-Tested BxDFs for Physically Based Rendering
 A statically polymorphic library for defining Bidirectional Scattering Distribution Functions (BxDFs) in HLSL and C++. Each BxDF is rigorously unit-tested in C++ as well as HLSL. This is part of Nabla’s HLSL-C++ compatible library.
 
-### 🔧 In Progress: GPU Entity component system  
-Property Pools group related properties together in a Structure Of Arrays (SoA) manner, allowing efficient, cache-friendly access to data on the GPU. The system enables transferring properties between the CPU and GPU, with the `PropertyPoolHandler` managing scattered updates with a special compute shader. Handles are assigned for each object and remain constant as data is added or removed.
+### 🔧 In Progress: Property Pools (GPU Entity Component System)
+*Property Pools* group related properties together in a Structure Of Arrays (SoA) manner, allowing efficient, cache-friendly access to data on the GPU. The system enables transferring properties (Components) between the CPU and GPU, with the `PropertyPoolHandler` managing scattered updates with a special compute shader. Handles are assigned for each object and remain constant as data is added or removed.
 
 [TODO]:
 - SPIR-V Introspection and Layout creation

From eaa99e2c583cac78af864eeaf5069e0b8f1d8dff Mon Sep 17 00:00:00 2001
From: Erfan Ahmadi <ahmadierfan99@gmail.com>
Date: Sat, 8 Feb 2025 18:43:00 +0400
Subject: [PATCH 08/21] small edits

---
 README.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index d1f27b3cd1..8330be5035 100644
--- a/README.md
+++ b/README.md
@@ -98,11 +98,11 @@ Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean eu odio gravida,
 
 ### 🧩 The Nabla Core Profile
 
-Nabla exposes [a well-defined, curated set of Vulkan extensions and features](https://github.com/Devsh-Graphics-Programming/Nabla/blob/master/src/nbl/video/vulkan/profiles/NablaCore.json) compatible across the GPUs we aim to support. (TODO: on which platforms?)
+Nabla exposes [a curated set of Vulkan extensions and features](https://github.com/Devsh-Graphics-Programming/Nabla/blob/master/src/nbl/video/vulkan/profiles/NablaCore.json) compatible across the GPUs we aim to support. (TODO: on which platforms?)
 
 ### 🧩 Physical Device Selection and Filteration
 
-Nabla allows you to select the best GPU for your workload.
+Nabla allows you to select the best GPU for your compute or graphics workload.
 
 ```c++
 nbl::video::SPhysicalDeviceFilter deviceFilter = {};
@@ -159,7 +159,7 @@ Future Proof: C++20 Concepts in HLSL for safe and documented Static Polymorphism
 
 - 🛠️ Real-World Problem Solvers: The library offers GPU-optimized solutions for tasks like Prefix Sum, Binary Search, FFT, Global Sort, and even emulated `shaderFloat64` when native GPU support is unavailable!
 
-```cpp
+```
 [TODO][CODE] Code for each or just one showcasing most of the above points?
 ```
 

From 1f4b7c0c4065aecbf3a92f1368ea72038a8f19c0 Mon Sep 17 00:00:00 2001
From: Erfan Ahmadi <ahmadierfan99@gmail.com>
Date: Sat, 8 Feb 2025 19:03:57 +0400
Subject: [PATCH 09/21] first version final

---
 README.md | 30 +++++++++++++++++++++++-------
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/README.md b/README.md
index 8330be5035..457936f63b 100644
--- a/README.md
+++ b/README.md
@@ -173,8 +173,6 @@ By utilizing Buffer Device Addresses (BDAs), Nabla enables more direct access to
 ### 🧩 Designed for Interoperation
 Nabla is built with interoperation in mind, supporting memory export and import between different compute and graphics APIs.
 
-🚀 Coming soon: Full CUDA Interop support for enhanced cross-platform compatibility.
-
 ----
 
 [TODO]:
@@ -195,11 +193,29 @@ A statically polymorphic library for defining Bidirectional Scattering Distribut
 ### 🔧 In Progress: Property Pools (GPU Entity Component System)
 *Property Pools* group related properties together in a Structure Of Arrays (SoA) manner, allowing efficient, cache-friendly access to data on the GPU. The system enables transferring properties (Components) between the CPU and GPU, with the `PropertyPoolHandler` managing scattered updates with a special compute shader. Handles are assigned for each object and remain constant as data is added or removed.
 
-[TODO]:
-- SPIR-V Introspection and Layout creation
-- Extensions (ImGUI, FFT, Workgroup Prefix Sum, Blur, Counting Sort In Progress: Autoexposure, Tonemap, - GPU MPMC Queue, OptiX Interop, Global Scan)
-- Coming Soon: Scene Loaders, GPU Driven Scene Graph, Material Compiler v2 for efficient scheduling of - BxDF graph evaluation
-### [TODO?] IUtiltities?/ Using Fixed-sized staging memory for easier cpu-gpu transfers? format promotion?
+### 🧩 SPIR-V Introspection and Layout Creation
+
+SPIR-V introspection in Nabla eliminates most of the boilerplate code required to set up descriptor and pipeline layouts, simplifying resource binding to shaders.
+
+### 🧩 Nabla Extensions
+- ImGui integration.
+- Fast Fourier Transform for image processing and all kind of frequncy-domain fun.
+- Workgroup Prefix Sum – Efficient parallel prefix sum computation.
+- Blur – Optimized GPU-based image blurring.
+- Counting Sort – High-performance, GPU-accelerated sorting algorithm.
+- Autoexposure [Work in Progress] – Adaptive brightness adjustment for HDR rendering.
+- Tonemapping
+- GPU MPMC Queue – Multi-producer, multi-consumer GPU queue.
+- OptiX interoperability for ray tracing.
+- Global Scan – High-speed parallel scanning across large datasets.
+
+### 🚀 Coming Soon [TODO: Explain some better]
+- Full CUDA interoperability support.
+- Scene Loaders
+- GPU-Driven Scene Graph
+- Material Compiler 2.0 for efficient scheduling of BxDF graph evaluation
+
+### [TODO?] IUtiltities?  Using Fixed-sized staging memory for easier cpu-gpu transfers? format promotion?
 
 # FAQ
 

From f10f7ec0a45a0eb870165e737482fe089a762479 Mon Sep 17 00:00:00 2001
From: Erfan Ahmadi <ahmadierfan99@gmail.com>
Date: Sun, 9 Feb 2025 08:08:55 +0400
Subject: [PATCH 10/21] more edits

---
 README.md | 29 ++++++++++++++++++-----------
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/README.md b/README.md
index 457936f63b..97dffdadfc 100644
--- a/README.md
+++ b/README.md
@@ -98,28 +98,34 @@ Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean eu odio gravida,
 
 ### 🧩 The Nabla Core Profile
 
-Nabla exposes [a curated set of Vulkan extensions and features](https://github.com/Devsh-Graphics-Programming/Nabla/blob/master/src/nbl/video/vulkan/profiles/NablaCore.json) compatible across the GPUs we aim to support. (TODO: on which platforms?)
+Nabla exposes [a curated set of Vulkan extensions and features](https://github.com/Devsh-Graphics-Programming/Nabla/blob/master/src/nbl/video/vulkan/profiles/NablaCore.json) compatible across the GPUs we aim to support on Windows, Linux, (coming soon MacOS, iOS as well as Android)
+
+Vulkan evolves fast—just when you think you've figured out [sync](), you realize there's [sync2](). Keeping up with new extensions, best practices, and hardware quirks is exhausting.
+Instead of digging through [gpuinfo.org](gpuinfo.org) or [Vulkan specs](), Nabla gives you a well-thought-out set of extensions—so you can focus on what you want to achieve, not get lost in extreme details.
 
 ### 🧩 Physical Device Selection and Filteration
 
 Nabla allows you to select the best GPU for your compute or graphics workload.
 
 ```c++
-nbl::video::SPhysicalDeviceFilter deviceFilter = {};
-deviceFilter.minApiVersion = { 1,3,0 };
-deviceFilter.minConformanceVersion = {1,3,0,0};
-deviceFilter.requiredFeatures.rayQuery = true;
-// TODO: add something else here
-deviceFilter(physicalDevices);
+void filterDevices(core::set<video::IPhysicalDevice*>& physicalDevices)
+{
+  nbl::video::SPhysicalDeviceFilter deviceFilter = {};
+  deviceFilter.minApiVersion = { 1,3,0 };
+  deviceFilter.minConformanceVersion = {1,3,0,0};
+  deviceFilter.requiredFeatures.rayQuery = true;
+  deviceFilter(physicalDevices);
+}
 ```
 
 ### 🧩 SPIR-V and Vulkan as First-Class Citizens
 
-Nabla treats SPIR-V and Vulkan as core components, this ensures full control over [TODO]
+Nabla treats **SPIR-V** and **Vulkan** as the preferred, reference standard—everything else is built around them, with all other backends adapting to them.
 
-### Integration of Renderdoc
+### 🧩 Integration of Renderdoc
 
-Built-in support for capturing frames and debugging with Renderdoc.
+Built-in support for capturing frames and debugging with [Renderdoc](https://renderdoc.org/).
+ This is how one debugs headless or async GPU workloads that are not directly involved in producing a swapchain frame to be captured by Renderdoc.
 
 ```c++
 const IQueue::SSubmitInfo submitInfo = {
@@ -136,7 +142,8 @@ m_api->endCapture(); // End Renderdoc Capture
 
 Nabla Event Handler's extensive usage of [Timeline Semaphores]() enables CPU Callbacks on GPU conditions.
 
-You can enqueue callbacks that trigger upon specific GPU conditions, enabling tasks like resource deallocation to be handled only after the GPU has completed the relevant work.
+You can enqueue callbacks that trigger upon submission completion (workload finish), enabling amongst others, async readback of submission side effects, or deallocating an allocation after a workload is finished.
+
 ```c++
 // This doesn't actually free the memory from the pool, the memory is queued up to be freed only after the `scratchSemaphore` reaches a value a future submit will signal
 memory_pool->deallocate(&offset,&size,nextSubmit.getFutureScratchSemaphore());

From 7d0544b8eaed4d99347309253b51d2ed0771dc66 Mon Sep 17 00:00:00 2001
From: Erfan Ahmadi <ahmadierfan99@gmail.com>
Date: Sun, 9 Feb 2025 08:53:14 +0400
Subject: [PATCH 11/21] Nabla Extensions

---
 README.md | 51 ++++++++++++++++++++++++++-------------------------
 1 file changed, 26 insertions(+), 25 deletions(-)

diff --git a/README.md b/README.md
index 97dffdadfc..e8af2a62da 100644
--- a/README.md
+++ b/README.md
@@ -151,14 +151,13 @@ memory_pool->deallocate(&offset,&size,nextSubmit.getFutureScratchSemaphore());
 
 ### 🧩 GPU Object Lifecycle Tracking
 
-Nabla uses [smart reference counting]() to track the lifecycle of GPU objects. Descriptor sets and command buffers are responsible for maintaining reference counts on the resources (e.g., buffers, textures) they use. The queue itself also tracks command buffers, ensuring that objects remain alive as long as they are pending execution. This system guarantees the correct order of deletion and makes it difficult for GPU objects to go out of scope and be destroyed before the GPU has finished using them.
+Nabla uses [reference counting]() to track the lifecycle of GPU objects. Descriptor sets and command buffers are responsible for maintaining reference counts on the resources (e.g., buffers, textures) they use. The queue itself also tracks command buffers, ensuring that objects remain alive as long as they are pending execution. This system guarantees the correct order of deletion and makes it difficult for GPU objects to go out of scope and be destroyed before the GPU has finished using them.
 
 ### 🧩 HLSL2021 Standard Template Library
 
 - 🔄 Reusable: Unified single-source C++/HLSL libraries eliminate code duplication with reimplementation of STL's `type_traits`, `limits`, `functional`, `tgmath`, etc.
 
-- 🐞 Shader Logic, CPU-Tested: A subset of HLSL compiles as both C++ and SPIR-V, enabling CPU-side debugging of GPU logic, ensuring correctness in complex tasks like FFT, Prefix Sum, etc.
-Future Proof: C++20 Concepts in HLSL for safe and documented Static Polymorphism
+- 🐞 Shader Logic, CPU-Tested: A subset of HLSL compiles as both C++ and SPIR-V, enabling CPU-side debugging of GPU logic, ensuring correctness in complex tasks like FFT, Prefix Sum, etc. (See our examples: [1. BxDF Unit Test](https://github.com/Devsh-Graphics-Programming/Nabla-Examples-and-Tests/blob/d7f7a87fa08a56a16cd1bcc7d4d9fd48fc8c278c/66_HLSLBxDFTests/app_resources/tests.hlsl#L436), [2. Math Funcs Unit Test](https://github.com/Devsh-Graphics-Programming/Nabla-Examples-and-Tests/blob/fd92730f0f5c8a120782c928309cb10e776c25db/22_CppCompat/main.cpp#L407))
 
 - 🔮 Future-Proof: C++20 concepts in HLSL enable safe and documented polymorphism.
 
@@ -166,16 +165,19 @@ Future Proof: C++20 Concepts in HLSL for safe and documented Static Polymorphism
 
 - 🛠️ Real-World Problem Solvers: The library offers GPU-optimized solutions for tasks like Prefix Sum, Binary Search, FFT, Global Sort, and even emulated `shaderFloat64` when native GPU support is unavailable!
 
-```
-[TODO][CODE] Code for each or just one showcasing most of the above points?
-```
 
 ### 🧩 Full Embrace of [Buffer Device Address]() and [Descriptor Indexing]()
 
 By utilizing Buffer Device Addresses (BDAs), Nabla enables more direct access to memory through 64-bit GPU virtual addresses. Synergized with Descriptor Indexing, this approach enhances flexibility by enabling more dynamic, scalable resource binding without relying on traditional descriptor sets.
 
 ### 🧩 Minimally Invasive Design
-[TODO]: vulkan handle acquisition, multiple windows, content playing second fiddle
+
+Nabla's minimally invasive and flexible design with api handle acquisitions and multi-window support make it ideal for custom rendering setups and low-level GPU programming without unnecessary constraints such as assuming a main thread or a single window.
+
+This allows simpler porting of legacy OpenGL and DirectX applications.
+
+[TDOO:Insert Image]
+
 
 ### 🧩 Designed for Interoperation
 Nabla is built with interoperation in mind, supporting memory export and import between different compute and graphics APIs.
@@ -185,14 +187,15 @@ Nabla is built with interoperation in mind, supporting memory export and import
 [TODO]:
 - Cancellable Future based Async I/O
 - Virtual File System (archive mounting, our alternative to #embed, everything is referenced by absolute - path)
-
+- IUtiltities  Using Fixed-sized staging memory for easier cpu-gpu transfers? format promotion?
 ----
 
-### 🧩 Asset Manager
-Nabla’s Asset Manager efficiently loads assets while tracking dependencies using a Directed Acyclic Graph (DAG). Assets are loaded in the correct order, avoids redundant allocations, and simplifies resource management.
+### 🧩 Asset System
+The asset system in Nabla maintains a 1:1 mapping between CPU and GPU representations, where every CPU asset has a direct GPU counterpart.
+The system also allows for coordination between loaders—for instance, the OBJ loader can trigger the MTL loader, and the MTL loader in turn invokes image loaders, ensuring smooth asset dependency management.
 
 ### 🧩 Asset Converter (CPU to GPU)
-The Asset Converter transforms CPU objects (asset::IAsset) into GPU objects (video::IBackendObject) while eliminating duplicates. Instead of relying on pointer comparisons, it hashes asset contents to detect and reuse identical GPU objects.
+The Asset Converter transforms CPU objects (`asset::IAsset`) into GPU objects (`video::IBackendObject`) while eliminating duplicates with Merkle Trees. Instead of relying on pointer comparisons, it hashes asset contents to detect and reuse identical GPU objects.
 
 ### 🧩 Unit-Tested BxDFs for Physically Based Rendering
 A statically polymorphic library for defining Bidirectional Scattering Distribution Functions (BxDFs) in HLSL and C++. Each BxDF is rigorously unit-tested in C++ as well as HLSL. This is part of Nabla’s HLSL-C++ compatible library.
@@ -205,25 +208,23 @@ A statically polymorphic library for defining Bidirectional Scattering Distribut
 SPIR-V introspection in Nabla eliminates most of the boilerplate code required to set up descriptor and pipeline layouts, simplifying resource binding to shaders.
 
 ### 🧩 Nabla Extensions
-- ImGui integration.
-- Fast Fourier Transform for image processing and all kind of frequncy-domain fun.
-- Workgroup Prefix Sum – Efficient parallel prefix sum computation.
-- Blur – Optimized GPU-based image blurring.
-- Counting Sort – High-performance, GPU-accelerated sorting algorithm.
-- Autoexposure [Work in Progress] – Adaptive brightness adjustment for HDR rendering.
-- Tonemapping
-- GPU MPMC Queue – Multi-producer, multi-consumer GPU queue.
-- OptiX interoperability for ray tracing.
-- Global Scan – High-speed parallel scanning across large datasets.
-
-### 🚀 Coming Soon [TODO: Explain some better]
+- [ImGui integration](https://github.com/Devsh-Graphics-Programming/Nabla/tree/master/include/nbl/ext/ImGui) – `MultiDrawIndirect` based and draws in as little as a single drawcall.
+- [Fast Fourier Transform Extension](https://github.com/Devsh-Graphics-Programming/Nabla/tree/master/include/nbl/ext/FFT) – for image processing and all kind of frequncy-domain fun.
+- [Workgroup Prefix Sum](https://github.com/Devsh-Graphics-Programming/Nabla/tree/master/include/nbl/builtin/hlsl/workgroup) – Efficient parallel prefix sum computation.
+- [Blur](https://github.com/Devsh-Graphics-Programming/Nabla/blob/ff07cd71c4e21bc51fa416ccd151b2e92efea028/include/nbl/builtin/hlsl/prefix_sum_blur/blur.hlsl#L3) – Optimized GPU-based image blurring.
+- [Counting Sort](https://github.com/Devsh-Graphics-Programming/Nabla/blob/ff07cd71c4e21bc51fa416ccd151b2e92efea028/include/nbl/builtin/hlsl/sort/counting.hlsl) – High-performance, GPU-accelerated sorting algorithm.
+- [WIP] Autoexposure – Adaptive brightness adjustment for HDR rendering.
+- [WIP] Tonemapping
+- [WIP] GPU MPMC Queue – Multi-producer, multi-consumer GPU queue.
+- [WIP] OptiX interoperability for ray tracing.
+- [WIP] Global Scan – High-speed parallel scanning across large datasets.
+
+### 🚀 Coming Soon
 - Full CUDA interoperability support.
 - Scene Loaders
 - GPU-Driven Scene Graph
 - Material Compiler 2.0 for efficient scheduling of BxDF graph evaluation
 
-### [TODO?] IUtiltities?  Using Fixed-sized staging memory for easier cpu-gpu transfers? format promotion?
-
 # FAQ
 
 < FAQ >

From 6ee4b858caa64df1fd8b269a331d6f7437c629f8 Mon Sep 17 00:00:00 2001
From: Erfan Ahmadi <ahmadierfan99@gmail.com>
Date: Sun, 9 Feb 2025 09:23:41 +0400
Subject: [PATCH 12/21] more updates

---
 README.md | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index e8af2a62da..e0eb3aabd2 100644
--- a/README.md
+++ b/README.md
@@ -165,6 +165,9 @@ Nabla uses [reference counting]() to track the lifecycle of GPU objects. Descrip
 
 - 🛠️ Real-World Problem Solvers: The library offers GPU-optimized solutions for tasks like Prefix Sum, Binary Search, FFT, Global Sort, and even emulated `shaderFloat64` when native GPU support is unavailable!
 
+🎤 Talks from us:
+ - [Vulkanised 2024: Beyond SPIR-V: Single Source C++ and Shader Programming](https://www.youtube.com/watch?v=JCJ35dlZJb4)
+ - [Vulkanised 2023: HLSL202x like its C++, building an `std::` like Library]()
 
 ### 🧩 Full Embrace of [Buffer Device Address]() and [Descriptor Indexing]()
 
@@ -187,7 +190,7 @@ Nabla is built with interoperation in mind, supporting memory export and import
 [TODO]:
 - Cancellable Future based Async I/O
 - Virtual File System (archive mounting, our alternative to #embed, everything is referenced by absolute - path)
-- IUtiltities  Using Fixed-sized staging memory for easier cpu-gpu transfers? format promotion?
+- IUtiltities  Using Fixed-sized staging memory for easier cpu-gpu transfers? format promotion? SIntendedSubmitInfo
 ----
 
 ### 🧩 Asset System
@@ -200,6 +203,25 @@ The Asset Converter transforms CPU objects (`asset::IAsset`) into GPU objects (`
 ### 🧩 Unit-Tested BxDFs for Physically Based Rendering
 A statically polymorphic library for defining Bidirectional Scattering Distribution Functions (BxDFs) in HLSL and C++. Each BxDF is rigorously unit-tested in C++ as well as HLSL. This is part of Nabla’s HLSL-C++ compatible library.
 
+Part of our [BxDF Unit Test](https://github.com/Devsh-Graphics-Programming/Nabla-Examples-and-Tests/blob/d7f7a87fa08a56a16cd1bcc7d4d9fd48fc8c278c/66_HLSLBxDFTests/main.cpp#L93):
+
+```cpp
+TestJacobian<bxdf::reflection::SLambertianBxDF<sample_t, iso_interaction, aniso_interaction, spectral_t>>::run(initparams, cb);
+TestJacobian<bxdf::reflection::SOrenNayarBxDF<sample_t, iso_interaction, aniso_interaction, spectral_t>>::run(initparams, cb);
+TestJacobian<bxdf::reflection::SBeckmannBxDF<sample_t, iso_cache, aniso_cache, spectral_t>, false>::run(initparams, cb);
+TestJacobian<bxdf::reflection::SBeckmannBxDF<sample_t, iso_cache, aniso_cache, spectral_t>, true>::run(initparams, cb);
+TestJacobian<bxdf::reflection::SGGXBxDF<sample_t, iso_cache, aniso_cache, spectral_t>, false>::run(initparams, cb);
+TestJacobian<bxdf::reflection::SGGXBxDF<sample_t, iso_cache, aniso_cache, spectral_t>,true>::run(initparams, cb);
+
+TestJacobian<bxdf::transmission::SLambertianBxDF<sample_t, iso_interaction, aniso_interaction, spectral_t>>::run(initparams, cb);
+TestJacobian<bxdf::transmission::SSmoothDielectricBxDF<sample_t, iso_cache, aniso_cache, spectral_t>>::run(initparams, cb);
+TestJacobian<bxdf::transmission::SSmoothDielectricBxDF<sample_t, iso_cache, aniso_cache, spectral_t, true>>::run(initparams, cb);
+TestJacobian<bxdf::transmission::SBeckmannDielectricBxDF<sample_t, iso_cache, aniso_cache, spectral_t>, false>::run(initparams, cb);
+TestJacobian<bxdf::transmission::SBeckmannDielectricBxDF<sample_t, iso_cache, aniso_cache, spectral_t>, true>::run(initparams, cb);
+TestJacobian<bxdf::transmission::SGGXDielectricBxDF<sample_t, iso_cache, aniso_cache, spectral_t>, false>::run(initparams, cb);
+TestJacobian<bxdf::transmission::SGGXDielectricBxDF<sample_t, iso_cache, aniso_cache, spectral_t>,true>::run(initparams, cb);
+```
+
 ### 🔧 In Progress: Property Pools (GPU Entity Component System)
 *Property Pools* group related properties together in a Structure Of Arrays (SoA) manner, allowing efficient, cache-friendly access to data on the GPU. The system enables transferring properties (Components) between the CPU and GPU, with the `PropertyPoolHandler` managing scattered updates with a special compute shader. Handles are assigned for each object and remain constant as data is added or removed.
 

From 78b0d335982e8713a9b65d9c026e318f46401566 Mon Sep 17 00:00:00 2001
From: Erfan Ahmadi <ahmadierfan99@gmail.com>
Date: Sun, 9 Feb 2025 09:43:04 +0400
Subject: [PATCH 13/21] Data Transfer Utilities section

---
 README.md | 32 ++++++++++++++++++++++++--------
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/README.md b/README.md
index e0eb3aabd2..7589ffaf37 100644
--- a/README.md
+++ b/README.md
@@ -140,7 +140,7 @@ m_api->endCapture(); // End Renderdoc Capture
 
 ### 🧩 Nabla Event Handler: Seamless GPU-CPU Synchronization
 
-Nabla Event Handler's extensive usage of [Timeline Semaphores]() enables CPU Callbacks on GPU conditions.
+Nabla Event Handler's extensive usage of [Timeline Semaphores](https://www.khronos.org/blog/vulkan-timeline-semaphores) enables CPU Callbacks on GPU conditions.
 
 You can enqueue callbacks that trigger upon submission completion (workload finish), enabling amongst others, async readback of submission side effects, or deallocating an allocation after a workload is finished.
 
@@ -179,19 +179,35 @@ Nabla's minimally invasive and flexible design with api handle acquisitions and
 
 This allows simpler porting of legacy OpenGL and DirectX applications.
 
-[TDOO:Insert Image]
+[TDOO:Insert Diff Image]
 
 
 ### 🧩 Designed for Interoperation
 Nabla is built with interoperation in mind, supporting memory export and import between different compute and graphics APIs.
 
-----
+### 🧩 TODO: Cancellable Future based Async I/O
+```
+somewhere here you can add "No Singletons, No Main Thread"
+
+Basically you can have as many instances of every object as you please (VK device), there's no assumption of a main thread or threadwise contexts.
+
+Not thread safe, but thread agnostic, we avoid global state, we pass contexts around explicitly to allow for easy multithreading (e.g. no mutable state in factory classes).
+
+Can also mentioned that we managed to wrap Win32 windowing in a way that lets you use it from multiple threads.
+```
+
+
+### 🧩 Data Transfer Utilities
+Nabla's [Utilities](https://github.com/Devsh-Graphics-Programming/Nabla/blob/master/include/nbl/video/utilities/IUtilities.h) streamlines the process of pushing/pulling arbitrary-sized buffers and images with fixed staging memory to/from the GPU, ensuring seamless data transfers.
+ The system automatically handles submission when buffer memory overflows, while [promoting unsupported formats](https://github.com/Devsh-Graphics-Programming/Nabla/tree/dac9855ab4a98d764130e41a69abdc605a91092c/include/nbl/asset/format) during upload to handle color format conversions.
+By leveraging device-specific properties, the system respects alignment limits and ensures deterministic behavior. The user only provides initial submission info through [SIntendedSubmitInfo](), and the utility manages subsequent submissions automatically.
+
+ - Learn more:
+   - 🎤 Our Talk at Vulkanised: [Vulkanised 2023: Keeping your staging buffer fixed size! ](https://www.youtube.com/watch?v=x8v656d3pc4)
+   - 📚 Our Blog post: [Uploading Textures to GPU - The Good Way](https://erfan-ahmadi.github.io/blog/Nabla/imageupload)
+
 
-[TODO]:
-- Cancellable Future based Async I/O
-- Virtual File System (archive mounting, our alternative to #embed, everything is referenced by absolute - path)
-- IUtiltities  Using Fixed-sized staging memory for easier cpu-gpu transfers? format promotion? SIntendedSubmitInfo
-----
+### 🧩 TODO: Virtual File System (archive mounting, our alternative to #embed, everything is referenced by absolute - path)
 
 ### 🧩 Asset System
 The asset system in Nabla maintains a 1:1 mapping between CPU and GPU representations, where every CPU asset has a direct GPU counterpart.

From 6b450fbfd59c3ee57aca4b63dd853ea8edbc81e4 Mon Sep 17 00:00:00 2001
From: Arkadiusz Lachowicz <34793522+AnastaZIuk@users.noreply.github.com>
Date: Sun, 9 Feb 2025 10:58:30 +0100
Subject: [PATCH 14/21] Update README.md, add GDI diff render with details
 section

---
 README.md | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 7589ffaf37..e8f9703414 100644
--- a/README.md
+++ b/README.md
@@ -179,8 +179,12 @@ Nabla's minimally invasive and flexible design with api handle acquisitions and
 
 This allows simpler porting of legacy OpenGL and DirectX applications.
 
-[TDOO:Insert Diff Image]
-
+<p align="center">
+<details open>
+<summary>Comparison of GDI renders</summary>
+  <img src="https://github.com/user-attachments/assets/d6331212-a9fd-4ab5-9745-783ccd014c1d" alt="GDI diff" style="width:100%; height:auto; vertical-align:top; align:top;">
+</details>
+</p>
 
 ### 🧩 Designed for Interoperation
 Nabla is built with interoperation in mind, supporting memory export and import between different compute and graphics APIs.

From 4bff4f54e7262ed41456db4f995cf210e1496a29 Mon Sep 17 00:00:00 2001
From: Erfan Ahmadi <ahmadierfan99@gmail.com>
Date: Mon, 10 Feb 2025 15:27:24 +0400
Subject: [PATCH 15/21] remaining TODOs

---
 README.md | 74 +++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 47 insertions(+), 27 deletions(-)

diff --git a/README.md b/README.md
index e8f9703414..7405cefff4 100644
--- a/README.md
+++ b/README.md
@@ -96,14 +96,19 @@ Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean eu odio gravida,
 
 # Features
 
-### 🧩 The Nabla Core Profile
+### 🧩 **The Nabla Core Profile**
 
 Nabla exposes [a curated set of Vulkan extensions and features](https://github.com/Devsh-Graphics-Programming/Nabla/blob/master/src/nbl/video/vulkan/profiles/NablaCore.json) compatible across the GPUs we aim to support on Windows, Linux, (coming soon MacOS, iOS as well as Android)
 
 Vulkan evolves fast—just when you think you've figured out [sync](), you realize there's [sync2](). Keeping up with new extensions, best practices, and hardware quirks is exhausting.
-Instead of digging through [gpuinfo.org](gpuinfo.org) or [Vulkan specs](), Nabla gives you a well-thought-out set of extensions—so you can focus on what you want to achieve, not get lost in extreme details.
+Instead of digging through [gpuinfo.org](gpuinfo.org) or [Vulkan specs](), Nabla gives you a well-thought-out set of extensions—so you can focus on what you want to achieve, not get stuck in an eternal loop of:
+  - mastering a feature
+  - finding out about a new feature
+  - assesing whether obsoletes or just adds the one you've just mastered
+  - working if the feature is ubiquitous on the devices you target
+  - rewriting what you've just polished
 
-### 🧩 Physical Device Selection and Filteration
+### 🧩 **Physical Device Selection and Filteration**
 
 Nabla allows you to select the best GPU for your compute or graphics workload.
 
@@ -118,11 +123,11 @@ void filterDevices(core::set<video::IPhysicalDevice*>& physicalDevices)
 }
 ```
 
-### 🧩 SPIR-V and Vulkan as First-Class Citizens
+### 🧩 **SPIR-V and Vulkan as First-Class Citizens**
 
 Nabla treats **SPIR-V** and **Vulkan** as the preferred, reference standard—everything else is built around them, with all other backends adapting to them.
 
-### 🧩 Integration of Renderdoc
+### 🧩 **Integration of Renderdoc**
 
 Built-in support for capturing frames and debugging with [Renderdoc](https://renderdoc.org/).
  This is how one debugs headless or async GPU workloads that are not directly involved in producing a swapchain frame to be captured by Renderdoc.
@@ -138,7 +143,7 @@ queue->submit({&submitInfo,1});
 m_api->endCapture(); // End Renderdoc Capture
 ```
 
-### 🧩 Nabla Event Handler: Seamless GPU-CPU Synchronization
+### 🧩 **Nabla Event Handler: Seamless GPU-CPU Synchronization**
 
 Nabla Event Handler's extensive usage of [Timeline Semaphores](https://www.khronos.org/blog/vulkan-timeline-semaphores) enables CPU Callbacks on GPU conditions.
 
@@ -149,11 +154,11 @@ You can enqueue callbacks that trigger upon submission completion (workload fini
 memory_pool->deallocate(&offset,&size,nextSubmit.getFutureScratchSemaphore());
 ```
 
-### 🧩 GPU Object Lifecycle Tracking
+### 🧩 **GPU Object Lifecycle Tracking**
 
 Nabla uses [reference counting]() to track the lifecycle of GPU objects. Descriptor sets and command buffers are responsible for maintaining reference counts on the resources (e.g., buffers, textures) they use. The queue itself also tracks command buffers, ensuring that objects remain alive as long as they are pending execution. This system guarantees the correct order of deletion and makes it difficult for GPU objects to go out of scope and be destroyed before the GPU has finished using them.
 
-### 🧩 HLSL2021 Standard Template Library
+### 🧩 **HLSL2021 Standard Template Library**
 
 - 🔄 Reusable: Unified single-source C++/HLSL libraries eliminate code duplication with reimplementation of STL's `type_traits`, `limits`, `functional`, `tgmath`, etc.
 
@@ -169,14 +174,18 @@ Nabla uses [reference counting]() to track the lifecycle of GPU objects. Descrip
  - [Vulkanised 2024: Beyond SPIR-V: Single Source C++ and Shader Programming](https://www.youtube.com/watch?v=JCJ35dlZJb4)
  - [Vulkanised 2023: HLSL202x like its C++, building an `std::` like Library]()
 
-### 🧩 Full Embrace of [Buffer Device Address]() and [Descriptor Indexing]()
+### 🧩 **Full Embrace of [Buffer Device Address]() and [Descriptor Indexing]()**
 
 By utilizing Buffer Device Addresses (BDAs), Nabla enables more direct access to memory through 64-bit GPU virtual addresses. Synergized with Descriptor Indexing, this approach enhances flexibility by enabling more dynamic, scalable resource binding without relying on traditional descriptor sets.
 
-### 🧩 Minimally Invasive Design
+### 🧩 **Minimally Invasive Design**
+
+No Singletons, No Main Thread—Nabla allows multiple instances of every object (including Vulkan devices) without assuming a main thread or thread-local contexts. Thread-agnostic by design, it avoids global state and explicitly passes contexts for easy multithreading.
 
 Nabla's minimally invasive and flexible design with api handle acquisitions and multi-window support make it ideal for custom rendering setups and low-level GPU programming without unnecessary constraints such as assuming a main thread or a single window.
 
+Even Win32 windowing is wrapped for use across multiple threads, breaking traditional single-thread limitations.
+
 This allows simpler porting of legacy OpenGL and DirectX applications.
 
 <p align="center">
@@ -186,22 +195,22 @@ This allows simpler porting of legacy OpenGL and DirectX applications.
 </details>
 </p>
 
-### 🧩 Designed for Interoperation
+### 🧩 **Designed for Interoperation**
 Nabla is built with interoperation in mind, supporting memory export and import between different compute and graphics APIs.
 
-### 🧩 TODO: Cancellable Future based Async I/O
-```
-somewhere here you can add "No Singletons, No Main Thread"
+### 🧩 **Cancellable Future based Async I/O**
 
-Basically you can have as many instances of every object as you please (VK device), there's no assumption of a main thread or threadwise contexts.
+File I/O is fully asynchronous, using nbl::system::future_t, a cancellable MPSC circular buffer-based future implementation.
 
-Not thread safe, but thread agnostic, we avoid global state, we pass contexts around explicitly to allow for easy multithreading (e.g. no mutable state in factory classes).
+Requests start in a PENDING state and can be invalidated before execution if needed. This enables efficient async file reads and GPU memory writes, ensuring non-blocking execution:
 
-Can also mentioned that we managed to wrap Win32 windowing in a way that lets you use it from multiple threads.
+```cpp
+ISystem::future_t<size_t> bytesActuallyWritten;
+file->read(bytesActuallyWritten, gpuMemory->getMappedPointer(), offsetInFile, 2*1024*1024*1024);
+while (!bytesActuallyWritten.ready()) { /* Do other work */ }
 ```
 
-
-### 🧩 Data Transfer Utilities
+### 🧩 **Data Transfer Utilities**
 Nabla's [Utilities](https://github.com/Devsh-Graphics-Programming/Nabla/blob/master/include/nbl/video/utilities/IUtilities.h) streamlines the process of pushing/pulling arbitrary-sized buffers and images with fixed staging memory to/from the GPU, ensuring seamless data transfers.
  The system automatically handles submission when buffer memory overflows, while [promoting unsupported formats](https://github.com/Devsh-Graphics-Programming/Nabla/tree/dac9855ab4a98d764130e41a69abdc605a91092c/include/nbl/asset/format) during upload to handle color format conversions.
 By leveraging device-specific properties, the system respects alignment limits and ensures deterministic behavior. The user only provides initial submission info through [SIntendedSubmitInfo](), and the utility manages subsequent submissions automatically.
@@ -211,16 +220,27 @@ By leveraging device-specific properties, the system respects alignment limits a
    - 📚 Our Blog post: [Uploading Textures to GPU - The Good Way](https://erfan-ahmadi.github.io/blog/Nabla/imageupload)
 
 
-### 🧩 TODO: Virtual File System (archive mounting, our alternative to #embed, everything is referenced by absolute - path)
+### 🧩 **Virtual File System**  
+
+Nabla provides a [**unified Virtual File System**]() (`system::ISystem`) that supports **mounting archives and folders** under different virtual paths. This enables access to both external and embedded assets while preserving **original relative paths**.  
+
+For embedding, we provide an alternative to C++23's #embed, which allows embedding files directly into compiled binaries. Instead of relying on compiler support, we use **Python + CMake** to generate what we call **built-in resource archives**—packing files (e.g., images, shaders, `.obj`, `.mtl`, `.dds`) into DLLs as **memory-mapped `system::IFile` objects** ensuring that dependent assets (e.g., models and their textures) **retain their correct relative paths** even when embedded.  
+
+The embedding process:  
+1. **At build time**, Python reads an input path table (generated by CMake).  
+2. It serializes files into **constexpr arrays** with metadata (key + timestamps).  
+3. The output **C++ source + header** define a **built-in resource library**, linked into Nabla or examples.  
+
+This approach keeps assets self-contained, making file access efficient while maintaining asset dependencies.
 
-### 🧩 Asset System
+### 🧩 **Asset System**
 The asset system in Nabla maintains a 1:1 mapping between CPU and GPU representations, where every CPU asset has a direct GPU counterpart.
 The system also allows for coordination between loaders—for instance, the OBJ loader can trigger the MTL loader, and the MTL loader in turn invokes image loaders, ensuring smooth asset dependency management.
 
-### 🧩 Asset Converter (CPU to GPU)
+### 🧩 **Asset Converter (CPU to GPU)**
 The Asset Converter transforms CPU objects (`asset::IAsset`) into GPU objects (`video::IBackendObject`) while eliminating duplicates with Merkle Trees. Instead of relying on pointer comparisons, it hashes asset contents to detect and reuse identical GPU objects.
 
-### 🧩 Unit-Tested BxDFs for Physically Based Rendering
+### 🧩 **Unit-Tested BxDFs for Physically Based Rendering**
 A statically polymorphic library for defining Bidirectional Scattering Distribution Functions (BxDFs) in HLSL and C++. Each BxDF is rigorously unit-tested in C++ as well as HLSL. This is part of Nabla’s HLSL-C++ compatible library.
 
 Part of our [BxDF Unit Test](https://github.com/Devsh-Graphics-Programming/Nabla-Examples-and-Tests/blob/d7f7a87fa08a56a16cd1bcc7d4d9fd48fc8c278c/66_HLSLBxDFTests/main.cpp#L93):
@@ -242,14 +262,14 @@ TestJacobian<bxdf::transmission::SGGXDielectricBxDF<sample_t, iso_cache, aniso_c
 TestJacobian<bxdf::transmission::SGGXDielectricBxDF<sample_t, iso_cache, aniso_cache, spectral_t>,true>::run(initparams, cb);
 ```
 
-### 🔧 In Progress: Property Pools (GPU Entity Component System)
+### 🔧 **In Progress: Property Pools (GPU Entity Component System)**
 *Property Pools* group related properties together in a Structure Of Arrays (SoA) manner, allowing efficient, cache-friendly access to data on the GPU. The system enables transferring properties (Components) between the CPU and GPU, with the `PropertyPoolHandler` managing scattered updates with a special compute shader. Handles are assigned for each object and remain constant as data is added or removed.
 
-### 🧩 SPIR-V Introspection and Layout Creation
+### 🧩 **SPIR-V Introspection and Layout Creation**
 
 SPIR-V introspection in Nabla eliminates most of the boilerplate code required to set up descriptor and pipeline layouts, simplifying resource binding to shaders.
 
-### 🧩 Nabla Extensions
+### 🧩 **Nabla Extensions**
 - [ImGui integration](https://github.com/Devsh-Graphics-Programming/Nabla/tree/master/include/nbl/ext/ImGui) – `MultiDrawIndirect` based and draws in as little as a single drawcall.
 - [Fast Fourier Transform Extension](https://github.com/Devsh-Graphics-Programming/Nabla/tree/master/include/nbl/ext/FFT) – for image processing and all kind of frequncy-domain fun.
 - [Workgroup Prefix Sum](https://github.com/Devsh-Graphics-Programming/Nabla/tree/master/include/nbl/builtin/hlsl/workgroup) – Efficient parallel prefix sum computation.
@@ -261,7 +281,7 @@ SPIR-V introspection in Nabla eliminates most of the boilerplate code required t
 - [WIP] OptiX interoperability for ray tracing.
 - [WIP] Global Scan – High-speed parallel scanning across large datasets.
 
-### 🚀 Coming Soon
+### 🚀 **Coming Soon**
 - Full CUDA interoperability support.
 - Scene Loaders
 - GPU-Driven Scene Graph

From 68b52cf91057f780ba683d50e4486fb7c8588e4f Mon Sep 17 00:00:00 2001
From: Erfan <ahmadierfan99@gmail.com>
Date: Mon, 10 Feb 2025 15:06:25 +0330
Subject: [PATCH 16/21] Update README.md

---
 README.md | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/README.md b/README.md
index 7405cefff4..4aea4a5f63 100644
--- a/README.md
+++ b/README.md
@@ -189,10 +189,11 @@ Even Win32 windowing is wrapped for use across multiple threads, breaking tradit
 This allows simpler porting of legacy OpenGL and DirectX applications.
 
 <p align="center">
-<details open>
-<summary>Comparison of GDI renders</summary>
-  <img src="https://github.com/user-attachments/assets/d6331212-a9fd-4ab5-9745-783ccd014c1d" alt="GDI diff" style="width:100%; height:auto; vertical-align:top; align:top;">
-</details>
+  <div style="display: flex; justify-content: center; gap: 10px;">
+    <img src="https://github.com/user-attachments/assets/1add9cbd-fabc-4e97-b4a1-373ccefa3d8a" alt="GDI 1" style="width: 30%; height: auto;">
+    <img src="https://github.com/user-attachments/assets/97efeb67-d78c-4010-a0a2-198958b3deeb" alt="GDI 2" style="width: 30%; height: auto;">
+    <img src="https://github.com/user-attachments/assets/82009094-81e5-4146-8f1a-5bac7e13f722" alt="GDI 3" style="width: 30%; height: auto;">
+  </div>
 </p>
 
 ### 🧩 **Designed for Interoperation**

From e353e2504dae6f98d9cd8a9638148365c26d0fff Mon Sep 17 00:00:00 2001
From: Erfan Ahmadi <ahmadierfan99@gmail.com>
Date: Mon, 10 Feb 2025 15:45:02 +0400
Subject: [PATCH 17/21] permalinks and small edits

---
 README.md | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/README.md b/README.md
index 4aea4a5f63..87d05dc92a 100644
--- a/README.md
+++ b/README.md
@@ -100,8 +100,8 @@ Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean eu odio gravida,
 
 Nabla exposes [a curated set of Vulkan extensions and features](https://github.com/Devsh-Graphics-Programming/Nabla/blob/master/src/nbl/video/vulkan/profiles/NablaCore.json) compatible across the GPUs we aim to support on Windows, Linux, (coming soon MacOS, iOS as well as Android)
 
-Vulkan evolves fast—just when you think you've figured out [sync](), you realize there's [sync2](). Keeping up with new extensions, best practices, and hardware quirks is exhausting.
-Instead of digging through [gpuinfo.org](gpuinfo.org) or [Vulkan specs](), Nabla gives you a well-thought-out set of extensions—so you can focus on what you want to achieve, not get stuck in an eternal loop of:
+Vulkan evolves fast—just when you think you've figured out [sync](https://github.com/KhronosGroup/Vulkan-Docs/wiki/Synchronization-Examples-(Legacy-synchronization-APIs)), you realize there's [sync2](https://registry.khronos.org/vulkan/specs/latest/man/html/VK_KHR_synchronization2.html). Keeping up with new extensions, best practices, and hardware quirks is exhausting.
+Instead of digging through [gpuinfo.org](gpuinfo.org) or [Vulkan specs](https://registry.khronos.org/vulkan/specs/latest/html/vkspec.html), Nabla gives you a well-thought-out set of extensions—so you can focus on what you want to achieve, not get stuck in an eternal loop of:
   - mastering a feature
   - finding out about a new feature
   - assesing whether obsoletes or just adds the one you've just mastered
@@ -156,7 +156,7 @@ memory_pool->deallocate(&offset,&size,nextSubmit.getFutureScratchSemaphore());
 
 ### 🧩 **GPU Object Lifecycle Tracking**
 
-Nabla uses [reference counting]() to track the lifecycle of GPU objects. Descriptor sets and command buffers are responsible for maintaining reference counts on the resources (e.g., buffers, textures) they use. The queue itself also tracks command buffers, ensuring that objects remain alive as long as they are pending execution. This system guarantees the correct order of deletion and makes it difficult for GPU objects to go out of scope and be destroyed before the GPU has finished using them.
+Nabla uses [reference counting](https://github.com/Devsh-Graphics-Programming/Nabla/blob/ff07cd71c4e21bc51fa416ccd151b2e92efea028/include/nbl/core/decl/smart_refctd_ptr.h#L22) to track the lifecycle of GPU objects. Descriptor sets and command buffers are responsible for maintaining reference counts on the resources (e.g., buffers, textures) they use. The queue itself also tracks command buffers, ensuring that objects remain alive as long as they are pending execution. This system guarantees the correct order of deletion and makes it difficult for GPU objects to go out of scope and be destroyed before the GPU has finished using them.
 
 ### 🧩 **HLSL2021 Standard Template Library**
 
@@ -164,7 +164,7 @@ Nabla uses [reference counting]() to track the lifecycle of GPU objects. Descrip
 
 - 🐞 Shader Logic, CPU-Tested: A subset of HLSL compiles as both C++ and SPIR-V, enabling CPU-side debugging of GPU logic, ensuring correctness in complex tasks like FFT, Prefix Sum, etc. (See our examples: [1. BxDF Unit Test](https://github.com/Devsh-Graphics-Programming/Nabla-Examples-and-Tests/blob/d7f7a87fa08a56a16cd1bcc7d4d9fd48fc8c278c/66_HLSLBxDFTests/app_resources/tests.hlsl#L436), [2. Math Funcs Unit Test](https://github.com/Devsh-Graphics-Programming/Nabla-Examples-and-Tests/blob/fd92730f0f5c8a120782c928309cb10e776c25db/22_CppCompat/main.cpp#L407))
 
-- 🔮 Future-Proof: C++20 concepts in HLSL enable safe and documented polymorphism.
+- 🔮 Future-Proof: C++20 [concepts](https://en.cppreference.com/w/cpp/language/constraints) in HLSL enable safe and documented polymorphism.
 
 - 🧠 Insane: Boost Preprocessor and Template Metaprogramming in HLSL!
 
@@ -201,9 +201,9 @@ Nabla is built with interoperation in mind, supporting memory export and import
 
 ### 🧩 **Cancellable Future based Async I/O**
 
-File I/O is fully asynchronous, using nbl::system::future_t, a cancellable MPSC circular buffer-based future implementation.
+File I/O is fully asynchronous, using [nbl::system::future_t](https://github.com/Devsh-Graphics-Programming/Nabla/blob/ff07cd71c4e21bc51fa416ccd151b2e92efea028/include/nbl/system/ISystem.h#L26), a cancellable MPSC circular buffer-based future implementation.
 
-Requests start in a PENDING state and can be invalidated before execution if needed. This enables efficient async file reads and GPU memory writes, ensuring non-blocking execution:
+Requests start in a **PENDING** state and can be invalidated before execution if needed. This enables efficient async file reads and GPU memory writes, ensuring non-blocking execution:
 
 ```cpp
 ISystem::future_t<size_t> bytesActuallyWritten;
@@ -214,7 +214,7 @@ while (!bytesActuallyWritten.ready()) { /* Do other work */ }
 ### 🧩 **Data Transfer Utilities**
 Nabla's [Utilities](https://github.com/Devsh-Graphics-Programming/Nabla/blob/master/include/nbl/video/utilities/IUtilities.h) streamlines the process of pushing/pulling arbitrary-sized buffers and images with fixed staging memory to/from the GPU, ensuring seamless data transfers.
  The system automatically handles submission when buffer memory overflows, while [promoting unsupported formats](https://github.com/Devsh-Graphics-Programming/Nabla/tree/dac9855ab4a98d764130e41a69abdc605a91092c/include/nbl/asset/format) during upload to handle color format conversions.
-By leveraging device-specific properties, the system respects alignment limits and ensures deterministic behavior. The user only provides initial submission info through [SIntendedSubmitInfo](), and the utility manages subsequent submissions automatically.
+By leveraging device-specific properties, the system respects alignment limits and ensures deterministic behavior. The user only provides initial submission info through [SIntendedSubmitInfo](https://github.com/Devsh-Graphics-Programming/Nabla/blob/ff07cd71c4e21bc51fa416ccd151b2e92efea028/include/nbl/video/utilities/SIntendedSubmitInfo.h#L18), and the utility manages subsequent submissions automatically.
 
  - Learn more:
    - 🎤 Our Talk at Vulkanised: [Vulkanised 2023: Keeping your staging buffer fixed size! ](https://www.youtube.com/watch?v=x8v656d3pc4)
@@ -223,9 +223,9 @@ By leveraging device-specific properties, the system respects alignment limits a
 
 ### 🧩 **Virtual File System**  
 
-Nabla provides a [**unified Virtual File System**]() (`system::ISystem`) that supports **mounting archives and folders** under different virtual paths. This enables access to both external and embedded assets while preserving **original relative paths**.  
+Nabla provides a [**unified Virtual File System**] ([system::ISystem](https://github.com/Devsh-Graphics-Programming/Nabla/blob/ff07cd71c4e21bc51fa416ccd151b2e92efea028/include/nbl/system/ISystem.h#L19)) that supports **mounting archives and folders** under different virtual paths. This enables access to both external and embedded assets while preserving **original relative paths**.  
 
-For embedding, we provide an alternative to C++23's #embed, which allows embedding files directly into compiled binaries. Instead of relying on compiler support, we use **Python + CMake** to generate what we call **built-in resource archives**—packing files (e.g., images, shaders, `.obj`, `.mtl`, `.dds`) into DLLs as **memory-mapped `system::IFile` objects** ensuring that dependent assets (e.g., models and their textures) **retain their correct relative paths** even when embedded.  
+For embedding, we provide an alternative to C++23's #embed, which allows embedding files directly into compiled binaries. Instead of relying on compiler support, we use **Python + CMake** to generate what we call **built-in resource archives**—packing files (e.g., images, shaders, `.obj`, `.mtl`, `.dds`) into DLLs as **memory-mapped [system::IFile](https://github.com/Devsh-Graphics-Programming/Nabla/blob/ff07cd71c4e21bc51fa416ccd151b2e92efea028/include/nbl/system/IFile.h#L9) objects** ensuring that dependent assets (e.g., models and their textures) **retain their correct relative paths** even when embedded.  
 
 The embedding process:  
 1. **At build time**, Python reads an input path table (generated by CMake).  
@@ -244,7 +244,7 @@ The Asset Converter transforms CPU objects (`asset::IAsset`) into GPU objects (`
 ### 🧩 **Unit-Tested BxDFs for Physically Based Rendering**
 A statically polymorphic library for defining Bidirectional Scattering Distribution Functions (BxDFs) in HLSL and C++. Each BxDF is rigorously unit-tested in C++ as well as HLSL. This is part of Nabla’s HLSL-C++ compatible library.
 
-Part of our [BxDF Unit Test](https://github.com/Devsh-Graphics-Programming/Nabla-Examples-and-Tests/blob/d7f7a87fa08a56a16cd1bcc7d4d9fd48fc8c278c/66_HLSLBxDFTests/main.cpp#L93):
+Snippet of our [BxDF Unit Test](https://github.com/Devsh-Graphics-Programming/Nabla-Examples-and-Tests/blob/d7f7a87fa08a56a16cd1bcc7d4d9fd48fc8c278c/66_HLSLBxDFTests/main.cpp#L93):
 
 ```cpp
 TestJacobian<bxdf::reflection::SLambertianBxDF<sample_t, iso_interaction, aniso_interaction, spectral_t>>::run(initparams, cb);

From 30b78e4642cc9f56df0001a13dcf02c8cc6a8b57 Mon Sep 17 00:00:00 2001
From: Erfan Ahmadi <ahmadierfan99@gmail.com>
Date: Mon, 10 Feb 2025 16:09:28 +0400
Subject: [PATCH 18/21] Need Our Expertise?

---
 README.md | 27 ++++++++++++++++++++++-----
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/README.md b/README.md
index 87d05dc92a..074e14f17b 100644
--- a/README.md
+++ b/README.md
@@ -288,14 +288,31 @@ SPIR-V introspection in Nabla eliminates most of the boilerplate code required t
 - GPU-Driven Scene Graph
 - Material Compiler 2.0 for efficient scheduling of BxDF graph evaluation
 
-# FAQ
+# Need Our Expertise?
 
-< FAQ >
+We specialize in:
+- High-performance computing and performance optimization
+- Path Tracing and Physically Based Rendering
+- CAD Rendering
+- Audio Programming and Digital Signal Processing
+- Porting and Optimizing legacy Renderers
+- Graphics and Compute APIs:
+  - Vulkan, D3D12, CUDA, OpenCL, WebGPU, D3D11, OpenGL
 
-# Get expert
+Whether you're optimizing your **renderer** or **compute workloads**, looking to **port your legacy renderer**, or integrating complex **visual effects** into your product, our team can help you. As a specialized team, we're constantly learning, evolving, and discussing matters with each other. [Each member](#join-our-team) brings unique insights to the table, ensuring we approach every project from multiple angles to achieve the best possible solution.
 
-< TODO >
+Our primary language is **C++20**, but we also work with **C#**, **Java**, **Python**, and other related technologies.
+
+If you're already here reading this, We want to hear from you and learn more about what you're building.
+
+**Contact us** at **newclients@devsh.eu**.
+
+The members of **Devsh Graphics Programming Sp. z O.O.** (Company Registration (KRS) #: 0000764661) are available (individually or collectively) for contracts on projects of various scopes and timescales.
+
+---
+
+Let me know if you need any further modifications!
 
 # Join our team
 
-< TODO >
+[TODO]: also link to achievements, personal blogs, websites, linkedin and presentations of each member

From 61ef2af2118e9c9bf2dfee10bf31f3697f0faf3f Mon Sep 17 00:00:00 2001
From: Erfan Ahmadi <ahmadierfan99@gmail.com>
Date: Mon, 10 Feb 2025 16:16:55 +0400
Subject: [PATCH 19/21] testing emoji. may remove later

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 074e14f17b..68a829eefe 100644
--- a/README.md
+++ b/README.md
@@ -288,7 +288,7 @@ SPIR-V introspection in Nabla eliminates most of the boilerplate code required t
 - GPU-Driven Scene Graph
 - Material Compiler 2.0 for efficient scheduling of BxDF graph evaluation
 
-# Need Our Expertise?
+# 🤝 Need Our Expertise?
 
 We specialize in:
 - High-performance computing and performance optimization

From 062d3532a03506f97b9e34e78ae8677025c8c401 Mon Sep 17 00:00:00 2001
From: Erfan <ahmadierfan99@gmail.com>
Date: Tue, 11 Feb 2025 09:32:08 +0330
Subject: [PATCH 20/21] Update README.md

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 68a829eefe..a0de0d639e 100644
--- a/README.md
+++ b/README.md
@@ -184,7 +184,7 @@ No Singletons, No Main Thread—Nabla allows multiple instances of every object
 
 Nabla's minimally invasive and flexible design with api handle acquisitions and multi-window support make it ideal for custom rendering setups and low-level GPU programming without unnecessary constraints such as assuming a main thread or a single window.
 
-Even Win32 windowing is wrapped for use across multiple threads, breaking traditional single-thread limitations.
+Even Win32 windowing is wrapped for use across multiple threads, breaking free traditional single-thread limitations.
 
 This allows simpler porting of legacy OpenGL and DirectX applications.
 

From b8d75d8649db78e643c190773298387943a41eca Mon Sep 17 00:00:00 2001
From: Erfan <ahmadierfan99@gmail.com>
Date: Tue, 11 Feb 2025 09:32:28 +0330
Subject: [PATCH 21/21] Update README.md

---
 README.md | 2 --
 1 file changed, 2 deletions(-)

diff --git a/README.md b/README.md
index a0de0d639e..842fb35da1 100644
--- a/README.md
+++ b/README.md
@@ -311,8 +311,6 @@ The members of **Devsh Graphics Programming Sp. z O.O.** (Company Registration (
 
 ---
 
-Let me know if you need any further modifications!
-
 # Join our team
 
 [TODO]: also link to achievements, personal blogs, websites, linkedin and presentations of each member