Skip to content

Commit 57fa558

Browse files
Documentation: Added "Recommended usage patterns" chapter.
1 parent 20622c6 commit 57fa558

File tree

8 files changed

+375
-114
lines changed

8 files changed

+375
-114
lines changed

docs/html/index.html

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,11 @@ <h1><a class="anchor" id="main_table_of_contents"></a>
108108
</li>
109109
</ul>
110110
</li>
111+
<li><a class="el" href="usage_patterns.html">Recommended usage patterns</a><ul>
112+
<li><a class="el" href="usage_patterns.html#usage_patterns_simple">Simple patterns</a></li>
113+
<li><a class="el" href="usage_patterns.html#usage_patterns_advanced">Advanced patterns</a></li>
114+
</ul>
115+
</li>
111116
<li><a class="el" href="configuration.html">Configuration</a><ul>
112117
<li><a class="el" href="configuration.html#config_Vulkan_functions">Pointers to Vulkan functions</a></li>
113118
<li><a class="el" href="configuration.html#custom_memory_allocator">Custom host memory allocator</a></li>

docs/html/search/all_b.js

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
var searchData=
22
[
3-
['requiredflags',['requiredFlags',['../struct_vma_allocation_create_info.html#a9166390303ff42d783305bc31c2b6b90',1,'VmaAllocationCreateInfo']]]
3+
['requiredflags',['requiredFlags',['../struct_vma_allocation_create_info.html#a9166390303ff42d783305bc31c2b6b90',1,'VmaAllocationCreateInfo']]],
4+
['recommended_20usage_20patterns',['Recommended usage patterns',['../usage_patterns.html',1,'index']]]
45
];

docs/html/search/pages_7.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
var searchData=
22
[
3-
['statistics',['Statistics',['../statistics.html',1,'index']]]
3+
['recommended_20usage_20patterns',['Recommended usage patterns',['../usage_patterns.html',1,'index']]]
44
];

docs/html/search/pages_8.js

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
11
var searchData=
22
[
3-
['vulkan_20memory_20allocator',['Vulkan Memory Allocator',['../index.html',1,'']]],
4-
['vk_5fkhr_5fdedicated_5fallocation',['VK_KHR_dedicated_allocation',['../vk_khr_dedicated_allocation.html',1,'index']]]
3+
['statistics',['Statistics',['../statistics.html',1,'index']]]
54
];

docs/html/search/searchdata.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ var indexSectionsWithContent =
99
6: "v",
1010
7: "v",
1111
8: "v",
12-
9: "acdglmqsv"
12+
9: "acdglmqrsv"
1313
};
1414

1515
var indexSectionNames =

docs/html/usage_patterns.html

Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
2+
<html xmlns="http://www.w3.org/1999/xhtml">
3+
<head>
4+
<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
5+
<meta http-equiv="X-UA-Compatible" content="IE=9"/>
6+
<meta name="generator" content="Doxygen 1.8.13"/>
7+
<meta name="viewport" content="width=device-width, initial-scale=1"/>
8+
<title>Vulkan Memory Allocator: Recommended usage patterns</title>
9+
<link href="tabs.css" rel="stylesheet" type="text/css"/>
10+
<script type="text/javascript" src="jquery.js"></script>
11+
<script type="text/javascript" src="dynsections.js"></script>
12+
<link href="search/search.css" rel="stylesheet" type="text/css"/>
13+
<script type="text/javascript" src="search/searchdata.js"></script>
14+
<script type="text/javascript" src="search/search.js"></script>
15+
<link href="doxygen.css" rel="stylesheet" type="text/css" />
16+
</head>
17+
<body>
18+
<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
19+
<div id="titlearea">
20+
<table cellspacing="0" cellpadding="0">
21+
<tbody>
22+
<tr style="height: 56px;">
23+
<td id="projectalign" style="padding-left: 0.5em;">
24+
<div id="projectname">Vulkan Memory Allocator
25+
</div>
26+
</td>
27+
</tr>
28+
</tbody>
29+
</table>
30+
</div>
31+
<!-- end header part -->
32+
<!-- Generated by Doxygen 1.8.13 -->
33+
<script type="text/javascript">
34+
var searchBox = new SearchBox("searchBox", "search",false,'Search');
35+
</script>
36+
<script type="text/javascript" src="menudata.js"></script>
37+
<script type="text/javascript" src="menu.js"></script>
38+
<script type="text/javascript">
39+
$(function() {
40+
initMenu('',true,false,'search.php','Search');
41+
$(document).ready(function() { init_search(); });
42+
});
43+
</script>
44+
<div id="main-nav"></div>
45+
<!-- window showing the filter options -->
46+
<div id="MSearchSelectWindow"
47+
onmouseover="return searchBox.OnSearchSelectShow()"
48+
onmouseout="return searchBox.OnSearchSelectHide()"
49+
onkeydown="return searchBox.OnSearchSelectKey(event)">
50+
</div>
51+
52+
<!-- iframe showing the search results (closed by default) -->
53+
<div id="MSearchResultsWindow">
54+
<iframe src="javascript:void(0)" frameborder="0"
55+
name="MSearchResults" id="MSearchResults">
56+
</iframe>
57+
</div>
58+
59+
<div id="nav-path" class="navpath">
60+
<ul>
61+
<li class="navelem"><a class="el" href="index.html">Vulkan Memory Allocator</a></li> </ul>
62+
</div>
63+
</div><!-- top -->
64+
<div class="header">
65+
<div class="headertitle">
66+
<div class="title">Recommended usage patterns </div> </div>
67+
</div><!--header-->
68+
<div class="contents">
69+
<div class="textblock"><h1><a class="anchor" id="usage_patterns_simple"></a>
70+
Simple patterns</h1>
71+
<h2><a class="anchor" id="usage_patterns_simple_render_targets"></a>
72+
Render targets</h2>
73+
<p><b>When:</b> Any resources that you frequently write and read on GPU, e.g. images used as color attachments (aka "render targets"), depth-stencil attachments, images/buffers used as storage image/buffer (aka "Unordered Access View (UAV)").</p>
74+
<p><b>What to do:</b> Create them in video memory that is fastest to access from GPU using <a class="el" href="vk__mem__alloc_8h.html#aa5846affa1e9da3800e3e78fae2305ccac6b5dc1432d88647aa4cd456246eadf7">VMA_MEMORY_USAGE_GPU_ONLY</a>.</p>
75+
<p>Consider using <a class="el" href="vk_khr_dedicated_allocation.html">VK_KHR_dedicated_allocation</a> extension and/or manually creating them as dedicated allocations using <a class="el" href="vk__mem__alloc_8h.html#ad9889c10c798b040d59c92f257cae597a3fc311d855c2ff53f1090ef5c722b38f" title="Set this flag if the allocation should have its own memory block. ">VMA_ALLOCATION_CREATE_DEDICATED_MEMORY_BIT</a>, especially if they are large or if you plan to destroy and recreate them e.g. when display resolution changes. Prefer to create such resources first and all other GPU resources (like textures and vertex buffers) later.</p>
76+
<h2><a class="anchor" id="usage_patterns_simple_immutable_resources"></a>
77+
Immutable resources</h2>
78+
<p><b>When:</b> Any resources that you fill on CPU only once (aka "immutable") or infrequently and then read frequently on GPU, e.g. textures, vertex and index buffers, constant buffers that don't change often.</p>
79+
<p><b>What to do:</b> Create them in video memory that is fastest to access from GPU using <a class="el" href="vk__mem__alloc_8h.html#aa5846affa1e9da3800e3e78fae2305ccac6b5dc1432d88647aa4cd456246eadf7">VMA_MEMORY_USAGE_GPU_ONLY</a>.</p>
80+
<p>To initialize content of such resource, create a CPU-side (aka "staging") copy of it in system memory - <a class="el" href="vk__mem__alloc_8h.html#aa5846affa1e9da3800e3e78fae2305cca40bdf4cddeffeb12f43d45ca1286e0a5">VMA_MEMORY_USAGE_CPU_ONLY</a>, map it, fill it, and submit a transfer from it to the GPU resource. You can keep the staging copy if you need it for another upload transfer in the future. If you don't, you can destroy it or reuse this buffer for uploading different resource after the transfer finishes.</p>
81+
<p>Prefer to create just buffers in system memory rather than images, even for uploading textures. Use <code>vkCmdCopyBufferToImage()</code>. Dont use images with <code>VK_IMAGE_TILING_LINEAR</code>.</p>
82+
<h2><a class="anchor" id="usage_patterns_dynamic_resources"></a>
83+
Dynamic resources</h2>
84+
<p><b>When:</b> Any resources that change frequently (aka "dynamic"), e.g. every frame or every draw call, written on CPU, read on GPU.</p>
85+
<p><b>What to do:</b> Create them using <a class="el" href="vk__mem__alloc_8h.html#aa5846affa1e9da3800e3e78fae2305cca9066b52c5a7079bb74a69aaf8b92ff67">VMA_MEMORY_USAGE_CPU_TO_GPU</a>. You can map it and write to it directly on CPU, as well as read from it on GPU.</p>
86+
<p>This is a more complex situation. Different solutions are possible, and the best one depends on specific GPU type, but you can use this simple approach for the start. Prefer to write to such resource sequentially (e.g. using <code>memcpy</code>). Don't perform random access or any reads from it, as it may be very slow.</p>
87+
<h2><a class="anchor" id="usage_patterns_readback"></a>
88+
Readback</h2>
89+
<p><b>When:</b> Resources that contain data written by GPU that you want to read back on CPU, e.g. results of some computations.</p>
90+
<p><b>What to do:</b> Create them using <a class="el" href="vk__mem__alloc_8h.html#aa5846affa1e9da3800e3e78fae2305cca7b586d2fdaf82a463b58f581ed72be27">VMA_MEMORY_USAGE_GPU_TO_CPU</a>. You can write to them directly on GPU, as well as map and read them on CPU.</p>
91+
<h1><a class="anchor" id="usage_patterns_advanced"></a>
92+
Advanced patterns</h1>
93+
<h2><a class="anchor" id="usage_patterns_integrated_graphics"></a>
94+
Detecting integrated graphics</h2>
95+
<p>You can support integrated graphics (like Intel HD Graphics, AMD APU) better by detecting it in Vulkan. To do it, call <code>vkGetPhysicalDeviceProperties()</code>, inspect <code>VkPhysicalDeviceProperties::deviceType</code> and look for <code>VK_PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU</code>. When you find it, you can assume that memory is unified and all memory types are equally fast to access from GPU, regardless of <code>VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT</code>.</p>
96+
<p>You can then sum up sizes of all available memory heaps and treat them as useful for your GPU resources, instead of only <code>DEVICE_LOCAL</code> ones. You can also prefer to create your resources in memory types that are <code>HOST_VISIBLE</code> to map them directly instead of submitting explicit transfer (see below).</p>
97+
<h2><a class="anchor" id="usage_patterns_direct_vs_transfer"></a>
98+
Direct access versus transfer</h2>
99+
<p>For resources that you frequently write on CPU and read on GPU, many solutions are possible:</p>
100+
<ol type="1">
101+
<li>Create one copy in video memory using <a class="el" href="vk__mem__alloc_8h.html#aa5846affa1e9da3800e3e78fae2305ccac6b5dc1432d88647aa4cd456246eadf7">VMA_MEMORY_USAGE_GPU_ONLY</a>, second copy in system memory using <a class="el" href="vk__mem__alloc_8h.html#aa5846affa1e9da3800e3e78fae2305cca40bdf4cddeffeb12f43d45ca1286e0a5">VMA_MEMORY_USAGE_CPU_ONLY</a> and submit explicit tranfer each time.</li>
102+
<li>Create just single copy using <a class="el" href="vk__mem__alloc_8h.html#aa5846affa1e9da3800e3e78fae2305cca9066b52c5a7079bb74a69aaf8b92ff67">VMA_MEMORY_USAGE_CPU_TO_GPU</a>, map it and fill it on CPU, read it directly on GPU.</li>
103+
<li>Create just single copy using <a class="el" href="vk__mem__alloc_8h.html#aa5846affa1e9da3800e3e78fae2305cca40bdf4cddeffeb12f43d45ca1286e0a5">VMA_MEMORY_USAGE_CPU_ONLY</a>, map it and fill it on CPU, read it directly on GPU.</li>
104+
</ol>
105+
<p>Which solution is the most efficient depends on your resource and especially on the GPU. It is best to measure it and then make the decision. Some general recommendations:</p>
106+
<ul>
107+
<li>On integrated graphics use (2) or (3) to avoid unnecesary time and memory overhead related to using a second copy.</li>
108+
<li>For small resources (e.g. constant buffers) use (2). Discrete AMD cards have special 256 MiB pool of video memory that is directly mappable. Even if the resource ends up in system memory, its data may be cached on GPU after first fetch over PCIe bus.</li>
109+
<li>For larger resources (e.g. textures), decide between (1) and (2). You may want to differentiate NVIDIA and AMD, e.g. by looking for memory type that is both <code>DEVICE_LOCAL</code> and <code>HOST_VISIBLE</code>. When you find it, use (2), otherwise use (1).</li>
110+
</ul>
111+
<p>Similarly, for resources that you frequently write on GPU and read on CPU, multiple solutions are possible:</p>
112+
<ol type="1">
113+
<li>Create one copy in video memory using <a class="el" href="vk__mem__alloc_8h.html#aa5846affa1e9da3800e3e78fae2305ccac6b5dc1432d88647aa4cd456246eadf7">VMA_MEMORY_USAGE_GPU_ONLY</a>, second copy in system memory using <a class="el" href="vk__mem__alloc_8h.html#aa5846affa1e9da3800e3e78fae2305cca7b586d2fdaf82a463b58f581ed72be27">VMA_MEMORY_USAGE_GPU_TO_CPU</a> and submit explicit tranfer each time.</li>
114+
<li>Create just single copy using <a class="el" href="vk__mem__alloc_8h.html#aa5846affa1e9da3800e3e78fae2305cca7b586d2fdaf82a463b58f581ed72be27">VMA_MEMORY_USAGE_GPU_TO_CPU</a>, write to it directly on GPU, map it and read it on CPU.</li>
115+
</ol>
116+
<p>You should take some measurements to decide which option is faster in case of your specific resource.</p>
117+
<p>If you don't want to specialize your code for specific types of GPUs, yon can still make an simple optimization for cases when your resource ends up in mappable memory to use it directly in this case instead of creating CPU-side staging copy. For details see <a class="el" href="memory_mapping.html#memory_mapping_finding_if_memory_mappable">Finding out if memory is mappable</a>. </p>
118+
</div></div><!-- contents -->
119+
<!-- start footer part -->
120+
<hr class="footer"/><address class="footer"><small>
121+
Generated by &#160;<a href="http://www.doxygen.org/index.html">
122+
<img class="footer" src="doxygen.png" alt="doxygen"/>
123+
</a> 1.8.13
124+
</small></address>
125+
</body>
126+
</html>

docs/html/vk__mem__alloc_8h_source.html

Lines changed: 109 additions & 109 deletions
Large diffs are not rendered by default.

src/vk_mem_alloc.h

Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,9 @@ Documentation of all members: vk_mem_alloc.h
6363
- \subpage allocation_annotation
6464
- [Allocation user data](@ref allocation_user_data)
6565
- [Allocation names](@ref allocation_names)
66+
- \subpage usage_patterns
67+
- [Simple patterns](@ref usage_patterns_simple)
68+
- [Advanced patterns](@ref usage_patterns_advanced)
6669
- \subpage configuration
6770
- [Pointers to Vulkan functions](@ref config_Vulkan_functions)
6871
- [Custom host memory allocator](@ref custom_memory_allocator)
@@ -803,6 +806,133 @@ printf("Image name: %s\n", imageName);
803806
804807
That string is also printed in JSON report created by vmaBuildStatsString().
805808
809+
810+
\page usage_patterns Recommended usage patterns
811+
812+
\section usage_patterns_simple Simple patterns
813+
814+
\subsection usage_patterns_simple_render_targets Render targets
815+
816+
<b>When:</b>
817+
Any resources that you frequently write and read on GPU,
818+
e.g. images used as color attachments (aka "render targets"), depth-stencil attachments,
819+
images/buffers used as storage image/buffer (aka "Unordered Access View (UAV)").
820+
821+
<b>What to do:</b>
822+
Create them in video memory that is fastest to access from GPU using
823+
#VMA_MEMORY_USAGE_GPU_ONLY.
824+
825+
Consider using [VK_KHR_dedicated_allocation](@ref vk_khr_dedicated_allocation) extension
826+
and/or manually creating them as dedicated allocations using #VMA_ALLOCATION_CREATE_DEDICATED_MEMORY_BIT,
827+
especially if they are large or if you plan to destroy and recreate them e.g. when
828+
display resolution changes.
829+
Prefer to create such resources first and all other GPU resources (like textures and vertex buffers) later.
830+
831+
\subsection usage_patterns_simple_immutable_resources Immutable resources
832+
833+
<b>When:</b>
834+
Any resources that you fill on CPU only once (aka "immutable") or infrequently
835+
and then read frequently on GPU,
836+
e.g. textures, vertex and index buffers, constant buffers that don't change often.
837+
838+
<b>What to do:</b>
839+
Create them in video memory that is fastest to access from GPU using
840+
#VMA_MEMORY_USAGE_GPU_ONLY.
841+
842+
To initialize content of such resource, create a CPU-side (aka "staging") copy of it
843+
in system memory - #VMA_MEMORY_USAGE_CPU_ONLY, map it, fill it,
844+
and submit a transfer from it to the GPU resource.
845+
You can keep the staging copy if you need it for another upload transfer in the future.
846+
If you don't, you can destroy it or reuse this buffer for uploading different resource
847+
after the transfer finishes.
848+
849+
Prefer to create just buffers in system memory rather than images, even for uploading textures.
850+
Use `vkCmdCopyBufferToImage()`.
851+
Dont use images with `VK_IMAGE_TILING_LINEAR`.
852+
853+
\subsection usage_patterns_dynamic_resources Dynamic resources
854+
855+
<b>When:</b>
856+
Any resources that change frequently (aka "dynamic"), e.g. every frame or every draw call,
857+
written on CPU, read on GPU.
858+
859+
<b>What to do:</b>
860+
Create them using #VMA_MEMORY_USAGE_CPU_TO_GPU.
861+
You can map it and write to it directly on CPU, as well as read from it on GPU.
862+
863+
This is a more complex situation. Different solutions are possible,
864+
and the best one depends on specific GPU type, but you can use this simple approach for the start.
865+
Prefer to write to such resource sequentially (e.g. using `memcpy`).
866+
Don't perform random access or any reads from it, as it may be very slow.
867+
868+
\subsection usage_patterns_readback Readback
869+
870+
<b>When:</b>
871+
Resources that contain data written by GPU that you want to read back on CPU,
872+
e.g. results of some computations.
873+
874+
<b>What to do:</b>
875+
Create them using #VMA_MEMORY_USAGE_GPU_TO_CPU.
876+
You can write to them directly on GPU, as well as map and read them on CPU.
877+
878+
\section usage_patterns_advanced Advanced patterns
879+
880+
\subsection usage_patterns_integrated_graphics Detecting integrated graphics
881+
882+
You can support integrated graphics (like Intel HD Graphics, AMD APU) better
883+
by detecting it in Vulkan.
884+
To do it, call `vkGetPhysicalDeviceProperties()`, inspect
885+
`VkPhysicalDeviceProperties::deviceType` and look for `VK_PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU`.
886+
When you find it, you can assume that memory is unified and all memory types are equally fast
887+
to access from GPU, regardless of `VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT`.
888+
889+
You can then sum up sizes of all available memory heaps and treat them as useful for
890+
your GPU resources, instead of only `DEVICE_LOCAL` ones.
891+
You can also prefer to create your resources in memory types that are `HOST_VISIBLE` to map them
892+
directly instead of submitting explicit transfer (see below).
893+
894+
\subsection usage_patterns_direct_vs_transfer Direct access versus transfer
895+
896+
For resources that you frequently write on CPU and read on GPU, many solutions are possible:
897+
898+
-# Create one copy in video memory using #VMA_MEMORY_USAGE_GPU_ONLY,
899+
second copy in system memory using #VMA_MEMORY_USAGE_CPU_ONLY and submit explicit tranfer each time.
900+
-# Create just single copy using #VMA_MEMORY_USAGE_CPU_TO_GPU, map it and fill it on CPU,
901+
read it directly on GPU.
902+
-# Create just single copy using #VMA_MEMORY_USAGE_CPU_ONLY, map it and fill it on CPU,
903+
read it directly on GPU.
904+
905+
Which solution is the most efficient depends on your resource and especially on the GPU.
906+
It is best to measure it and then make the decision.
907+
Some general recommendations:
908+
909+
- On integrated graphics use (2) or (3) to avoid unnecesary time and memory overhead
910+
related to using a second copy.
911+
- For small resources (e.g. constant buffers) use (2).
912+
Discrete AMD cards have special 256 MiB pool of video memory that is directly mappable.
913+
Even if the resource ends up in system memory, its data may be cached on GPU after first
914+
fetch over PCIe bus.
915+
- For larger resources (e.g. textures), decide between (1) and (2).
916+
You may want to differentiate NVIDIA and AMD, e.g. by looking for memory type that is
917+
both `DEVICE_LOCAL` and `HOST_VISIBLE`. When you find it, use (2), otherwise use (1).
918+
919+
Similarly, for resources that you frequently write on GPU and read on CPU, multiple
920+
solutions are possible:
921+
922+
-# Create one copy in video memory using #VMA_MEMORY_USAGE_GPU_ONLY,
923+
second copy in system memory using #VMA_MEMORY_USAGE_GPU_TO_CPU and submit explicit tranfer each time.
924+
-# Create just single copy using #VMA_MEMORY_USAGE_GPU_TO_CPU, write to it directly on GPU,
925+
map it and read it on CPU.
926+
927+
You should take some measurements to decide which option is faster in case of your specific
928+
resource.
929+
930+
If you don't want to specialize your code for specific types of GPUs, yon can still make
931+
an simple optimization for cases when your resource ends up in mappable memory to use it
932+
directly in this case instead of creating CPU-side staging copy.
933+
For details see [Finding out if memory is mappable](@ref memory_mapping_finding_if_memory_mappable).
934+
935+
806936
\page configuration Configuration
807937
808938
Please check "CONFIGURATION SECTION" in the code to find macros that you can define

0 commit comments

Comments
 (0)