You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Integrated clang-tidy and clang-format into cmake build options.
80
+
* New entrypoint added: GpaGetDeviceGeneration. Binary backwards compatibility is maintained.
81
+
* OpenGL on Linux: Fixed hardware detection on MESA drivers.
82
+
* OpenGL: Fixed hardware detection accuracy.
83
+
* DX11:
84
+
* Fixed Adrenalin driver version detection.
85
+
* Fixed setting the number of shader arrays based on client hardware.
86
+
* Improvements made to the sample applications:
87
+
* Extensive counter validation in DX12.
88
+
* Sample apps can now confirm successful validation tests.
89
+
* Sample apps now support passing in a counter file to specify which counters to enable.
90
+
* Consolidated parameter parsing logic in sample apps.
91
+
* In Vulkan and DX12 samples, the return code now indicates the number of errors that were reported.
37
92
38
93
## System Requirements
39
94
* An AMD Radeon GPU or APU based on Graphics IP version 8 and newer.
@@ -80,7 +135,24 @@ The documentation is hosted publicly at: http://gpuperfapi.readthedocs.io/en/lat
80
135
This release exposes both "Derived" counters and "Raw Hardware" counters. Derived counters are counters that are computed using a set of raw hardware counters.
81
136
This version allows you to access the raw hardware counters by simply specifying a flag when calling GpaOpenContext.
82
137
138
+
## New Pipeline-Based Counters
139
+
It was discovered that the improvements introduced in Vega, RDNA, and RDNA2 architectures were not being properly accounted for in GPUPerfAPI v3.9, and caused a lot of known issues to be called out in that release. In certain cases, the driver and hardware are able to make optimizations by combining two shader stages together, which prevented GPUPerfAPI from identifying which instructions where executed for which shader type. As a result of these changes, GPUPerfAPI is no longer able to expose instruction counters for each API-level shader, specifically Vertex Shaders, Hull Shaders, Domain Shaders, and Geometry Shaders. Pixel Shaders and Compute Shaders remain unchanged. We are now exposing these instruction counters based on the type of shader pipeline being used. In pipelines that do not use tessellation, the instruction counts for both the Vertex and Geometry Shaders (if used) will be combined in the VertexGeometry group (ie: counters with the "VsGs" prefix). In pipelines that use tessellation, the instruction counts for both the Vertex and Hull Shaders will be combined in the PreTessellation group (ie: counters with the "PreTessellation" or "PreTess" prefix), and instruction counts for the Domain and Geometry Shaders (if used) will be combined in the PostTessellation group (ie: counters with the "PostTessellation" or "PostTess" prefix). The table below may help to better understand the new mapping between the API-level shaders (across the top), and which prefixes to look for in the GPUPerfAPI counters.
### Counter Validation Errors in D3D12ColorCube Sample App
151
+
Due to the extensive counter validation now being done in the D3D12ColorCube sample application, and some expected variation in nondeterministic counters across a wide range of systems, the sample app may report errors on some systems. Likewise, some counters are marked as known issues and we are investigating the underlying causes of the inconsistent results.
152
+
153
+
Additionally, the following deterministic performance counter values may not be accurate for the D3D12ColorCube sample application:
154
+
* CulledPrims, PSPixelsOut on Radeon RX 480 hardware.
155
+
84
156
### Ubuntu 20.04 LTS Vulkan ICD Issue
85
157
On Ubuntu 20.04 LTS, Vulkan ICD may not be set to use AMD Vulkan ICD. In this case, it needs to be explicitly set to use AMD Vulkan ICD before using the GPA. It can be done by setting the ```VK_ICD_FILENAMES``` environment variable to ```/etc/vulkan/icd.d/amd_icd64.json```.
86
158
@@ -96,23 +168,10 @@ By default this file is only modifiable by root, so the application being profil
96
168
* You may have to reboot the system for the change to take effect.
97
169
* Setting the GPU clock mode is not working correctly for <b>Radeon 5700 Series GPUs</b>, potentially leading to some inconsistencies in counter values from one run to the next.
98
170
99
-
### DirectX11 Performance Counter Accuracy For Select Counters and GPUs
100
-
The following performance counter values may not be accurate for DirectX 11 applications running on a Radeon 5700, and 6000 Series GPUs:
101
-
* VALUInstCount, SALUInstCount, VALUBusy, SALUBusy for all shader stages: These values should be representative of performance, but may not be 100% accurate.
102
-
* Most of the ComputeShader counters (all except the MemUnit and WriteUnit counters): These values should be representative of performance, but may not be 100% accurate.
103
-
104
171
### OpenCL Performance Counter Accuracy For Radeon 6000 Series GPUs
105
172
The following performance counter values may not be accurate for OpenCL applications running on Radeon 6000 Series GPUs:
106
173
* Wavefronts, VALUInsts, SALUInsts, SALUBusy, VALUUtilization: These values should be representative of performance, but may not be 100% accurate.
107
174
108
-
### OpenGL Performance Counter Accuracy For Radeon 5700 Series GPUs
109
-
The following performance counter values may not be accurate for OpenGL applications running on a Radeon 5700 Series GPUs:
110
-
* Most of the ComputeShader counters (all except the MemUnit and WriteUnit counters): These values should be representative of performance, but may not be 100% accurate.
111
-
112
-
### Variability in Deterministic Counters For Select GPUs
113
-
Performance counters which should be deterministic are showing variability on Radeon 5700 and 6000 Series GPUs. The values should be useful for performance analysis, but may not be 100% correct.
114
-
* e.g. VSVerticesIn, PrimitivesIn, PSPixelsOut, PreZSamplesPassing
115
-
116
175
### Profiling Bundles
117
176
Profiling bundles in DirectX12 and Vulkan is not working properly. It is recommended to remove those GPA Samples from your application, or move the calls out of the bundle for profiling.
0 commit comments