Skip to content

Conversation

asuessenbach
Copy link
Contributor

@asuessenbach asuessenbach commented Apr 3, 2024

Description

As described in #938, requesting and enabling extension features is currently incorrect. With this PR, this is done similar to how it's done for the simple PhysicalDeviceFeatures: one function to get the supported bits (get_extension_features) and one to actually set them into the structure chain (add_extension_features).
The perfect handling would look like this:

if (gpu.get_extension_features<VkExtensionFeaturesStruct>(VK_STRUCTURE_TYPE_EXTENSION_FEATURES_STRUCT).featureBit)
{
	gpu.add_extension_features<VkExtensionFeaturesStruct>(VK_STRUCTURE_TYPE_EXTENSION_FEATURES_STRUCT).featureBit = VK_TRUE;
}
else
{
	// some fallback or error handling!
}

To ease usage, [HPP]PhysicalDevice got two new template functions: request_optional_feature and request_required_feature, which log a message or throw an exception, respectively, if the requested feature is not supported.
Requesting a required feature would now look like

gpu.request_required_feature<VkExtensionFeaturesStruct>(VK_STRUCTURE_TYPE_EXTENSION_FEATURES_STRUCT,
                                                        &VkExtensionFeaturesStruct::featureBit,
                                                        "VkExtensionFeaturesStruct",
                                                        "featureBit");

Where the last two arguments to that functions are just for error reporting and maybe could omitted.

To ease usage even more, some macros have been added, so that requesting a required feature would now look like

REQUEST_REQUIRED_FEATURE(gpu, VkExtensionFeaturesStruct, VK_STRUCTURE_TYPE_EXTENSION_FEATURES_STRUCT, featureBit);

Build tested on Win10 with VS2022. Run tested on Win10 with NVidia GPU.

Fixes #938

General Checklist:

Please ensure the following points are checked:

  • My code follows the coding style
  • I have reviewed file licenses
  • I have commented any added functions (in line with Doxygen)
  • I have commented any code that could be hard to understand
  • My changes do not add any new compiler warnings
  • My changes do not add any new validation layer errors or warnings
  • I have used existing framework/helper functions where possible
  • My changes do not add any regressions
  • I have tested every sample to ensure everything runs correctly
  • This PR describes the scope and expected impact of the changes I am making

Note: The Samples CI runs a number of checks including:

  • I have updated the header Copyright to reflect the current year (CI build will fail if Copyright is out of date)
  • My changes build on Windows, Linux, macOS and Android. Otherwise I have documented any exceptions

Sample Checklist

If your PR contains a new or modified sample, these further checks must be carried out in addition to the General Checklist:

  • I have tested the sample on at least one compliant Vulkan implementation
  • If the sample is vendor-specific, I have tagged it appropriately
  • I have stated on what implementation the sample has been tested so that others can test on different implementations and platforms
  • Any dependent assets have been merged and published in downstream modules
  • For new samples, I have added a paragraph with a summary to the appropriate chapter in the readme of the folder that the sample belongs to e.g. api samples readme
  • For new samples, I have added a tutorial README.md file to guide users through what they need to know to implement code using this feature. For example, see conditional_rendering
  • For new samples, I have added a link to the Antora navigation so that the sample will be listed at the Vulkan documentation site

@asuessenbach asuessenbach force-pushed the extension_features branch 5 times, most recently from 37787b5 to a6ce335 Compare April 3, 2024 16:05
@asuessenbach asuessenbach marked this pull request as ready for review April 4, 2024 06:20
@asuessenbach asuessenbach requested a review from a team April 4, 2024 06:20
Copy link
Contributor

@gary-sweet gary-sweet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can just assert in all these cases can we?
It seems to prevent batch mode from running as soon as it finds an unsupported case.

@asuessenbach
Copy link
Contributor Author

I don't think we can just assert in all these cases can we?

You're kind of right. As I stated in the Description above, that should be replaced by some meaningful error or fall-back handling.
What would you suggest to do instead until then?

@gary-sweet
Copy link
Contributor

What would you suggest to do instead until then?

Well throwing a std::runtime_error would be better, as it doesn't terminate batch mode, just the current sample I guess.

I'm also not really sure what was wrong with just having a single call to request_extension_features? I can't see why unrolling that code everywhere is better. Maybe I'm just missing the issue that's being solved here.

@asuessenbach
Copy link
Contributor Author

I'm also not really sure what was wrong with just having a single call to request_extension_features?

The problem, as described in #938, was, that currently you get the supported extension features bits, possibly set some of them to true, no matter if they are supported at all, and store that structure in the structure chain for device creation.
That is, there are two issues:

  • All supported flags are requested, no matter if you actually need/want the support of all those flags.
  • Potentially support of a not supported feature is requested.

With this PR, you just set the flags you're interested in and when they're supported at all.

@asuessenbach
Copy link
Contributor Author

Well throwing a std::runtime_error would be better, as it doesn't terminate batch mode, just the current sample I guess.

Sounds good. I can change that accordingly.

@asuessenbach asuessenbach force-pushed the extension_features branch 5 times, most recently from 309c259 to 5ed9f04 Compare April 10, 2024 14:48
@gary-sweet
Copy link
Contributor

I think you'll need to rebase this over #1031. A lot of the samples don't run without that, so testing this isn't really possible without it.

@SaschaWillems
Copy link
Collaborator

SaschaWillems commented Jun 17, 2024

Tbh. I'm kinda torn over the syntax that now needs to be used. Will that actually work with e.g. code completion/intellisense? Or will it fail at compile time when I e.g. request a member of a feature that isn't present or that I did misspell?

I'm referring to this:

REQUEST_REQUIRED_FEATURE(gpu,
VkPhysicalDeviceGraphicsPipelineLibraryFeaturesEXT,
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_GRAPHICS_PIPELINE_LIBRARY_FEATURES_EXT,
graphicsPipelineLibrary);

The old syntax has you access that member (graphicsPipelineLibary) directly, which works fine with intellisense/code compleation and is easier to understand/follow:

auto &requested_vertex_input_features      = gpu.request_extension_features<VkPhysicalDeviceMeshShaderFeaturesEXT>(VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_MESH_SHADER_FEATURES_EXT);
requested_vertex_input_features.meshShader = VK_TRUE;

(though I don't really like that syntax either)

I fear that this makes working on samples harder, hurts readability and foremost makes it harder to follow our samples.

Is there a different way to handle this that looks more similar to the old syntax?

@asuessenbach
Copy link
Contributor Author

asuessenbach commented Jun 18, 2024

Will that actually work with e.g. code completion/intellisense?

As it is with all macros, you probably don't have any support by code completion or intellisense when you use [HPP_]REQUEST_[OPTIONAL|REQUIRED]_FEATURE. Any error will fire up at compile time.
When you use the helper functions gpu.request_[optional|required]_feature or the actual worker functions gpu.[add|get]_extension_features, you of course have all the help available by code completion and intellisense.

The old syntax has you access that member (graphicsPipelineLibary) directly, which works fine with intellisense/code compleation and is easier to understand/follow:

But as described in #938 it's just wrong!

Is there a different way to handle this that looks more similar to the old syntax?

I don't know of any.

Actually, I provided three different levels of complexity to handle that correctly. If you have an idea for an additional approach, I'd be happy to add that as well.

@gpx1000 gpx1000 self-requested a review July 1, 2024 15:20
gary-sweet
gary-sweet previously approved these changes Jul 15, 2024
@SaschaWillems
Copy link
Collaborator

SaschaWillems commented Jul 15, 2024

As discussed on the call, here is the way I do it in my samples:

  • I declare the extension struct(s) in the sample
  • I setup the whole extension struct(s) in the sample constructor
  • I add the struct(s) to a pNext pointer of my device / instance

An example:

class VulkanExample : public VulkanExampleBase
{
    VkPhysicalDeviceTimelineSemaphoreFeaturesKHR enabledTimelineSemaphoreFeaturesKHR{};

    VulkanExample() : VulkanExampleBase()
    {
        ...

        enabledTimelineSemaphoreFeaturesKHR.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_TIMELINE_SEMAPHORE_FEATURES_KHR;
        enabledTimelineSemaphoreFeaturesKHR.timelineSemaphore = VK_TRUE;

        pNextDevice = &enabledTimelineSemaphoreFeaturesKHR;
    }    

}

It might not be the best approach from a C++ language point of view (esp. how pNextDevice is implemented), but it makes it very easy for people to follow my samples and you can copy that 1:1 into your own application without having to follow any templates or macros.

It would probably be a more substantial change, so before we take this further we should agree on what way we want to follow.

@asuessenbach
Copy link
Contributor Author

@SaschaWillems I assume, that pNextDevice is the pNext of some VkPhysicalDeviceFeatures2, that you hold somewhere. And you're using a pointer to that VkPhysicalDeviceFeatures2 to create a VkDevice using vkCreateDevice().
That is, it seems that setting the requested features and actually using that is also distributed to (at least) two separate code sequences.
But, what's more important, I don't see that you first ask if the requested feature actually is supported. You would need a second structure chain to first ask for support using vkGetPhysicalDeviceFeatures2().

From a sample point of view, I think, it's important that the reader of the sample can easily see what features are requested. But I would not consider the details on how that is handled to be important for a sample. Besides for a sample that actually demonstrates how feature requesting is done, of course.

@SaschaWillems
Copy link
Collaborator

But, what's more important, I don't see that you first ask if the requested feature actually is supported. You would need a second structure chain to first ask for support using vkGetPhysicalDeviceFeatures2().

I actually don't do that. If one requests a feature that isn't supported, instance or device creation will fail with an error code hinting at a feature not supported. I want to deliberately keep samples as simple as possible.

@asuessenbach
Copy link
Contributor Author

If one requests a feature that isn't supported, instance or device creation will fail with an error code hinting at a feature not supported.

Interesting approach! That is, you would get some VK_ERROR_FEATURE_NOT_PRESENT on vkCreateDevice and bail out at that point. But you can't tell then, what feature is missing. And how would you handle some optionally used feature?
I would consider to first get the supported features and potentially react accordingly on their absence the "cleaner" and also more descriptive way to handle that.
And I don't think, that code is that much harder to read and understand.

@SRSaunders
Copy link
Contributor

SRSaunders commented Aug 31, 2024

This PR also addresses issue #1144. It fixes the crashes I am seeing due to missing feature support on my AMD 6600XT GPU during vkCreateDevice() initialization on Windows 10 with SDK 1.3.290. Thanks @asuessenbach for this work.

However, I would like to request a couple of small things:

  1. Please update the mutableComparisonSamplers portability feature activation in samples async_compute and multithreading_render_passes.
  2. If possible, please add line type optionality to the dynamic_line_rasterization sample. Some GPUs support a subset of the line types and it would be good to have partial support vs. all-or-nothing. The implementation could be something like what I tried in the now-retracted PR Check dynamic state / line rasterization feature support during sample init #1145.

@asuessenbach
Copy link
Contributor Author

@SRSaunders I adjusted the two samples async_compute and multithreading_render_passes. But the changes in dynamic_line_rasterization would actually change that sample, which I'ld like to avoid with this already pretty big and long standing PR. Would be nice if you could change that after this PR has been merged.

@SRSaunders
Copy link
Contributor

I adjusted the two samples async_compute and multithreading_render_passes

Thanks.

But the changes in dynamic_line_rasterization would actually change that sample, which I'ld like to avoid with this already pretty big and long standing PR. Would be nice if you could change that after this PR has been merged.

No problem. I'd be happy to come back to that once this PR is merged.

@SaschaWillems SaschaWillems self-requested a review September 6, 2024 12:34
SaschaWillems
SaschaWillems previously approved these changes Sep 6, 2024
Copy link
Collaborator

@SaschaWillems SaschaWillems left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and works for me.

Though I can't do a full batch run. With the latest SDK I get a GPU crash in the ray tracing extended sample. Doesn't look like it's related to this PR though.

gary-sweet
gary-sweet previously approved these changes Sep 9, 2024
@SRSaunders
Copy link
Contributor

I don't want to interfere with the approvals already granted, but two more samples have now been added using the old gpu.request_extension_features<>() method: dynamic_rendering_local_read and host_image_copy. Applying this PR without fixing these will result in project build failures. I ran into this today when trying to stack up a few pending changes for testing.

Is there any way to speed up approvals for these kinds of framework updates, or at least put in safeguards to restrict certain kinds of changes (e.g. new samples) until they are merged? It appears as an outside user that there is a process weakness here. This one was obvious, but I could see more subtle issues arising in the future.

@SaschaWillems
Copy link
Collaborator

We usually meet twice a month to discuss PRs. That's the process. We'll discus this on our next call and make sure we don't break anything.

@SaschaWillems
Copy link
Collaborator

@asuessenbach : Can you merge main into this so we can see if there are any CI failures as mentioned?

Copy link
Collaborator

@SaschaWillems SaschaWillems left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much :)

@marty-johnson59 marty-johnson59 merged commit 5daac4d into KhronosGroup:main Sep 23, 2024
23 checks passed
@asuessenbach asuessenbach deleted the extension_features branch September 24, 2024 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Requesting extension features also enables all available features

5 participants