So many descriptors in Vulkan

Brainwash is always somewhere

It’s really a mess when I started to port my engine to Vulkan, there are too many new data types that mapped to those came-from-nowhere concepts, which I don’t need to take care about previously. But luckily those concepts are well designed and once after you understand what they are, every pain you occurred would just disappear.

The new generation graphics APIs all transport the responsibility of CPU-GPU communication to the user more or less explicitly, now if you want to ask GPU to do something for you, there won’t be any already defined API, which you just need to feed in some data from your CPU and memory then all the others would be handled by “somebody”.

Let’s say, GPU knows absolutely nothing what you want to do as always, and that “(ex)-somebody”, previously it’s your graphics card vendor’s implementation of those graphics API, they did the “trivial” underlying pipeline works for you. They all have gone now, you have to take care of all what it did for you before. Sounds like a fairly bad break-up!

But what would you benefit from a break-up? (almost) Freedom. Now GPU is more like a general computing server, which exposure itself through the new generation of lower level APIs. As the client, we need to submit the computing work with a detailed enough work description, and then keep feed in data and commands following with the description which we signed with GPU before. But in practice, the most work I did like lots of people who found this article is rendering, and for this purpose, these new APIs were designed still with lots of rendering related specific concepts (because GPU is still “Graphic” Process Unit today:)).

But without considering the details, the whole bunch of things is easy to understand. What we need to do is just fit ourselves into the new CPU-GPU communication model. Create a work description, submit work, repeat, that’s all. Now it’s time to write the code, and you may want to say “whaaaaat” when you type “vk” inside your IDE if it has some kind of code autocomplete features. Yep, too many data types!

Ce este?

“…A descriptor is an opaque data structure representing a shader resource such as a buffer, buffer view, image view, sampler, or combined image sampler. Descriptors are organised into descriptor sets, which are bound during command recording for use in subsequent draw commands…” -13. Resource Descriptors, Vulkan® 1.1.106 - A Specification (with all published extensions)

If you asked me what is the most beneficial thing I got from the journey of writing a game engine, I would answer, “Don’t Panic”. One headache thing when I play with Vulkan is the descriptors, I can’t catch the meaning of it at the very beginning, because with an OpenGL mindset there is no corresponding concept.

But if you think in a fresh point of view, it’s really not a nonsense existence because we have to tell GPU the work description, like where are the resources, how shader will access them and in which kind of view since data are just some bytes inside the GPU memory. So, it’s really better to build a new mindset closer to the GPU pipeline.

“…Descriptors are grouped together into descriptor set objects. A descriptor set object is an opaque object that contains storage for a set of descriptors, where the types and number of descriptors is defined by a descriptor set layout. The layout object may be used to define the association of each descriptor binding with memory or other hardware resources. The layout is used both for determining the resources that need to be associated with the descriptor set, and determining the interface between shader stages and shader resources… -13.2. Descriptor Sets, Vulkan® 1.1.106 - A Specification (with all published extensions)

Actually, there isn’t a “descriptor” data type that we could interact with directly on the CPU side, as the specification said, it’s opaque. The workflow of creating the descriptors is designed as to create a combination of a set(VkDescriptorSet) handle, a buffer or image bound info(VkDescriptorBufferInfo/VkDescriptorImageInfo) and a write or copy operation(VkWriteDescriptorSet/VkCopyDescriptorSet). You acquire one set instance from a pool(vkAllocateDescriptorSets), and all of the set and pool have their own characteristics or usage hints which you would specific before creating them. These characteristics are typically configured with the layout(VkDescriptorSetLayout) and the create info(VkDescriptorSetAllocateInfo and VkDescriptorPoolCreateInfo). After you create the set you have to provide the info about what buffer or image it would bind to, and finally, update this information by a write or copy operation(vkUpdateDescriptorSets). The cons of a plain code example are it’s not so intuitive about the relation between data structure and functions, so I made a little flow graph to demonstrate.

All the thick line indicate the real object instance is created by a function invocation, while the dotted line means the dependency between the info. I omitted the other dependencies of creating a VkPipeline since they are not related to the topic I’m talking about.

Actually now you may start to feel Vulkan has a really clean architecture model, indeed it is. We now have a far more flexible possibility that we could have as many descriptors in many different descriptor sets in many descriptor pools with many different combinations of configurations. One thing that breaks a little bit of the name convention is the VkWriteDescriptorSet and VkCopyDescriptorSet, they should and only could be created by user directly and submit later rather than acquire from the vKDevice (I was thinking why there isn’t a VkWriteDescriptorSetCreateInfo kind stuff but it would be too redundant to create a VkWriteDescriptorSet, because after all, it’s about an operation around the descriptor, or maybe better call it VkDescriptorSetWriteOp?).

Some usage cases

Let’s have a look at some code examples, which are all coming from some real scenarios I occurred before.

Single UBO data accessed per shader stage

I have a UBO for main camera related data like the projection matrix, only update once per frame, the GLSL code like this:

layout(std140, row_major, set = 0, binding = 0) uniform cameraUBO
{
	mat4 uni_p_camera_original;
};

Then it’s better to answer some questions before creating the descriptor-related data:

How many descriptors will we have inside the pool? Only one.
Which kind of resource type it will be used for? For uniform buffer.
How many different type and number of descriptors will be allocated from this pool? Only one type and only descriptor will it hold.
How many sets it could hold at all? Only one.

VkDescriptorPoolSize l_poolSize = {};
l_poolSize.type = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
l_poolSize.descriptorCount = 1;

VkDescriptorPoolCreateInfo l_poolInfo = {};
l_poolInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO;
l_poolInfo.poolSizeCount = 1;
l_poolInfo.pPoolSizes = l_poolSize;
l_poolInfo.maxSets = 1;

Create VkDescriptorPool.

VkDescriptorPool l_pool;
vkCreateDescriptorPool(m_device, &l_poolInfo, nullptr, &l_pool);

Where it will be bound to with the shader? Binding point 0.
How many descriptors will be bound? Only one.
Which shader stage could access it? Vertex shader.

VkDescriptorSetLayoutBinding l_setLayoutBinding = {};
l_setLayoutBinding.binding = 0;
l_setLayoutBinding.descriptorCount = 1;
l_setLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
l_setLayoutBinding.pImmutableSamplers = nullptr;
l_setLayoutBinding.stageFlags = VK_SHADER_STAGE_VERTEX_BIT;

VkDescriptorSetLayoutCreateInfo l_layoutCreateInfo = {};
l_layoutCreateInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO;
l_layoutCreateInfo.bindingCount = 1;
l_layoutCreateInfo.pBindings = l_setLayoutBinding;

Create VkDescriptorSetLayout.

VkDescriptorSetLayout l_setLayout;
vkCreateDescriptorSetLayout(m_device, &l_layoutCreateInfo, nullptr, &l_setLayout);

Create VkDescriptorSet.

VkDescriptorSetAllocateInfo l_allocInfo = {};
l_allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
l_allocInfo.descriptorPool = l_pool;
l_allocInfo.descriptorSetCount = 1;
l_allocInfo.pSetLayouts = &l_setLayout;

VkDescriptorSet l_set;
vkAllocateDescriptorSets(.m_device, &l_allocInfo, &l_set)

Which resource it will be bound to? A UBO.
Where to bind? Binding point 0.
Which VkDescriptorSet that the write operation targeted at? The one we just created.

VkDescriptorBufferInfo l_bufferInfo = {};
l_bufferInfo.buffer = m_cameraUBO; // created from somewhere else
l_bufferInfo.offset = 0;
l_bufferInfo.range = sizeof(CameraGPUData);

VkWriteDescriptorSet l_writeDescriptorSet = {};
l_writeDescriptorSet.sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
l_writeDescriptorSet.dstBinding = 0;
l_writeDescriptorSet.dstSet = l_set;
l_writeDescriptorSet.dstArrayElement = 0;
l_writeDescriptorSet.descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
l_writeDescriptorSet.descriptorCount = 1;
l_writeDescriptorSet.pBufferInfo = &l_bufferInfo;

vkUpdateDescriptorSets(
		m_device, // VkDevice handle
		1, // only 1 VkWriteDescriptorSet
		&l_writeDescriptorSet,
		0,
		nullptr);

Then when I need to submit command, the only thing left is a call to vkCmdBindDescriptorSets:

	vkCmdBindDescriptorSets(
	&m_commandBuffer,
	VK_PIPELINE_BIND_POINT_GRAPHICS,
	m_pipelineLayout,
	0, // the first set is set 0
	1, // and one set only
	&l_descriptorSet, 0, nullptr);

Array UBO data accessed per shader stage

My punctual light data is inside an array which contains all the information and will be updated to GPU once per frame, but I’ll iterate through the array for a deferred style light pass inside the fragment shader. The GLSL code like this:

#define MAX_POINT_LIGHT 64
// w component of luminance is attenuationRadius
struct pointLight {
	vec4 position;
	vec4 luminance;
	//float attenuationRadius;
};

layout(set = 0, binding = 2) uniform pointLightUBO
{
	pointLight uni_pointLights[MAX_POINT_LIGHT];
};

The only things change in C++ code is the binding point and the buffer range, since it’s an array.

#define MAX_POINT_LIGHT 64
VkDescriptorSetLayoutBinding l_setLayoutBinding = {};
l_setLayoutBinding.binding = 2;

VkDescriptorBufferInfo l_bufferInfo = {};
l_bufferInfo.buffer = m_pointLightUBO; // created from somewhere else
l_bufferInfo.offset = 0;
l_bufferInfo.range = sizeof(PointLightGPUData) * MAX_POINT_LIGHT; // the total size of the UBO array

VkWriteDescriptorSet l_writeDescriptorSet = {};
l_writeDescriptorSet.sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
l_writeDescriptorSet.dstBinding = 2;

The command submit is exactly the same as the previous example:

	vkCmdBindDescriptorSets(
	&m_commandBuffer,
	VK_PIPELINE_BIND_POINT_GRAPHICS,
	m_pipelineLayout,
	0, // the first set is set 0
	1, // and one set only
	&l_descriptorSet,
	0,
	nullptr);

Array UBO data accessed per draw call object

The local-to-world space transformation matrix needs to be updated per object, and for this what we could use is the Dynamic Uniform Buffer (forget your glUpdateUniform* things!), I’ll update a UBO array per frame which contain all the drawable meshes information, and will use an offset to access the corresponding part per draw call later.

layout(std140, row_major, set = 0, binding = 1) uniform meshUBO
{
	mat4 uni_m;
};

Now in C++ code, I need to specify a different descriptor type, also a different binding point because I bind other resources at binding point 0, but it’s trivial.

VkDescriptorPoolSize l_poolSize = {};
l_poolSize.type = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC;

VkDescriptorSetLayoutBinding l_layoutBinding = {};
l_layoutBinding.binding = 1;
l_layoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC;

VkDescriptorBufferInfo l_bufferInfo = {};
l_bufferInfo.buffer = m_meshUBO;
l_bufferInfo.offset = 0;
l_bufferInfo.range = sizeof(MeshGPUData);

Since it would be accessed in a dynamic favor, the buffer range is per block rather than the whole UBO data array.

And when bind the DescriptorSet, we need to specify the dynamic offset, I use such an implementation:

unsigned int l_blockSize = sizeof(MeshGPUData);

for (int i = 0; i < total_meshes_this_frame; i++)
{
	auto l_offset = l_blockSize * i;

	vkCmdBindDescriptorSets(
	&m_commandBuffer,
	VK_PIPELINE_BIND_POINT_GRAPHICS,
	m_pipelineLayout,
	0, // the first set is set 0
	1, // and one set only
	&l_descriptorSet,
	1, // Now we have one dymanic offset
	&l_offset // the offset value
	);

	// draw call
	//...
	//
}

Multiple array UBO data accessed per draw call object

This is the combination of the previous situations, the mesh UBO and material UBO is related to each draw call, but the camera UBO is one frame one update, but still, we could achieve this.

The Vertex shader looks like this:

layout(std140, row_major, set = 0, binding = 0) uniform cameraUBO
{
	mat4 uni_p_camera_original;
};

layout(std140, row_major, set = 0, binding = 1) uniform meshUBO
{
	mat4 uni_m;
};

While the fragment shader looks like this:

layout(std140, set = 0, binding = 2) uniform materialUBO
{
	vec4 uni_albedo;
	vec4 uni_MRAT;
};

Now the C++ code:

How many descriptors will we have inside the pool? Now 3.
Which kind of resource type it will be used for? For normal uniform buffer and dynamic uniform buffer.
How many different type and number of descriptors will be allocated from this pool? 2 types 3 descriptors.
How many sets it could hold at all? Now, still 1.

VkDescriptorPoolSize l_cameraUBOPoolSize = {};
VkDescriptorPoolSize l_meshUBOPoolSize = {};
VkDescriptorPoolSize l_materialUBOPoolSize = {};

l_cameraUBOPoolSize.type = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
l_cameraUBOPoolSize.descriptorCount = 1;

l_meshUBOPoolSize.type = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC;
l_meshUBOPoolSize.descriptorCount = 1;

l_materialUBOPoolSize.type = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC;
l_materialUBOPoolSize.descriptorCount = 1;

VkDescriptorPoolSize l_UBOPoolSizes[] = { l_cameraUBOPoolSize , l_meshUBOPoolSize, l_materialUBOPoolSize };

VkDescriptorPoolCreateInfo l_poolInfo = {};
l_poolInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO;
l_poolInfo.poolSizeCount = 3;
l_poolInfo.pPoolSizes = l_UBOPoolSizes;
l_poolInfo.maxSets = 1;

Other parts are all the same, just change the binding points and the buffer info. And when bind the DescriptorSet, the dynamic offset is an array now:

unsigned int l_meshDataBlockSize = sizeof(MeshGPUData);
unsigned int l_materialDataBlockSize = sizeof(MeshGPUData);

for (int i = 0; i < total_meshes_this_frame; i++)
{
	auto l_meshOffset = l_meshDataBlockSize * i;
	auto l_materialOffset = l_materialDataBlockSize * i;
	unsigned int l_offsets[] = { l_meshOffset, l_materialOffset };

	vkCmdBindDescriptorSets(
	&m_commandBuffer,
	VK_PIPELINE_BIND_POINT_GRAPHICS,
	m_pipelineLayout,
	0, // the first set is set 0
	1, // and one set only
	2, // Now we have two dymanic offsets
	&l_offsets // the offset value array
	);

	// draw call
	//...
	//
}

That’s it! The sampler descriptor is similar to uniform buffer, all the rules work as well as what I applied above. Freedom means more responsibility and caution, but we now have more possibility to optimize the whole rendering pipeline to a new level of efficiency, I would keep investigating how to utilize the power of Vulkan in a more concise way, after all, it’s quite a more complex API than OpenGL!

Published Apr 21, 2019

Random randomness in randomized randomization.Hang Zhang on Twitter