Order Independent Transparency in a Custom Engine

Introduction

This blog post follows on from my previous one on structuring the rendering of a frame within my custom game engine. Information given in the prior post is not necessary for this one, but I’ll give a quick rundown of who I am and what the project these blogs are about is.

My name is Brandon, and I have been using C++ to develop games for the past 8+ years. I have always found games which push their graphics to a new level to be really impressive, with the Dark Souls and TrackMania series specifically coming to mind. It’s games like these that captured and kept my focus on games, and which eventually led into the world of graphics and engine development, which is the aim of these blogs. I am making my own bespoke game engine, and it has now matured to the point where I feel that sharing the development process and lessons would be beneficial to me, and entertaining to read about.

The aim for this blog post is to breakdown my journey into transparency within real-time graphics rendering, concluding with my implementation of approximate order independent transparency into my Vulkan rendering code.

What is Transparency?

First and foremost, what is transparency?

There are many ways to visualise what transparency is, some more focused on real-world observations, and others based in theoretics and maths. In this section I am going to explain it from a real-world observation perspective, and then re-explain it from a mathematical perspective in the order independent transparency part.

Simply, transparency is the phenomenon where objects can be seen through other objects. This is as a result of the light reflecting off of further away objects passing through closer objects, making them still able to be seen due to the light not being blocked. In the opposite sense, if all of the light reflected from an object is absorbed into a closer object, then you wouldn’t be able to see the further away object, making the closer object fully opaque. How this is seen in reality is through materials such as glass and water, which in turn allow for the creation of, for example, windows, glasses, and clouds.

In games, transparency occurs in more places than you may first imagine. There are the obvious cases also seen in reality, such as windows or puddles. But many other areas could be using transparency without it being clear that they are. A clever example of this is with foliage, which commonly uses it to combine multiple leaf textures into a single render. Another being billboarding (where far away objects are replaced with a 2D textured quad of the object, to reduce the amount of vertices needing to be rendered), which can use transparency to help them better fit into the environment. Another example is that in many particle effects, light - meaning colour - may need to be seen through the effect (such as a smoke cloud only slightly obscuring vision). If each smoke particle was fully opaque then this wouldn’t give the effect planned, making transparency required for this effect to work properly. This is before taking into account other obviously transparent objects in games, such as force-fields, magic effects, and UI elements.

Below is a screenshot from TrackMania Turbo showing a closeup of how they have rendered bushes, which on close inspection are just 2D textured quads that always face the camera:

And after investigating more, far away trees are rendered with exactly the same approach.

Blending

Given that we now know what transparency is as a concept, and some use-cases, the question becomes: how is it implemented in games?

The simplest answer to this would be blending. (A more complex answer would be ray-marching or ray-tracing. But this article is not going to be covering either of them). Blending is where two colours are mixed together based off of some pre-determined factors, and then written back out to a store. This allows for both colours still being ‘visible‘ within the image. The most basic approach of this would be to give both colours even weighting in the mix, which would result in an equation like below:

$$Cm = C1 * 0.5 + C0 * 0.5$$

Which written in a more programmer friendly way looks like the code below:

vec3 mixed = (colourOne + colourZero) / 2.0;

If this was to be used to render objects to the screen then the general idea of transparency would be achieved. However, the result wouldn’t be able to be authored very effectively. To solve this, colour values have a fourth component to their usual red, green, and blue channels: alpha. This value represents how much revealage the colour takes up - but for now we can think of it as how opaque the colour is. So a colour with an alpha value of 1.0 is not transparent what-so-ever and lets no light through. But one with a value of 0.01 would let a lot of light through, making the transparent object itself not able to be seen very easily.

Great, so colours have another channel than the normal RGB ones, but how is it used in blending? The answer to that depends on how you want what is being drawn to blend into the existing scene. There are many combinations of equations that you can choose from. In OpenGL there is a function provided for setting this equation: glBlendFunc(). This page details the khronos spec on the function, and shows what input into the mix each option passes through. There is also glBlendEquation() and glBlendFuncSeparate() for further control over the equation.

Vulkan also provides similar functionality to OpenGL, it is just provided in a different way. In Vulkan you need to set what is known as a ‘blend state’ as part of the render pipeline, which holds essentially the same data as the OpenGL approach, just in a less global way. Following either of these API’s flows is easy to mess up, so knowing the whole blend process is useful to be very familiar with.

Below is a snippet taken from the Vulkan specification showing the different blend factors that can be set for the source and destination colours (colour one and colour zero). The snippet is from: https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VkBlendFactor.html

// Provided by VK_VERSION_1_0
typedef enum VkBlendFactor {
    VK_BLEND_FACTOR_ZERO = 0,
    VK_BLEND_FACTOR_ONE = 1,
    VK_BLEND_FACTOR_SRC_COLOR = 2,
    VK_BLEND_FACTOR_ONE_MINUS_SRC_COLOR = 3,
    VK_BLEND_FACTOR_DST_COLOR = 4,
    VK_BLEND_FACTOR_ONE_MINUS_DST_COLOR = 5,
    VK_BLEND_FACTOR_SRC_ALPHA = 6,
    VK_BLEND_FACTOR_ONE_MINUS_SRC_ALPHA = 7,
    VK_BLEND_FACTOR_DST_ALPHA = 8,
    VK_BLEND_FACTOR_ONE_MINUS_DST_ALPHA = 9,
    VK_BLEND_FACTOR_CONSTANT_COLOR = 10,
    VK_BLEND_FACTOR_ONE_MINUS_CONSTANT_COLOR = 11,
    VK_BLEND_FACTOR_CONSTANT_ALPHA = 12,
    VK_BLEND_FACTOR_ONE_MINUS_CONSTANT_ALPHA = 13,
    VK_BLEND_FACTOR_SRC_ALPHA_SATURATE = 14,
    VK_BLEND_FACTOR_SRC1_COLOR = 15,
    VK_BLEND_FACTOR_ONE_MINUS_SRC1_COLOR = 16,
    VK_BLEND_FACTOR_SRC1_ALPHA = 17,
    VK_BLEND_FACTOR_ONE_MINUS_SRC1_ALPHA = 18,
} VkBlendFactor;

The most common combination is known as ‘Over Blending’, or ‘back-to-front blending‘:

$$C_{m}=a_{1}C_{1}+(1-a_{1})C_{0}$$

Which assumes that C0 is further away that C1, and where a1 is the alpha component of C1. Written as code it is:

// C1 is the 'source' colour (the new colour)
// C0 is the 'destination' colour (the colour that is currently stored)
vec4 blendedColour = (C1.a * C1) + (1.0 - C1.a) * C0

Rendering a Scene Correctly

Now that the theory has been covered, how actually is it used within a rendering context?

Well, first, opaque models must be rendered to the screen, so that the depth buffer doesn’t cause issues with the ordering of drawing later on - and so that there is something to blend with. Then the transparent models are rendered. If the approach being used is over blending as detailed before, then all transparent meshes must be sorted based on distance, and then rendered in a back to front order, with further away objects being rendered first. This is to allow the blending equation to work properly. There is the option of blending front to back, using under blending, though the meshes must be sorted either way, which is the main drawback of this approach. Sorting all transparent objects in a scene may be simple for certain games and be brutally difficult for others. For example, say that you were to be rendering a glass sphere in the middle of a smoke screen, then you would need to sort every particle, and ensure that the sphere is rendered at the right time between other particles. This may not be too hard depending on the approach taken for drawing the smoke screen, or it may be super intensive.

In some other scenes it is impossible to sort the meshes properly, due to self contained transparent objects. For example, imagine a glass sphere with a transparent dragon inside. What should be rendered first? The sphere is both in-front and behind the dragon. The mesh could be split into two, but that adds more overhead to the rendering process. Another solution to this would be to move it to being a design problem to solve. In other words, preventing the issue from arising rather than solving the underlying problem. If your implementation is ok with sorting all transparent meshes, and avoiding the containment issue, then transparency has been added into your program! This is not where I stopped however. I was not happy with sorting all meshes, and would rather be able to render them in any order. This is what lead me into the area of Order Independent Transparency.

Order Independent Transparency

As a preface to this part of the blog. I am not going to covering anywhere near a complete deep dive into order independent transparency. This is a massive area, with many many different approaches that can be taken to solve the problem. What I am going to be discussing is the specific approach that I took, and how I added it into my engine.

So, to begin with, what is Order Independent Transparency? As the name implies it is a transparency approach that supports meshes to be rendered in any order, and give the correct results. ‘Correct‘ here comes down to a case by case basis, as some implementations are approximate and others exact (meaning that they give the same result as sorting the meshes).

Before I get to the implementation portion of this post, I need to quickly outline which research paper I have followed to achieve the results I have. The paper is a 2013 NVIDIA research paper by Morgan McGuire and Louis Bavoli titled: Weighted Blended Order-Independent Transparency. It gives a lot more detail on the concepts I’m going to talk about here and I highly recommend giving it a read as I didn’t come up with these ideas, and haven’t spent as long looking into the area as they have.

Partial Coverage

Before, I explained transparency as the phenomenon of light passing through objects and not losing all of its energy during the process. Well that idea of light passing through, or not passing through, another object can be described by the term ‘partial coverage‘. What this implies is that one object partially covers another - all logical so far. Now to expand this idea to the alpha value in colours. If we split the four component colour back into two different parts, then we get a three component colour (RGB), and a coverage factor (A). Where the coverage factor describes what proportion of the area the object covers.

It is the splitting of the colour into two sections that is most important here. If you were to think through the ‘normal’ process of using over blending to repeatedly blend colours to a texture, then we get something that looks like this (equation from McGuire 2013, using pre-multiplied colours):

$$C_{m} = [C_{n} + (1 - a_{n})...[C_{2}+(1-a_{2})C_{1})[C_{1}+(1-a_{1})C_{0})]]]$$

Which looks like a scary equation, but is just repeated multiplication of all of the terms in the blending flow.

Using this equation, the ordering problem can be thought of in a different way, as being the weighted sum of each component separately, meaning:

$$C_{w}= (\sum_{i=1}^nC_{i})+C_{0}(1-\sum_{i=1}^na_{i})$$

Which may look even scarier than the last one. All this is doing is looping through all blending operations that occur (from i = 1 to i = n) and adding together their colour components. Then summing together the 1 - alpha parts of all blending operations to get the weighted sum, and then multiplying that against the original colour (C0). In other words, finding what the amount of the original colour is let through, and getting that proportion of it to add.

The only modifications from the above equation to the one I have actually used is the introduction of a weighted average, so that each loop is more in line with how much colour and coverage it would actually provide, instead of each one being balanced. And, in addition to that, adding a distance weighting function, so that further away objects contribute less than closer objects- helping to solve the issue of far away occluded objects still being well defined. Both these steps are detailed in the paper, but as the equations get more unruly I’m not going to outline them here.

Now onto the implementation!

Implementation in Engine

In a single frame there is a number of opaque and transparent meshes that range from zero, to the amount the GPU can hold in memory. This doesn’t fully matter in this context beyond knowing that the opaque meshes must be, and are, fully rendered first.

The frame I have chosen to breakdown is this one:

Which shows a full transparent model at the back, and transparent two red cubes in the foreground.

First the opaque models are rendered, giving the result of:

Then there are two new textures introduced to hold transparency information: the accumulation and revealage textures. These textures represent the left and right half of the equations from before. This here is the accumulation texture for this render.

Not very interesting :( But if the range is restricted to a very small number, in this case around 0.015, then what is actually stored there is more viewable:

Here is the revealage texture (it is grey-scale as it is stored as a one channel texture):

Which after being combined with the opaque render texture gives the result that was shown at the start:

Code Implementation

The image examples above are pretty and all, but how were they made?

Surprisingly, given the equation’s complexity, it was fairly simple to implement. First off two new textures were needed to be added. An accumulation texture with a format of VK_FORMAT_R16G16B16A16_SFLOAT (16 bits per channel), and a revealage texture of format: VK_FORMAT_R32_SFLOAT (32 bits per channel).

Accumulation Texture Format

As a note here, the accumulation texture doesn’t have to have 16 bits per channel. It just needs to have a floating point data type, and as the 8 bit version is not widely supported, I chose to use the 16 bit version.

Then, after the opaque models have been drawn, each transparent mesh is rendered. But instead of drawing out their colour (including alpha component) to the colour buffer to be blended, they are instead written out to the two new textures:

layout(location = 0) out vec4 outAccumulationBuffer;
layout(location = 1) out vec4 outRevealageBuffer;

// Distance weighting equation taken from McGuire 2013
float distanceFactor(float alpha)
{
    return alpha * max(0.01, 3000.0 * pow(1.0 - gl_FragCoord.z, 3.0));
}

void main() 
{
    vec4 outColour = ...

    outAccumulationBuffer = outColour * distanceFactor(outColour.a);
    outRevealageBuffer    = vec4(outColour.a);
}

(The code above is glsl)

Some key points from the code above: distanceFactor() refers to the distance weighting factor, and the revealage buffer output is a vec4 even though the actual buffer type is a one channel texture. This is to provide the alpha component to the blending process. Without it then it would have no alpha data to draw on for the mixing process.

The blending state for each texture is very important, and is:

// Accumulation blend state
VkPipelineColorBlendAttachmentState newState = {};
newState.colorWriteMask                      = VK_COLOR_COMPONENT_R_BIT | VK_COLOR_COMPONENT_G_BIT | VK_COLOR_COMPONENT_B_BIT | VK_COLOR_COMPONENT_A_BIT;
newState.blendEnable                         = VK_TRUE;
newState.srcColorBlendFactor                 = VK_BLEND_FACTOR_ONE;
newState.dstColorBlendFactor                 = VK_BLEND_FACTOR_ONE;
newState.colorBlendOp                        = VK_BLEND_OP_ADD;
newState.srcAlphaBlendFactor                 = VK_BLEND_FACTOR_ONE;
newState.dstAlphaBlendFactor                 = VK_BLEND_FACTOR_ONE;
newState.alphaBlendOp                        = VK_BLEND_OP_ADD;

// Revealage blend state
VkPipelineColorBlendAttachmentState newState = {};
newState.colorWriteMask                      = VK_COLOR_COMPONENT_R_BIT;
newState.blendEnable                         = VK_TRUE;
newState.srcColorBlendFactor                 = VK_BLEND_FACTOR_ZERO;
newState.dstColorBlendFactor                 = VK_BLEND_FACTOR_ONE_MINUS_SRC_ALPHA;
newState.colorBlendOp                        = VK_BLEND_OP_ADD;
newState.srcAlphaBlendFactor                 = VK_BLEND_FACTOR_ZERO;
newState.dstAlphaBlendFactor                 = VK_BLEND_FACTOR_ONE_MINUS_SRC_ALPHA;
newState.alphaBlendOp                        = VK_BLEND_OP_ADD;

And then the final step is to combine both textures with the opaque colour buffer. How this is done is up to you. It could be done through rendering a fullscreen quad with a fragment shader that blends them using a blend state. Or it could be done through a compute shader, which is how I did it. I don’t know if one way is more efficient than the other, it was just a very good excuse to add compute shader support to the engine.

Here is the compute shader I wrote:

#version 440 core
#extension GL_KHR_vulkan_glsl : enable

layout (local_size_x = 16, local_size_y = 16, local_size_z = 1) in;

layout(set = 0, binding = 0) uniform sampler2D accumulationTexture;
layout(set = 0, binding = 1) uniform sampler2D revealageTexture;
layout(set = 0, binding = 2) uniform sampler2D opaqueBufferInput;

// Output image to have the final colour blended with
layout (rgba16f, set = 0, binding = 3) uniform image2D imgOutput;

void main()
{
    // Pixels on screen width and height, used to determine if we are going out of bounds at any point
    vec2 screenRes = textureSize(opaqueBufferInput, 0);

    // Get current pixel on the screen - (0 - > width, height -> 0)
    vec2 pixelCoord = vec2(gl_GlobalInvocationID.xy);

    // Break out if we are out of bounds
    if(pixelCoord.x > screenRes.x || pixelCoord.y > screenRes.y)
        return;

    vec2 textureCoords = vec2(pixelCoord.x / screenRes.x, pixelCoord.y / screenRes.y);

    // read in the accumulation, revelage and opaque values
    vec4  accumulation     = texture(accumulationTexture, textureCoords.xy);
    float revealage        = texture(revealageTexture, textureCoords.xy).r;
    vec4  opaqueColourData = texture(opaqueBufferInput, textureCoords.xy);

    // Calculate the colour to write based on the input textures
    vec3 newColourComponent = accumulation.rgb / clamp(accumulation.a, 1e-4, 5e4);
    vec4 newColourValue = vec4(newColourComponent, revealage);

    // Now do the blending by hand as we are in a compute shader instead of a fragment shader that has built in blending
    vec3 finalData = opaqueColourData.rgb;
    finalData = ((1.0 - newColourValue.a) * newColourValue.rgb) + (newColourValue.a * opaqueColourData.rgb);

    // Write the colour to the image
    imageStore(imgOutput, ivec2(pixelCoord.xy), vec4(finalData, 1.0));
}

I have not yet tested to see if different local sizes are faster, nor have I looked into optimizing it yet, so I am sure there may be places to make it faster.

Drawbacks

The main drawback of this version of order independent transparency is the extra VRAM (virtual ram) memory that it requires, due to the extra internal textures being used for storing transparent pixel data. But in my opinion the lack of sorting is well worth the extra memory usage.

Conclusion

This blog post has covered what transparency is and how it is used within games to create a range of visuals. It then transitioned into talking about order independent transparency and the approach that I took to implement it into my custom game engine, with an example frame rendering being broken down.

For the next blog I am going to be covering deferred rendering, and how I have added a somewhat unique version of it into the my engine. To make sure that you don’t miss being notified about the next article you can subscribe to the newsletter which will send you an email whenever I upload a new article in this series :D

Thanks for reading.

Order Independent Transparency

A breakdown of how I implemented order independent transparency into my custom engine.

Introduction

What is Transparency?

Blending

Rendering a Scene Correctly

Order Independent Transparency

Partial Coverage

Implementation in Engine

Code Implementation

Drawbacks

Conclusion