Rendering Techniques - In Practice - OpenGL SuperBible: Comprehensive Tutorial and Reference, Sixth Edition (2013)

OpenGL SuperBible: Comprehensive Tutorial and Reference, Sixth Edition (2013)

Part III: In Practice

Chapter 12. Rendering Techniques

What You’ll Learn in This Chapter

• How to light the pixels in your scene

• How to delay shading until the last possible moment

• How to render an entire scene without a single triangle

By this point in the book, you should have a good grasp of the fundamentals of OpenGL. You have been introduced to most of its features and should feel comfortable using it to implement graphics rendering algorithms. In this chapter, we take a look at a few of these algorithms — in particular those that might be interesting in a real-time rendering context. First, we will cover a few basic lighting techniques that will allow you to apply interesting shading to the objects in your scene. Then, we will take a look at some approaches to rendering without the goal of photo-realism. Finally, we will discuss some algorithms that are really only applicable outside the traditional forward-rendering geometry pipeline, ultimately culminating with rendering an entire scene without a single vertex or triangle.

Lighting Models

Arguably, the job of any graphics rendering application is the simulation of light. Whether it be the simplest spinning cube, or the most complex movie special effect ever invented, we are trying to convince the user that they are seeing the real world, or an analog of it. To do this, we must model the way that light interacts with surfaces. Extremely advanced models exist that are as physically accurate as far as we understand the properties of light. However, most of these are impractical for real-time implementation, and so we must assume approximations, or models that produce plausible results even if they are not physically accurate. The following few sections show how a few of the lighting models that you might use in a real-time application can be implemented.

The Phong Lighting Model

One of the most common lighting models is the Phong lighting model. It works on a simple principle, which is that objects have three material properties, which are the ambient, diffuse, and specular reflectivity. These properties are assigned color values, with brighter colors representing a higher amount of reflectivity. Light sources have these same three properties and are again assigned color values that represent the brightness of the light. The final calculated color value is then the sum of the lighting and material interactions of these three properties.

Ambient Light

Ambient light doesn’t come from any particular direction. It has an original source somewhere, but the rays of light have bounced around the room or scene and become directionless. Objects illuminated by ambient light are evenly lit on all surfaces in all directions. You can think of ambient light as a global “brightening” factor applied per light source. This lighting component really approximates scattered light in the environment that originates from the light source.

To calculate the contribution an ambient light source makes to the final color, the ambient material property is scaled by the ambient light values (the two color values are just multiplied), which yields the ambient color contribution. In GLSL shader speak, we would write this like so:

uniform vec3 ambient = vec3(0.1, 0.1, 0.1);

Diffuse Light

Diffuse light is the directional component of a light source and was the subject of our previous example lighting shader. In the Phong lighting model, the diffuse material and lighting values are multiplied together, as is done with the ambient components. However, this value is then scaled by the dot product of the surface normal and light vector, which is the direction vector from the point being shaded to the light. Again, in shader speak, this might look something like this:

uniform vec3 vDiffuseMaterial;
uniform vec3 vDiffuseLight;
float fDotProduct = max(0.0, dot(vNormal, vLightDir));
vec3 vDiffuseColor = vDiffuseMaterial * vDiffuseLight * fDotProduct;

Note that we did not simply take the dot product of the two vectors, but also employed the GLSL function max. The dot product can also be a negative number, and we really can’t have negative lighting or color values. Anything less than zero needs to just be zero.

Specular Highlight

Like diffuse light, specular light is a highly directional property, but it interacts more sharply with the surface and in a particular direction. A highly specular light (really a material property in the real world) tends to cause a bright spot on the surface it shines on, which is called the specular highlight. Because of its highly directional nature, it is even possible that depending on a viewer’s position, the specular highlight may not even be visible. A spotlight and the sun are good examples of sources that produce strong specular highlights, but of course they must be shining on an object that is “shiny.”

The color contribution to the specular material and lighting colors is scaled by a value that requires a bit more computation than we’ve done so far. First we must find the vector that is reflected by the surface normal and the inverted light vector. The dot product between these two vectors is then raised to a “shininess” power. The higher the shininess number, the smaller the resulting specular highlight turns out to be. Some shader skeleton code that does this is shown here.

uniform vec3 vSpecularMaterial;
uniform vec3 vSpecularLight;
float shininess = 128.0;

vec3 vReflection = reflect(-vLightDir, vEyeNormal);
float EyeReflectionAngle = max(0.0, dot(vEyeNormal, vReflection);
fSpec = pow(EyeReflectionAngle, shininess);
vec3 vSpecularColor = vSpecularLight * vSpecularMaterial * fSpec;

The shininess parameter could easily be a uniform just like anything else. Traditionally (from the fixed-function pipeline days), the highest specular power is set to 128. Numbers greater than this tend to have a diminishingly small effect.

Now, we have formed a complete equation for modeling the effect of lighting on a surface. Given material with ambient term ka, diffuse term kd, specular term ks and shininess factor α, and a light with ambient term ia, diffuse term id, and diffuse term is, the complete lighting formula is

Image

This equation is a function of several vectors, Image, Image, Image, and Image which represent the surface normal, the unit vector from the point being shaded to the light, the reflection of the negative of the light vector Image in the plane defined by Image, and the vector to the viewer Image. To understand why this works, consider the vectors shown in Figure 12.1.

Image

Figure 12.1: Vectors used in Phong lighting

In Figure 12.1, Image is shown pointing away from the light. If we then reflect that vector about the plane defined by the surface normal Image, it is obvious from the diagram that we end up with Image. This represents the reflection of the light source in the surface. When Image points away from the viewer, the reflection will not be visible. However, when Image points directly at the viewer, then the reflection will appear brightest. At this point, the dot product (which, remember, is the cosine of the angle between two normalized vectors) will be greatest. This is the specular highlight, which is view dependent.

The effect of diffuse shading also becomes clearer from Figure 12.1. When the light source shines directly on the surface, the vector Image will be perpendicular to the surface and therefore be colinear with Image, where the dot product between Image and Image is greatest. When the light strikes the surface at a grazing angle, Image and Image will be almost perpendicular to one another, and their dot product will be close to zero.

As you can see, the intensity of the light at point p (Ip) is calculated as the sum of a number of terms. The reflection vector Image (called R in the shader) is calculated by reflecting the light vector around the eye-space normal of the point being shaded.

The sample program phonglighting implements just such a shader. The sample implements the Gouraud technique known as Gouraud shading, where we compute the lighting values per vertex and then simply interpolate the resulting colors between vertices for the shading. This allows us to implement the entire lighting equation in the vertex shader. The complete listing of the vertex shader is given in Listing 12.1.


#version 420 core

// Per-vertex inputs
layout (location = 0) in vec4 position;
layout (location = 1) in vec3 normal;

// Matrices we'll need
layout (std140) uniform constants
{
mat4 mv_matrix;
mat4 view_matrix;
mat4 proj_matrix;
};

// Light and material properties
uniform vec3 light_pos = vec3(100.0, 100.0, 100.0);
uniform vec3 diffuse_albedo = vec3(0.5, 0.2, 0.7);
uniform vec3 specular_albedo = vec3(0.7);
uniform float specular_power = 128.0;
uniform vec3 ambient = vec3(0.1, 0.1, 0.1);

// Outputs to the fragment shader
out VS_OUT
{
vec3 color;
} vs_out;

void main(void)
{
// Calculate view-space coordinate
vec4 P = mv_matrix * position;

// Calculate normal in view space
vec3 N = mat3(mv_matrix) * normal;
// Calculate view-space light vector
vec3 L = light_pos - P.xyz;
// Calculate view vector (simply the negative of the
// view-space position)
vec3 V = -P.xyz;

// Normalize all three vectors
N = normalize(N);
L = normalize(L);
V = normalize(V);

// Calculate R by reflecting -L around the plane defined by N
vec3 R = reflect(-L, N);

// Calculate the diffuse and specular contributions
vec3 diffuse = max(dot(N, L), 0.0) * diffuse_albedo;
vec3 specular = pow(max(dot(R, V), 0.0), specular_power) *
specular_albedo;

// Send the color output to the fragment shader
vs_out.color = ambient + diffuse + specular;

// Calculate the clip-space position of each vertex
gl_Position = proj_matrix * P;
}


Listing 12.1: The Gouraud shading vertex shader

The fragment shader for Gouraud shading is very simple. As the final color of each fragment is essentially calculated in the vertex shader and then interpolated before being passed to the fragment shader, all we need to do in our fragment shader is write the incoming color to the framebuffer. The complete source code is shown in Listing 12.2.


#version 420 core

// Output
layout (location = 0) out vec4 color;

// Input from vertex shader
in VS_OUT
{
vec3 color;
} fs_in;

void main(void)
{
// Write incoming color to the framebuffer
color = vec4(fs_in.color, 1.0);
}


Listing 12.2: The Gouraud shading fragment shader

Unless you use a very high level of tessellation, then for a given triangle, there are only three vertices and usually many more fragments that fill out the triangle. This makes per-vertex lighting and Gouraud shading very efficient, as all the computations are done only once per vertex. Figure 12.2 shows the output of the phonglighting example program.

Image

Figure 12.2: Per-vertex lighting (Gouraud shading)

Phong Shading

One of the drawbacks to Gouraud shading is clearly apparent in Figure 12.2. Notice the starburst pattern of the specular highlight. On a still image, this might almost pass as an intentional artistic effect. The running sample program, however, rotates the sphere and shows a characteristic flashing that is a bit distracting and generally undesirable. This is caused by the discontinuity between triangles because the color values are being interpolated linearly through color space. The bright lines are actually the seams between individual triangles. One way to reduce this effect is to use more and more vertices in your geometry.

Another, and higher quality, method is called Phong shading. Note that Phong shading and the Phong lighting model are separate things — although they were both invented by the same person at the same time. With Phong shading, instead of interpolating the color values between vertices, we interpolate the surface normals between vertices and then use the resulting normal to perform the entire lighting calculation for each pixel instead of per vertex. The phonglighting example program can be switched between evaluating the lighting equations per vertex (and therefore implementing Gouraud shading) and evaluating them per fragment (implementing Phong shading). Figure 12.3 shows the output from the phonglighting sample program performing shading per fragment.

Image

Figure 12.3: Per-fragment lighting (Phong shading)

The trade-off is of course we are now doing significantly more work in the fragment shader, which is going to be executed significantly more times than the vertex shader. The basic code is the same as for the Gouraud shading example, but this time there is some significant rearranging of the shader code. Listing 12.3 shows the new vertex shader.


#version 420 core

// Per-vertex inputs
layout (location = 0) in vec4 position;
layout (location = 1) in vec3 normal;

// Matrices we'll need
layout (std140) uniform constants
{
mat4 mv_matrix;
mat4 view_matrix;
mat4 proj_matrix;
};

// Inputs from vertex shader
out VS_OUT
{
vec3 N;
vec3 L;
vec3 V;
} vs_out;

// Position of light
uniform vec3 light_pos = vec3(100.0, 100.0, 100.0);
void main(void)
{
// Calculate view-space coordinate
vec4 P = mv_matrix * position;

// Calculate normal in view-space
vs_out.N = mat3(mv_matrix) * normal;

// Calculate light vector
vs_out.L = light_pos - P.xyz;

// Calculate view vector
vs_out.V = -P.xyz;

// Calculate the clip-space position of each vertex
gl_Position = proj_matrix * P;
}


Listing 12.3: The Phong shading vertex shader

All the lighting computations depend on the surface normal, light direction, and view vector. Instead of passing a computed color value one from each vertex, we pass these three vectors as the outputs vs_out.N, vs_out.L, and vs_out.V. Now the fragment shader has significantly more work to do than before, and it is shown in Listing 12.4.


#version 420 core

// Output
layout (location = 0) out vec4 color;

// Input from vertex shader
in VS_OUT
{
vec3 N;
vec3 L;
vec3 V;
} fs_in;

// Material properties
uniform vec3 diffuse_albedo = vec3(0.5, 0.2, 0.7);
uniform vec3 specular_albedo = vec3(0.7);
uniform float specular_power = 128.0;

void main(void)
{
// Normalize the incoming N, L, and V vectors
vec3 N = normalize(fs_in.N);
vec3 L = normalize(fs_in.L);
vec3 V = normalize(fs_in.V);

// Calculate R locally
vec3 R = reflect(-L, N);

// Compute the diffuse and specular components for each
// fragment
vec3 diffuse = max(dot(N, L), 0.0) * diffuse_albedo;
vec3 specular = pow(max(dot(R, V), 0.0), specular_power) *
specular_albedo;

// Write final color to the framebuffer
color = vec4(diffuse + specular, 1.0);
}


Listing 12.4: The Phong shading fragment shader

On today’s hardware, higher quality rendering choices such as Phong shading are often practical. The visual quality is dramatic, and performance is often only marginally compromised. Still, on lower powered hardware (such as an embedded device) or in a scene where many other already expensive choices have been made, Gouraud shading may be the best choice. A general shader performance optimization rule is to move as much processing out of the fragment shaders and into the vertex shader as possible. With this example, you can see why.

The main parameters that are passed to the Phong lighting equations (whether they be evaluated per vertex or per fragment) are the diffuse and specular albedo and the specular power. The first two are the colors of the diffuse and specular lighting effect produced by the material being modeled. Normally, they are either the same color or the diffuse albedo is the color of the material and the specular albedo is white. However, it’s also possible to make the specular albedo a completely different color to the diffuse albedo. The specular power controls the sharpness of the specular highlight. Figure 12.4 shows the effect of varying the specular parameters of a material (this image is also shown in Color Plate 5). A single white point light is in the scene. From left to right, the specular albedo varies from almost black to pure white (essentially increasing the specular contribution), and from top to bottom, the specular power increases exponentially from 4.0 to 256.0, doubling in each row. As you can see, the sphere on the top left looks dull and evenly lit, whereas the sphere on the bottom right appears highly glossy.

Image

Figure 12.4: Varying specular parameters of a material

Although the image in Figure 12.4 shows only the effect of a white light on the scene, colored lights are simulated by simply multiplying the color of the light by the diffuse and specular components of each fragment’s color.

Blinn-Phong Lighting

The Blinn-Phong lighting model could be considered an extension to or possibly an optimization of the Phong lighting model. Notice that in the Phong lighting model, we calculate Image · Image at each shaded point (either per vertex or per fragment). However, as an approximation, we can replace Image · Image with Image · Image, where Image is the halfway vector between the light vector Image and the eye vector Image. This vector is can be calculated as

Image

Technically, this calculation should also be applied wherever the Phong equations would have been applied, requiring a normalization at each step (the division by the vectors’ magnitude in the above equation). However, this comes in exchange for no longer needing to calculate the vector Image,avoiding the call to the reflect function. Modern graphics processors are generally powerful enough that the difference in cost between the vector normalization required to calculate Image and the call to reflect is negligible. However, if the curvature of the underlying surface represented by a triangle is relatively small and if the triangle is small relative to the distance from the surface to the light and viewer, the value of Image won’t change much, so it’s even possible to calculate Image in the vertex (or geometry or tessellation) shader and pass it to the fragment shader as a flat input. Even when the result of this is inaccurate, this can often be remedied by increasing the shininess (or specular) factor α. Listing 12.5 provides a fragment shader that implements Blinn-Phong lighting per fragment. This shader is included in the blinnphong example program.


#version 420 core

// Output
layout (location = 0) out vec4 color;

// Input from vertex shader
in VS_OUT
{
vec3 N;
vec3 L;
vec3 V;
} fs_in;

// Material properties
uniform vec3 diffuse_albedo = vec3(0.5, 0.2, 0.7);
uniform vec3 specular_albedo = vec3(0.7);
uniform float specular_power = 128.0;

void main(void)
{
// Normalize the incoming N, L, and V vectors
vec3 N = normalize(fs_in.N);
vec3 L = normalize(fs_in.L);
vec3 V = normalize(fs_in.V);

// Calculate the half vector, H
vec3 H = normalize(L + V);

// Compute the diffuse and specular components for each fragment
vec3 diffuse = max(dot(N, L), 0.0) * diffuse_albedo;

// Replace the R.V calculation (as in Phong) with N.H
vec3 specular = pow(max(dot(N, H), 0.0), specular_power) * specular_albedo;

// Write final color to the framebuffer
color = vec4(diffuse + specular, 1.0);
}


Listing 12.5: Blinn-Phong fragment shader

Figure 12.5 shows the result of using plain Phong shading (left) next to the result of using Blinn-Phong shading. In Figure 12.5, the specular exponent used for the Phong rendering is 128, whereas the specular exponent used for the Blinn-Phong rendering is 200. As you can see, after adjustment of the specular powers, the results are very similar.

Image

Figure 12.5: Phong lighting (left) vs. Blinn-Phong lighting (right)

Rim Lighting

Rim lighting, which is also known as back-lighting, is an effect that simulates the bleeding of light “around” an object from sources that are behind it or otherwise have no effect on the shaded surfaces of the model. Rim lighting is so called because it produces a bright rim of light around the outline of the object being lit. In photography, this is attained by physically placing a light source behind the subject such that the object of interest sits between the camera and the light source. In computer graphics, we can simulate the effect by determining how closely the view direction comes to glancing the surface.

To implement this, all we need is the surface normal and the view direction — two quantities we have at hand from any of the lighting models we have already described. When the view direction is face on to the surface, the view vector will be colinear to the surface normal and so the effect of rim lighting will be least. When the view direction glances the surface, the surface normal and view vector will be almost perpendicular to one another and the rim light effect will be greatest.

You can see this in Figure 12.6. Near the edge of the object, the vectors Image1 and Image1 are almost perpendicular, and this is where the most light from the lamp behind the object will leak around it. However, in the center of the object, Image2 and Image2 point in pretty much the same direction. The lamp will be completely obscured by the object, and the amount of light leaking through will minimal.

Image

Figure 12.6: Rim lighting vectors

A quantity that is easy to calculate and is proportional to the angle between two vectors is the dot product. When two vectors are colinear, the dot product between them will be one. As the two vectors become closer to orthogonal, the dot product becomes closer to zero. Therefore, we can produce a rim light effect by taking the dot product between the view direction and the surface normal and making the intensity of the rim light inversely proportional to it. To provide further control over the rim light, we include a scalar brightness and an exponential sharpness factor. Thus, our rim lighting equation is

Image

Here, Image and Image are our usual normal and view vectors, Crim and Prim are the color and power of the rim light, respectively, and Lrim is the resulting contribution of the rim light. The fragment shader to implement this is quite simple, and is shown in Listing 12.6.


// Uniforms controlling the rim light effect
uniform vec3 rim_color;
uniform float rim_power;

vec3 calculate_rim(vec3 N, vec3 V)
{
// Calculate the rim factor
float f = 1.0 - dot(N, V);

// Constrain it to the range 0 to 1 using a smooth step function
f = smoothstep(0.0, 1.0, f);

// Raise it to the rim exponent
f = pow(f, rim_power);

// Finally, multiply it by the rim color
return f * rim_color;
}


Listing 12.6: Rim lighting shader function

Figure 12.7 shows a model illuminated with a Phong lighting model as described earlier in this chapter, but with a rim light effect applied. The code to produce this image is included in the rimlight example program. The top-left image has the rim light disabled for reference. The top-right image applies a medium strength rim light with a moderate fall-off exponent. The bottom-left image increases both the exponent and the strength of the light. As a result, the rim is sharp and focused. The image on the bottom right of Figure 12.7 has the light intensity turned down but also has the rim exponent turned down. This causes the light to bleed further around the model producing more of an ambient effect.

Image

Figure 12.7: Result of rim lighting example

Two of the images included in Figure 12.7 are also shown in Color Plate 6. For a given scene, the color of the rim light would normally be fixed or perhaps vary as a function of world space (otherwise it would seem as though the different objects were lit by different lights, which might look odd). However, the power of the rim light is essentially an approximation of bleeding, which may vary by material. For example, soft materials such as hair, fur or translucent materials such as marble might bleed quite a bit, whereas harder materials such as wood or rock might not bleed as much light.

Normal Mapping

In the examples shown so far, we have calculated the lighting contributions either at each vertex in the case of Gouraud shading, or at each pixel, but with vectors derived from per-vertex attributes that are then smoothly interpolated across each triangle in the case of Phong shading. To really see surface features, that level of detail must be present in the original model. In most cases, this leads to an unreasonable amount of geometry that must be passed to OpenGL and to triangles that are so small that each one only covers a small number of pixels.

One method for increasing the perceived level of detail without actually adding more vertices to a model is normal mapping, which is sometimes also called bump mapping. To implement normal mapping, we need a texture that stores a surface normal in each texel. This is then applied to our model and used in the fragment shader to calculate a local surface normal for each fragment. Our lighting model of choice is then applied in each invocation to calculate per-fragment lighting. An example of such a texture is shown in Figure 12.8.

Image

Figure 12.8: Example normal map

The most common coordinate space used for normal maps is tangent space, which is a local coordinate system where the positive z axis is aligned with the surface normal. The other two vectors in this coordinate space are known as the tangent and bitangent vectors, and for best results, these vectors should line up with the direction of the u and v coordinates used in the texture. The tangent vector is usually encoded as part of the geometry data and passed as an input to the vertex shader. As an orthonormal basis, given two vectors in the frame, the third can be calculated using a simple cross product. This means that given the normal and tangent vectors, we can calculate the bitangent vector using the cross product.

The normal, tangent, and bitangent vectors can be used to construct a rotation matrix that will transform a vector in the standard Cartesian frame into the frame represented by these three vectors. We simply insert the three vectors as the rows of this matrix. This gives us the following:

Image

The matrix produced here is often referred to as the TBN matrix, which stands for Tangent, Bitangent, Normal. Given the TBN matrix for a vertex, we can transform any vector expressed in Cartesian coordinates into the local frame at the vertex. This is important because the dot product operations we use in our lighting calculations are relative to pairs of vectors. As long as these two vectors are in the same frame, then the results will be correct. By transforming our view and light vectors into the local frame at each vertex and then interpolating them across each polygon as we would with normal Phong shading, we are presented with view and light vectors at each fragment that are in the same frame as the normals in our normal map. We can then simply read the local normal at each fragment and perform our lighting calculations in the usual manner.

A vertex shader that calculates the TBN matrix for a vertex, determines the light and view vectors, and then multiplies them by the TBN matrix before passing them to the fragment shader is shown in Listing 12.7. This shader, along with the rest of the code for this example is included in thebumpmapping sample application.


#version 420 core

layout (location = 0) in vec4 position;
layout (location = 1) in vec3 normal;
layout (location = 2) in vec3 tangent;
layout (location = 4) in vec2 texcoord;

out VS_OUT
{
vec2 texcoord;
vec3 eyeDir;
vec3 lightDir;
} vs_out;

uniform mat4 mv_matrix;
uniform mat4 proj_matrix;
uniform vec3 light_pos = vec3(0.0, 0.0, 100.0);

void main(void)
{
// Calculate vertex position in view space.
vec4 P = mv_matrix * position;

// Calculate normal (N) and tangent (T) vectors in view space from
// incoming object space vectors.
vec3 N = normalize(mat3(mv_matrix) * normal);
vec3 T = normalize(mat3(mv_matrix) * tangent);
// Calculate the bitangent vector (B) from the normal and tangent
// vectors.
vec3 B = cross(N, T);

// The light vector (L) is the vector from the point of interest to
// the light. Calculate that and multiply it by the TBN matrix.
vec3 L = light_pos - P.xyz;
vs_out.lightDir = normalize(vec3(dot(V, T), dot(V, B), dot(V, N)));

// The view vector is the vector from the point of interest to the
// viewer, which in view space is simply the negative of the position.
// Calculate that and multiply it by the TBN matrix.
vec3 V = -P.xyz;
vs_out.eyeDir = normalize(vec3(dot(V, T), dot(V, B), dot(V, N)));

// Pass the texture coordinate through unmodified so that the fragment
// shader can fetch from the normal and color maps.
vs_out.texcoord = texcoord;

// Calculate clip coordinates by multiplying our view position by
// the projection matrix.
gl_Position = proj_matrix * P;
}


Listing 12.7: Vertex shader for normal mapping

The shader in Listing 12.7 calculates the view and light vectors expressed in the local frame of each vertex and passes them along with the vertex’s texture coordinates to the fragment shader. In our fragment shader, which is shown in Listing 12.8, we simply fetch a per-fragment normal map and use it in our shading calculations.


#version 420 core

out vec4 color;

// Color and normal maps
layout (binding = 0) uniform sampler2D tex_color;
layout (binding = 1) uniform sampler2D tex_normal;

in VS_OUT
{
vec2 texcoord;
vec3 eyeDir;
vec3 lightDir;
} fs_in;

void main(void)
{
// Normalize our incoming view and light direction vectors.
vec3 V = normalize(fs_in.eyeDir);
vec3 L = normalize(fs_in.lightDir);
// Read the normal from the normal map and normalize it.
vec3 N = normalize(texture(tex_normal, fs_in.texcoord).rgb * 2.0
- vec3(1.0));
// Calculate R ready for use in Phong lighting.
vec3 R = reflect(-L, N);

// Fetch the diffuse albedo from the texture.
vec3 diffuse_albedo = texture(tex_color, fs_in.texcoord).rgb;
// Calculate diffuse color with simple N dot L.
vec3 diffuse = max(dot(N, L), 0.0) * diffuse_albedo;
// Uncomment this to turn off diffuse shading
// diffuse = vec3(0.0);

// Assume that specular albedo is white - it could also come from a
texture
vec3 specular_albedo = vec3(1.0);
// Calculate Phong specular highlight
vec3 specular = max(pow(dot(R, V), 5.0), 0.0) * specular_albedo;
// Uncomment this to turn off specular highlights
// specular = vec3(0.0);

// Final color is diffuse + specular
color = vec4(diffuse + specular, 1.0);
}


Listing 12.8: Fragment shader for normal mapping

Rendering a model with this shader clearly shows specular highlights on details that are present only in the normal map and do not have geometric representation in the model data. In Figure 12.9, the top-left image shows the diffuse shading result, the top-right image shows the specular shading results, and the bottom left shows the image produced by adding these two results together. For reference, the bottom-right image of Figure 12.9 shows the result of applying per-pixel Phong shading using only the normals that are interpolated by OpenGL and does not use the normal map. It should be clear from contrasting the bottom-left and bottom-right images that normal mapping can add substantial detail to an image. The bottom-left image from Figure 12.9 is also shown in Color Plate 7.

Image

Figure 12.9: Result of normal mapping example

Environment Mapping

In the previous few subsections, you have learned how to compute the effect of lighting on the surface of objects. Lighting shaders can become extremely complex, but eventually they become so intensive that they start to affect performance. Also, it’s virtually impossible to create an equation that can represent an arbitrary environment. This is where environment maps come in. There are a few types of environment maps that are commonly used in real-time graphics applications — the spherical environment map, the equirectangular map, and the cube map. The spherical environment map is represented as the image of a sphere illuminated by the simulated surroundings. As a sphere map can only represent a single hemisphere of the environment, an equirectangular map is a mapping of spherical coordinates onto a rectangle that allows a full 360o view of the environment to be represented. A cube map, on the other hand, is a special texture made up of six faces that essentially represent a box made of glass through which, if you were standing in its center, you would see your surroundings. We’ll dig into these three methods of simulating an environment in the next couple of subsections.

Spherical Environment Maps

As noted, a spherical environment map is a texture map that represents the lighting produced by the simulated surroundings on a sphere made from the material being simulated. This works by taking the view direction and surface normal at the point being shaded and using these two vectors to compute a set of texture coordinates that can be used to look up into the texture to retrieve the lighting coefficients. In the simplest case, this is simply the color of the surface under these lighting conditions, although any number of parameters could be stored in such a texture map. A few examples1 of environment maps are shown in Figure 12.10. These environment maps are also shown in Color Plate 9.

1. The images shown in Figure 12.10 were produced by simply ray tracing a sphere using the popular POVRay ray tracer using different materials and lighting conditions.

Image

Figure 12.10: A selection of spherical environment maps

The first step in implementing spherical environment mapping is to transform the incoming normal into view-space and to calculate the eye-space view direction. These will be used in our fragment shader to compute the texture coordinates to look up into the environment map. Such a vertex shader is shown in Listing 12.9.


#version 420 core

uniform mat4 mv_matrix;
uniform mat4 proj_matrix;

layout (location = 0) in vec4 position;
layout (location = 1) in vec3 normal;

out VS_OUT
{
vec3 normal;
vec3 view;
} vs_out;

void main(void)
{
vec4 pos_vs = mv_matrix * position;

vs_out.normal = mat3(mv_matrix) * normal;
vs_out.view = pos_vs.xyz;

gl_Position = proj_matrix * pos_vs;
}


Listing 12.9: Spherical environment mapping vertex shader

Now, given the per-fragment normal and view direction, we can calculate the texture coordinates to look up into our environment map. First, we reflect the incoming view direction about the plane defined by the incoming normal. Then, by simply scaling and biasing the x and y components of this reflected vector, we can use them to fetch from the environment and shade our fragment. The corresponding fragment shader is given in Listing 12.10.


#version 420 core

layout (binding = 0) uniform sampler2D tex_envmap;

in VS_OUT
{
vec3 normal;
vec3 view;
} fs_in;

out vec4 color;

void main(void)
{
// u will be our normalized view vector
vec3 u = normalize(fs_in.view);

// Reflect u about the plane defined by the normal at the fragment
vec3 r = reflect(u, normalize(fs_in.normal));

// Compute scale factor
r.z += 1.0;
float m = 0.5 * inversesqrt(dot(r, r));

// Sample from scaled and biased texture coordinate
color = texture(tex_envmap, r.xy * m + vec2(0.5));
}


Listing 12.10: Spherical environment mapping fragment shader

The result of rendering a model with the shader given in Listing 12.10 is shown in Figure 12.11. This image was produced by the envmapsphere example program, using the environment map in the rightmost image of Figure 12.10.

Image

Figure 12.11: Result of rendering with spherical environment mapping

Equirectangular Environment Maps

The equirectangular environment map is similar to the spherical environment map except that it is less susceptible to the pinching effect sometimes seen when the poles of the sphere are sampled from. An example equirectangular environment texture is shown in Figure 12.12. Again, we use the view-space normal and view direction vectors, calculated in the vertex shader, interpolated and passed to the fragment shader, and again the fragment shader reflects the incoming view direction about the plane defined by the local normal. Now, instead of directly using the scaled and biased x and y components of this reflected vector, we extract the y component and then project the vector onto the xz plane by setting the y component to zero and normalizing it again. From this normalized vector, we extract the x component, producing our second texture coordinate. These extracted x and y components effectively form the altitude and azimuth angles for looking up into our equirectangular texture.

Image

Figure 12.12: Example equirectangular environment map

A fragment shader implementing equirectangular environment mapping is included in the equirectangular example application and is shown in Listing 12.11. The result of rendering an object with this shader is shown in Figure 12.13.

Image

Figure 12.13: Rendering result of equirectangular environment map


#version 420 core

layout (binding = 0) uniform sampler2D tex_envmap;

in VS_OUT
{
vec3 normal;
vec3 view;
} fs_in;

out vec4 color;

void main(void)
{
// u will be our normalized view vector
vec3 u = normalize(fs_in.view);

// Reflect u about the plane defined by the normal at the fragment
vec3 r = reflect(u, normalize(fs_in.normal));

// Compute texture coordinate from reflection vector
vec2 tc;

tc.y = r.y; r.y = 0.0;
tc.x = normalize(r).x * 0.5;

// Scale and bias texture coordinate based on direction
// of reflection vector
float s = sign(r.z) * 0.5;

tc.s = 0.75 - s * (0.5 - tc.s);
tc.t = 0.5 + 0.5 * tc.t;

// Sample from scaled and biased texture coordinate
color = texture(tex_envmap, tc);
}


Listing 12.11: Equirectangular environment mapping fragment shader

Cube Maps

A cube map is treated as a single texture object, but it is made up of six square (yes, they must be square!) 2D images that make up the six sides of a cube. Applications of cube maps range from 3D light maps to reflections and highly accurate environment maps. Figure 12.14 shows the layout of six square images composing a cube map that we use for the Cubemap sample program.2 The images are arranged in a cross shape with their matching edges abutting. If you wanted to, you could cut and fold the image into a cube and the edges would align.

2. The six images used for the Cubemap sample program were provided courtesy of The Game Creators, Ltd. (www.thegamecreators.com).

Image

Figure 12.14: The layout of six cube faces in the Cubemap sample program

To load a cube map texture, we create a texture object by binding a new name to the GL_TEXTURE_CUBE_MAP target, call glTexStorage2D() to specify the storage dimensions of the texture, and then load the cube map data into the texture object by calling glTexSubImage2D() once for each face of the cube map. The faces of the cube map each have a special target named GL_TEXTURE_CUBE_MAP_POSITIVE_X, GL_TEXTURE_CUBE_MAP_NEGATIVE_X, GL_TEXTURE_CUBE_MAP_POSITIVE_Y, GL_TEXTURE_CUBE_MAP_NEGATIVE_Y, GL_TEXTURE_CUBE_MAP_POSITIVE_Z, and GL_TEXTURE_CUBE_MAP_NEGATIVE_Z. They are assigned numerical values in this order, and so we can simply create a loop and update each face in turn. Example code to do this is shown in Listing 12.12.


GLuint texture;

glGenTextures(1, &texture);
glBindTexture(GL_TEXTURE_CUBE_MAP, texture);

glTexStorage2D(GL_TEXTURE_CUBE_MAP,
levels, internalFormat,
width, height);
for (face = 0; face < 6; face++)
{
glTexSubImage2D(GL_TEXURE_CUBE_MAP_POSITIVE_X + face,
0,
0, 0,
width, height,
format, type,
data + face * face_size_in_bytes);
}


Listing 12.12: Loading a cube map texture

Cube maps also support mipmaps, and so if your cube map has mipmap data, the code in Listing 12.12 would need to be modified to load the additional mipmap levels. The Khronos Texture File format has native support for cube map textures, and so the book’s .KTX file loader is able to do this for you.

Texture coordinates for cube maps have three dimensions, even though it they are collections of 2D images. This seem a little odd at first glance. Unlike a true 3D texture, the S, T, and R texture coordinates represent a signed vector from the center of the texture map pointing outwards. This vector will intersect one of the six sides of the cube map. The texels around this intersection point are then sampled to create the filtered color value from the texture.

A very common use of cube maps is to create an object that reflects its surroundings. The cube map is applied to a sphere, creating the appearance of a mirrored surface. The same cube map is also applied to the sky box, which creates the background being reflected.

A sky box is nothing more than a big box with a picture of the sky on it. Another way of looking at it is as a picture of the sky on a big box! Simple enough. An effective sky box contains six images that contain views from the center of your scene along the six directional axes. If this sounds just like a cube map, congratulations, you’re paying attention!

To render a cube map, we could simply draw a large cube around the viewer and apply the cube map texture to it. However, there’s an even easier way to do it! Any part of the virtual cube that is outside the viewport will be clipped away, but what we need is for the entire viewport to be covered. We can do this by rendering a full-screen quad. All we need to do then is to compute the texture coordinates at each of the four corners of the viewport, and we’ll be able to use them to render our cube map.

Now, if the cube map texture were mapped directly to our virtual cube, the cube’s vertex positions would be our texture coordinates. We would take the cube’s vertex positions, multiply their x, y, and z components by the rotational part of our view matrix (which is the upper-left 3 × 3submatrix) to orient them in the right direction, and render the cube in world space. In world space, the only face we’d see is the one we are looking directly at. Therefore, we can render a full-screen quad, and transform its corners by the view matrix in order to orient it correctly. All this occurs in the vertex shader, which is shown in Listing 12.13.


#version 420 core

out VS_OUT
{
vec3 tc;
} vs_out;

uniform mat4 view_matrix;

void main(void)
{
vec3[4] vertices = vec3[4](vec3(-1.0, -1.0, 1.0),
vec3( 1.0, -1.0, 1.0),
vec3(-1.0, 1.0, 1.0),
vec3( 1.0, 1.0, 1.0));

vs_out.tc = mat3(view_matrix) * vertices[gl_VertexID];

gl_Position = vec4(vertices[gl_VertexID], 1.0);
}


Listing 12.13: Vertex shader for sky box rendering

Notice that because the vertex coordinates and the resulting texture coordinates are hard-coded into the vertex shader, we don’t need any vertex attributes, and therefore don’t need any buffers to store them. If we wished, we could scale the field of view by scaling the z component of the vertex data — the larger the z component becomes, the smaller the x and y become after normalization, and so the smaller the field of view. The fragment shader for rendering the cube map is also equally simple and is shown in its entirety in Listing 12.14.


#version 420 core

layout (binding = 0) uniform samplerCube tex_cubemap;

in VS_OUT
{
vec3 tc;
} fs_in;

layout (location = 0) out vec4 color;

void main(void)
{
color = texture(tex_cubemap, fs_in.tc);
}


Listing 12.14: Fragment shader for sky box rendering

Once we’ve rendered our sky box, we need to render something into the scene that reflects the sky box. The texture coordinates used to fetch from a cube map texture are interpreted as a vector pointing from the origin outwards towards the cube. OpenGL will determine which face this vector eventually hits, and the coordinate within the face that it hits and then retrieve data from this location. What we need to do is for each fragment, calculate this vector. Again, we need the incoming view direction and the normal at each fragment.

These are produced in the vertex shader as before and passed to the fragment shader and normalized. Again, we reflect the incoming view direction about the plane defined by the surface normal at the fragment to compute an outgoing reflection vector. Under the assumption that the scenery shown in the sky box is sufficiently far away, this reflection vector can be considered to emanate from the origin and so can be used as the texture coordinate for our sky box. The vertex and fragment shaders are shown in Listings 12.15 and 12.16.


#version 420 core

uniform mat4 mv_matrix;
uniform mat4 proj_matrix;

layout (location = 0) in vec4 position;
layout (location = 1) in vec3 normal;

out VS_OUT
{
vec3 normal;
vec3 view;
} vs_out;

void main(void)
{
vec4 pos_vs = mv_matrix * position;

vs_out.normal = mat3(mv_matrix) * normal;
vs_out.view = pos_vs.xyz;

gl_Position = proj_matrix * pos_vs;
}


Listing 12.15: Vertex shader for cube map environment rendering


#version 420 core

layout (binding = 0) uniform samplerCube tex_cubemap;

in VS_OUT
{
vec3 normal;
vec3 view;
} fs_in;

out vec4 color;

void main(void)
{
// Reflect view vector about the plane defined by the normal
// at the fragment
vec3 r = reflect(fs_in.view, normalize(fs_in.normal));

// Sample from scaled using reflection vector
color = texture(tex_cubemap, r);
}


Listing 12.16: Fragment shader for cube map environment rendering

The result of rendering an object surrounded by a sky box using the shaders shown in Listings 12.13 through 12.16 is shown in Figure 12.15. This image was produced by the cubemapenv example program.

Image

Figure 12.15: Cube map environment rendering with a sky box

Of course, there is no reason that the final color of the fragment must be taken directly from the environment map. For example, you could multiply it by the base color of the object you’re rendering to tint the environment it reflects. Color Plate 10 shows a golden version the dragon being rendered.

Material Properties

In the examples presented so far in this chapter, we have used a single material for the entire model. This means that our dragons are uniformly shiny, and our ladybug looks somewhat plastic. However, there is no reason that every part of our models must be made from the same material. In fact, we can assign material properties per surface, per triangle, or even per pixel by storing information about the surface in a texture. For example, the specular exponent can be stored in a texture and applied to a model when rendering. This allows some parts of the model to be more reflective than others.

Another technique that allows a sense of roughness to be applied to a model is to pre-blur an environment map and then use a gloss factor (also stored in a texture) to gradually fade between a sharp and blurred version of the map. In this example, we will again use a simple spherical environment map. Figure 12.16 shows two environment maps and a shininess map used to blend between them. The left image shows a fully sharp environment map, whereas the image in the center contains a pre-blurred version of the same environment. The rightmost image is our gloss map and will be used to filter between the sharp and blurry versions of the environment map. Where the gloss map is brightest, the sharper environment map will be used. Where it is darkest, we will use the blurrier environment map.

Image

Figure 12.16: Pre-filtered environment maps and gloss map

We can combine the two environment textures together into a single, 3D texture that is only two texels deep. Then, we can sample from our gloss texture and use the fetched texel value as the third component of the texture coordinate used to fetch from the environment map (with the first two being calculated as normal). With the sharp image as the first layer of the 3D environment texture and the blurry image as the second layer of the 3D environment, OpenGL will smoothly interpolate between the sharp and the blurry environment maps for you.

Listing 12.17 shows the fragment shader that reads the material property texture to determine per-pixel gloss and then reads the environment map texture using the result.


#version 420 core

layout (binding = 0) uniform sampler3D tex_envmap;
layout (binding = 1) uniform sampler2D tex_glossmap;

in VS_OUT
{
vec3 normal;
vec3 view;
vec2 tc;
} fs_in;

out vec4 color;

void main(void)
{
// u will be our normalized view vector
vec3 u = normalize(fs_in.view);

// Reflect u about the plane defined by the normal at the fragment
vec3 r = reflect(u, normalize(fs_in.normal));

// Compute scale factor
r.z += 1.0;
float m = 0.5 * inversesqrt(dot(r, r));

// Sample gloss factor from glossmap texture
float gloss = texture(tex_glossmap, fs_in.tc * vec2(3.0, 1.0) * 2.0).r;

// Sample from scaled and biased texture coordinate
vec3 env_coord = vec3(r.xy * m + vec2(0.5), gloss);

// Sample from two-level environment map
color = texture(tex_envmap, env_coord);
}


Listing 12.17: Fragment shader for per-fragment shininess

Figure 12.17 was produced by the perpixelgloss example and shows the result of rendering a torus with the map applied.

Image

Figure 12.17: Result of per-pixel gloss example

Casting Shadows

The shading algorithms presented so far have all assumed that each light will contribute to the final color of each fragment. However, in a complex scene with lots of objects, this is not the case. Objects will cast shadows on each other and upon themselves. If these shadows are omitted from the rendered scene, a great deal of realism can be lost. This section outlines some techniques for simulating the effects of shadowing on objects.

Shadow Mapping

The most basic operation of any shadow calculation must be to determine whether the point being considered has any light hitting it. In effect, we must determine whether there is line of sight from the point being shaded to a light and, therefore, from the light to the point being shaded. This turns out to be a visibility calculation, and as luck might have it, we have extremely fast hardware to determine whether a piece of geometry is visible from a given vantage point — the depth buffer.

Shadow mapping is a technique that produces visibility information for a scene by rendering it from the point of view of a light source. Only the depth information is needed, and so to do this, we can use a framebuffer object with only a depth attachment. After rendering the scene into a depth buffer from the light’s perspective, we will be left with a per-pixel distance of the nearest point to the light in the scene. When we render our geometry in a forward pass, we can calculate, for each point, what the distance to the light is and compare that to the distance stored in the depth buffer. To do this, we project our point from view space (where it is being rendered) into the coordinate system of the light.

Once we have this coordinate, we simply read from the depth texture we rendered earlier, compare our calculated depth value against the one stored in the texture, and if we are not the closest point to the light for that particular texture, we know we are in shadow. In fact, this is such a common operation in graphics that OpenGL even has a special sampler type that does the comparison for us, the shadow sampler. In GLSL, this is declared as a variable with a sampler2DShadow type for 2D textures, which we’ll be using in this example. You can also create show samplers for 1D textures (sampler1DShadow), cube maps (samplerCubeShadow), and rectangle textures (samplerRectShadow), and for arrays of these types (except, of course, rectangle textures).

Listing 12.18 shows how to set up a framebuffer object with only a depth attachment ready for rendering the shadow map into.


GLuint shadow_buffer; GLuint shadow_tex;

glGenFramebuffers(1, &shadow_buffer);
glBindFramebuffer(GL_FRAMEBUFFER, shadow_buffer);

glGenTextures(1, &shadow_tex);
glBindTexture(GL_TEXTURE_2D, shadow_tex);
glTexStorage2D(GL_TEXTURE_2D, 1, GL_DEPTH_COMPONENT32,
DEPTH_TEX_WIDTH, DEPTH_TEX_HEIGHT);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_COMPARE_MODE,
GL_COMPARE_REF_TO_TEXTURE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_COMPARE_FUNC, GL_LEQUAL);

glFramebufferTexture(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT,
shadow_tex, 0);

glBindFramebuffer(GL_FRAMEBUFFER, 0);


Listing 12.18: Getting ready for shadow mapping

You will notice in Listing 12.18 two calls to glTexParameteri() with the parameters GL_TEXTURE_COMPARE_MODE and GL_TEXTURE_COMPARE_FUNC. The first of these turns on texture comparison, and the second sets the function that should be used. Once we have created our FBO for rendering depth, we can render the scene from the point of view of the light. Given a light position, light_pos, which is pointing at the origin, we can construct a matrix that represents the model-view-projection matrix for the light. This is shown in Listing 12.19.


vmath::mat4 model_matrix = vmath::rotate(currentTime, 0.0f, 1.0f, 0.0f);
vmath::mat4 light_view_matrix =
vmath::lookat(light_pos,
vmath::vec3(0.0f),
vmath::vec3(0.0f, 1.0f, 0.0f);
vmath::mat4 light_proj_matrix =
vmath::frustum(-1.0f, 1.0f, -1.0f, 1.0f,
1.0f, 1000.0f);
vmath::mat4 light_mvp_matrix = light_projection_matrix *
light_view_matrix *
model_matrix;


Listing 12.19: Setting up matrices for shadow mapping

Rendering the scene from the light’s position results in a depth buffer that contains the distance from the light to each pixel in the framebuffer. This can be visualized as a grayscale image with black being the closest possible depth value (zero) and white being the furthest possible depth value (white). Figure 12.18 shows the depth buffer of a simple scene rendered with the above shader.

Image

Figure 12.18: Depth as seen from a light

To make use of this stored depth information to generate shadows, we need to make a few modifications to our rendering shader. First, of course, we need to declare our shadow sampler and read from it. The interesting part is how we determine the coordinates at which to read from the depth texture. In fact, it turns out to be pretty simple. In our vertex shader, we normally calculate the output position in clip coordinates, which is a projection of the vertex’s world-space coordinate into the view space of our virtual camera and then into the camera’s frustum. At the same time, we need to perform the same operations using the light’s view and frustum matrices. As the resulting coordinate is interpolated and passed to the fragment shader, that shader then has the coordinate of each fragment in the light’s clip space.

In addition to the coordinate space transforms, we must scale and bias the resulting clip coordinates. Remember, OpenGL’s normal clip coordinate frame ranges from −1.0 to 1.0 in the x and y axis and 0.0 to 1.0 in the z axis. The matrix that transforms vertices from object space into the light’s clip space is known as the shadow matrix, and the code to calculate it is shown in Listing 12.20.


const vmath::mat4 scale_bias_matrix =
vmath::mat4(vmath::vec4(0.5f, 0.0f, 0.0f, 0.0f),
vmath::vec4(0.0f, 0.5f, 0.0f, 0.0f),
vmath::vec4(0.0f, 0.0f, 0.5f, 0.0f),
vmath::vec4(0.5f, 0.5f, 0.5f, 1.0f));

vmath::mat4 shadow_matrix = scale_bias_matrix *
light_proj_matrix *
light_view_matrix *
model_matrix;


Listing 12.20: Setting up a shadow matrix

The shadow matrix can be passed as a single uniform to the original vertex shader. A simplified version of the shader is shown in Listing 12.21.


#version 420 core

uniform mat4 mv_matrix;
uniform mat4 proj_matrix;
uniform mat4 shadow_matrix;

layout (location = 0) in vec4 position;

out VS_OUT
{
vec4 shadow_coord;
} vs_out;

void main(void)
{
gl_Position = proj_matrix * mv_matrix * position;
vs_out.shadow_coord = shadow_matrix * position;
}


Listing 12.21: Simplified vertex shader for shadow mapping

The shadow_coord output is sent from the vertex shader, interpolated, and passed into the fragment shader. This coordinate must be projected into normalized device coordinates in order to use them to look up into the shadow map we made earlier. This would normally mean dividing the whole vector through by its own w component. However, as projecting a coordinate in this way is such a common operation, there is a version of the overloaded texture function that will do this for us called textureProj. When we use textureProj with a shadow sampler, it first divides the x, y, andz components of the texture coordinate by its own w component and then uses the resulting x and y components to fetch a value from the texture. It then compares the returned value against the computed z component using the chosen comparison function, producing a value 1.0 or 0.0 depending on whether the test passed or failed, respectively.

If the selected texture filtering mode for the texture is GL_LINEAR or would otherwise require multiple samples, then OpenGL applies the test to each of the samples individually before averaging them together. The result of the textureProj function is therefore a value between 0.0 and 1.0 based on which and how many of the samples passed the comparison. All we need to do, then, is to call textureProj with our shadow sampler containing our depth buffer using the interpolated shadow texture coordinate, and the result will be a value that we can use to determine whether the point is in shadow or not. A highly simplified shadow mapping fragment shader is shown in Listing 12.22.


#version 420 core

layout (location = 0) out vec4 color;

layout (binding = 0) uniform sampler2DShadow shadow_tex;

in VS_OUT
{
vec4 shadow_coord;
} fs_in;

void main(void)
{
color = textureProj(shadow_tex, fs_in.shadow_coord) * vec4(1.0);
}


Listing 12.22: Simplified fragment shader for shadow mapping

Of course, the result of rendering a scene with the shader shown in Listing 12.22 is that no real lighting is applied and everything is drawn in black and white. However, as you can see in the shader code, we have simply multiplied the value vec4(1.0) by the result of the shadow map sample. In a more complex shader, we would apply our normal shading and texturing and multiply the result of those calculations by the result of the shadow map sample. Figure 12.19 shows a simple scene rendered as just shadow information on the left and with full lighting calculations on the right. This image was produced by the shadowmapping example.

Image

Figure 12.19: Results of rendering with shadow maps

Shadow maps have their advantages and disadvantages. They can be very memory intensive as each light requires its own shadow map. Each light also requires a pass over the scene, which costs performance. This can quickly add up and slow your application down. The shadow maps must be of a very high resolution as what might have mapped to a single texel in the shadow map may cover several pixels in screen space, which is effectively where the lighting calculations are performed. Finally, effects of self-occlusion may be visible in the output as stripes or a “sparkling” image in shadowed regions. It is possible to mitigate this to some degree using polygon offset. This is a small offset that can be applied automatically by OpenGL to all polygons (triangles) in order to push them towards or away from the viewer. To set the polygon offset, call

void glPolygonOffset(GLfloat factor,
GLfloat units);

The first parameter, factor, is a scale factor that is multiplied by the change in depth of the polygon relative to its screen area, and the second parameter, units, is an implementation-defined scaling value that is internally multiplied by the smallest change guaranteed to produce a different value in the depth buffer. If this sounds a bit handwavy — it can be. You need to play with these two values until the depth fighting effects go away. Once you’ve set up your polygon offset scaling factors, you can enable the effect by calling glEnable() with the GL_POLYGON_OFFSET_FILLparameter, and disable it again by passing the same parameter to glDisable().

Atmospheric Effects

In general, rendering in computer graphics is the modeling of light as it interacts with the world around us. Most of the rendering we’ve done so far has not taken into consideration the medium in which the light travels. Usually, this is air. The air around us isn’t perfectly transparent, and it contains particles, vapor, and gases that absorb and scatter light as it travels. We use this scattering and absorption to gauge depth and infer distance as we look out into the world. Modeling it, even approximately, can add quite a bit of realism to our scenes.

Fog

We are all familiar with fog. On a foggy day, it might be impossible to see more than a few feet in front of us, and dense fog can present danger. However, even when fog is not heavy, it’s still there — you may just need to look further to see it. Fog is caused by water vapor hanging in the air or by other gases or particles such as smoke or pollution. As light travels through the air, two things happen — some of the light is absorbed by the particles, and some bounces off the particles (or is possibly re-emitted by those particles). As light is absorbed by fog, this is known as extinctionas eventually all of the light will have been absorbed and none will be left. However, light will generally find a way to get out of the fog as it will bounce around and be absorbed and re-emitted by the fog particles. We call this inscattering. We can build a simple model of both extinction and inscattering to produce a simple yet effective simulation of fog.

For this example, we will return to the tessellated landscape example of Chapter 8, “Primitive Processing.” If you refer back to Figure 8.12, you will notice that we left the sky black and used only a simple texture with shading information baked into it to render the landscape. It is quite difficult to infer depth from the rendered result, and so we will adapt the sample to apply fog.

To add fog effects to the sample, we modify our tessellation evaluation shader to send both the world- and eye-space coordinates of each point to the fragment shader. The modified tessellation evaluation shader is shown in Listing 12.23.


#version 420 core

layout (quads, fractional_odd_spacing) in;

uniform sampler2D tex_displacement;

uniform mat4 mv_matrix;
uniform mat4 proj_matrix;
uniform float dmap_depth;

out vec2 tc;

in TCS_OUT
{
vec2 tc;
} tes_in[];

out TES_OUT
{
vec2 tc;
vec3 world_coord;
vec3 eye_coord;
} tes_out;

void main(void)
{
vec2 tc1 = mix(tes_in[0].tc, tes_in[1].tc, gl_TessCoord.x);
vec2 tc2 = mix(tes_in[2].tc, tes_in[3].tc, gl_TessCoord.x);
vec2 tc = mix(tc2, tc1, gl_TessCoord.y);

vec4 p1 = mix(gl_in[0].gl_Position,
gl_in[1].gl_Position, gl_TessCoord.x);
vec4 p2 = mix(gl_in[2].gl_Position,
gl_in[3].gl_Position, gl_TessCoord.x);
vec4 p = mix(p2, p1, gl_TessCoord.y);
p.y += texture(tex_displacement, tc).r * dmap_depth;

vec4 P_eye = mv_matrix * p;

tes_out.tc = tc;
tes_out.world_coord = p.xyz;
tes_out.eye_coord = P_eye.xyz;

gl_Position = proj_matrix * P_eye;
}


Listing 12.23: Displacement map tessellation evaluation shader

In the fragment shader, we fetch from our landscape texture as normal, but then we apply our simple fog model to the resulting color. We use the length of the eye-space coordinate to determine the distance from the viewer to the point being rendered. This tells us how far through the atmosphere light from the point of interest must travel to reach our eyes, which is the input term to the fog equations. We will apply exponential fog to our scene. The extinction and inscattering terms will be

fe = e– zde

fi = e– zdi

Here, fe is the extinction factor, and fi is the inscattering factor. Likewise, de and di are the extinction and inscattering coefficients, which we can use to control our fog effect. z is the distance from the eye to the point being shaded. As z approaches zero, the exponential term then tends towards one. As z increases (i.e., the point being shaded gets further from the viewer), the exponential term gets smaller and smaller, tending towards zero. These curves are illustrated by the graph in Figure 12.20.

Image

Figure 12.20: Graphs of exponential decay

The modified fragment shader that applies fog is shown in Listing 12.24.


#version 420 core

out vec4 color;

layout (binding = 1) uniform sampler2D tex_color;

uniform bool enable_fog = true;
uniform vec4 fog_color = vec4(0.7, 0.8, 0.9, 0.0);

in TES_OUT
{
vec2 tc;
vec3 world_coord;
vec3 eye_coord;
} fs_in;

vec4 fog(vec4 c)
{
float z = length(fs_in.eye_coord);

float de = 0.025 * smoothstep(0.0, 6.0,
10.0 - fs_in.world_coord.y);
float di = 0.045 * smoothstep(0.0, 40.0,
20.0 - fs_in.world_coord.y);

float extinction = exp(-z * de);
float inscattering = exp(-z * di);

return c * extinction + fog_color * (1.0 - inscattering);
}

void main(void)
{
vec4 landscape = texture(tex_color, fs_in.tc);

if (enable_fog)
{
color = fog(landscape);
}
else
{
color = landscape;
}
}


Listing 12.24: Application of fog in a fragment shader

In our fragment shader, the fog function applies fog to the incoming fragment color. It first calculates the fog factor for the extinction and inscattering components of the fog. It then multiplies the original fragment color by the extinction term. As the extinction term approaches zero, so this term approaches black. It then multiplies the fog color by one minus the inscattering term. As the distance from the viewer increases, so the inscattering term approaches zero (just like the extinction term). Taking one minus this causes it to approach one as the distance to the viewer increases, meaning that as the scene gets further from the viewer, its color approaches the color of the fog. The results of rendering the tessellated landscape scene with this shader is shown in Figure 12.21. The left image shows the original scene without fog, and the right image shows the scene with fog applied. You should be able to see that the sense of depth is greatly improved in the image on the right.

Image

Figure 12.21: Applying fog to tessellated landscape

Non-Photo-Realistic Rendering

Normally, the goal of rendering and computer graphics is to produce an image that appears as realistic as possible. However, for some applications or artistic reasons, it may be desirable to render an image that isn’t realistic at all. For example, perhaps we want to render using a pencil-sketch effect or in a completely abstract manner. This is known as non-photo-realistic rendering, or NPR.

Cell Shading — Texels as Light

Many of our examples of texture mapping in the last few chapters have used 2D textures. Two-dimensional textures are typically the simplest and easiest to understand. Most people can quickly get the intuitive feel for putting a 2D picture on the side of a piece of 2D or 3D geometry. Let’s take a look now at a one-dimensional texture mapping example that is commonly used in computer games to render geometry that appears on-screen like a cartoon. Toon shading, which is often referred to as cell shading, uses a one-dimensional texture map as a lookup table to fill geometry with a solid color (using GL_NEAREST) from the texture map.

The basic idea is to use the diffuse lighting intensity (the dot product between the eye space surface normal and light directional vector) as the texture coordinate into a one-dimensional texture that contains a gradually brightening color table. Figure 12.22 shows one such texture, with four increasingly bright red texels (defined as RGB unsigned byte color components).

Image

Figure 12.22: A one-dimensional color lookup table

Recall that the diffuse lighting dot product varies from 0.0 at no intensity to 1.0 at full intensity. Conveniently, this maps nicely to a one-dimensional texture coordinate range. Loading this one-dimensional texture is pretty straightforward as shown here:

static const GLubyte toon_tex_data[] =
{
0x44, 0x00, 0x00, 0x00,
0x88, 0x00, 0x00, 0x00,
0xCC, 0x00, 0x00, 0x00,
0xFF, 0x00, 0x00, 0x00
};

glGenTextures(1, &tex_toon);
glBindTexture(GL_TEXTURE_1D, tex_toon);
glTexStorage1D(GL_TEXTURE_1D, 1, GL_RGB8, sizeof(toon_tex_data) / 4);
glTexSubImage1D(GL_TEXTURE_1D, 0,
0, sizeof(toon_tex_data) / 4,
GL_RGBA, GL_UNSIGNED_BYTE,
toon_tex_data);
glTexParameteri(GL_TEXTURE_1D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_1D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_1D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);

This code is from the example program toonshading, which renders a spinning torus with the toon shading effect applied. Although the torus model file, which we use to create the torus, supplies a set of two-dimensional texture coordinates, we ignore them in our vertex shader, which is shown in Listing 12.25, and only use the incoming position and normal.


#version 420 core

uniform mat4 mv_matrix;
uniform mat4 proj_matrix;

layout (location = 0) in vec4 position;
layout (location = 1) in vec3 normal;

out VS_OUT
{
vec3 normal;
vec3 view;
} vs_out;

void main(void)
{
vec4 pos_vs = mv_matrix * position;

// Calculate eye-space normal and position
vs_out.normal = mat3(mv_matrix) * normal;
vs_out.view = pos_vs.xyz;

// Send clip-space position to primitive assembly
gl_Position = proj_matrix * pos_vs;
}


Listing 12.25: The toon vertex shader

Other than the transformed geometry position, the outputs of this shader are an interpolated eye-space normal and position that are passed to the fragment shader, which is shown in Listing 12.26. The computation of the diffuse lighting component is virtually identical to the earlier diffuse lighting examples.


#version 420 core

layout (binding = 0) uniform sampler1D tex_toon;

uniform vec3 light_pos = vec3(30.0, 30.0, 100.0);

in VS_OUT
{
vec3 normal;
vec3 view;
} fs_in;

out vec4 color;

void main(void)
{
// Calculate per-pixel normal and light vector
vec3 N = normalize(fs_in.normal);
vec3 L = normalize(light_pos - fs_in.view);

// Simple N dot L diffuse lighting
float tc = pow(max(0.0, dot(N, L)), 5.0);

// Sample from cell shading texture
color = texture(tex_toon, tc) * (tc * 0.8 + 0.2);
}


Listing 12.26: The toon fragment shader

The fragment shader for our toon shader calculates the diffuse lighting coefficient as normal, but rather than using it directly, it uses it to look up into a texture containing our four cell colors. In a traditional toon shader, the diffuse coefficient would be used unmodified as a texture coordinate, and the resulting color would be sent directly to the output of the fragment shader. However, here we raise the diffuse coefficient to a small power and then scale that color returned from the ramp texture by the diffuse lighting coefficient before outputting the result. This makes the toon highlights slightly sharper and also leaves the image with some depth rather than the plain flat shading that would be achieved with the content of the toon ramp texture only.

The resulting output is shown in Figure 12.23, where the banding and highlighting due to the toon shader are clearly visible. Both the red color ramp texture and the toon-shaded torus are also shown together in Color Plate 12.

Image

Figure 12.23: A toon-shaded torus

Alternative Rendering Methods

Traditional forward rendering executes the complete graphics pipeline, starting with a vertex shader and following through with any number of subsequent stages, most likely terminating with a fragment shader. That fragment shader is responsible for calculating the final color of the fragment3 and after each drawing command, the content of the framebuffer becomes more and more complete. However, it doesn’t have to be this way. As you will see in this section, it’s quite possible to partially calculate some of the shading information and finish the scene after all of the objects have been rendered, or even to forego traditional vertex-based geometry representations and do all of your geometry processing in the fragment shader.

3. Post processing notwithstanding.

Deferred Shading

In almost all of the examples you’ve seen so far, the fragment shader is used to calculate the final color of the fragment that it’s rendering. Now, consider what happens when you render an object that ends up covering something that’s already been drawn to the screen. This is known asoverdraw. In this case, the result of the previous calculation is replaced with the new rendering, essentially throwing away all of the work that the first fragment shader did. If the fragment shader is expensive, or if there is a lot of overdraw, this can add up to a large drain on performance. To get around this, we can use a technique called deferred shading, which is a method to delay the heavy processing that might be performed by a fragment shader until the last moment.

To do this, we first render the scene using a very simple fragment shader that outputs into the framebuffer any parameters of each fragment that we might need for shading it later. In most cases, multiple framebuffer attachments will be required. If you refer to the earlier sections on lighting, you will see that the types of information you might need for lighting the scene would be the diffuse color of the fragment, its surface normal, and its position in world space. The latter can usually be reconstructed from screen space and the depth buffer, but it can be convenient to simply store the world-space coordinate of each fragment in a framebuffer attachment. The framebuffer used for storing this intermediate information is often referred to as a G-buffer. Here, G stands for geometry as it stores information about the geometry at that point rather than image properties.

Once the G-buffer has been generated, it is possible to shade each and every point on the screen using a single full-screen quad. This final pass will use the full complexity of the final lighting algorithms, but rather than being applied to each pixel of each triangle, it is applied to each pixel in the framebuffer exactly once. This can substantially reduce the cost of shading fragments, especially if many lights or a complex shading algorithm are in use.

Generating the G-Buffer

The first stage of a deferred renderer is to create the G-buffer, which is implemented using a framebuffer object with several attachments. OpenGL can support framebuffers with up to eight attachments, and each attachment can have up to four 32-bit channels (using the GL_RGBA32F internal format, for example). However, each channel of each attachment consumes some memory bandwidth, and if we don’t pay attention to the amount of data we write to the framebuffer, we can start to outweigh the savings of deferring shading with the added cost of the memory bandwidth required to save all of this information.

In general, 16-bit floating-point values are more than enough to store colors4 and normals. 32-bit floating-point values are normally preferred to store the world-space coordinates in order to preserve accuracy. Additional components that might be stored for the purposes of shading might be derived from the material. For example, we may store the specular exponent (or shininess factor) at each pixel. Given all of the data, the varying precision requirements, and the consideration of efficiency of memory bandwidth, it’s a good idea to attempt to pack the data together into otherwise unrelated components of wider framebuffer formats.

4. Even when rendering in HDR, the color content of a G-buffer can be stored as 8-bit values so long as the final passes operate at higher precision.

In our example, we’ll use three 16-bit components to store the normal at each fragment, three 16-bit components to store the fragment’s albedo (flat color), three 32-bit floating-point components to store5 the world-space coordinate of the fragment, and a 32-bit integer component to store a per-pixel object or material index, and a 32-bit component to store the per-pixel specular power factor.

5. Several methods exist to reconstruct the world-space coordinates of a fragment from its screen-space coordinates, but for this example, we’ll store them directly in the framebuffer.

The sum total of these bits is six 16-bit components and five 32-bit components. How on earth will we represent this with a single framebuffer? Actually, it’s fairly simple. For the six 16-bit components, we can pack them into the first three 32-bit components of a GL_RGBA32UI format framebuffer. This leaves a fourth component that we can use to store our 32-bit object identifier. Now, we have four more 32-bit components to store — the three components of our world-space coordinate and the specular power. These can simply be packed into a GL_RGBA32F format framebuffer attachment. The code to create our G-buffer framebuffer is shown in Listing 12.27.


GLuint gbuffer;
GLuint gbuffer_tex[3];

glGenFramebuffers(1, &gbuffer);
glBindFramebuffer(GL_FRAMEBUFFER, gbuffer);

glGenTextures(3, gbuffer_tex);
glBindTexture(GL_TEXTURE_2D, gbuffer_tex[0]);
glTexStorage2D(GL_TEXTURE_2D, 1, GL_RGBA32UI,
MAX_DISPLAY_WIDTH, MAX_DISPLAY_HEIGHT);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);

glBindTexture(GL_TEXTURE_2D, gbuffer_tex[1]);
glTexStorage2D(GL_TEXTURE_2D, 1, GL_RGBA32F,
MAX_DISPLAY_WIDTH, MAX_DISPLAY_HEIGHT);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);

glBindTexture(GL_TEXTURE_2D, gbuffer_tex[2]);
glTexStorage2D(GL_TEXTURE_2D, 1, GL_DEPTH_COMPONENT32F,
MAX_DISPLAY_WIDTH, MAX_DISPLAY_HEIGHT);

glFramebufferTexture(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
gbuffer_tex[0], 0);
glFramebufferTexture(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT1,
gbuffer_tex[1], 0);
glFramebufferTexture(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT,
gbuffer_tex[2], 0);

glBindFramebuffer(GL_FRAMEBUFFER, 0);


Listing 12.27: Initializing a G-buffer

Now that we have a framebuffer to represent our G-buffer, it’s time to start rendering into it. We mentioned packing multiple 16-bit components into half as many 32-bit components. This can be achieved using the GLSL function packHalf2x16. Assuming our fragment shader has all of thenecessary input information, it can export all of the data it needs into two color outputs as seen in Listing 12.28.


#version 420 core

layout (location = 0) out uvec4 color0;
layout (location = 1) out vec4 color1;

in VS_OUT
{
vec3 ws_coords;
vec3 normal;
vec3 tangent;
vec2 texcoord0;
flat uint material_id;
} fs_in;

layout (binding = 0) uniform sampler2D tex_diffuse;

void main(void)
{
uvec4 outvec0 = uvec4(0);
vec4 outvec1 = vec4(0);

vec3 color = texture(tex_diffuse, fs_in.texcoord0).rgb;

outvec0.x = packHalf2x16(color.xy);
outvec0.y = packHalf2x16(vec2(color.z, fs_in.normal.x));
outvec0.z = packHalf2x16(fs_in.normal.yz);
outvec0.w = fs_in.material_id;

outvec1.xyz = fs_in.ws_coords;
outvec1.w = 60.0;

color0 = outvec0;
color1 = outvec1;
}


Listing 12.28: Writing to a G-buffer

As you can see from Listing 12.28, we have made extensive use of the packHalf2x16 function. Although this seems like quite a bit of code, it is generally “free” relative to the memory bandwidth cost of storing all of this data. Once you have rendered your scene to the G-buffer, it’s time to calculate the final color of all of the pixels in the framebuffer.

Consuming the G-Buffer

Given a G-buffer with diffuse colors, normals, specular powers, world-space coordinates, and other information, we need to read from it and reconstruct the original data that we packed in Listing 12.28. Essentially, we employ the inverse operations to our packing code and make use of theunpackHalf2x16 and uintBitsToFloat functions to convert the integer data stored in our textures into the floating-point data we need. The unpacking code is shown in Listing 12.29.


layout (binding = 0) uniform usampler2D gbuf0;
Layout (binding = 1) uniform sampler2D gbuf1;

struct fragment_info_t
{
vec3 color;
vec3 normal;
float specular_power;
vec3 ws_coord;
uint material_id;
};

void unpackGBuffer(ivec2 coord,
out fragment_info_t fragment)
{
uvec4 data0 = texelFetch(gbuf_tex0, ivec2(coord), 0);
vec4 data1 = texelFetch(gbuf_tex1, ivec2(coord), 0);
vec2 temp;

temp = unpackHalf2x16(data0.y);
fragment.color = vec3(unpackHalf2x16(data0.x), temp.x);
fragment.normal = normalize(vec3(temp.y, unpackHalf2x16(data0.z)));
fragment.material_id = data0.w;

fragment.ws_coord = data1.xyz;
fragment.specular_power = data1.w;
}


Listing 12.29: Unpacking data from a G-buffer

We can visualize the contents of our G-buffer using a simple fragment shader that reads from the resulting textures that are attached to it, unpacks the data into its original form, and then outputs the desired parts to the normal color framebuffer. Rendering a simple scene into the G-buffer and visualizing it gives the result shown in Figure 12.24.

Image

Figure 12.24: Visualizing components of a G-buffer

The upper-left quadrant of Figure 12.24 shows the diffuse albedo, the upper right shows the surface normals, the lower left shows the world-space coordinates, and the lower right of Figure 12.24 shows the material ID at each pixel, represented as different levels of gray.

Once we have unpacked the content of the G-buffer into our shader, we have everything we need to calculate the final color of the fragment. We can use any of the techniques covered in the earlier part of this chapter. In this example, we use standard Phong shading. Taking thefragment_info_t structure unpacked in Listing 12.29, we can pass this directly to a lighting function that will calculate the final color of the fragment from the lighting information. Such a function is shown in Listing 12.30.


vec4 light_fragment(fragment_info_t fragment)
{
int i;
vec4 result = vec4(0.0, 0.0, 0.0, 1.0);

if (fragment.material_id != 0)
{
for (i = 0; i < num_lights; i++)
{
vec3 L = fragment.ws_coord - light[i].position;
float dist = length(L);
L = normalize(L);
vec3 N = normalize(fragment.normal);
vec3 R = reflect(-L, N);
float NdotR = max(0.0, dot(N, R));
float NdotL = max(0.0, dot(N, L));
float attenuation = 50.0 / (pow(dist, 2.0) + 1.0);

vec3 diffuse_color = light[i].color * fragment.color *
NdotL * attenuation;
vec3 specular_color = light[i].color *
pow(NdotR, fragment.specular_power)
* attenuation;

result += vec4(diffuse_color + specular_color, 0.0);
}
}

return result;
}


Listing 12.30: Lighting a fragment using data from a G-buffer

The final result of lighting a scene using deferred shading is shown in Figure 12.25. In the scene, over 200 copies of an object are rendered using instancing. Each pixel in the frame has some overdraw. The final pass over the scene calculates the contribution of 64 lights. Increasing and decreasing the number of lights in the scene has little effect on performance. In fact, the most expensive part of rendering the scene is generating the G-buffer in the first place and then reading and unpacking it in the lighting shader, which is performed once in this example, regardless of the number of lights in the scene. In this example, we have used a relatively inefficient G-buffer representation for the sake of clarity. This consumes quite a bit of memory bandwidth, and the performance of the program could probably be increased somewhat by reducing the storage requirements of the buffer.

Image

Figure 12.25: Final rendering using deferred shading

Normal Mapping and Deferred Shading

Earlier in this chapter, you read about normal mapping, which is a technique to store local surface normals in a texture and then use them to add detail to rendered models. To achieve this, most normal mapping algorithms (including the one described earlier in this chapter) use tangent space normals and perform all lighting calculations in that coordinate space. This involves calculating the light and view vectors, Image and Image in the vertex shader, transforming them into tangent space using the TBN matrix, and passing them to the fragment shader where lighting calculations are performed. However, in deferred renderers, the normals that you store in the G-buffer are generally in world or view space.

In order to generate view-space normals6 that can be stored into a G-buffer for deferred shading, we need to take the tangent-space normals read from the normal map and transform them into view-space during G-buffer generation. This requires minor modifications to the normal mapping algorithm.

6. View space is generally preferred for lighting calculations over world space as it has consistent accuracy independent of the viewer’s position. When the viewer is placed at a large distance from the origin, world space precision breaks down near the viewer, and that can affect the accuracy of lighting calculations.

First, we do not calculate Image or Image in the vertex shader, nor do we construct the TBN matrix there. Instead, we calculate the view-space normal and tangent vectors Image and Image and pass them to the fragment shader. In the fragment shader, we re-normalize Image and Image and take their cross product to produce the bitangent vector Image. This is used in the fragment shader to construct the TBN matrix local to the fragment being shaded. We read the tangent-space normal from the normal map as usual, but transform it through the inverse of the TBN matrix (which is simply its transpose, assuming it encodes only rotation). This moves the normal vector from tangent-space into view-space. The normal is then stored in the G-buffer. The remainder of the shading algorithm that performs lighting calculations is unchanged from that described earlier.

The vertex shader used to generate the G-buffer with normal mapping applied is almost unmodified from the version that does not apply normal mapping. However, the updated fragment shader is shown in Listing 12.31.


#version 420 core

layout (location = 0) out uvec4 color0;
layout (location = 1) out vec4 color1;

in VS_OUT
{
vec3 ws_coords;
vec3 normal;
vec3 tangent;
vec2 texcoord0;
flat uint material_id;
} fs_in;

layout (binding = 0) uniform sampler2D tex_diffuse;
layout (binding = 1) uniform sampler2D tex_normal_map;

void main(void)
{
vec3 N = normalize(fs_in.normal);
vec3 T = normalize(fs_in.tangent);
vec3 B = cross(N, T);
mat3 TBN = mat3(T, B, N);

vec3 nm = texture(tex_normal_map, fs_in.texcoord0).xyz * 2.0 - vec3(1.0);
nm = TBN * normalize(nm);

uvec4 outvec0 = uvec4(0);
vec4 outvec1 = vec4(0);

vec3 color = texture(tex_diffuse, fs_in.texcoord0).rgb;

outvec0.x = packHalf2x16(color.xy);
outvec0.y = packHalf2x16(vec2(color.z, nm.x));
outvec0.z = packHalf2x16(nm.yz);
outvec0.w = fs_in.material_id;

outvec1.xyz = floatBitsToUint(fs_in.ws_coords);
outvec1.w = 60.0;

color0 = outvec0;
color1 = outvec1;
}


Listing 12.31: Deferred shading with normal mapping (fragment shader)

Finally, Figure 12.26 shows the difference between applying normal maps to the scene (left) and using the interpolated per-vertex normal (right). As you can see, substantially more detail is visible in the left image that has normal maps applied. All of this code is contained in thedeferredshading example, which generated these images.

Image

Figure 12.26: Deferred shading with and without normal maps

Deferred Shading — Downsides

While deferred shading can reduce the impact of complex lighting or shading calculations on the performance of your application, it won’t solve all of your problems. Besides being very bandwidth heavy and requiring a lot of memory for all of the textures you attach to your G-buffer, there are a number of other downsides to deferred shading. With a bit of effort, you might be able to work around some of them, but before you launch into writing a shiny new deferred renderer, you should consider the following.

First, the bandwidth considerations of a deferred shading implementation should be considered carefully. In our example, we used 256 bits of information for each pixel in the G-buffer, and we didn’t make particularly efficient use of them either. We packed our world-space coordinates directly in the G-buffer, consuming 96 bits of space (remember, we used three 32-bit floating-point entries for this). However, we have the screen-space coordinates of each pixel when we render our final pass, which we can retrieve from the x and y components of gl_FragCoord and from the content of the depth buffer. To obtain world-space coordinates, we need to undo the viewport transform (which is simply a scale and bias) and then move the resulting coordinates from clip space into world space by applying the inverse of the projection and view matrices (which normally transforms coordinates from world space to view space). As the view matrix usually only encodes translation and rotation, it is generally easy to invert. However, the projection matrix and subsequent homogenous division is more difficult to reverse.

We also used 48 bits to encode our surface normals in the G-buffer by using three 16-bit floating-point numbers per normal. We could instead store only the x and y components of the normal and reconstruct the z coordinate using the knowledge that the normal should a unit-length vector and, therefore, Image. We must also deduce the sign of z. However, if we make the assumption that no surface with a negative z component in view space will ever be rendered, we’d usually be right. Finally, the specular power and material ID components were stored using full 32-bit quantities. It is likely that you won’t have more than 60,000 unique materials in your scene and can therefore use 16 bits for a material ID. Also, it is reasonable to store specular powers as logarithms and raise 2 to the power of the shininess factor in your lighting shader. This will require substantially fewer bits to store the specular power factor in the G-buffer.

Another downside of deferred shading algorithms is that they generally don’t play well with antialiasing. Normally, when OpenGL resolves a multi-sample buffer, it will take a (possibly weighted) average of the samples in the pixel. Averaging depth values, normals, and in particular meta-data such as material IDs just doesn’t work. So, if you want to implement antialiasing, you’ll need to use multi-sampled textures for all of the off-screen buffers attached to your G-buffer. What’s worse, because the final pass consists of a single large polygon (or possibly two) that covers the entire scene, none of the interior pixels will be considered edge pixels, breaking traditional multi-sample antialiasing. For the resolve pass, you will either need to write a special custom resolve shader or run the whole thing at sample rate, which will substantially increase the cost of lighting your scene.

Finally, most deferred shading algorithms can’t deal with transparency. This is because at each pixel in the G-buffer, we store only the information for a single fragment. In order to properly implement transparency, we would need to know all of the information for every fragment starting closest to the viewer until an opaque fragment is hit. There are algorithms that do this, and they are often used to implement order-independent transparency, for example. Another approach is simply to render all non-transparent surfaces using deferred shading and then to render transparent materials in a second pass through the scene. This requires your renderer to either keep a list of transparent surfaces that it skipped as it traversed the scene, or to traverse your scene twice. Either option can be pretty expensive.

In summary, deferred shading can bring substantial performance improvements to your application if you keep in mind the limitations of the techniques and restrict yourself to algorithms that it handles well.

Screen-Space Techniques

Most of the rendering techniques described in this book so far have been implemented per primitive. However, in the previous section, we discussed deferred shading, which suggested that at least some of the rendering procedures can be implemented in screen space. In this subsection, we discuss a few more algorithms that push shading into screen space. In some cases, this is the only way to implement certain techniques, and in other cases, we can achieve a pretty significant performance advantage by delaying processing until all geometry has already been rendered.

Ambient Occlusion

Ambient occlusion is a technique for simulating one component of global illumination. Global illumination is the observed effect of light bouncing from object to object in a scene such that surfaces are lit indirectly by the light reflected from nearby surfaces. Ambient light is an approximation to this scattered light and is a small, fixed amount added to lighting calculations. However, in deep creases or gaps between objects, less light will light them due to the nearby surfaces occluding the light sources — hence the term ambient occlusion. Real-time global illumination is a topic of current research, and while some fairly impressive work has been presented, this is an unsolved problem. However, we can produce some reasonably good results with ad-hoc methods and gross approximations. One such approximation is screen space ambient occlusion (or SSAO), which we will discuss here.

To explain the technique, we will start in two dimensions. Ambient light could be considered to be the amount of light that would hit a point on a surface if it were surrounded by an arbitrarily large number of small point lights. On a perfectly flat surface, any point is visible to all of the lights above that surface. However, on a bumpy surface, not all of the lights will be visible from all points on that surface — the bumpier the surface, the fewer the number of lights that will be visible from any given point. This is illustrated in Figure 12.27.

Image

Figure 12.27: Bumpy surface occluding points

In the diagram, you can see that we have eight point lights distributed roughly equally around a surface. For the point under consideration, we draw a line from that point to each of the eight lights. You can see that a point at the bottom of a valley in the surface can only see a small number of the lights. However, it should be clear that a point at the top of a peak should be able to see most, if not all of the lights. The bumps in the surface occlude the lights from points at the bottom of valleys, and therefore, they will receive ambient light. In a full global illumination simulation, we would literally trace lines (or rays) from each point being shaded in hundreds, perhaps thousands, of directions and determine what was hit. However, that’s far too expensive for a real-time solution, and so we use a method that allows us to calculate the occlusion of a point directly in screen space.

To implement this technique, we are going to march rays from each position in screen space along a random direction and determine the amount of occlusion at each point along that ray. First, we render our scene into depth and color buffers attached to an FBO. Along with this, we also render the normal at each fragment and its linear depth7 in view space into a second color attachment on the same FBO. In a second pass, we will use this information to compute the level of occlusion at each pixel. In this pass, we render a full-screen quad with our ambient occlusion shader. The shader reads the depth value that we render in our first pass, selects a random direction to walk in, and takes several steps along that direction. At each point along the walk, it tests whether the value in the depth buffer is less than the depth value computed along the ray. If it is, then we consider the point occluded.

7. We could reconstruct a linear view-space depth from the content of the depth buffer produced in the first pass by inverting the mapping of eye-space z into the 0.0 to 1.0 range stored in the depth buffer. However, for simplicity, we’re going to use the extra channel on our frame-buffer attachment.

To select a random direction, we pre-initialize a uniform buffer with a large number of random vectors in a unit radius sphere. Although our random vectors may point in any direction, we only really want to consider vectors that point away from the surface. That is, we only consider vectors lying in the hemisphere oriented around the surface normal at the point. To produce a random direction oriented in this hemisphere, we take the dot product of the surface normal (which we rendered into our color buffer earlier) and the selected random direction. If the result is negative, then the selected direction vector points into the surface, and so we negate it in order to point it back into the correctly oriented hemisphere. Figure 12.28 demonstrates the technique.

Image

Figure 12.28: Selection of random vector in an oriented hemisphere

In Figure 12.28, you can see that vectors V0, V1, and V4 already lie in the hemisphere that is aligned with the normal vector, N. This means that the dot product between any of these three vectors and N will be positive.

However, V2 and V3 lie outside the desired hemisphere, and it should be clear that the dot product between either of these two vectors and N will be negative. In this case, we simply negate V2 and V3, reorienting them into the correct hemisphere.

Once we have our random set of vectors, it’s time to walk along them. To do this, we start at the point on the surface and step a small distance along our chosen distance vector. This produces a new point, complete with x, y, and z coordinates. We use the x and y components to read from the linear depth buffer that we rendered earlier and look up the value stored there. We compare this depth to that of the interpolated position vector, and if it is closer (i.e., lower) than the interpolated value, then our interpolated point is obscured from view in the image, and thus we consider the original point to be occluded for the purposes of the algorithm. While this is clearly far from accurate, statistically it works out. The number of random directions to choose, the number of steps along each direction, and how far that step is are all parameters that we can choose to control the output image quality. The more directions we choose, the farther we step, and the more steps we take in each direction, the better the output image quality will be. Figure 12.29 shows the effect of adding more sample directions on the result of the ambient occlusion algorithm.

Image

Figure 12.29: Effect of increasing direction count on ambient occlusion

In Figure 12.29, directions are added from left to right, top to bottom, starting with a single direction on the top left, with 4 on the top right, 16 on the bottom left, and 64 on the bottom right. As you can see, it is not until we have 64 directions that the image becomes smooth. With fewer directions, severe banding is seen in the image. There are many approaches to reduce this, but one of the most effective is to randomize the distance along each of the occlusion rays we take for each sample. This introduces noise into the image, but also smoothes the result, improving overall quality. Figure 12.30 shows the result of introducing this randomness into the image.

Image

Figure 12.30: Effect of introducing noise in ambient occlusion

As you can see in Figure 12.30, the introduction of randomness in the step rate along the occlusion rays has improved image quality substantially. Again, from left to right, top to bottom, we have taken 1, 4, 16, and 64 directions, respectively. With random ray step rates, the image produced by considering only a single ray direction has gone from looking quite corrupted to looking noisy, but correct. Even the 4-direction result (shown on the top right of Figure 12.30) has acceptable quality, whereas the equivalent image in Figure 12.29 still exhibits considerable banding. The 16-sample image on the bottom left of Figure 12.30 is almost as good as the 64-sample image of Figure 12.29, and the 64-sample image of Figure 12.30 does not show much improvement over it. It is even possible to compensate for the noise introduced by this method, but that is beyond the scope of this example.

Once we have our ambient occlusion term, we need to apply it to our rendered image. Ambient occlusion is simply the amount by which ambient light is occluded. Therefore, all we need to do is multiply our ambient lighting term in our shading equation by our occlusion term, which causes the creases of our model to have less ambient lighting applied to them. Figure 12.31 shows the effect of applying the screen space ambient occlusion algorithm to a rendered scene.

Image

Figure 12.31: Ambient occlusion applied to a rendered scene

In Figure 12.31, the image on the left is the diffuse and specular terms of the lighting model only. The dragon is suspended just over a plane although depth is very hard to judge in the image. The image on the right has screen space ambient occlusion applied. As you can see, not only is the definition of some of the dragon’s details more apparent, but the dragon also casts a soft shadow on the ground below it, increasing the sense of depth.

In our first pass, we simply render the diffuse and specular terms into one color attachment as usual, and then we render the surface normal and linear eye-space depth into a second color attachment. The shader to do this is relatively straightforward and is similar to many of the shaders presented thus far in the book. The second pass of the algorithm is the interesting part — this is where we apply the ambient occlusion effect. It is shown in its entirety in Listing 12.32, which is part of the ssao sample application.


#version 430 core

// Samplers for pre-rendered color, normal, and depth
layout (binding = 0) uniform sampler2D sColor;
layout (binding = 1) uniform sampler2D sNormalDepth;

// Final output
layout (location = 0) out vec4 color;

// Various uniforms controlling SSAO effect
uniform float ssao_level = 1.0;
uniform float object_level = 1.0;
uniform float ssao_radius = 5.0;
uniform bool weight_by_angle = true;
uniform uint point_count = 8;
uniform bool randomize_points = true;

// Uniform block containing up to 256 random directions (x,y,z,0)
// and 256 more completely random vectors
layout (binding = 0, std140) uniform SAMPLE_POINTS
{
vec4 pos[256];
vec4 random_vectors[256];
} points;

void main(void)
{
// Get texture position from gl_FragCoord
vec2 P = gl_FragCoord.xy / textureSize(sNormalDepth, 0);
// ND = normal and depth
vec4 ND = textureLod(sNormalDepth, P, 0);
// Extract normal and depth
vec3 N = ND.xyz;
float my_depth = ND.w;

// Local temporary variables
int i;
int j;
int n;

float occ = 0.0;
float total = 0.0;

// n is a pseudo-random number generated from fragment coordinate
// and depth
n = (int(gl_FragCoord.x * 7123.2315 + 125.232) *
int(gl_FragCoord.y * 3137.1519 + 234.8)) ^
int(my_depth);
// Pull one of the random vectors
vec4 v = points.random_vectors[n & 255];

// r is our "radius randomizer"
float r = (v.r + 3.0) * 0.1;
if (!randomize_points)
r = 0.5;

// For each random point (or direction)...
for (i = 0; i < point_count; i++)
{
// Get direction
vec3 dir = points.pos[i].xyz;

// Put it into the correct hemisphere
if (dot(N, dir) < 0.0)
dir = -dir;

// f is the distance we've stepped in this direction
// z is the interpolated depth
float f = 0.0;
float z = my_depth;

// We're going to take 4 steps - we could make this
// configurable
total += 4.0;

for (j = 0; j < 4; j++)
{
// Step in the right direction
f += r;
// Step _towards_ viewer reduces z
z -= dir.z * f;

// Read depth from current fragment
float their_depth =
textureLod(sNormalDepth,
(P + dir.xy * f * ssao_radius), 0).w;

// Calculate a weighting (d) for this fragment's
// contribution to occlusion
float d = abs(their_depth - my_depth);
d *= d;

// If we're obscured, accumulate occlusion
if ((z - their_depth) > 0.0)
{
occ += 4.0 / (1.0 + d);
}
}
}

// Calculate occlusion amount
float ao_amount = vec4(1.0 - occ / total);

// Get object color from color texture
vec4 object_color = textureLod(sColor, P, 0);

// Mix in ambient color scaled by SSAO level
color = object_level * object_color +
mix(vec4(0.2), vec4(ao_amount), ssao_level);
}


Listing 12.32: Ambient occlusion fragment shader

Rendering without Triangles

In the previous section, we covered techniques that can be applied in screen space, all of which are implemented by drawing a full-screen quad over geometry that’s already been rendered. In this section, we take it one step further and demonstrate how it’s possible to render entire scenes entirely with a single full-screen quad.

Rendering Julia Fractals

In this next example, we render a Julia set, creating image data from nothing but the texture coordinates. Julia sets are related to the Mandelbrot set — the iconic bulblike fractal. The Mandelbrot image is generated by iterating the formula

Zn = Zn–12 + C

until the magnitude of Z exceeds a threshold and calculating the number of iterations. If the magnitude of Z never exceeds the threshold within the allowed number of iterations, that point is determined to be inside the Mandelbrot set and is colored with some default color. If the magnitude ofZ exceeds the threshold within the allowed number of iterations, then the point is outside the set. A common visualization of the Mandelbrot set colors the point using a function of the iteration count at the time the point was determined to be outside the set. The primary difference between the Mandelbrot set and the Julia set is the initial conditions for Z and C.

When rendering the Mandelbrot set, Z is set to (0 + 0i), and C is set to the coordinate of the point at which the iterations are to be performed. When rendering the Julia set, on the other hand, Z is set to the coordinate of the point at which iterations are performed, and C is set to an application-specified constant. Thus, while there is only one Mandelbrot set, there are infinitely many Julia sets — one for every possible value of C. Because of this, the Julia set can be controlled parametrically and even animated. Just as in some of the previous examples, we invoke this shader at every fragment by drawing a full-screen quad. However, rather than consuming and post-processing data that might already be in the framebuffer, we generate the final image directly.

Let’s set up the fragment shader with an input block containing just the texture coordinates. We also need a uniform to hold the value of C. To apply interesting colors to the resulting Julia image, we use a one-dimensional texture with a color gradient in it. When we’ve iterated a point that escapes from the set, we color the output fragment by indexing into this texture using the iteration count. Finally, we also define a uniform containing the maximum number of iterations we want to perform. This allows the application to balance performance against the level of detail in the resulting image. Listing 12.33 shows the setup for our Julia renderer’s fragment shader.


#version 430 core

in Fragment
{
vec2 tex_coord;
} fragment;

// Here's our value of c
uniform vec2 c;

// This is the color gradient texture
uniform sampler1D tex_gradient;

// This is the maximum iterations we'll perform before we consider
// the point to be outside the set
uniform int max_iterations;

// The output color for this fragment
out vec4 output_color;


Listing 12.33: Setting up the Julia set renderer

Now that we have the inputs to our shader, we are ready to start rendering the Julia set. The value of C is taken from the uniform supplied by the application. The initial value of Z is taken from the incoming texture coordinates supplied by the vertex shader. Our iteration loop is shown inListing 12.34.


int iterations = 0;
vec2 z = fragment.tex_coords;
const float threshold_squared = 4.0;

// While there are iterations left and we haven't escaped from
// the set yet...
while (iterations < max_iterations &&
dot(z, z) < threshold_squared)
{
// Iterate the value of Z as Z^2 + C
vec2 z_squared;
z_squared.x = z.x * z.x - z.y * z.y;
z_squared.y = 2.0 * z.x * z.y;
z = z_squared + c;
iterations++;
}


Listing 12.34: Inner loop of the Julia renderer

The loop terminates under one of two conditions — either we reach the maximum number of iterations allowed (iterations == max_iterations) or the magnitude of Z passes our threshold. Note that in this shader, we compare the squared magnitude of Z (found using the dot function) to the square of the threshold (the threshold_squared uniform). The two operations are equivalent, but this way avoids a square root in the shader, improving performance. If, at the end of the loop, iterations is equal to max_iterations, we know that we ran out of iterations and the point is inside the set — we color it black. Otherwise, our point left the set before we ran out of iterations, and we can color the point accordingly. To do this, we can just figure out what fraction of the total allowed iterations we used up and use that to look up into the gradient texture. Listing 12.35 shows what the code looks like.


if (iterations == max_iterations)
{
output_color = vec4(0.0, 0.0, 0.0, 0.0);
}
else
{
output_color = texture(tex_gradient,
float(iterations) / float(max_iterations));
}


Listing 12.35: Using a gradient texture to color the Julia set

Now all that’s left is to supply the gradient texture and set an appropriate value of c. For our application, we update c on each frame as a function of the currentTime parameter passed to our render function. By doing this, we can animate the fractal. Figure 12.32 shows a few frames of the Julia animation produced by the julia example program. (See Color Plate 13 in the color insert for another example.)

Image

Figure 12.32: A few frames from the Julia set animation

Ray Tracing in a Fragment Shader

OpenGL usually works by using rasterization to generate fragments for primitives such as lines, triangles, and points. This should be obvious to you by now. We send geometry into the OpenGL pipeline, and for each triangle, OpenGL figures out which pixels it covers, and then runs your shader to figure out what color it should be. Ray tracing effectively inverts the problem. We throw a bunch of pixels into the pipeline (actually represented by rays), and then for each one, we figure out which pieces of geometry cover that pixel (which means our per-pixel ray hits the geometry). The biggest disadvantage of this when compared to traditional rasterization is that OpenGL doesn’t include direct support for it, which means we have to do all of the work in our own shaders. However, this provides us with a number of advantages — in particular, we aren’t limited8 to just points, lines, and triangles, and we can figure out what happens to a ray after it hits an object. Using the same techniques as we use for figuring out what’s visible from the camera, we can render reflections, shadows, and even refraction with little additional code.

8. In fact, points, lines, and triangles are amongst the more complex shapes to render in a ray tracer.

In this subsection, we discuss the construction of a simple recursive ray tracer using a fragment shader. The ray tracer we produce here will be capable of rendering images consisting of simple spheres and infinite planes — enough to produce the classic “glossy spheres in a box” image. Certainly, substantially more advanced implementations exist, but this should be sufficient to convey the basic techniques. Figure 12.33 shows a simplified, 2D illustration of the basics of a simple ray tracer.

Image

Figure 12.33: Simplified 2D illustration of ray tracing

In Figure 12.33, we see the eye position, which forms the origin of a ray O shot towards the image plane (which is our display) and intersecting it at point P. This ray is known as the primary ray and is denoted here by Rprimary. The ray intersects a first sphere at the intersection point I0. At this point, we create two additional rays. The first is directed towards the light source and is denoted by Rshadow. If this ray intersects anything along its way to the light source, then point I0 is in shadow; otherwise, it is lit by the point. In addition to the shadow ray, we shoot a second ray Rreflectedby reflecting the incoming ray Rprimary around the surface normal at I0, N.

Shading for ray tracing isn’t all that different from the types of shading and lighting algorithms we’ve looked at already in this book. We can still calculate diffuse and specular terms, apply normal maps and other textures, and so on. However, we also consider the contribution of the rays that we shoot in other directions. So, for I0, we’ll shade it using Rprimary as our view vector, N as our normal, Rshadow as our light vector, and so on. Next, we’ll shoot a ray off towards I1 (Rreflected), shade the surface there, and then add that contribution (scaled by the reflectivity of the surface at I0) back to the color accumulated at P. The result is crisp, clean reflections.

Now, given the origin (O), which is usually at the origin in view space, and point P, we calculate the direction of ray Rprimary and begin the ray tracing process. This involves calculating the intersection of a line (our ray) and an object in the scene (each sphere). The intersection of a ray with a sphere works as follows.

Given a ray R with origin O and direction Image, then at time t, a point on that ray is O + t Image. Also, given a sphere at center C with radius r, any point on its surface is at distance r from C, and moreover, the squared distance between C and any point on the sphere’s surface is r2. This is convenient as the dot product of a vector with itself is its squared distance. Thus, we can say that for a point P at O + t Image

(P – C) · (P – C) = r2

Substituting for P, we have

Image

Expanding this gives us a quadratic equation in t:

Image

To write this in the more familiar form of At2 + Bt + C = 0

Image

As a simple quadratic equation, we can solve for t, knowing that there are either zero, one, or two solutions:

Image

Given that we know that our direction vector Image is normalized, then its length is one, and therefore, A is one also. This simplifies things a little, and we can simply say that our solution for t is

Image

If 4C is greater than B2, then the term under the square root is negative, and there is no solution for t, which means that there is no intersection between the ray and the sphere. If B2 is equal to 4C, then there is only one solution, meaning that the ray just grazes the sphere. If that solution is positive, then this occurs in front of the viewer and we have found our intersection point. If the single solution for t is negative, then the intersection point is behind the viewer. Finally, if there are two solutions to the equation, we take the smallest non-negative solution for t as our intersection point. We simply plug this value back into P = O + t Image and retrieve the coordinates of the intersection point in 3D space.

Shader code to perform this intersection test is shown in Listing 12.36.


struct ray
{
vec3 origin;
vec3 direction;
};

struct sphere
{
vec3 center;
float radius;
};

float intersect_ray_sphere(ray R,
sphere S,
out vec3 hitpos,
out vec3 normal)
{
vec3 v = R.origin - S.center;
float B = 2.0 * dot(R.direction, v);
float C = dot(v, v) - S.radius * S.radius;
float B2 = B * B;

float f = B2 - 4.0 * C;

if (f < 0.0)
return 0.0;

float t0 = -B + sqrt(f);
float t1 = -B - sqrt(f);
float t = min(max(t0, 0.0), max(t1, 0.0)) * 0.5;

if (t == 0.0)
return 0.0;

hitpos = R.origin + t * R.direction;
normal = normalize(hitpos - S.center);

return t;
}


Listing 12.36: Ray-sphere intersection test

Given the structures ray and sphere, the function intersect_ray_sphere in Listing 12.36 returns 0.0 if the ray does not hit the sphere and the value of t if it does. If an intersection is found, the position of that intersection is returned in the output parameter hitpos, and the normal of the surface at the intersection point is returned in the output parameter normal. We use the returned value of t to determine the closest intersection point along each ray by initializing a temporary variable to the longest allowed ray length, and taking the minimum between it and the distance returned byintersect_ray_sphere for each sphere in the scene. The code to do this is shown in Listing 12.37.


// Declare a uniform block with our spheres in it.
layout (std140, binding = 1) uniform SPHERES
{
sphere S[128];
};

// Textures with the ray origin and direction in them
layout (binding = 0) uniform sampler2D tex_origin;
layout (binding = 1) uniform sampler2D tex_direction;

// Construct a ray using the two textures
ray R;

R.origin = texelFetch(tex_origin, ivec2(gl_FragCoord.xy), 0).xyz;
R.direction = normalize(texelFetch(tex_direction,
ivec2(gl_FragCoord.xy), 0).xyz);

float min_t = 1000000.0f;
float t;

// For each sphere...
for (i = 0; i < num_spheres; i++)
{
// Find the intersection point
t = intersect_ray_sphere(R, S[i], hitpos, normal);

// If there is an intersection
if (t != 0.0)
{
// And that intersection is less than our current best
if (t < min_t)
{
// Record it.
min_t = t;
hit_position = hitpos;
hit_normal = normal;
sphere_index = i;
}
}
}


Listing 12.37: Determining closest intersection point

Image

Figure 12.34: Our first ray-traced sphere

If all we do at each point is write white wherever we hit something, and then trace rays into a scene containing a single sphere, we produce the image shown in Figure 12.35.

Image

Figure 12.35: Our first lit ray-traced sphere

However, this isn’t particularly interesting — we’ll need to light the point. The surface normal is important for lighting calculations (as you have read already in this chapter), and this is returned by our intersection function. We perform lighting calculations as normal in the ray tracer — taking the surface normal, the view-space coordinate (calculated during the intersection test), and material parameters and shade the point. By applying the lighting equations you’ve already learned about, we can retrieve the image shown in Figure 12.35.

Although the normal is used in lighting calculations, it is also very important for the next few steps in the ray tracer. For each light in the scene, we calculate its contribution to the surface’s shading and accumulate this to produce the final color. This is where the first real advantage of ray tracing comes in. Given a surface point P and a light coordinate L, we form a new ray, setting its origin O to P and its direction Image to the normalized vector from P to L, Image. This is known as a shadow ray (pictured as Rshadow in Figure 12.33). We can then test the objects in the scene to see if the light is visible from that point — if the ray doesn’t hit anything, then there is line of sight from the point being shaded to the light; otherwise, it is occluded and therefore in shadow. As you can imagine, shadows are something that ray tracers do very well.

However, it doesn’t end there. Just as we constructed a new ray starting from our intersection and pointing in the direction of our light source, we can construct a ray pointing in any direction. For example, given that we know the surface normal at the ray’s intersection with the sphere, we can use GLSL’s reflect to reflect the incoming ray direction around the plane defined by this normal and shoot a new ray away from the plane in this direction. This ray is simply sent as input to our ray tracing algorithm, the intersection point it generates is shaded, and the resulting color is simply added into the scene.

You may have noticed in Listing 12.37 that at each pixel, we read an origin and a direction from a texture. Ray tracing is a recursive algorithm — you trace a ray, shade the point, create a new ray, trace it, and continue. GLSL doesn’t allow recursion, so instead, we implement it using a stack maintained in an array of textures.

To maintain all the data that we’ll need for our ray tracer, we create an array of framebuffer objects, and to each we attach four textures as color attachments. These hold, for each pixel in the framebuffer, the final composite color, the origin of a ray, the current direction of the ray, and the accumulated reflected color of the ray. In our application, we allow each ray to take up to five bounces, and we need five framebuffer objects, each with four textures attached to it. The first (the composite color) is common to all framebuffer objects, but the other three are unique to each framebuffer. During each pass, we read from one set of textures and write into the next set via the framebuffer object. This is illustrated in Figure 12.36.

Image

Figure 12.36: Implementing a stack using framebuffer objects

To initialize our ray tracer, we run a shader that writes the starting origin and ray direction into the first origin and direction textures. We also initialize our accumulation texture to zeros, and our reflection color texture to all ones. Next, we run our actual ray tracing shader by drawing a full-screen quad once for each bounce of the rays we want to trace. On each pass, we bind the origin, direction, and reflected color textures from the previous pass. We also bind a framebuffer that has the outgoing origin, direction, and reflection textures attached to it as color attachments — these textures will be used in the next pass. Then, for each pixel, the shader forms a ray using the origin and direction stored in the first two textures, traces it into the scene, lights the intersection point, multiplies the result by the value stored in the reflected color texture, and sends it to its first output.

To enable composition into the final output texture, we attach it to the first color attachment of each framebuffer object and enable blending for that attachment with the blending function set to GL_ONE for both the source and destination factors. This causes the output to be simply added to the existing content of that attachment. To the other outputs, we write the intersection position, the reflected ray direction, and the reflectivity coefficient of the material that we use for shading the ray’s intersection point.

If we add a few more spheres to the scene, we can have them reflect each other by applying this technique. Figure 12.37 shows the scene with a few more spheres thrown in with an increasing number of bounces of each ray.

Image

Figure 12.37: Ray-traced spheres with increasing ray bounces

As you can see in Figure 12.37, the top-left image (which includes no secondary rays) is pretty dull. As soon as we introduce the first bounce in the top-right image, we begin to see reflections of the spheres. Adding a second bounce in the bottom left, we can see reflections of spheres in the reflections of the spheres... in the third bounce on the lower right, the effect is more subtle, but if you look very closely, there are reflections of spheres in spheres in spheres.

Now, a scene made entirely of spheres really isn’t very exciting. What we need to do is add more object types. Although in theory, any object could be ray traced, another form that is relatively easy to perform intersection tests with is the plane. One representation of a plane is a normal (which is constant for a plane) and a distance from the origin of the point on the plane that lies along that normal. The normal is a three-dimensional vector, and the distance is a scalar value. As such, we can describe a plane with a single four-component vector. We pack the normal into the x, y, and zcomponents of the vector and the distance from the origin into the w component. In fact, given a plane normal N and distance from origin d, the implicit equation of a plane can be represented as

P · N + d = 0

where P is a point in the plane. Given that we have P, a point on our ray defined as

Image

we can simply substitute this value of P into the implicit equation to retrieve

Image

Solving for t, we arrive at

Image

As you can see from the equation, if Image · N is zero, then the denominator of the fraction is zero and there is no solution for t. This occurs when the ray direction is parallel to the plane (thus, it is perpendicular to the plane’s normal and their dot product is zero), and so never intersects it.

Otherwise, we can find a real value for t. Again, once we know the value of t, we can substitute it back into our ray equation, P = O + t Image, to retrieve our intersection point. If t is less than zero, then we know that the ray intersects the plane behind the viewer, which we consider here to be a miss. Code to perform this intersection test is shown in Listing 12.38.


float intersect_ray_plane(ray R,
vec4 P,
out vec3 hitpos,
out vec3 normal)
{
vec3 O = R.origin;
vec3 D = R.direction;
vec3 N = P.xyz;
float d = P.w;

float denom = dot(N, D);

if (denom == 0.0)
return 0.0;

float t = -(d + dot(O, N)) / denom;

if (t < 0.0)
return 0.0;

hitpos = O + t * D;
normal = N;

return t;
}


Listing 12.38: Ray-plane intersection test

Adding a plane to our scene behind our spheres produces the image shown on the left of Figure 12.38. Although this adds some depth to our scene, it doesn’t show the full effect of the ray tracer. By adding a couple of bounces, we can clearly see the reflections of the spheres in the plane, and of the plane in the spheres.

Image

Figure 12.38: Adding a ray-traced plane

Now, if we add a few more planes, we can enclose our scene in a box. The resulting image is shown on the top left of Figure 12.39. However, now, when we bounce the rays further, the effect of reflection becomes more and more apparent. You can see the result of adding more bounces as we progress from left to right, top to bottom in Figure 12.39 with no bounces, one, two, and three bounces, respectively. A higher resolution image using four bounces is shown in Color Plate 14.

Image

Figure 12.39: Ray-traced spheres in a box

The ray tracing implementation presented here and in the raytracer example application is a brute force approach that simply intersects every ray against every object. As your objects get more complex and the amount of them in the scene becomes greater, you may wish to implementacceleration structures. An acceleration structure is a data structure constructed in memory that allows you to quickly determine which objects might be hit by a ray given an origin and a direction. As you have seen from this example, ray tracing is actually pretty easy so long as you know an intersection algorithm for your primitive of choice. Shadows, reflections, and even refraction just come for free with ray tracing. However, ray tracing is certainly not cheap, and without dedicated hardware support, it leaves a lot of work for you to do in your shaders. Using an acceleration structure is vital if you really want to use ray trace scenes containing more than a handful of spheres and bunch of planes in real time. Current research in ray tracing is almost entirely focused on efficient acceleration structures and how to generate them, store them, and traverse them.

Summary

In this chapter, we have applied the fundamentals that you have learned throughout the book to a number of rendering techniques. At first, we focused heavily on lighting models and how to shade the objects that you’re drawing. This included a discussion of the Phong lighting model, the Blinn-Phong model, and rim lighting. We also looked at how to produce higher frequency lighting effects than are representable by your geometry by using normal maps, environment maps, and other textures. We showed how you can cast shadows and simulate basic atmospheric effects. We also discussed some techniques that have no basis in reality.

In the final section, we stepped away from shading at the same time as rendering our geometry and looked at some techniques that can be applied in screen space. Deferred shading allows expensive shading calculations to be decoupled from the initial pass that renders our geometry. By storing positions, normals, colors, and other surface attributes in framebuffer attachments, we are able to implement arbitrarily complex shading algorithms without worrying about wasting work. At first, we use this to apply standard lighting techniques only to pixels we know will be visible. However, with screen space ambient occlusion, we demonstrated a technique that relies on having data from neighboring pixels available in order to function at all. Ultimately, we introduced the topic of ray tracing, and in our implementation, we render an entire scene without a single triangle.