Mikkelsen and 3D Graphics

Tuesday, May 17, 2016

Fine Pruned Tiled Lighting

Tiled lighting techniques have gained significant interest in recent years. However, a problem with tiled lighting techniques vs. traditional directx 9 styled deferred lighting (additive alpha blending) is the significant amount of false positives largely due to intersection testing using coarse bounding volumes. This is particularly relevant when supporting lights/volumes of different shapes such as long narrow spot lights, wedges, capsules etc. The downside to directx 9 style lighting is this approach does not work with forward lighting and having thousands of lights is extremely expensive due to setup cost per light and overlapping reads from gbuffer and overlapping writes to the frame buffer.

During the development of Rise of the Tomb Raider (ROTR) we came up with a new tiled lighting variant which we named Fine Pruned Tiled Lighting (FPTL) which we describe in GPU Pro 7. There are many details to the full implementation discussed in the article and I will not go over them here but the main point is the cost of fine pruning can easily be absorbed by using asyncronous compute. This implies we obtain a light list with a very minimal amount of false positives almost for free.

As explained in the article the technique will work with essentially any methodology such as deferred shading, pre-pass deferred, tiled forward and even hybrids between these. A demo sample is available though it was written in vanilla directx 11 which implies the asyncronous compute part is left as an exercise for the reader! The demo shows a single terrain mesh lit by 1024 lights (heat map and fine pruning enabled by default). For simplicity the demo is setup as tiled forward though on ROTR we used a hybrid where we supported pre-pass deferred, tiled forward and conventional forward.

When running the demo you will notice fine pruning enabled runs faster than disabled despite the fact that there is no asyncronous compute in the demo (since it is standard DX11). However, the improvement on speed is of course much more significant when asyncronous compute is used correctly.

Other interesting aspects to the implementation is we determine screen-space AABBs around each light (regardless of type of shape) on the GPU. This allows us to reduce coverage significantly for partially visible lights (accellerates fine pruning) and reduces pressure on registers during light list generation (explained in the article). Additionally, we keep light lists sorted by type of shape to miminize chances of thread divergence during tiled forward lighting.

For more information on the details....Buy GPU Pro 7! :)

Sunday, October 6, 2013

Volume Height Maps and Triplanar Bump Mapping

Since I wrote this post I've written a new technical paper called "Surface Gradient Based Bump Mapping Framework" which does a better and more complete job describing the following.
All my papers can be found at https://mmikkelsen3d.blogspot.com/p/3d-graphics-papers.html

I know it has been a while but I thought I would like to emphasize an easily missed contribution in my paper Bump Mapping Unparametrized Surfaces on the GPU. The new surface gradient based formulation (eq. 4) of Jim Blinn's perturbed normal unifies conventional bump mapping and bump mapping from height maps defined on a volume. In other words you do not have to bake such heights to a 2D texture first to generate
Jim Blinn's perturbed normal. The formula also suggests that a well known method proposed by Ken Perlin is wrong. For a more direct visual confirmation a difference is shown in figure 2. The black spots on the left are caused by subtracting the volume gradient which causes the normal to implode. The result on the right is when using the new surface gradient based formulation and as you see the errors are gone.

To give an example of how the theoretical math might apply to a real world practical case
let us take a look at triplanar projection. In this case we have a parallel projection
along each axis and we use a derivative/normal map for each projection.

Let's imagine we have three 2D height maps.

H_x: (s,t) --> R
H_y: (s,t) --> R
H_z: (s,t) --> R

The corresponding two channel derivative maps are represented as (dH_x/ds, dH_x/dt) and similar for the other two height maps. Using the blend weights described on page 64 in the presentation by Nvidia on triplanar projections, let us call them w0-w2, the mixed height function defined on a volume can now be given as

H(x,y,z) = H_z(x,y)*w0 + H_x(z,y)*w1 + H_y(z,x)*w2

The goal is to use equations 2 and 4 on it to get the perturbed normal.
However, notice the blend weights w0, w1 and w2. These are not really constant.
They depend on the initial surface normal. In my paper I mention that the height map
defined on a volume does not have to be global. It can be a local extension on an open
subset of R^3. What that means (roughly) is at every given point we just need to know some local height map function defined on an arbitrarily small volume which contains the point.

At every given point we can assume the initial normal is nearly constant within some small neighborhood. This is the already applied approximation by Jim Blinn himself.
Constant "enough" at least that its contributions, as a varying function, to the gradient
of H will be negligible. In other words in the local height map we treat w0, w1 and w2 as constants.

Now to evaluate equation 2 using H as input function we need to find the regular gradient of H

Grad(H) = w0 * Grad(H_z) + w1 * Grad(H_x) + w2 * Grad(H_y)

= w0*[ dH_z/ds dH_z/dt 0 ]^T +
w1*[ 0 dH_x/dt dH_x/ds ]^T +
w2*[ dH_y/dt 0 dH_y/ds ]^T

[w0*dH_z/ds + w2*dH_y/dt]
= [w0*dH_z/dt + w1*dH_x/dt]
[w1*dH_x/ds + w2*dH_y/ds]

Now that we have the gradient this can be used in equation 2 and subsequently equation 4. Another way to describe it is to use this gradient with Listing 3 in the paper.

The derivation in this post is according to OpenGL conventions. In other words assuming a right hand coordinate system and Y is up. Furthermore, using lower left corner as texture space origin. Let me know if you need help reordering things to D3D.

Sunday, March 10, 2013

Links to my papers and some Summaries

My homepage with links to my papers

Friday, February 24, 2012

Parallax/POM mapping and no tangent space.

Since I wrote this post I've written a new technical paper called "Surface Gradient Based Bump Mapping Framework" which does a better and more complete job describing the following.
All my papers can be found at https://mmikkelsen3d.blogspot.com/p/3d-graphics-papers.html

I thought I would do yet another follow up post regarding thoughts on Parallax/POM mapping without the use of conventional tangent space. A lot of the terms involved are terms we already have when doing bump mapping using either derivative or height maps.

// terms shared between bump and pom

float2 TexDx = ddx(In.texST);

float2 TexDy = ddy(In.texST);

float3 vSigmaX = ddx(surf_pos);

float3 vSigmaY = ddy(surf_pos);

float3 vN = surf_norm; // normalized

float3 vR1 = cross(vSigmaY, vN);

float3 vR2 = cross(vN, vSigmaX);

float fDet = dot(vSigmaX, vR1);

// specific to Parallax/POM

float3 vV = vView; // normalized view vector in same space as surf_pos and vN

float2 vProjVscr = (1/fDet) * float2( dot(vR1, vV), dot(vR2, vV) );

float2 vProjVtex = TexDx*vProjVscr.x + TexDy*vProjVscr.y;

The resulting 2D vector vProjVtex is the offset vector in normalized texture space which corresponds to moving along the surface of the object by the plane projected view vector which is exactly what we want for POM. The remaining work is done the usual way.

The magnitude of vProjVtex (in normalized texture space) will correspond to the magnitude of the projected view vector at the surface. To obtain the third component of the transformed view vector the applied bump_scale must be taken into account. This is done using the following line of code:

float vProjVtexZ = dot(vN, vV) / bump_scale;

If we consider T the texture coordinate and the surface gradient of T a 2x3 matrix then an alternative way to think of how we obtain vProjVtex is through the use of surface gradients. One per component of the texture coordinate since each of these represent a scalar field.

float2 vProjVtex = mul(SurfGrad(T), vView)

The first row of SurfGrad(T) is equal to (1/fDet)*(TexDx.x*vR1 + TexDy.x*vR2) and similar for the second row but using the .y components of TexDx and TexDy. In practice it doesn't really simplify the code much unless we need to transform multiple vectors but it's a fun fact :)

Note that one of the observations made in the paper is that the surface gradient of a given scalar field can be obtained using any parametrization of the surface and the field (eq. 3). For a scalar field defined on a volume we can (though not required) use eq. 2 instead to obtain the surface gradient.

Tuesday, January 3, 2012

How to do more generic mixing of derivative maps?

Since I wrote this post I've written a new technical paper called "Surface Gradient Based Bump Mapping Framework" which does a better and more complete job describing the following.
All my papers can be found at https://mmikkelsen3d.blogspot.com/p/3d-graphics-papers.html

I have added a shader to the demo which shows how to do mixing of derivative maps in a more typical scenario in which auto bump scale is used and all derivative maps are using same base unwrap but using different scales and offsets on the texture coordinate. The auto bump scale is what allows us to achieve scale invariance like we are used to with normal mapping.

The location of the demo is still the same: source and binary. Also notice that dHdst_derivmap() can be made shorter by passing
VirtDim / sqrt(abs(VirtDim.x * VirtDim.y)) as a constant. Furthermore, this vector constant is ±1.0 when VirtDim.x == VirtDim.y. Finally, the scale and offset which is applied to the texture coordinate can be moved to the vertex shader. However, this requires the use of multiple interpolators so this might not be the preferred way.

One relevant thing to notice here is that the numerator VirtDim is a float2 and not part of the bump scale. It serves to convert the height derivatives to the normalized texture domain from which we apply the chain rule. The factor 1 / sqrt(abs(VirtDim.x * VirtDim.y)) on the other hand is part of the bump scale. The distinction is subtle when doing bump mapping but becomes relevant when we do parallax mapping.

If for instance you are using derivative maps made with xNormal or you are just using this way of auto calculating a bump scale then the expression for the bump scale is:

bump_scale = g_fMeshSpecificAutoBumpScale / sqrt(abs(VirtDim.x * VirtDim.y))

where g_fMeshSpecificAutoBumpScale is the square root of the average ratio of the surface to normalized texture area. There is code which shows how to calculate this in the demo. The final value bump_scale is the same as the value which is returned by SuggestInitialScaleBumpDerivative() in xNormal.

Friday, December 16, 2011

So finally a no tangents bump demo is up!

There is both a binary and source code available. However, it is a Direct3D11 sample so for those with older cards......get an upgrade!

When running the Gothic window you can toggle between height map and derivative map on the M key.
For the model on the right there are no options since it's doing triplanar bump mapping.
In other words it's generating texture coordinates from three planar projections and then mixing based on the normal which would be significantly more cumbersome to achieve using conventional tangent space based normal mapping since no one likes to store three sets of tangent spaces.

I would also like to say thank you to Rune Stubbe of Square Enix for pointing out that triplanar is a good application for my method!

Finally, the triplanar example on the right is using three different textures (one for each plane). There is a derivative map, a height map and a procedural function. For anyone who's still not getting this. There is no texture unwrap in the triplanar case! :)

I'd also like to point out that the shader on the Gothic window is using an auto-generated bump scale to match xNormal so this sample is a good reference for that as well. The triplanar is a good reference for seeing how you can mix different kinds of derivatives. And this includes scenarios where these are obtained from different texture spaces.

That's it for this time.

Tuesday, December 13, 2011

Oh no! Quads only!

I have found in general it can be difficult to get hold of a control mesh that is quads only, well proportioned, and represents something "interesting".

For this reason I thought I'd make one available which was obtained by taking the third example from the bottom given here http://iat.ubalt.edu/summers/math/platsol.htm and then typing the given function:

2 - (cos(x + T*y) + cos(x - T*y) + cos(y + T*z) + cos(y - T*z) + cos(z - T*x) + cos(z + T*x)), T=golden ratio

into Maple and then having Maple apply marching cubes to triangulate it. A retopo of this mesh is available here:

http://jbit.net/~sparky/academic/icosym_ctrl_744quads.obj

Hope others will find this useful.