Tuesday, May 17, 2016

Fine Pruned Tiled Lighting

Tiled lighting techniques have gained significant interest in recent years. However, a problem with tiled lighting techniques vs. traditional directx 9 styled deferred lighting (additive alpha blending) is the significant amount of false positives largely due to intersection testing using coarse bounding volumes. This is particularly relevant when supporting lights/volumes of different shapes such as long narrow spot lights, wedges, capsules etc. The downside to directx 9 style lighting is this approach does not work with forward lighting and having thousands of lights is extremely expensive due to setup cost per light and overlapping reads from gbuffer and overlapping writes to the frame buffer.

During the development of Rise of the Tomb Raider (ROTR) we came up with a new tiled lighting variant which we named Fine Pruned Tiled Lighting (FPTL) which we describe in GPU Pro 7.  There are many details to the full implementation discussed in the article and I will not go over them here but the main point is the cost of fine pruning can easily be absorbed by using asyncronous compute. This implies we obtain a light list with a very minimal amount of false positives almost for free.

As explained in the article the technique will work with essentially any methodology such as deferred shading, pre-pass deferred, tiled forward and even hybrids between these. A demo sample is available though it was written in vanilla directx 11 which implies the asyncronous compute part is left as an exercise for the reader! The demo shows a single terrain mesh lit by 1024 lights (heat map and fine pruning enabled by default). For simplicity the demo is setup as tiled forward though on ROTR we used a hybrid where we supported pre-pass deferred, tiled forward and conventional forward.

When running the demo you will notice fine pruning enabled runs faster than disabled despite the fact that there is no asyncronous compute in the demo (since it is standard DX11). However, the improvement on speed is of course much more significant when asyncronous compute is used correctly.

Other interesting aspects to the implementation is we determine screen-space AABBs around each light (regardless of type of shape) on the GPU. This allows us to reduce coverage significantly for partially visible lights (accellerates fine pruning) and reduces pressure on registers during light list generation (explained in the article). Additionally, we keep light lists sorted by type of shape to miminize chances of thread divergence during tiled forward lighting.

For more information on the details....Buy GPU Pro 7! :)






Sunday, October 6, 2013

Volume Height Maps and Triplanar Bump Mapping


I know it has been a while but I thought I would like to emphasize an easily missed contribution in my paper Bump Mapping Unparametrized Surfaces on the GPU. The new surface gradient based formulation (eq. 4) of Jim Blinn's perturbed normal unifies conventional bump mapping and bump mapping from height maps defined on a volume. In other words you do not have to bake such heights to a 2D texture first to generate
Jim Blinn's perturbed normal. The formula also suggests that a well known method proposed by Ken Perlin is wrong. For a more direct visual confirmation a difference is shown in figure 2. The black spots on the left are caused by subtracting the volume gradient which causes the normal to implode. The result on the right is when using the new surface gradient based formulation and as you see the errors are gone.

To give an example of how the theoretical math might apply to a real world practical case
let us take a look at triplanar projection. In this case we have a parallel projection
along each axis and we use a derivative/normal map for each projection.

Let's imagine we have three 2D height maps.

H_x: (s,t) --> R
H_y: (s,t) --> R
H_z: (s,t) --> R

The corresponding two channel derivative maps are represented as (dH_x/ds, dH_x/dt) and similar for the other two height maps. Using the blend weights described on page 64 in the presentation by Nvidia on triplanar projections, let us call them w0-w2, the mixed height function defined on a volume can now be given as

H(x,y,z) = H_z(x,y)*w0 + H_x(z,y)*w1 + H_y(z,x)*w2

The goal is to use equations 2 and 4 on it to get the perturbed normal.
However, notice the blend weights w0, w1 and w2. These are not really constant.
They depend on the initial surface normal. In my paper I mention that the height map
defined on a volume does not have to be global. It can be a local extension on an open
subset of R^3. What that means (roughly) is at every given point we just need to know some local height map function defined on an arbitrarily small volume which contains the point.

At every given point we can assume the initial normal is nearly constant within some small neighborhood. This is the already applied approximation by Jim Blinn himself.
Constant "enough" at least that its contributions, as a varying function, to the gradient
of H will be negligible. In other words in the local height map we treat w0, w1 and w2 as constants.

Now to evaluate equation 2 using H as input function we need to find the regular gradient of H

Grad(H) = w0 * Grad(H_z) + w1 * Grad(H_x) + w2 * Grad(H_y)

=  w0*[ dH_z/ds   dH_z/dt        0 ]^T +
     w1*[ 0       dH_x/dt       dH_x/ds  ]^T +
     w2*[ dH_y/dt        0       dH_y/ds ]^T

             [w0*dH_z/ds + w2*dH_y/dt]
         =  [w0*dH_z/dt + w1*dH_x/dt]
            [w1*dH_x/ds + w2*dH_y/ds]


Now that we have the gradient this can be used in equation 2 and subsequently equation 4. Another way to describe it is to use this gradient with Listing 3 in the paper.

The derivation in this post is according to OpenGL conventions. In other words assuming a right hand coordinate system and Y is up. Furthermore, using lower left corner as texture space origin. Let me know if you need help reordering things to D3D.


Sunday, March 10, 2013

Links to my papers and some Summaries

The jbit.net server appears to be down still. So I have decided to write a post listing some of my papers and with links to their alternative urls.

One paper some have been asking for recently is my skin rendering paper from 2010:




Skin Rendering by Pseudo-Separable Cross Bilateral Filtering


In this paper I show how to convert the work of Eugene d'Eon accurately into screen-space using the cross bilateral filter. The 2D filter region is chosen on a per pixel basis such that it represent a rectangular bounding box of a projected disc. The disc has a constant radius in mm and is tangent to the surface at the pixel. Also as mentioned in the paper I never bothered to use more than one separable Gaussian pass though Eugene suggests using six Gaussians. There is also a video available here.


Next there is the bump mapping paper and the accompanying source and binary.

Bump Mapping Unparametrized Surfaces on the GPU

In this paper I describe bump mapping on the gpu in a broader context which allows combining perturbations from all kinds of multiple unwraps and procedural fields. Since the subject has already been covered extensively in all my previous posts I will not go over further details here.






A different paper I wrote though somewhat more academic is the paper:


Microfacet Based Bidirectional Reflectance Distribution Function

The point to me when I did this paper was developing a good solid understanding of what physically based specular reflection really is. The way I did this was by deriving the Torrance-Sparrow model from scratch. There is a frequent trend in the graphics community to consider physically based specular as essentially Fresnel and using the Beckmann surface distribution function. This leads me to believe a lot of people never read the Torrance-Sparrow paper. The paper does not attribute any major significance to the underlying choice of isotropic surface distribution function. An interesting fact which is less known is that you can remap from a beckmann to a normalized phong and you won't be able to tell them apart. The observation is made in my section 2.4 but is also an observation made on page 7 in a paper written by Walter B. et al..
In practice I have found for a normalized phong specular power of 8.0 and above there is no visible difference between the two.

In regards to Fresnel, though relevant, the point to the Torrance-Sparrow paper was to dispute a previous model which attributes off-specular peaks to Fresnel exclusively. As is pointed out this model would not work for metals since in this case the Fresnel reflectance is closer to constant.

To add to the irony a term which is often marginalized in the graphics community is the shadow/masking term also sometimes referred to as the geometry factor. It's ironic because this term is essentially the fruit of the Torrance-Sparrow paper. This term combined with the division by the dot product between the view vector and the normal is how the model predicts off-specular peaks.


I also wrote a long paper in which I set out to understand the work of Henrik Wann Jensen and the work of Craig Donner:


Skin Rendering: Reflectance and Integration



It was a very long and hard road and I am having trouble coming up with something brief to say about the contents of the paper. It was very interesting to me academically yet at the same time I think though I achieved my goal I still arrived at the conclusion it wasn't necessary to understand the model to its core to do good skin rendering. Important things to know is that the reflectance profile should exhibit exponential decay so a spike followed by a broad base is important. Further we need most of the bleed in red, less in green and even less in blue (and of course no bleed in the specular).
That being said if you are interested in all the details that lead to the multi-pole based bssrdf this paper is a very good walk-through of the underlying details.

I wrote my masters thesis at DIKU (University of Copenhagen).

Simulation of Wrinkled Surfaces Revisited

It's a rather extensive analysis of bump mapping in its original form. Many things are studied but a core element component is normal mapping on low resolution geometry today and why we get unwanted lighting seams/discontinuities in our results. This is due to tangent space not being calculated the same way in game engines and bake tools which causes bad errors in practice. Ideally, baking tools must allow developers to customize them such that the game developer can ensure the tool does the exact inverse of what the game engine and shader does. Such functionality has since then been added to the very popular baking tool xNormal.


Finally some oldies:

DOT3 Normal Mapping on the PS2

Separating Plane Perspective Shadow Mapping.

Friday, February 24, 2012

Parallax/POM mapping and no tangent space.

I thought I would do yet another follow up post regarding thoughts on Parallax/POM mapping without the use of conventional tangent space. A lot of the terms involved are terms we already have when doing bump mapping using either derivative or height maps.

// terms shared between bump and pom
float2 TexDx = ddx(In.texST);
float2 TexDy = ddy(In.texST);
float3 vSigmaX = ddx(surf_pos);
float3 vSigmaY = ddy(surf_pos);
float3 vN = surf_norm;     // normalized
float3 vR1 = cross(vSigmaY, vN);
float3 vR2 = cross(vN, vSigmaX);
float fDet = dot(vSigmaX, vR1);

// specific to Parallax/POM
float3 vV = vView;   // normalized view vector in same space as surf_pos and vN
float2 vProjVscr = (1/fDet) * float2( dot(vR1, vV), dot(vR2, vV) );
float2 vProjVtex = TexDx*vProjVscr.x + TexDy*vProjVscr.y;

The resulting 2D vector vProjVtex is the offset vector in normalized texture space which corresponds to moving along the surface of the object by the plane projected view vector which is exactly what we want for POM. The remaining work is done the usual way.

The magnitude of vProjVtex (in normalized texture space) will correspond to the magnitude of the projected view vector at the surface. To obtain the third component of the transformed view vector the applied bump_scale must be taken into account. This is done using the following line of code:

float vProjVtexZ = dot(vN, vV) / bump_scale;

If we consider T the texture coordinate and the surface gradient of T a 2x3 matrix then an alternative way to think of how we obtain vProjVtex is through the use of surface gradients. One per component of the texture coordinate since each of these represent a scalar field.

float2 vProjVtex = mul(SurfGrad(T), vView)


The first row of SurfGrad(T) is equal to (1/fDet)*(TexDx.x*vR1 + TexDy.x*vR2) and similar for the second row but using the .y components of TexDx and TexDy. In practice it doesn't really simplify the code much unless we need to transform multiple vectors but it's a fun fact :)

Note that one of the observations made in the paper is that the surface gradient of a given scalar field can be obtained using any parametrization of the surface and the field (eq. 3). For a scalar field defined on a volume we can (though not required) use eq. 2 instead to obtain the surface gradient.

Tuesday, January 3, 2012

How to do more generic mixing of derivative maps?

I have added a shader to the demo which shows how to do mixing of derivative maps in a more typical scenario in which auto bump scale is used and all derivative maps are using same base unwrap but using different scales and offsets on the texture coordinate. The auto bump scale is what allows us to achieve scale invariance like we are used to with normal mapping.


The location of the demo is still the same: source and binary. Also notice that dHdst_derivmap() can be made shorter by passing
VirtDim / sqrt(abs(VirtDim.x * VirtDim.y)) as a constant. Furthermore, this vector constant is ±1.0 when VirtDim.x == VirtDim.y. Finally, the scale and offset which is applied to the texture coordinate can be moved to the vertex shader. However, this requires the use of multiple interpolators so this might not be the preferred way.

One relevant thing to notice here is that the numerator VirtDim is a float2 and not part of the bump scale. It serves to convert the height derivatives to the normalized texture domain from which we apply the chain rule. The factor 1 / sqrt(abs(VirtDim.x * VirtDim.y)) on the other hand is part of the bump scale. The distinction is subtle when doing bump mapping but becomes relevant when we do parallax mapping.

If for instance you are using derivative maps made with xNormal or you are just using this way of auto calculating a bump scale then the expression for the bump scale is:

bump_scale = g_fMeshSpecificAutoBumpScale / sqrt(abs(VirtDim.x * VirtDim.y))

where g_fMeshSpecificAutoBumpScale is the square root of the average ratio of the surface to normalized texture area. There is code which shows how to calculate this in the demo. The final value bump_scale is the same as the value which is returned by SuggestInitialScaleBumpDerivative() in xNormal.

Friday, December 16, 2011

So finally a no tangents bump demo is up!

So I wrote a small demo of the no tangents bump mapping technique from my paper and Andy Davies has worked with me by supplying the Gothic window model and textures.


There is both a binary and source code available. However, it is a Direct3D11 sample so for those with older cards......get an upgrade!

When running the Gothic window you can toggle between height map and derivative map on the M key.
For the model on the right there are no options since it's doing triplanar bump mapping.
In other words it's generating texture coordinates from three planar projections and then mixing based on the normal which would be significantly more cumbersome to achieve using conventional tangent space based normal mapping since no one likes to store three sets of tangent spaces.

I would also like to say thank you to Rune Stubbe of Square Enix for pointing out that triplanar is a good application for my method!

Finally, the triplanar example on the right is using three different textures (one for each plane). There is a derivative map, a height map and a procedural function. For anyone who's still not getting this. There is no texture unwrap in the triplanar case! :)

I'd also like to point out that the shader on the Gothic window is using an auto-generated bump scale to match xNormal so this sample is a good reference for that as well. The triplanar is a good reference for seeing how you can mix different kinds of derivatives. And this includes scenarios where these are obtained from different texture spaces.

That's it for this time.

Tuesday, December 13, 2011

Oh no! Quads only!

I have found in general it can be difficult to get hold of a control mesh that is quads only, well proportioned, and represents something "interesting".



For this reason I thought I'd make one available which was obtained by taking the third example from the bottom given here http://iat.ubalt.edu/summers/math/platsol.htm and then typing the given function:

2 - (cos(x + T*y) + cos(x - T*y) + cos(y + T*z) + cos(y - T*z) + cos(z - T*x) + cos(z + T*x)), T=golden ratio

into Maple and then having Maple apply marching cubes to triangulate it. A retopo of this mesh is available here:

http://jbit.net/~sparky/academic/icosym_ctrl_744quads.obj

Hope others will find this useful.