Lighting and Shadows

Key Points

  • Today's Lab: Lighting Lab
    • [ ] Why is lighting important in a 3D scene?
    • [ ] Why are shadows important in a 3D scene?
    • [ ] What are some differences between point, spot, and directional lights?
    • [ ] What is a normal and why is it important for lighting?
    • [ ] What is the difference between diffuse, specular, and ambient lighting components?
    • [ ] What is a material and how does it relate to shaders and textures?
    • [ ] How do games implement reflections?
    • [ ] How does forward rendering performance scale with the number of lights in a scene amd the number of models in a scene?
      • [ ] What workarounds can you think of to make up for this?
    • [ ] How are blob shadows implemented? What are some drawbacks?
    • [ ] What is shadow-mapping and how does it fit into a rendering pipeline?
  • This series of articles can be a nice overview for different aspects of rendering.

Lighting

So far in our 3D scenes, every surface is evenly lit as if illuminated from within. This can make objects look flat, since our eyes rely on cues like differences in brightness to determine the shapes of objects (e.g., a solid color cube will look like a hexagon from almost all angles).

It's also difficult to perceive the relative positions of objects without cues like lighting and shadows—even though farther objects are drawn smaller, it is impossible to tell just by looking at two objects whether they are the same size (and they are equally close to the viewer) or whether one is larger (and further away). While texturing can help in this case, other similar situations are still problematic; it is difficult to tell where an object is relative to a ground plane if it does not cast a shadow, for example.

Lighting also gives us vital information about what objects are made of and what their physical properties might be. Smooth, soft objects tend not to take on harsh specular highlights (they scatter light fairly evenly), while reflective metal surfaces don't diffuse light (all the rays of light we see have come from a light or another object with no scattering). In our examples so far, the stone texture doesn't look very convincing because there is no self-shadowing or sense of depth in the material (as if it were just painted on plastic), and it can be hard to know whether the terrazzo-textured orbs are in tne air or on the ground.

We need lighting and shadows to know where things are and perceive what they are made of. How do we fake that in 3D games?

Light Sources

Light comes from some type of light source. Photons are emitted and bounce off of objects, and some of these will reflect towards a viewer (in our case, the camera). Different lights emit photons in different ways, and we'll discuss four main types of lights.

First off is the directed light, mimicking something like the sun—infinitely far away, casting a uniform field of light oriented in a certain direction:

struct DirectedLight {
    dir: Vec3, // just a normalized vector (technically a bivector) is fine
    color: Vec3 // rgb intensities
}

Next is the point light, which is sort of the opposite: undirected, but located in space. It radiates light equally in all directions.

struct PointLight {
    pos: Pos3,
    color: Vec3
}

Finally we have the spot light, which has both position and orientation; it emits light in a cone shape defined by an angle θ and sometimes an attenuation exponent (so the center is brighter than the edges).

struct SpotLight {
    pos: Pos3,
    dir: Vec3,
    theta: f32,
    att: f32
}

In practice, we sometimes will make one data structure that captures all three using some tricks; Panda3D does this for example. This can be a sensible move since GPUs mainly work with four-element vectors.

struct Light {
    pos: Pos4, // if w == 0.0, ignore; this is a directed light
    dir: Pos4, // if w == 0.0, ignore; this is a point light
    color: Vec4, // Pack theta, att into w coordinate
    // For point lights, theta is 2PI and att is 1
    // For directed lights, theta is PI and att is 1
    // The shader could ignore those, but it might work out
}
impl Light {
    pub fn point(pos:Pos3, color:Vec3) -> Self {
        Self {
            pos: Pos4::new(pos.x, pos.y, pos.z, 1.0),
            dir: Vec4::new(0.0, 0.0, 0.0, 0.0),
            color: Vec4::new(
                color.x, color.y, color.z,
                Self::pack_theta_att(std::f32::consts::TAU, 1.0)
            )
        }
    }
    pub fn directed(dir:Vec3, color:Vec3) -> Self {
        Self {
            pos: Pos4::new(0.0, 0.0, 0.0, 0.0),
            dir: Vec4::new(dir.x, dir.y, dir.z, 1.0),
            color: Vec4::new(
                color.x, color.y, color.z,
                Self::pack_theta_att(std::f32::consts::PI, 1.0)
            )
        }
    }
    pub fn spot(pos:Pos3, dir:Vec3, color:Vec3, theta:f32, att:f32) -> Self {
        Self {
            pos: Pos4::new(pos.x, pos.y, pos.z, 1.0),
            dir: Vec4::new(dir.x, dir.y, dir.z, 1.0),
            color: Vec4::new(
                color.x, color.y, color.z,
                Self::pack_theta_att(theta, att)
            )
        }
    }
    fn pack_theta_att(theta:f32, att:f32) -> f32 {
        assert!(theta >= 0.0);
        assert!(theta <= std::f32::consts::TAU);
        assert!(att >= 0.0);
        assert!(att <= 1.0);
        // theta is in the 0..2PI range
        let theta = theta / std::f32::consts::TAU;
        // now theta is in 0..1
        // att is in 0..1 already
        f32::from_bits(
            (((theta * (u16::MAX as f32)) as u32) << 16) |
            ((att * (u16::MAX as f32)) as u32 & 0x0000_FFFF)
        )
    }
    // This would usually happen in a shader on the GPU,
    // but this is the logic we need:
    fn unpack_theta_att(theta_att:f32) -> (f32,f32) {
        let bits = theta_att.to_bits();
        let theta = (bits & 0xFFFF_0000) >> 16;
        let att = bits & 0x0000_FFFF;
        // normalize theta from 0..65535 to 0..1
        let theta = theta as f32 / (u16::MAX as f32);
        // back to the 0..2PI range
        let theta = theta * std::f32::consts::TAU;
        // now the same for att
        let att = att as f32 / (u16::MAX as f32);
        (theta, att)
    }
}

Lighting Models and Materials

We can't simulate every photon accurately (even ray tracing only samples from the emitted photons), so we need to fake it using statistical approximations. While we could explore the world of physically based rendering (where irradiance computations are combined with modeling of microfacets on materials to produce physically-sound, energy-conserving lighting computations), we'll start with a simpler model.

We want to come up with an equation to describe how much light hits a point on an object—i.e., how much of the object's color (from its texture, for example) is visible based on the light's intensity and its own color. It's enough to capture a ratio between 0 and 1, with 0 meaning unlit by this light and 1 meaning lit with maximum intensity.

First, we'll fake the phenomenon of ambient lighting, which is a result of photons continuing to bounce around many, many times before expending their energy. Even objects that aren't directly illuminated—or the rear faces of such objects—are usually not completely dark, because stray light has bounced around the room or other objects and reached its dark side. This portion of the equation is just a constant \(\alpha\) which we add in. This constant could be set in a shader uniform or per-light.

The second phenomenon is the observation that all objects have microscopic bumps on them that absorb and disperse light of different frequencies differently, so when we look at a sheet or a curtain or a desk we mostly see the color of the object, modulated by the color of the light. We call this diffuse lighting. Statistically, we'll see the most rays from a light bouncing off of a surface (in all directions) when the light is directly pointed at the surface. We can imagine a linear model here where more "correct" angles use more of the light's intensity and less correct angles use less.

Mathematically, when lighting a fragment we need to find the incident angle of light onto the fragment. We can do this in one of two ways:

  • For point and spot lights, by subtracting the fragment's position (obtained by interpolating between its component vertices) from the light's position (given in a uniform) and normalizing;
  • For directional lights, the light is infinitely far, so there's an infinite number of these vectors. We just use the light's direction.

If we normalize that incident light vector, then we can find the diffuse intensity by taking the dot product of the incident light vector and the surface's normal. If the light is coming from directly "above" the fragment from the fragment's perspective, it will be at maximum intensity. The further away from the normal the light is, the less diffuse light it will spread off of the surface.

If it's a spot light, we also want to check the dot product \(d\) of that incident light vector against the direction of the light, and if that does not exceed the cosine of the light's θ parameter (i.e. the fragment is not inside the cone) then force the intensity to 0; otherwise, we want to attenuate the light intensity by taking something like \(((1 - \mathrm{cos}(\theta))/(1 - d))^a\), so that as the dot product is smaller (the lines are less parallel) the denominator gets larger and the term as a whole decreases. Any exponential function will work here.

Finally, we have the specular light component: harsh reflections that we see on metallic objects or other shiny surfaces, where light has come from a light source and bounced off of a surface towards our viewpoint. This glare can be reduced by changing your viewpoint or tilting the object, so it's clear it's a situation where that incident light angle will come in handy.

For today's discussion we'll look at an approach which is not physically motivated but gives a decent effect at low computational cost: the Phong and Blinn-Phong models.

Unlike diffuse lighting, specular lighting depends on the angle between the viewer and the incident light to the fragment. So in the Phong model, we need to compute the dot product of the incident with the normal, then the dot product of the reflected light vector with the vector from the fragment to the camera. This is used to scale the intensity of the specular highlight by exponentiating with some given "shininess" constant.

This approach is okay, but if the angle between the viewer and the reflected light is greater than 90 degrees, it breaks down. We need to provide some specularity in cases like this because no surface is perfectly smooth, and some of that reflected light still may have reached us. Blinn-Phong improves on Phong by using different vectors which are guaranteed to be less than 90 degrees apart from each other: first, it computes the so-called half-angle vector, which is halfway between the incident light and the viewpoint; second, it takes the dot product between the half-angle vector and the surface normal to determine the intensity of the specular reflection (if they're the same, the viewpoint is the angle of reflected light). Then that dot product is exponentiated by a material-dependent constant, as in Phong.

We haven't discussed it here, but lights also do tend to attenuate with distance, which can be nice visually and for efficiency reasons (e.g., culling lights that are too far to influence an object).

To obtain the final color of a fragment, we sum up the contributions of each light; each light contributes its ambient intensity multiplied by the diffuse color of the fragment, its diffuse intensity multiplied by the diffuse color, and its specular intensity multiplied by the fragment's specular color. These additions saturate at white (full intensity).

So far, this math has depended on three object-dependent properties: diffuse color, specular color (maybe the same as the diffuse color), shininess constants, and surface normals. Any of these could be given by a UV-wrapped texture and sampled, which can give some interesting illusions (look up normal mapping or bump mapping if you'd like to learn more) and effects (e.g., a metallic hairband with high specularity on a character's head).

Note that because all color contributions must be summed up, lighting any object depends on iterating through all of the lights in the scene! So each draw call requires iterating through every light for every fragment, leading to some very expensive computations. Note also that we need to make a new draw call for each new combination of textures, uniform parameters, or (sometimes) models we're rendering. If there are some things in your scene that don't need to be realistically lit, consider drawing them with a different, less expensive shader.

Fancier Lighting

Most environments have more than a dozen or so lights, but lights are largely static. How do we render scenes like those?

For static geometry, we can fake it completely by baking the lights in advance: doing expensive lighting calculations when the level is exported, and exporting diffuse and specular information to be loaded by the game engine. We can also use a technique called environment mapping, where we put probes in the environment and render the scene from the viewpoint of each probe, facing six different directions, and creating a cube map which we can later sample like a texture to know how illuminated any point within that cube is. The more probes we have the more accurate this lighting can be, but the more expensive it is to sample them. These static lights can be added in to the lighting equations for dynamic objects as well.

A similar technique of rendering a scene into a cubemap can be used for reflections on metallic objects. If we have an environment map made from the perspective of the center of a reflective object, we can determine the color of a fragment on its surface by sampling the environment map and obtain reflections against static parts of the scene. If we want to have reflections of dynamic objects as well, or if the reflective object moves, we'll either need to re-render those environment maps often (expensive!) or do extra work to figure out what color from the dynamic object should be used for the reflection fragment.

In the special case where the reflective object is a plane, the job of rendering reflections is easier: just render the scene again from the perspective of the reflective plane, either clipping it to the region of screen space occupied by the reflective plane or rendering it to a texture and mapping that onto the reflective plane.

Other important lighting effects include Fresnel and rim lighting. This extended tutorial has some great examples.

Interestingly, a lot of these fancy tricks are just things you get for free with raytracing, so it's kind of silly how hard it is in rasterization-based approaches. But rasterization is just so fast!

Shadows

The last topic for today is shadows. Shadows can help us figure out the relative positions of an object and the ground or other surfaces. The principle behind shadows is simple (and also trivial in raytracing): when light rays are blocked (occluded) by an object, they do not reach other objects which are behind the occluder! Unfortunately, our lighting equations would become horribly complex if we had to do collision checks along the incident light ray to figure out if a light should influence a fragment or not—so we have to do more hacks.

Blob Shadows

The simplest hack for shadows is to assume that light comes only from one direction and goes from somewhere mostly above the object towards the ground. We project a ray from the character's feet in the direction of the light source (usually, straight down) and, wherever it hits, place a small dark (sometimes translucent) circle or oval oriented along the normal to the surface. For greater realism in case the shadow should be cast on a couple of different surfaces, we might project rays from multiple points and deploy several shadow bits.

When combined with statically baked shadows, this can be fairly convincing. A more realistic way to achieve "I'm on the ground now" shadowing and more is screen space ambient occlusion, which I recommend looking up and reading about! We might discuss it in a couple of weeks.

Shadow Mapping

If we want to get more realistic shadows that take the object's shape into account, we need to do some more sophisticated tricks. Shadow mapping is an old technique that's still used today for forward rendering pipelines, and it works on the observation that if we couldn't "see" a fragment from the perspective of a light source pointed at it, then that fragment should be in shadow—i.e., it should receive no light from this source.

Of course, we don't want to determine shaded-ness for each light for each fragment one at a time. It would be great if we could somehow calculate which positions in space are illuminated by each light, and then use that information during our fragment shader to determine how shadowed the fragment is. We're trying to view the scene from some perspective… sounds like a job for another render pass!

In this case, we don't care about the color of objects from the perspective of the light. It suffices to figure out the depth buffer we would obtain if we render the scene from the light's point of view. We can then sample from that depth buffer to determine how shadowed a fragment is: first by converting the fragment's position in world coordinates to its position from the light's perspective (multiplying by the inverse of the light's model-view-projection matrix), and then seeing if the depth of the depth buffer at that transformed \(x,y\) position is greater or lower than the transformed \(z\) value (which is how far forward from the light this fragment is). If the depth buffer sample is a lower value than the transformed fragment position's \(z\), then the rays from this light were blocked from reaching this fragment.

We can figure out how shadow casting works for our three types of lights:

  1. For spot lights, render the scene from the light's position, pointed in its direction, with a perspective projection based on θ. This is the simplest case.
  2. For point lights, we render the scene six times from the light's position pointed in different directions to produce a cubemap; each time we use a perspective projection. Sampling the shadow cubemap is a little different.
  3. For directed lights, render the scene with an orthographic projection since all light rays are parallel; but what position should we use and what width and height should our orthographic "box" be? It takes some care to be sure the box is large enough to hold the objects within (or even a bit outside of!) the camera's frustum which might either cast or receive shadows. To a first approximation, you could get nearby objects within the camera frustum; bound them with a box or other volume; and then add in objects that are in the inverse direction of the light from that bounding box.

There are lots of improvements available for shadow maps to reduce artifacts like sharp shadows or shadows popping in and out; the main one you'll read about is cascaded shadow maps.

Rendering the scene an additional time (or 6 times!) every time the lights move or objects in the scene move is pretty rough. Deferred rendering is an approach to do lighting and shadows in screen space once per actually rendered fragment, rather than once per possibly-rendered fragment. It's a really cool technique.