Blog Archive About

Psychopath Renderer

a slightly psychotic path tracer

2020 - 04 - 20

Light Trees and The Many Lights Problem

This post will walk you through a particular approach for efficiently importance sampling huge numbers of lights in a scene. I call the approach "Light Trees".

This post has been a long time coming. I actually wrote about this technique (tersely) back in 2014 on ompf2.com. But I never got around to writing about it properly on this blog. So I'm fixing that now!

Before I get started, I want to make it clear that I am not the first person to come up with this idea. In particular, the Manuka guys have apparently been doing something like this since the inception of their renderer. And there are several other people that apparently came up with pretty much the same technique around the same time as me.

Really, the core approach is super straightforward. And the fact that I (of all people) independently came up with it is a strong indication that it's pretty obvious. After all, I'm an animator professionally, not a software developer.

So my goal in this post is not to claim any credit for the idea. Quite the contrary—given how many people have independently come up with it, I don't think anyone can reasonably claim credit for it.

Rather, the goal of this post is to clarify and distill the approach, since the papers I've seen about it describe more detailed or complex variants. I discuss those papers and their contributions at the end of this post.

The Many Lights Problem

The problem we're trying to solve with light trees is sometimes called the "many lights" problem. If you're already familiar with this problem and with importance sampling, then please feel free to skip to the "Light Trees" section. These first two sections are just filling in that background.

In a typical forward path tracer, when you want to calculate lighting at a point on a surface, you generally do two things1:

  1. Sample the light(s) in the scene to calculate direct lighting.
  2. Shoot one or more rays out in random directions to calculate indirect lighting.

The many lights problem shows up in that first step.

Calculating direct lighting is really easy if you only have one light: just shoot a shadow ray at the light, and if the ray reaches it, calculate the lighting. But what if there's more than one light? In that case, there are two broad approaches:

  1. Sample all lights.
  2. Pick one light at random.

The first approach—sampling all lights—works really well:

three lights full sampling

The second approach—sampling one light at random—also works well enough, but results in some noise due to the random selection:

three lights random sampling

Both approaches are fine. The former takes longer to render because it needs to do more calculations per shading point, but produces less noise. The latter renders faster but produces more noise, which takes more samples to get rid of. Which approach is better depends on the particular scene.

The many lights problem starts to show up when there are... eh... many lights. A large number of lights. For example, I have a test scene that contains roughly 8000 lights. Rendering this scene by sampling all lights looks like this2:

city scape full sampling

It doesn't look too bad. But it took 3 hours to render at only 16 samples per pixel, which is about 11 minutes per sample. That's because we have to spawn 8000 shadow rays on every ray hit, and potentially do as many lighting calculations. Imagine the render times on an actual production scene with even more lights, and at the sample rates needed to converge to a sufficiently noise-free image.

You might think that randomly selecting lights will save us. And, indeed, that brings our render time down to 6 seconds. But the result is so noisy that it's unusable:

city scape random sampling

That's because the probability of selecting a light that matters to the shading point is so low. Even if we crank our samples into the thousands, the render would still be somewhat noisy.

So it seems like we're stuck, then. Either our render times are intractable, or our renders are unusably noisy. And that, in a nut shell, is the many lights problem.

Importance Sampling

The answer to this problem is something called importance sampling3.

With approach #2 (random selection) we have a lot of wiggle room in how we select our lights, and the idea behind importance sampling is to use that wiggle room to sample important things with higher probability.

As an example, let's make a scene with two lights, with one of the lights much brighter than the other. If we render it with equal probability of selecting either light, it looks like this:

two lights equal sampling

But if we use importance sampling to select the brighter light with higher probability, it looks like this:

two lights importance sampling

So is that it, then? We just select the brighter lights more often, and we're good to go?

Unfortunately, no. You can get a hint of why that won't work by noticing that the area under the dimmer light actually gets more noisy in the importance sampled image.

The problem is that even though one light is brighter than the other, that doesn't mean it actually contributes more light to the point we're shading. For shading points that are sufficiently close to the dim light, the dim light is effectively the brighter one.

For a simple scene like the one above, that's not a big problem. But for something like the city scape scene it's a huge problem. First of all, all the lights in that scene are about the same brightness, so even with brightness-based importance sampling we would just be selecting them equally anyway. But also, even if some of the lights were brighter, what are the chances that any given light has much effect on any given shading point? Very low.

So what we actually want is to somehow sample the lights in proportion to the brightness they contribute to the current shading point. If we do that, things look way better:

two lights light tree sampling

But... how do we do that?

Light Trees

(To summarize for those that skipped to this section: our goal is to select a light from all lights in the scene according to their contributions to the current shading point.)

The naive way to do this would be to calculate the shading from all the lights, and then select a light based on those results. But that doesn't get us anywhere because then we're just sampling all the lights anyway.

So instead we need some efficient way to approximate that. And that's exactly what light trees let us do.

A light tree is a pretty straightforward data structure: it's just a BVH of the lights in the scene, where each node in the BVH acts as an approximation of the lights under it. For it to be an effective approximation, that means we also need to store some extra data at each node.

The simplest kind of light tree would simply store the total energy of all the lights under each node. You could then treat each node as a light with the same size as the spatial bounds of the node, and emitting the same energy as the lights under it.

I'm going to use that simple light tree variant to illustrate the technique. But I'll discuss some possible extensions in a bit.

Okay, so we have our light tree as described above, and we have our shading point on a surface. How do we select a light?

Starting from the root of the tree, we:

  1. Calculate the lighting (or an approximation thereof) from each of the child nodes.
  2. Take those lighting results and select which child to traverse into based on them. For example, if the left node contributes twice as much light, then we'll be twice as likely to traverse into it.
  3. If the selected child is a leaf, return the actual light under it. Otherwise, go back to step 1 with the selected child.

So that's the basic idea. But there are a couple of further details:

  • We also need to return the probability of selecting that light. This is because the calling code needs to weight that light's contribution based on that probability.
  • We need a good "approximate lighting" function to use in step 1. Incidentally, I like to call this the "weighting" function, because we only use it to determine the weighted probability of traversing into each node.

Calculating the light-selection probability is pretty easy: we just combine the probabilities of all the child nodes we traverse into, which is just multiplying them all together.

Coming up with a good weighting function is trickier. The ideal weighting function would exactly calculate the lighting from all the individual lights under the node. But if we could do that efficiently then we wouldn't need the light tree in the first place! So instead we need something that approximates that well. I'm going to punt on this for now, but we'll come back to it later. For the moment, just assume we have a reasonable weighting function.

Given all of those things, our light selection process is the following pseudo code, where tree_root is the root node of the light tree, shading_point is the data of the point we're shading, and n is a random number in the range [0, 1):

def light_tree_select(tree_root, shading_point, n):
    node = tree_root   # Updated as we traverse.
    selection_p = 1.0  # Selection probability, also updated as we go.
    
    loop:
        if is_leaf(node):
            return (node.light, selection_p)

        # Child node weights, or "approximate lighting".
        child_1_w = weight(node.child_1, shading_point)
        child_2_w = weight(node.child_2, shading_point)

        # Normalized probabilities of traversing into either child.
        child_1_p = child_1_w / (child_1_w + child_2_w)
        child_2_p = 1.0 - child_1_p

        # Select the child for the next loop.
        if n < child_1_p:
            node = node.child_1
            selection_p *= child_1_p # Accumulate probability.
            n /= child_1_p           # Scale random number.
        else:
            node = node.child_2
            selection_p *= child_2_p        # Accumulate probability.
            n = (n - child_1_p) / child_2_p # Scale random number.

And that's it! Well, modulo some floating point corner cases. But that's the entire thing. Really.

One thing I didn't mention yet: you need to scale the random number on each traversal step. That's what lets us get away with using just one number, instead of needing a new random number every time we choose a child. I'm feeling too lazy to explain how it works in detail, but you can think of it as the tree whittling down the space of the random number as it traverses deeper and deeper.

Anyway, if we implement that pseudo code, we get this beautiful result at 16 samples per pixel:

city scape basic light tree 16spp

This particular render uses a pretty simple weighting function: it assumes the shading point is lambert, treats the node as a spherical light, and just uses the analytic formula for lambert shading from a spherical light4.

That isn't an especially good weighting function for anything other than diffuse surfaces. But it works surprisingly well in many cases. It's definitely not sufficient for a robust implementation, however.

Potential Extensions and Improvements

Starting from that base, there are a bunch of potential improvements we could explore. For example:

  • A better weighting function that handles, among other things, a wider variety of materials well. This is more-or-less necessary for a robust implementation.
  • Storing more information than just total energy in the nodes, and using that for better weighting. For example, storing the approximate directionality of lights in cases such as spot lights and some area lights.
  • Alternative tree constructions. For example, using something other than the surface area heuristic to cluster lights, or using higher-arity trees.
  • Accounting for shadowing (somehow...).
  • etc.

The core idea—using a hierarchy to approximate light importance—is always the same. But the potential for variety within that is huge.

As a quick example, this render:

city scape better light tree 16spp

...is also 16 samples per pixel, but (if you flip between it and the last render) it's clearly far less noisy. This was achieved by playing with some of the things above.

Further Reading

There are two published papers I'm aware of that describe variations on this light tree approach, both of which are very much worth reading:

Importance Sampling of Many Lights With Adaptive Tree Splitting is, in my opinion, the more significant paper of the two. It covers a large breadth of things including:

  • A novel tree construction approach that goes beyond the SAH, accounting for factors unique to light trees.
  • A way to account for the directionality of lights in the tree and to use that information to improve weighting calculations.
  • A novel "tree splitting" approach for returning multiple lights with reduced variance.

The main downside is that their weighting functions don't work well without their tree splitting approach, and can approach very high variance without it. (In fact, the Stochastic Lightcuts paper makes comparisons to it without the tree splitting, which highlights how poorly it performs in that case.)

Stochastic Lightcuts proposes the same technique as both this post and my 2014 post on ompf2, but frames it as an extension to lightcuts. I personally find that framing confusing, and it actually took me a couple readings to realize that it is, in fact, the same technique. But some people might find that framing easier to understand, especially if they're already familiar with lightcuts.

However, I believe the significant contribution of Stochastic Lightcuts is the weighting function they propose, which is based on the lightcuts error bounding function, and accounts for different surface materials and lights with directional components. Moreover, unlike the weighting function from the Adaptive Tree Splitting paper, it works really well with the standard traversal approach—no tree splitting required. I would definitely recommend this paper as the baseline for any real implementation of light trees, and then explore improvements from there.


Footnotes

  1. This is a simplification, of course. But it's sufficient for illustrating the problem.

  2. If you compare to later renders in the post, you'll notice that it doesn't quite match. That's because this one was rendered in Cycles, but the others are rendered in Psychopath. Psychopath doesn't support sampling all lights, and I didn't feel like implementing a whole feature just for illustration purposes.

  3. For a more thorough introduction to importance sampling, I strongly recommend the amazing PBRT book, which is available for free online. (But is also very much worth buying!)

  4. See the paper "Area Light Sources for Real-Time Graphics" by John M. Snyder for the appropriate formula.

2020 - 04 - 14

Building a Sobol Sampler

Up until recently I was using Leonhard Grünschloß's Faure-permuted Halton sampler as my main sampler in Psychopath. I even ported it to Rust when I made the switch from C++. And it works really well.

I've also had a Rust port of Grünschloß's Sobol sampler in Psychopath's git repo for a while, and from time to time I played with it. But it never seemed notably better than the Halton sampler in terms of variance, and it was significantly slower at generating samples. This confused me, because Sobol seems like the go-to low discrepancy sequence for 3d rendering, assuming you want to use low discrepancy sequences at all.

However, I recently stumbled across some new (to me) information, which ultimately sent me down a rabbit hole implementing my own Sobol sampler. I'd like to share that journey.

(Incidentally, if you want to look at or use my Sobol sampler, you can grab it here. It's MIT licensed, like the rest of Psychopath.)

Don't Offset Sobol Samples

The first two things I learned are about what not to do. And the title of this section actually refers to both of those things simultaneously:

  1. Cranely Patterson Rotation.
  2. Randomly offseting your starting index into the sequence for each pixel.

These are two different strategies for decorrelating samples between pixels. The first offsets where the samples are in space, and the second offsets where you are in the sequence. Both are bad with Sobol sequences, but for different reasons.

Cranley Patterson Rotation is bad because it increases variance. Not just apparent variance, but actual variance. It's still not entirely clear to me why that's the case—it feels counter-intuitive. But I first found out about it in the slides of a talk by Heitz, et al. about their screen-space blue noise sampler. It's also been borne out in some experiments of my own since then.

So the first lesson is: don't use Cranley Patterson Rotation. It makes your variance worse.

Randomly offsetting what index you start at for each pixel is bad for a completely different reason: it's slow. Due to the way computing Sobol samples works, later samples in the sequence take longer to compute. This is what was making it slow for me. I already used this technique with the Halton sampler, to great success. So I assumed I could just apply it to Sobol as well. And you can. It works correctly. It just ends up being comparatively slow.

So the second lesson is: don't offset into your Sobol sequence (at least not by much).

The Right Way to Decorrelate

So if you can't do Cranley Patterson Rotation, and you can't offset into your sequence, how can you decorrelate your pixels with a Sobol sampler?

The answer is: scrambling.

There are two ways (that I'm aware of) to scramble a Sobol sequence while maintaining its good properties:

  1. Random Digit Scrambling
  2. Owen Scrambling

Random digit scrambling amounts to just xoring the samples with a random integer, using a different random integer for each dimension. The PBRT book covers this in more detail, but it turns out that doing this actually preserves all of the good properties of the Sobol sequence. So it's a very efficient way to decorrelate your pixels.

I gave that a try, and it immediately made the Sobol sampler comparable to the Halton sampler in terms of performance. So it's definitely a reasonable way to do things.

But then, due to a semi-unrelated discussion with the author of Raygon, I ended up re-reading the paper Progressive Multi-Jittered Sample Sequences by Christensen et al. The paper itself is about a new family of samplers, but in it they also compare their new family to a bunch of other samplers, including the Sobol sequence. And not just the plain Sobol sequence, but also the rotated, random-digit scrambled, and Owen-scrambled Sobol sequences.

The crazy thing that I had no idea about, and which they discuss at length in the paper, is that Owen scrambled Sobol actually converges faster than plain Sobol. That just seems nuts to me. Basically, Owen scrambling is the anti-Cranley-Patterson: it's a way of decorrelating your pixels while actually reducing variance. So even if you're not trying to decorrelate your pixels, it's still a good thing to do.

(As an interesting aside: random digit scrambling is a strict subset of Owen scrambling. Or, flipping that around, Owen scrambling is a generalization of random digit scrambling. But what's nice about that is you can use the same proof to show that both approaches preserve the good properties of Sobol sequences.)

Implementing Owen Scrambling Efficiently

The trouble with Owen scrambling is that it's slow. If you do some searches online, you'll find a few papers out there about implementing it. But most of them are really obtuse, and none of them are really trying to improve performance. Except, if I recall correctly, there's one paper about using the raw number crunching power of GPUs to try to make it faster. But that's not really that helpful, since you probably want to use your GPU for other number crunching.

So it basically seems like Owen-scrambling is too slow to be practical for 3d rendering.

Except! There's a single sentence in the "Progressive Multi-Jittered Sample Sequences" paper that almost seems like an afterthought. In fact, I almost missed it:

We do Owen scrambling efficiently with hashing – similar in spirit to Laine and Karras [LK11], but with a better hash function provided by Brent Burley.

"LK11" refers to the paper Stratified sampling for stochastic transparency by Laine et al. The really weird thing is that this paper isn't even about Sobol sampling. But they, too, have a one-liner that's really important:

This produces the same result as performing an Owen scramble [...]

What they're referring to here is a kind of hashing operation. They use it to decorrelate a standard base-2 radical inverse sequence, but the same approach applies just as well to Sobol sequences.

But you can't just use any hash function. In fact, you have to use a really specific, weird, special kind of hash function that would be a very poor choice for any application that you would typically use hashing for.

Here's the basic idea: the key insight Laine et al. provide is that Owen scrambling is equivalent to a hash function that only avalanches downward, from higher bits to lower bits. In other words, a hash function where each bit only affects bits lower than itself. This is a super cool insight! And it means that if we can design a high-quality, efficient hash function with this property, then we have efficient Owen scrambling as well.

It turns out, though, that developing a high-quality hash function that meets these criteria is pretty challenging. For example, the hash function they present in the paper, while fast, is pretty poor quality. Nevertheless, I decided to take a crack at it myself.

So far, I've stuck to the general approach from Laine et al., but worked on optimizing the constants and tweaking a few things. I don't think I've gained any real insights beyond their paper, but I have nevertheless made significant improvements through those tweaks. At this point, I have a hash function that I would call "okay" for the purposes of Owen scrambling. But there is very clearly still a lot of room for improvement.

I'm extremely curious what the implementation from "Progressive Multi-Jittered Sample Sequences" looks like. And I'm especially curious if there are any interesting insights behind it regarding how to construct this bizarre kind of hash function.

My Sobol Sampler

Again, you can grab my Sobol sampler here if you like.

It includes implementations of all the (non-slow) decorrelation approaches I mentioned in this post:

  • Cranley-Patterson rotation
  • Random Digit Scrambling
  • And hash-based Owen Scrambling

It also includes, of course, plain non-scrambled Sobol.

If you do make use of this somewhere, I strongly recommend using the Owen-scrambled version. But it can be fun to play with the others as well for comparison.

However, one thing I noticed in my testing is that Cranley-Patterson rotation seems to do a better job of decorrelating pixels. The two scrambling approaches seem to exhibit some correlation artifacts in the form of clusters of mildly higher-variance splotches in certain scenes. It's mild, and Owen scrambling still seems to win out variance-wise over Cranely-Patterson. But still. This is definitely a work in progress, so be warned.

Regardless, I've had a lot of fun with this, and I've learned a lot. Pretty much everything I talked about in this post was like black magic to me before diving head-first into all of it. I definitely recommend looking into these topics yourself if this post was at all interesting to you.

2020 - 04 - 12

Notes on Color: 01

During my time working on Psychopath, I've slowly gained a better understanding of color science, color management, and other color related topics. I'd like to jot down some notes about my current understanding of these topics, because the angle I learned them from is slightly non-standard. My hope is that this may help other people gain a better understanding of the topic as well.

I don't know how many entries in this series there will be. Maybe this will be the only one! Who knows, I'm lazy sometimes. But my hope is to make additional posts about this in the future. I have a lot that I'd like to write about.

Color Blindness

First, a fun fact about me: I'm partially color blind. More specifically, I'm an anomalous trichromat, meaning that, compared to normal color-vision people, one of the three cones in my eyes is shifted in its spectral sensitivity.

One of the more interesting experiences I've had was the process of being diagnosed with color blindness. Most people are familiar with the Ishihara color blindness test (even if not by name). You're shown a series of circles made up of colored dots, and each circle has an embedded image of some kind. If you can see the image then you have normal color vision, and if you can't then you're color blind to some extent.

What most people don't know, however, is that the full Ishihara test also includes the reverse: images that only color blind people can see but normal color vision people can't. This surprises a lot of people, and trying to understand how that works sent me down a fun rabbit hole about color vision a couple of decades ago.

That was my first foray into color science and color perception, and my understanding of color management and related topics is built on what I learned back then. I'm going to present color from that same perspective because I haven't seen it approached that way in the graphics community before, and I personally think it helps to clarify a lot of things that are otherwise quite... vague and wishy washy.

What is Color Vision?

The first step towards understanding almost anything about color is understanding color vision and how it relates to actual physical light.

Physical light is a fairly complex phenomenon (see e.g. polarization, various quantum effects, etc.). And if I'm being completely honest, I don't fully understand everything about it. However, the aspects of light relevant to color perception are (thankfully) pretty simple, and that's what I'll be talking about here.

To explain light and how it relates to color vision, I like to make an analogy between light and sound. Sound is made up of pressure waves (typically in the air) of various frequencies and amplitudes. For example, the wave of a single pure tone looks like this:

Graph of a tone

A louder version of that same tone looks like this:

Graph of a louder tone

And a higher-pitched version looks like this:

Graph of a higher-pitched tone

However, most sounds aren't a single pure tone. Most sounds have many tones of varying pitch and loudness layered on top of each other. This is what produces the rich harmonies and textures we experience in sounds. For example, here is a graph of a more complex sound: a chord of three notes being played on a piano:

Wave graph of three piano notes

Quick question: can you tell that there are three notes being played in that graph? Me either. That's because these waveform-style graphs aren't a good visualization for this kind of discussion. A more useful graph for us is one that visualizes how loud the sound is at each frequency. Here is the same piano sound graphed1 that way:

Spectrograph of three piano notes

Notice how you can make out the three separate notes, which show up as three spikes on the graph. Neat!

Light is also made up of waves, but they are electromagnetic waves rather than pressure waves. Nevertheless, we can graph light in the same way. For example, here is a graph2 of highly saturated yellow light, showing its frequency spiking around the yellow part of the spectrum:

Spectrograph of a pure yellow

With sound, we perceive different frequencies as being different pitches. But with light, we perceive different frequencies as being different colors. If you had a machine that could produce any frequency of light, and had it slowly move through all the frequencies of visible light, you would see it produce all the colors of the rainbow one after another, as shown along the bottom of the graph.

However, like sound, most light isn't made up of a single pure frequency. Most light has a range of frequencies layered on top of each other. Unfortunately, although our ears are capable of hearing multiple frequencies at once, our eyes aren't that good, and are unable to perceive the equivalent of the three piano notes in color. And that brings us to the next topic!

Human Color Perception & Metamers

Our ears have thousands of tiny hairs that are each sensitive to different frequencies, which is what allows our hearing to distinguish so many frequencies at once. Our eyes, on the other hand, only have three types of light sensors for seeing color, which are called "cones". Each type of cone is sensitive to different frequencies of light: one is sensitive to the long wavelengths ("L"), one to the medium wavelengths ("M"), and one to the short wavelengths ("S"). We can graph the sensitivities of each type of cone like this:

Spectrograph of cone sensitivities

The curve for each cone represents how sensitive it is at different frequencies. Importantly, the cones cannot distinguish where within their sensitivity range they are being stimulated. For example, the L cone can't tell if it's being stimulated at its peak or at the far end of its left tail. All it knows is that it's being stimulated somewhere within that range.

Nevertheless, because the sensitivities of the cones overlap, if they all work together, they can triangulate a single spike (or "tone") in the light spectrum. For example, a pure yellow spike would stimulate both the L and M cones roughly equally:

Spectrograph of a pure yellow with cone sensitivities overlaid

So we can make a pretty good guess that when the L and M cones are stimulated equally, the light spectrum is probably spiking at yellow. However, we could also have two spikes:

Spectrograph of a red and green together

And... well, the cones can't really tell the difference between that and the single yellow spike.3

This is, in fact, how our digital color displays (such as computer monitors) work. I imagine you already know that they use red, green, and blue lights at different intensities to create different colors. But what may not have occurred to you is that the reason that works is because all humans are color blind.

When a computer monitor turns on the red and green emitters of a pixel to display yellow, it's taking advantage of the fact that we can't tell the difference between a harmony of red + green and an actual pure yellow. It is fooling our eyes into seeing a yellow that isn't really there. If our eyes were as good as our ears, they wouldn't be fooled—we'd see both the red and green simultaneously, not yellow.

The big take-away here is that there are a lot of light spectrums that our eyes cannot distinguish. And there is a name for that phenomenon: two or more different light spectrums that appear the same to a given observer are called metamers.

The difference between color blind people and people with normal color vision isn't the existence of metamers, it's the quantity of metamers and which spectrums are metamers. To drive that point home: there are animals such as the mantis shrimp that have far, far better color vision than us humans do, because the mantis shrimp has not just three color sensors in its eyes but at least twelve. Compared to the mantis shrimp, normal color vision people are horribly, horribly color blind—only slightly less so than people with impaired color vision.

This, of course, makes me feel a little better as a color blind person. But the reason it's actually relevant to discussion of color management, 3d rendering, etc. is because it highlights the difference between physical light and human color perception.

Light spectrums are a well defined physical phenomenon: you can measure them, you can make a graph, and that is simply what the light spectrum is. Color, on the other hand, is a perceptual experience caused by light. Light doesn't have a color, it is perceived as a color by a given observer, and it depends on the particular observer. You can still measure and graph color, but only on a per-observer basis.

Wrapping Up

If I've done a good job explaining things, then hopefully you've grasped everything here reasonably well. The concepts I've just introduced are, I believe, critical for understanding color perception and color management, so if any of this was unclear please let me know!

Lastly, for the actual color scientists out there: I realize that I've glossed over some things here. Sorry about that. Nevertheless, I hope you think this is a good introduction to the topic.

Bonus

This isn't super important, but it's a fun little fact. I said earlier that our eyes can't see harmonies in light spectrums, but that's not quite true. There is one harmonic color that we can see (albeit a very simple one): purple.

We see purple when there is energy in the long and short ends of the visible spectrum, but not the medium length part, like this:

Spectrograph of a red and blue together

Our eyes can distinguish that because both the L and S cones will be strongly stimulated, but not the M cone. So for anyone who's favorite color is purple: congratulations, it's a very special color!


Footnotes

  1. For the pedantic, the data for this graph has actually been further processed to isolate the fundamental frequencies of the piano. But the principle stands.

  2. Most of the graphs in this post are just approximated by hand, so please don't take them as precise in any way. That's also why I (intentionally) left out the scales of the axes.

  3. The astute among you might notice that the S cone actually gets stimulated a little, so the red-green combination isn't quite identical to the pure yellow in how it stimulates the cones. However, if you just add a little bit of energy everywhere in the pure yellow graph (making it a little less saturated of a yellow) then it would be. This is the first hint of something fairly interesting, which is that an RGB display can't reproduce all the fully saturated colors, even if the three RGB colors are themselves fully saturated.

2018 - 11 - 21

Random Notes on Future Direction

This post is not going to be especially coherent. I wanted to jot down some thoughts about the future direction of Psychopath, given the decisions in my previous post. Normally I would do this in a random text file on my computer, but I thought it might be interesting to others if I just posted it to the blog instead. It should be fun to check back on this later and see which bits actually ended up in Psychopath.

Things to Keep

There are some things about Psychopath's existing implementation (or plans for implementation) that I definitely want to preserve:

  • Spectral rendering via hero wavelength sampling. I'm increasingly convinced that spectral rendering is simply the correct way to handle color in a (physically based) renderer. Not even for spectral effects, just for correct color handling.

  • Curved surface rendering and displacements should be "perfect" for most intents and purposes. Users should only ever have to deal with per-object tesselation settings in exceptional circumstances.

  • Everything should be motion-blurrable. This is a renderer built for animation.

  • Full hierarchical instancing.

  • Efficient sampling of massive numbers of lights.

  • Filter Importance Sampling for pixel filtering. This simplifies a lot of things, and I've also heard rumors that it makes denoising easier because it keeps pixels statistically independent of each other. It does limit the pixel filters that can be used, but I really don't think that's a problem in practice.

Shade-Before-Hit Architecture

Following in Manuka's footsteps will involve some challenges. Below are some thoughts about tackling some of them.

  • All processing that involves data inter-dependence should be done before rendering starts. For example, a subdivision surface should be processed in such a way that each individual patch can be handled completely independently.

  • Maybe even do all splitting before rendering starts, and then the only thing left at render time is (at most) dicing and shading. Not sure yet about this one.

  • Use either DiagSplit or FracSplit for splitting and tessellation (probably). DiagSplit is faster, but Fracsplit would better avoid popping artifacts in the face of magnification. Both approaches avoid patch inter-dependence for dicing, which is the most important thing.

  • Since we'll be storing shading information at all micropolygon vertices, we'll want compact representations for that data:

    • For data that is constant across the surface, maybe store it at the patch or grid level. That way only textured parameters have to be stored per-vertex.

    • We'll need a way to compactly serialize which surface closures are on a surface and how they should be mixed/layered. Probably specifying the structure on the patch/grid level, and then the textured inputs on the vertex level would make sense...? Whatever it is, it needs to be both reasonably compact and reasonably efficient to encode/decode.

    • For tristimulus color values, using something like the XYZE format (a variant of RGBE) to store each color in 32 bits might work well.

    • It would be good to have multiple "types" of color. For example: XYZ for tristimulus inputs, a temperature and energy multiplier for blackbody radiation, and a pointer to a shared buffer for full spectral data. More color types could be added as needed. This would provide a lot of flexibility for specifying colors in various useful ways. Maybe even have options for different tristimulus spectral upsampling methods?

    • For shading normals, using the Oct32 encoding from "Survey of Efficient Representations for Independent Unit Vectors" seems like a good trade-off between size, performance, and accuracy.

    • For scalar inputs, probably just use straight 16-bit floats.

  • The camera-based dicing-rate heuristic is going to be really important:

    • It can't necessarily be based on a standard perspective camera projection: supporting distorted lens models like fisheye in the future would be great.

    • It needs to take into account instancing: ideally we want only one diced representation of an instance. Accounting for all instance transforms robustly might be challenging (e.g. one instance may have one side of the model closer to the camera, while another may have a different side of the same model closer).

    • It needs to account for camera motion blur motion blur in general in some reasonable way. This probably isn't as important to get robustly "correct" because anything it would significantly affect would also be significantly blurred. But having it make at least reasonable decisions so that artifacts aren't visually noticeable is important.

  • Having a "dial" of sorts that allows the user to choose between all-up-front processing (like Manuka) and on-the-fly-while-rendering processing (closer to what the C++ Psychopath did) could be interesting. Such a dial could be used e.g. in combination with large ray batches to render scenes that would otherwise be too large for memory. What exactly this dial would look like, and what its semantics would be, however, I have no idea yet.

Interactive Rendering

Manuka's shade-before-hit architecture necessitates a long pre-processing step of the scene before you can see the first image. Supporting interactive rendering in Psychopath is something I would like to do if possible. Some thoughts:

  • Pretty much all ray tracing renderers require pre-processing before rendering starts (tessellation, displacement, BVH builds, etc.). For interactive rendering they just maintain a lot of that state, tweaking the minimum data needed to re-do to the render. Even with a shade-before-hit architecture, this should be feasible.

  • Having a "camera" that defines the dicing rates separately from the interactive viewport will probably be necessary, so that the whole scene doesn't have to be re-diced and re-shaded whenever the user changes their view. That separate dicing "camera" doesn't have to be a real projective camera--there could be dicing "cameras" designed specifically for various interactive rendering use-cases.

Color Management

Psychopath is a spectral renderer, and I intend for it to stay that way. Handling color correctly is really important to my goals for Psychopath, and I believe spectral rendering is key to that goal. But there is more to correct color management than just that.

  • Handling tristimulus spaces with something like Open Color IO would be good to do at some point, but I don't think now is the time to put the energy into that. A better focus right now, I think, is simply to properly handle a curated set of tristimulus colors spaces (including e.g. sRGB and ACES AP0 & AP1).

  • Handling custom tone mapping, color grading, etc. will be intentionally outside the scope of Psychopath. Psychopath will produce HDR data which can then be further processed by other applications later in the pipeline. In other words, keep Psychopath's scope in the color pipeline minimal. (However, being able to specify some kind of tone mapping for e.g. better adaptive sampling may be important. Not sure how best to approach that yet.)

  • Even with it being out of scope, having a single reasonable tone mapper for preview purposes only (e.g. when rendering directly to PNG files) is probably reasonable. If I recall correctly, ACES specifies a tone mapping operator. If that's the case, using their operator is likely a good choice. However, I haven't investigated this properly yet.

  • One of the benefits of spectral rendering is that you can properly simulate arbitrary sensor responses. Manuka, for example, supports providing spectral response data for arbitrary camera sensors so that VFX rendering can precisely match the footage it's going to be integrated with. I've thought about this for quite some time as well (even before reading the Manuka paper), and I would like to support this. This sort of overlaps with tone mapping, but both the purpose and tech are different.