Drawbacks of Geometry Caches
At the end of my post about ray reordering I alluded to some problems I ran into with using geometry caches and ray reordering. In this post I'm going to talk about the problem I ran into with geometry caching.
Geometry caching in-and-of itself isn't particularly problematic. But it can become a problem if you're trying to share it between multiple threads. The issue is that with an LRU cache even just reading from the cache is a write operation, because you have to move the read item to the front of the cache. And that means locking. And locking means thread contention.
I think I was actually reasonably smart in the way I wrote the cache. The cache treats the cached data as read-only, so it only needs to be locked long enough to move the accessed item to the front and then fetch a shared pointer to the data. That is a very short period of time, certainly far less than the tracing done with the geometry afterwards.
The result of that cleverness was that rendering with eight cores sped up rendering by about 5-6x. That's nothing to sneeze at considering that the cache was being accessed for every ray-object intersection test. But it was a clear indication that it wouldn't scale well with too many more cores. And on eight cores, you really want close to an 8x speed up if you can get it.