Cycles: Speedup transparent shadows on CUDA
authorSergey Sharybin <sergey.vfx@gmail.com>
Wed, 8 Feb 2017 12:05:05 +0000 (13:05 +0100)
committerSergey Sharybin <sergey.vfx@gmail.com>
Wed, 8 Feb 2017 13:00:48 +0000 (14:00 +0100)
This commit enables record-all behavior of transparent shadows
rays.

Render times difference goes as following:

               GTX 1080 render time
BMW                  -0.5%
Fishy Cat            -0.0%
Pabellon Barcelona   -11.6%
Classroom            +1.2%
Koro                 -58.6%

Kernel will now use some extra VRAM memory to store the intersection
array (200MB on my configuration). This we can optimize out with some
further commits.

intern/cycles/kernel/kernel_types.h

index 8c271c7..f518530 100644 (file)
@@ -84,6 +84,7 @@ CCL_NAMESPACE_BEGIN
 #  define __VOLUME_SCATTER__
 #  define __SUBSURFACE__
 #  define __CMJ__
+#  define __SHADOW_RECORD_ALL__
 #endif  /* __KERNEL_CUDA__ */
 
 #ifdef __KERNEL_OPENCL__