T61513: Refactored Cycles Attribute Retrieval
authorJeroen Bakker <j.bakker@atmind.nl>
Tue, 19 Feb 2019 14:41:22 +0000 (15:41 +0100)
committerJeroen Bakker <j.bakker@atmind.nl>
Tue, 19 Feb 2019 15:25:48 +0000 (16:25 +0100)
commitd6d306441fee0fbead3d48ba08166c31895de8ae
tree2faec4926ed1606e3dd7888e6f943de011d19d60
parentc598ad9ac6637cf2804d9fe16408888737d7dc9c
T61513: Refactored Cycles Attribute Retrieval

There is a generic function to retrieve float and float3 attributes
`primitive_attribute_float` and primitive_attribute_float3`. Inside
these functions an prioritised if-else construction checked where
the attribute is stored and then retrieved from that location.

Actually the calling function most of the time already knows where
the data is stored. So we could simplify this by splitting these
functions and remove the check logic.

This patch splits the `primitive_attribute_float?` functions into
`primitive_surface_attribute_float?` and `primitive_volume_attribute_float?`.
What leads to less branching and more optimum kernels.

The original function is still being used by OSL and `svm_node_attr`.

This will reduce the compilation time and render time for kernels.
Especially in production scenes there is a lot of benefit.

Impact in compilation times

    job  |   scene_name    | previous |  new  | percentage
  -------+-----------------+----------+-------+------------
  t61513 | empty           |    10.63 | 10.66 |          0%
  t61513 | bmw             |    17.91 | 17.65 |          1%
  t61513 | fishycat        |    19.57 | 17.68 |         10%
  t61513 | barbershop      |    54.10 | 24.41 |         55%
  t61513 | classroom       |    17.55 | 16.29 |          7%
  t61513 | koro            |    18.92 | 18.05 |          5%
  t61513 | pavillion       |    17.43 | 16.52 |          5%
  t61513 | splash279       |    16.48 | 14.91 |         10%
  t61513 | volume_emission |    36.22 | 21.60 |         40%

Impact in render times

    job  |   scene_name    | previous |  new   | percentage
  -------+-----------------+----------+--------+------------
  61513 | empty           |    21.06 |  20.35 |          3%
  61513 | bmw             |   198.44 | 190.05 |          4%
  61513 | fishycat        |   394.20 | 401.25 |         -2%
  61513 | barbershop      |  1188.16 | 912.39 |         23%
  61513 | classroom       |   341.08 | 340.38 |          0%
  61513 | koro            |   472.43 | 471.80 |          0%
  61513 | pavillion       |   905.77 | 899.80 |          1%
  61513 | splash279       |    55.26 |  54.86 |          1%
  61513 | volume_emission |    62.59 |  61.70 |          1%

There is also a possitive impact when using CPU and CUDA, but they are small.

I didn't split the hair logic from the surface logic due to:

* Hair and surface use same attribute types. It was not clear if it could be
  splitted when looking at the code only.
* Hair and surface are quick to compile and to read. So the benefit is quite
  small.

Differential Revision: https://developer.blender.org/D4375
intern/cycles/kernel/geom/geom_primitive.h
intern/cycles/kernel/geom/geom_volume.h
intern/cycles/kernel/osl/osl_services.cpp
intern/cycles/kernel/svm/svm_attribute.h
intern/cycles/kernel/svm/svm_bump.h
intern/cycles/kernel/svm/svm_closure.h
intern/cycles/kernel/svm/svm_displace.h
intern/cycles/kernel/svm/svm_tex_coord.h