Another optimization of tangent space calculation
authorSergey Sharybin <sergey.vfx@gmail.com>
Fri, 25 Aug 2017 12:54:44 +0000 (14:54 +0200)
committerSergey Sharybin <sergey.vfx@gmail.com>
Fri, 25 Aug 2017 12:54:44 +0000 (14:54 +0200)
Don't use quick sort for small arrays, bubble sort works way faster for small
arrays due to cache coherency. This is what qsort() from libc is doing actually.
We can also experiment unrolling some extra small arrays, for example 3 and 4
element arrays.

This reduces tangent space calculation for dragon from 3.1sec to 2.9sec.

intern/mikktspace/mikktspace.c

index 479443805bfbd0937d1fd150e34671c61563898e..2e8e58d37d4461dd53aac93969ea9728087606ee 100644 (file)
@@ -1677,6 +1677,19 @@ static void QuickSortEdges(SEdge * pSortBuffer, int iLeft, int iRight, const int
                }
                return;
        }
+       else if(iElems < 16) {
+               int i, j;
+               for (i = 0; i < iElems - 1; i++) {
+                       for (j = 0; j < iElems - i - 1; j++) {
+                               int index = iLeft + j;
+                               if (pSortBuffer[index].array[channel] > pSortBuffer[index + 1].array[channel]) {
+                                       sTmp = pSortBuffer[index];
+                                       pSortBuffer[index] = pSortBuffer[index];
+                                       pSortBuffer[index + 1] = sTmp;
+                               }
+                       }
+               }
+       }
 
        // Random
        t=uSeed&31;