2 * Copyright 2011, Blender Foundation.
4 * This program is free software; you can redistribute it and/or
5 * modify it under the terms of the GNU General Public License
6 * as published by the Free Software Foundation; either version 2
7 * of the License, or (at your option) any later version.
9 * This program is distributed in the hope that it will be useful,
10 * but WITHOUT ANY WARRANTY; without even the implied warranty of
11 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12 * GNU General Public License for more details.
14 * You should have received a copy of the GNU General Public License
15 * along with this program; if not, write to the Free Software Foundation,
16 * Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
27 #include "DNA_color_types.h"
28 #include "DNA_node_types.h"
31 * @defgroup Model The data model of the compositor
32 * @defgroup Memory The memory management stuff
33 * @defgroup Execution The execution logic
34 * @defgroup Conversion Conversion logic
35 * @defgroup Node All nodes of the compositor
36 * @defgroup Operation All operations of the compositor
38 * @mainpage Introduction of the Blender Compositor
40 * @section bcomp Blender compositor
41 * This project redesigns the internals of Blender's compositor. The project has been executed in 2011 by At Mind.
42 * At Mind is a technology company located in Amsterdam, The Netherlands.
43 * The project has been crowd-funded. This code has been released under GPL2 to be used in Blender.
45 * @section goals The goals of the project
46 * the new compositor has 2 goals.
47 * - Make a faster compositor (speed of calculation)
48 * - Make the compositor work faster for you (workflow)
50 * @section speed Faster compositor
51 * The speedup has been done by making better use of the hardware Blenders is working on. The previous compositor only
52 * used a single threaded model to calculate a node. The only exception to this is the Defocus node.
53 * Only when it is possible to calculate two full nodes in parallel a second thread was used.
54 * Current workstations have 8-16 threads available, and most of the time these are idle.
56 * In the new compositor we want to use as much of threads as possible. Even new OpenCL capable GPU-hardware can be
57 * used for calculation.
59 * @section workflow Work faster
60 * The previous compositor only showed the final image. The compositor could wait a long time before seeing the result
61 * of his work. The new compositor will work in a way that it will focus on getting information back to the user.
62 * It will prioritize its work to get earlier user feedback.
64 * @page memory Memory model
65 * The main issue is the type of memory model to use. Blender is used by consumers and professionals.
66 * Ranging from low-end machines to very high-end machines.
67 * The system should work on high-end machines and on low-end machines.
70 * @page executing Executing
71 * @section prepare Prepare execution
73 * during the preparation of the execution All ReadBufferOperation will receive an offset.
74 * This offset is used during execution as an optimization trick
75 * Next all operations will be initialized for execution @see NodeOperation.initExecution
76 * Next all ExecutionGroup's will be initialized for execution @see ExecutionGroup.initExecution
77 * this all is controlled from @see ExecutionSystem.execute
79 * @section priority Render priority
80 * Render priority is an priority of an output node. A user has a different need of Render priorities of output nodes
81 * than during editing.
82 * for example. the Active ViewerNode has top priority during editing, but during rendering a CompositeNode has.
83 * All NodeOperation has a setting for their render-priority, but only for output NodeOperation these have effect.
84 * In ExecutionSystem.execute all priorities are checked. For every priority the ExecutionGroup's are check if the
86 * When match the ExecutionGroup will be executed (this happens in serial)
88 * @see ExecutionSystem.execute control of the Render priority
89 * @see NodeOperation.getRenderPriority receive the render priority
90 * @see ExecutionGroup.execute the main loop to execute a whole ExecutionGroup
92 * @section order Chunk order
94 * When a ExecutionGroup is executed, first the order of chunks are determined.
95 * The settings are stored in the ViewerNode inside the ExecutionGroup. ExecutionGroups that have no viewer-node,
96 * will use a default one.
97 * There are several possible chunk orders
98 * - [@ref OrderOfChunks.COM_TO_CENTER_OUT]: Start calculating from a configurable point and order by nearest chunk
99 * - [@ref OrderOfChunks.COM_TO_RANDOM]: Randomize all chunks.
100 * - [@ref OrderOfChunks.COM_TO_TOP_DOWN]: Start calculation from the bottom to the top of the image
101 * - [@ref OrderOfChunks.COM_TO_RULE_OF_THIRDS]: Experimental order based on 9 hot-spots in the image
103 * When the chunk-order is determined, the first few chunks will be checked if they can be scheduled.
104 * Chunks can have three states:
105 * - [@ref ChunkExecutionState.COM_ES_NOT_SCHEDULED]: Chunk is not yet scheduled, or dependencies are not met
106 * - [@ref ChunkExecutionState.COM_ES_SCHEDULED]: All dependencies are met, chunk is scheduled, but not finished
107 * - [@ref ChunkExecutionState.COM_ES_EXECUTED]: Chunk is finished
109 * @see ExecutionGroup.execute
110 * @see ViewerBaseOperation.getChunkOrder
113 * @section interest Area of interest
114 * An ExecutionGroup can have dependencies to other ExecutionGroup's. Data passing from one ExecutionGroup to another
115 * one are stored in 'chunks'.
116 * If not all input chunks are available the chunk execution will not be scheduled.
118 * +-------------------------------------+ +--------------------------------------+
119 * | ExecutionGroup A | | ExecutionGroup B |
120 * | +----------------+ +-------------+ | | +------------+ +-----------------+ |
121 * | | NodeOperation a| | WriteBuffer | | | | ReadBuffer | | ViewerOperation | |
122 * | | *==* Operation | | | | Operation *===* | |
123 * | | | | | | | | | | | |
124 * | +----------------+ +-------------+ | | +------------+ +-----------------+ |
126 * +--------------------------------|----+ +---|----------------------------------+
129 * +---------------------------+
131 * | +----------+ +---------+ |
132 * | | Chunk a | | Chunk b | |
134 * | +----------+ +---------+ |
136 * +---------------------------+
139 * In the above example ExecutionGroup B has an outputoperation (ViewerOperation) and is being executed.
140 * The first chunk is evaluated [@ref ExecutionGroup.scheduleChunkWhenPossible],
141 * but not all input chunks are available. The relevant ExecutionGroup (that can calculate the missing chunks;
142 * ExecutionGroup A) is asked to calculate the area ExecutionGroup B is missing.
143 * [@ref ExecutionGroup.scheduleAreaWhenPossible]
144 * ExecutionGroup B checks what chunks the area spans, and tries to schedule these chunks.
145 * If all input data is available these chunks are scheduled [@ref ExecutionGroup.scheduleChunk]
149 * +-------------------------+ +----------------+ +----------------+
150 * | ExecutionSystem.execute | | ExecutionGroup | | ExecutionGroup |
151 * +-------------------------+ | (B) | | (A) |
152 * O +----------------+ +----------------+
154 * O ExecutionGroup.execute | |
155 * O------------------------------->O |
158 * . . | ExecutionGroup.scheduleChunkWhenPossible
162 * . . O ExecutionGroup.scheduleAreaWhenPossible|
163 * . . O---------------------------------------->O
164 * . . . O----------\ ExecutionGroup.scheduleChunkWhenPossible
169 * . . . . O-------\ ExecutionGroup.scheduleChunk
175 * . . O<========================================O
182 * This happens until all chunks of (ExecutionGroup B) are finished executing or the user break's the process.
184 * NodeOperation like the ScaleOperation can influence the area of interest by reimplementing the
185 * [@ref NodeOperation.determineAreaOfInterest] method
189 * +--------------------------+ +---------------------------------+
190 * | ExecutionGroup A | | ExecutionGroup B |
192 * +--------------------------+ +---------------------------------+
193 * Needed chunks from ExecutionGroup A | Chunk of ExecutionGroup B (to be evaluated)
194 * +-------+ +-------+ | +--------+
195 * |Chunk 1| |Chunk 2| +----------------+ |Chunk 1 |
196 * | | | | | ScaleOperation | | |
197 * +-------+ +-------+ +----------------+ +--------+
199 * +-------+ +-------+
200 * |Chunk 3| |Chunk 4|
202 * +-------+ +-------+
206 * @see ExecutionGroup.execute Execute a complete ExecutionGroup. Halts until finished or breaked by user
207 * @see ExecutionGroup.scheduleChunkWhenPossible Tries to schedule a single chunk,
208 * checks if all input data is available. Can trigger dependant chunks to be calculated
209 * @see ExecutionGroup.scheduleAreaWhenPossible Tries to schedule an area. This can be multiple chunks
210 * (is called from [@ref ExecutionGroup.scheduleChunkWhenPossible])
211 * @see ExecutionGroup.scheduleChunk Schedule a chunk on the WorkScheduler
212 * @see NodeOperation.determineDependingAreaOfInterest Influence the area of interest of a chunk.
213 * @see WriteBufferOperation NodeOperation to write to a MemoryProxy/MemoryBuffer
214 * @see ReadBufferOperation NodeOperation to read from a MemoryProxy/MemoryBuffer
215 * @see MemoryProxy proxy for information about memory image (a image consist out of multiple chunks)
216 * @see MemoryBuffer Allocated memory for a single chunk
218 * @section workscheduler WorkScheduler
219 * the WorkScheduler is implemented as a static class. the responsibility of the WorkScheduler is to balance
220 * WorkPackages to the available and free devices.
221 * the work-scheduler can work in 2 states. For witching these between the state you need to recompile blender
223 * @subsection multithread Multi threaded
224 * Default the work-scheduler will place all work as WorkPackage in a queue.
225 * For every CPUcore a working thread is created. These working threads will ask the WorkScheduler if there is work
226 * for a specific Device.
227 * the work-scheduler will find work for the device and the device will be asked to execute the WorkPackage
229 * @subsection singlethread Single threaded
230 * For debugging reasons the multi-threading can be disabled. This is done by changing the COM_CURRENT_THREADING_MODEL
231 * to COM_TM_NOTHREAD. When compiling the work-scheduler
232 * will be changes to support no threading and run everything on the CPU.
234 * @section devices Devices
235 * A Device within the compositor context is a Hardware component that can used to calculate chunks.
236 * This chunk is encapsulated in a WorkPackage.
237 * the WorkScheduler controls the devices and selects the device where a WorkPackage will be calculated.
239 * @subsection WS_Devices Workscheduler
240 * The WorkScheduler controls all Devices. When initializing the compositor the WorkScheduler selects
241 * all devices that will be used during compositor.
242 * There are two types of Devices, CPUDevice and OpenCLDevice.
243 * When an ExecutionGroup schedules a Chunk the schedule method of the WorkScheduler
244 * The Workscheduler determines if the chunk can be run on an OpenCLDevice
245 * (and that there are available OpenCLDevice). If this is the case the chunk will be added to the worklist for
247 * otherwise the chunk will be added to the worklist of CPUDevices.
249 * A thread will read the work-list and sends a workpackage to its device.
251 * @see WorkScheduler.schedule method that is called to schedule a chunk
252 * @see Device.execute method called to execute a chunk
254 * @subsection CPUDevice CPUDevice
255 * When a CPUDevice gets a WorkPackage the Device will get the inputbuffer that is needed to calculate the chunk.
256 * Allocation is already done by the ExecutionGroup.
257 * The outputbuffer of the chunk is being created.
258 * The OutputOperation of the ExecutionGroup is called to execute the area of the outputbuffer.
260 * @see ExecutionGroup
261 * @see NodeOperation.executeRegion executes a single chunk of a NodeOperation
262 * @see CPUDevice.execute
264 * @subsection GPUDevice OpenCLDevice
267 * @see NodeOperation.executeOpenCLRegion
268 * @see OpenCLDevice.execute
270 * @section executePixel executing a pixel
271 * Finally the last step, the node functionality :)
273 * @page newnode Creating new nodes
277 * @brief The main method that is used to execute the compositor tree.
278 * It can be executed during editing (blenkernel/node.c) or rendering
279 * (renderer/pipeline.c)
281 * @param rd [struct RenderData]
282 * Render data for this composite, this won't always belong to a scene.
284 * @param editingtree [struct bNodeTree]
285 * reference to the compositor editing tree
287 * @param rendering [true false]
288 * This parameter determines whether the function is called from rendering (true) or editing (false).
289 * based on this setting the system will work differently:
290 * - during rendering only Composite & the File output node will be calculated
291 * @see NodeOperation.isOutputProgram(int rendering) of the specific operations
293 * - during editing all output nodes will be calculated
294 * @see NodeOperation.isOutputProgram(int rendering) of the specific operations
296 * - another quality setting can be used bNodeTree. The quality is determined by the bNodeTree fields.
297 * quality can be modified by the user from within the node panels.
298 * @see bNodeTree.edit_quality
299 * @see bNodeTree.render_quality
301 * - output nodes can have different priorities in the WorkScheduler.
302 * This is implemented in the COM_execute function.
304 * @param viewSettings
305 * reference to view settings used for color management
307 * @param displaySettings
308 * reference to display settings used for color management
310 * OCIO_TODO: this options only used in rare cases, namely in output file node,
311 * so probably this settings could be passed in a nicer way.
312 * should be checked further, probably it'll be also needed for preview
313 * generation in display space
315 void COM_execute(RenderData *rd, bNodeTree *editingtree, int rendering,
316 const ColorManagedViewSettings *viewSettings, const ColorManagedDisplaySettings *displaySettings);
319 * @brief Deinitialize the compositor caches and allocated memory.
320 * Use COM_clearCaches to only free the caches.
322 void COM_deinitialize(void);
325 * @brief Clear all compositor caches. (Compositor system will still remain available).
326 * To deinitialize the compositor use the COM_deinitialize method.
328 // void COM_clearCaches(void); // NOT YET WRITTEN
331 * @brief Return a list of highlighted bnodes pointers.
334 void COM_startReadHighlights(void);
337 * @brief check if a bnode is highlighted
341 int COM_isHighlightedbNode(bNode *bnode);