|
The final Algorithm class is derived from Hybrid_ which is itself multiply derived from both DefaultCPU and DefaultGPU. Hybrid_ takes a template parameter called DevicePolicies to determine which of its parents, who provide the implementations, to use. The DevicePolicies class itself has a list of policies of the form PolicyName<Device>, which can be recursively searched to match a certain routine to its device.
At the root of the inheritance tree stand three classes, called DefaultBase, Memory and Timer. Both DefaultCPU and DefaultGPU derive virtually from these classes, such that they will share the same objects once instantiated.
The DefaultBase class is a convenience-class in which the user can add more data or through which the CPU and GPU implementations can communicate. It also prevents the user from having to modify the header-files corresponding to DefaultCPU and DefaultGPU, which will be regenerated and overwritten on a second call to HyCuda.
Memory holds pointers to both the CPU and GPU memory. Both implementations need to be aware of each memory-pool and when instantiated, the memory should only be allocated once, hence the virtual derivation.
For your convenience, Timer will time every routine and every memory-transfer, such that different paths through device-space can be compared without effort. Of course, there is a little overhead involved in timing the algorithm. Therefore, when the macro NO_TIMERS is defined, this will not be done.
Joren Heit 2013-12-17