samie abdul
2015-02-19 14:49:38 UTC
Hi,
is it possible to "precompile" the invoked kernels beforehand? My code makes use of several CUDA kernels, which are basically called within a "fit" function. Profiling the code with cProfile yields:
42272 function calls (42228 primitive calls) in 1.662 seconds
...
11 0.000 0.000 0.344 0.031 compiler.py:185(compile)
11 0.002 0.000 0.346 0.031 compiler.py:245(__init__)
4 0.000 0.000 0.317 0.079 compiler.py:33(preprocess_source)
11 0.000 0.000 0.342 0.031 compiler.py:66(compile_plain)
...
Thus, about 0.344 of the 1.662 seconds are spent on compiling the code. When executing the function "fit" twice, the code is not compiled again (hence, saving these 0.344 seconds for the second call of "fit"). I would like to somehow precompile all involved kernels as soon as the object the "fit" function belongs to is initialized...
Can one invoke the overall compilation process beforehand?
Thanks!
Fabian
is it possible to "precompile" the invoked kernels beforehand? My code makes use of several CUDA kernels, which are basically called within a "fit" function. Profiling the code with cProfile yields:
42272 function calls (42228 primitive calls) in 1.662 seconds
...
11 0.000 0.000 0.344 0.031 compiler.py:185(compile)
11 0.002 0.000 0.346 0.031 compiler.py:245(__init__)
4 0.000 0.000 0.317 0.079 compiler.py:33(preprocess_source)
11 0.000 0.000 0.342 0.031 compiler.py:66(compile_plain)
...
Thus, about 0.344 of the 1.662 seconds are spent on compiling the code. When executing the function "fit" twice, the code is not compiled again (hence, saving these 0.344 seconds for the second call of "fit"). I would like to somehow precompile all involved kernels as soon as the object the "fit" function belongs to is initialized...
Can one invoke the overall compilation process beforehand?
Thanks!
Fabian