Saturday, 15 January 2011

concurrency - Two concurrent kernels on nvidia kepler 3.0 -


In my program, I have two kernels, and for each kernel I get only two blocks of 256 thread.

  kernel1 & lt; & Lt; & Lt; 2,256 & gt; & Gt; & Gt; () Colonel 2 & lt; & Lt; & Lt; 2,256 & gt; & Gt; & Gt; ()   

The present execution of the program on the SMX graphic card is something like this (when profiling with the visual profiler, both the kernels perform themselves one after the other)

  SMX1 SMX2 ----- ---- | 1 of 1 of | 1 of 1 of ----------- | | | | | | --------- SMX 3 SMX 4 SMX 1 SMX 2 --------- | K2 K2 | K2 K2 ----------- | | | | | | --------- SMX3 SMX4   

I was thinking that it is possible for both Kernel to come within the same program at the same time, and there is something that It looks like and split the execution time from 2:

  SMX1 SMX2 --------- | 1 of 1 of | 1 of 1 of ----------- | K2 K2 | K2 K2 In Kepler 3.5: The new "Hyper-Q" feature in Kepler architecture allows multiple kernels to be loaded simultaneously with multiple MPI process --------- SMX3 SMX4   < P> (or other process)   

It is possible to get 2 kernels to execute it simultaneously. For starters, you will need to launch two kernels in different streams. Does it execute time divided by 2, I can not say. You or you may want to look at the sample, which covers the streams.

No comments:

Post a Comment