What strategies are efficient to handle concurrent reads on heterogeneous multi-core architectures?
- by fabrizioM
I am tackling the challenge of using both the capabilities of a 8 core machine and a high-end GPU (Tesla 10).
I have one big input file, one thread for each core, and one for the the GPU handling.
The Gpu thread, to be efficient, needs a big number of lines from the input, while
the Cpu thread needs only one line to proceed (storing multiple lines in a temp buffer was slower). The file doesn't need to be read sequentially. I am using boost.
My strategy is to have a mutex on the input stream and each thread locks - unlocks it.
This is not optimal because the gpu thread should have a higher precedence when locking the mutex, being the fastest and the most demanding one.
I can come up with different solutions but before rush into implementation I would like to have some guidelines.
What approach do you use / recommend ?