Efficiency concerning thread granularity
- by MaelmDev
Lately, I've been thinking of ways to use multithreading to improve the speed of different parts of a game engine. What confuses me is the appropriate granularity of threads, especially when dealing with single-instruction-multiple-data (SIMD) tasks.
Let's use line-of-sight detection as an example. Each AI actor must be able to detect objects of interest around them and mark them. There are three basic ways to go about this with multithreading:
Don't use threading at all.
Create a thread for each actor.
Create a thread for each actor-object combination.
Option 1 is obviously going to be the least efficient method. However, choosing between the next two options is more difficult. Only using one thread per actor is still running through every object in series instead of in parallel. However, are CPU's able to create and join threads in the granularity posed in Option 3 efficiently? It seems like that many calls to the OS could be really slow, and varying enormously between different hardware.