Foreword: I'm aware that optimizing this bottleneck is not a necessity - the engine is already very fast. I, however, for fun and educational purposes, would love to find a way to make the engine even faster.
I'm creating a general-purpose C++ 2D collision detection/response engine, with an emphasis on flexibility and speed.
Here's a very basic diagram of its architecture:
Basically, the main class is World, which owns (manages memory) of a ResolverBase*, a SpatialBase* and a vector<Body*>.
SpatialBase is a pure virtual class which deals with broad-phase collision detection.
ResolverBase is a pure virtual class which deals with collision resolution.
The bodies communicate to the World::SpatialBase* with SpatialInfo objects, owned by the bodies themselves.
There currenly is one spatial class: Grid : SpatialBase, which is a basic fixed 2D grid. It has it's own info class, GridInfo : SpatialInfo.
Here's how its architecture looks:
The Grid class owns a 2D array of Cell*. The Cell class contains two collection of (not owned) Body*: a vector<Body*> which contains all the bodies that are in the cell, and a map<int, vector<Body*>> which contains all the bodies that are in the cell, divided in groups.
Bodies, in fact, have a groupId int that is used for collision groups.
GridInfo objects also contain non-owning pointers to the cells the body is in.
As I previously said, the engine is based on groups.
Body::getGroups() returns a vector<int> of all the groups the body is part of.
Body::getGroupsToCheck() returns a vector<int> of all the groups the body has to check collision against.
Bodies can occupy more than a single cell. GridInfo always stores non-owning pointers to the occupied cells.
After the bodies move, collision detection happens. We assume that all bodies are axis-aligned bounding boxes.
How broad-phase collision detection works:
Part 1: spatial info update
For each Body body:
Top-leftmost occupied cell and bottom-rightmost occupied cells are calculated.
If they differ from the previous cells, body.gridInfo.cells is cleared, and filled with all the cells the body occupies (2D for loop from the top-leftmost cell to the bottom-rightmost cell).
body is now guaranteed to know what cells it occupies. For a performance boost, it stores a pointer to every map<int, vector<Body*>> of every cell it occupies where the int is a group of body->getGroupsToCheck(). These pointers get stored in gridInfo->queries, which is simply a vector<map<int, vector<Body*>>*>.
body is now guaranteed to have a pointer to every vector<Body*> of bodies of groups it needs to check collision against. These pointers are stored in gridInfo->queries.
Part 2: actual collision checks
For each Body body:
body clears and fills a vector<Body*> bodiesToCheck, which contains all the bodies it needs to check against. Duplicates are avoided (bodies can belong to more than one group) by checking if bodiesToCheck already contains the body we're trying to add.
const vector<Body*>& GridInfo::getBodiesToCheck()
{
bodiesToCheck.clear();
for(const auto& q : queries) for(const auto& b : *q) if(!contains(bodiesToCheck, b)) bodiesToCheck.push_back(b);
return bodiesToCheck;
}
The GridInfo::getBodiesToCheck() method IS THE BOTTLENECK.
The bodiesToCheck vector must be filled for every body update because bodies could have moved meanwhile. It also needs to prevent duplicate collision checks. The contains function simply checks if the vector already contains a body with std::find.
Collision is checked and resolved for every body in bodiesToCheck.
That's it.
So, I've been trying to optimize this broad-phase collision detection for quite a while now. Every time I try something else than the current architecture/setup, something doesn't go as planned or I make assumption about the simulation that later are proven to be false.
My question is: how can I optimize the broad-phase of my collision engine maintaining the grouped bodies approach?
Is there some kind of magic C++ optimization that can be applied here?
Can the architecture be redesigned in order to allow for more performance?
Actual implementation: SSVSCollsion
Body.h, Body.cpp
World.h, World.cpp
Grid.h, Grid.cpp
Cell.h, Cell.cpp
GridInfo.h, GridInfo.cpp