Algorithm to select groups of similar items in 2d array
- by mafutrct
There is a 2d array of items (in my case they are called Intersections).
A certain item is given as a start. The task is to find all items directly or indirectly connected to this item that satisfy a certain function.
So the basic algorithm is like this:
Add the start to the result list.
Repeat until no modification: Add each item in the array that satisfies the function and touches any item in the result list to the result list.
My current implementation looks like this:
private IList<Intersection> SelectGroup (
Intersection start,
Func<Intersection, Intersection, bool> select)
{
List<Intersection> result = new List<Intersection> ();
Queue<Intersection> source = new Queue<Intersection> ();
source.Enqueue (start);
while (source.Any ()) {
var s = source.Dequeue ();
result.Add (s);
foreach (var neighbour in Neighbours (s)) {
if (select (start, neighbour)
&& !result.Contains (neighbour)
&& !source.Contains (neighbour)) {
source.Enqueue (neighbour);
}
}
}
Debug.Assert (result.Distinct ().Count () == result.Count ());
Debug.Assert (result.All (x => select (x, result.First ())));
return result;
}
private List<Intersection> Neighbours (IIntersection intersection)
{
int x = intersection.X;
int y = intersection.Y;
List<Intersection> list = new List<Intersection> ();
if (x > 1) {
list.Add (GetIntersection (x - 1, y));
}
if (y > 1) {
list.Add (GetIntersection (x, y - 1));
}
if (x < Size) {
list.Add (GetIntersection (x + 1, y));
}
if (y < Size) {
list.Add (GetIntersection (x, y + 1));
}
return list;
}
(The select function takes a start item and returns true iff the second item satisfies.)
This does its job and turned out to be reasonable fast for the usual array sizes (about 20*20). However, I'm interested in further improvements. Any ideas?
Example (X satisfies in relation to other Xs, . does never satisfy):
....
XX..
.XX.
X...
In this case, there are 2 groups: a central group of 4 items and a group of a single item in the lower left. Selecting the group (for instance by starting item [2, 2]) returns the former, while the latter can be selected using the starting item and sole return value [0, 3].
Example 2:
.A..
..BB
A.AA
This time there are 4 groups. The 3 A groups are not connected, so they are returned as separate groups. The bigger A and B groups are connected, but A does not related to B so they are returned as separate groups.