Simple task framework - building software from reusable pieces
- by RuslanD
I'm writing a web service with several APIs, and they will be sharing some of the implementation code. In order not to copy-paste, I would like to ideally implement each API call as a series of tasks, which are executed in a sequence determined by the business logic.
One obvious question is whether that's the best strategy for code reuse, or whether I can look at it in a different way. But assuming I want to go with tasks, several issues arise:
What's a good task interface to use?
How do I pass data computed in one task to another task in the sequence that might need it?
In the past, I've worked with task interfaces like:
interface Task<T, U> {
U execute(T input);
}
Then I also had sort of a "task context" object which had getters and setters for any kind of data my tasks needed to produce or consume, and it gets passed to all tasks. I'm aware that this suffers from a host of problems. So I wanted to figure out a better way to implement it this time around.
My current idea is to have a TaskContext object which is a type-safe heterogeneous container (as described in Effective Java). Each task can ask for an item from this container (task input), or add an item to the container (task output). That way, tasks don't need to know about each other directly, and I don't have to write a class with dozens of methods for each data item. There are, however, several drawbacks:
Each item in this TaskContext container should be a complex type that wraps around the actual item data. If task A uses a String for some purpose, and task B uses a String for something entirely different, then just storing a mapping between String.class and some object doesn't work for both tasks. The other reason is that I can't use that kind of container for generic collections directly, so they need to be wrapped in another object.
This means that, based on how many tasks I define, I would need to also define a number of classes for the task items that may be consumed or produced, which may lead to code bloat and duplication. For instance, if a task takes some Long value as input and produces another Long value as output, I would have to have two classes that simply wrap around a Long, which IMO can spiral out of control pretty quickly as the codebase evolves.
I briefly looked at workflow engine libraries, but they kind of seem like a heavy hammer for this particular nail. How would you go about writing a simple task framework with the following requirements:
Tasks should be as self-contained as possible, so they can be composed in different ways to create different workflows.
That being said, some tasks may perform expensive computations that are prerequisites for other tasks. We want to have a way of storing the results of intermediate computations done by tasks so that other tasks can use those results for free.
The task framework should be light, i.e. growing the code doesn't involve introducing many new types just to plug into the framework.