Writing correct concurrent programs is harder than writing sequential ones. This is because the set of potential risks and failure modes is larger – anything that can go wrong in a sequential program can also go wrong in a concurrent one, and with concurrency comes additional hazards not present in sequential programs such as race conditions, data races, deadlocks, missed signals, and livelock.


Testing concurrent programs is also harder than testing sequential ones. This is trivially true: tests for concurrent programs are themselves concurrent programs. But it is also true for another reason: the failure modes of concurrent programs are less predictable and repeatable than for sequential programs. Failures in sequential programs are deterministic; if a sequential program fails with a given set of inputs and initial state, it will fail every time. Failures in concurrent programs, on the other hand, tend to be rare probabilistic events.


Because of this, reproducing failures in concurrent programs can be maddeningly difficult. Not only might the failure be rare, and therefore not manifest itself frequently, but it might not occur at all in certain platform configurations, so that bug that happens daily at your customer’s site might never happen at all in your test lab. Further, attempts to debug or monitor the program can introduce timing or synchronization artifacts that prevents the bug from appearing at all. As in Heisenberg’s uncertainty principle, observing the state of the system may in fact change it.


So, given all this depressing news, how are we supposed to ensure that concurrent programs work properly? The same way we manage complexity in any other engineering endeavor – attempt to isolate the complexity.


Structuring programs to limit concurrent interactions


It is possible to write functioning programs entirely with public, static variables. Mind you, it’s not a good idea, but it can be done – it’s just harder, and more fragile. The value of encapsulation is that it makes it possible to analyze the behavior of a portion of a program without having to review the code for the entire program.


Similarly, by encapsulating concurrent interactions in a few places, such as workflow managers, resource pools, work queues, and other concurrent objects, it becomes simpler to analyze and test concurrent programs. Once the concurrent interactions are encapsulated, you can focus the majority of your testing efforts primarily on the concurrency mechanisms themselves.


Concurrency mechanisms, such as shared work queues, often act as conduits for moving objects from one thread to another. These mechanisms contain sufficient synchronization to protect the integrity of their internal data structures, but the objects being passed in and out belong to the application, not the work queue, and the application is responsible for the thread-safety of these objects. You can make these domain objects thread-safe (making them immutable is often the easiest and most reliable way to do so), but there is often another option: make them effectively immutability.


Effectively immutable objects are those which are not necessarily immutable by design – they may have mutable state – but which the program treats as if they were immutable after they are published where they might be accessed by other threads. In other words, once you put a mutable object into a shared data structure, where other threads might then have access to it, make sure that it is not modified again by any thread. The judicious use of Immutability and effective immutability limit the range of potentially incorrect concurrent actions by limiting mutability to a few core classes that can be strenuously unit-tested.


Listing 1 shows an example of how effective immutability can greatly simplify testing. The client code submits a request to a work manager, in this case an Executor, to factor a large number. The calculation is represented as a Callable<BigInteger[]>, and the Executor returns a Future<BigInteger[]> representing the calculation. The client code then waits on the Future for the result.


The FactorTask class is immutable, and therefore thread-safe, so no additional testing is required to prevent unwanted concurrent interactions. But FactorTask returns an array, and arrays are mutable. Shared mutable state needs to be guarded with synchronization, but because the application code is structured so that once the array of BigIntegers is returned by the FactorTask its contents are never modified, the client and task code can “piggyback” on the synchronization implicit in the Executor framework and do not need to provide additional synchronization when accessing the array of factors. If it were possible that any thread might modify the contents of the array of factors after it was created, this technique would not work.


    ExecutorService exec = …
class FactorTask implements Callable<BigInteger[]> {
private final BigInteger number;

public FactorTask(BigInteger number) {
this.number = number;
}

public BigInteger[] call() throws Exception {
return factorNumber(number);
}
}

Future<BigInteger[]> future = exec.submit(new FactorTask(number));
// do some stuff
BigInteger[] factors = future.get();



This technique can be combined with nearly all the concurrency mechanisms in the class library, including Executor, BlockingQueue, and ConcurrentMap – by only passing effectively immutable objects to these facilities (and returning effectively immutable objects from callbacks), you can avoid much of the complexity of creating and testing thread-safe classes.


Testing concurrent building blocks


Once you’ve isolated concurrent interactions to a handful of components, you can focus your testing efforts on those components. Since testing concurrent code is difficult, you should expect to spend more time designing and executing concurrent tests than you do for sequential ones.


The factors below are some of the concepts to consider when designing and running tests for concurrent classes. They, as well as others, are covered in much greater detail in the Testing chapter of Java Concurrency in Practice.



All this sounds like a lot of work, and it is. But by limiting the scope of concurrent interactions to a few widely-used, well-tested components, you greatly reduce the amount of effort required to test an application. And, by reusing existing tested library components, such as the classes in the java.util.concurrent package, you further reduce your testing burden.


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.