Concurrency, Part 1

Beware! This document needs cleaning up. Formatting may be messy or broken. Text may be inaccurate or in need of editing.

You can look at this preview, but be aware that the content will change before the instructor assigns this to the class.

This document is the result of an automated conversion. You can view the original version but again beware: that version too is not necessarily what will eventually be assigned.

In software, concurrency is when multiple tasks have overlapping timespans.

Concurrent execution is the opposite of sequential execution. When code runs sequentially, one task finishes completely before the next task starts. You are already used to sequential execution: when you write code, each statement finishes its work before the next statement starts. If you put a loop in your code, the program waits for the loop to finish before moving on to the next thing. If you call a function, the program waits for the function to finish before moving on to the next thing. Code in Java (and Python, and most other common programming languages) is fundamentally sequential.

When code runs concurrently, one task or operation starts before others are finished. Why would you do that? And how does it work?! There are multiple answers — very, very different answers! — to both of those questions. In this activity’s two parts, you will look at two alternatives.

Take a look in the paralleldanger package. The SisyphusStream class has a main method with this ridiculous stream:

LongStream.range(0, 100L)
    .map(n -> 1)

What does this stream do? It gives you the number 1 repeated 100 times. Why would we want that?! Because we’re going to use it for performance testing. (The L in 100L means “long.” It is there so we can make that number get much, much bigger later on.)

Java streams have a forEach method that calls a closure with each element of the stream. Note that this is different from map. The map closure must return a value, and then map gives you a new stream that contains the transformed values. However, forEach is a terminal operation: its closure does not return anything, and it does not give you a new stream. It passes the closure you provide each element in turn, then does nothing else with the values.

Let’s use forEach to print each element in the stream, and see the number 1 repeated 100 times. Exciting! (Not really, but testing our code is always exciting.) Update the main method:

LongStream.range(0, 100L)
    .map(n -> 1)
    .forEach(System.out::println);

Run that and make sure it works.

Our goal is to compute the sum of the numbers in the stream. It should sum to 100, of course! (This is a nice task for performance testing: the correct answer is obvious, but it will still take some amount of time to compute.)

Take a quick look at the SumCalculator class. Then update the main method to use it to compute a sum:

SumCalculator calc = new SumCalculator();
LongStream.range(0, 100L)
    .map(n -> 1)
    .forEach(calc::add);  // Add each steam element to the total
System.out.println(calc.getTotal());  // Print the total

Note the new forEach call. What is it doing? What method does it call? What instance variable does that method modify? It calls calc’s add method, which modifies the total instance variable of calc.

OK, let’s start timing! The project also contains a Timer class for doing performance testing. You do not need to read and understand it now, although you are welcome to study it after class. Here is how to use it:

Timer.measure(20, () -> {  // Repeat the following test 20 times:
    SumCalculator calc = new SumCalculator();
    LongStream.range(0, 100L)  // Process this many numbers
        .map(n -> 1)
        .forEach(calc::add);
});

Copy that code into your main method, and run it. Study the output. Everything make sense? Grab an instructor or preceptor if you have questions.

Now the fun begins! Add one zero at a time to 100L until the average rep is longer than 0.5 seconds. How many zeros did you have to add? How fast is a computer, anyway?! Yikes!

Aside: Why are the subsequent repetitions so much faster than the first ones? Read on if you’re curious; skip ahead if time is running short.

Java runs your code in a sort of “quick start, slow speed” mode when your program first begins. Java then observes where your code is spending most of its time, and plans out faster ways of running the particular code that is causing performance problems

This strategy is a part of Java’s JIT — “ J ust I n T ime” — compilation strategy. To “compile” code essentially means “prepare the code in advance for running.” Here, the “just in time” means that Java waits until your code is just about to start running before fully compiling it. And then — here’s the key part! — Java may compile it again in a different way, after it’s already started, based on what is actually happening.

For this reason, Java code is often slow at first, then much faster. Keep this in mind when you do performance testing! This is one reason why Timer.measure does many repetitions and takes the average.

Write down the average time from your test. Now that we have something slow, we’re ready to make it faster.

Did you know that most modern computers, probably including your computer, can do multiple things simultaneously? This is called parallel execution, or just parallelism. “Parallel” doesn’t just mean different things taking turns; it means different things happening at the same time. (When you hear computer advertisements talking about a “4 core processor! 8 core processor! etc,” this is the thing they’re referring to: how many things the computer can do truly simultaneously.)

Java streams have an amazing feature: you can tell them to run in parallel! And it only takes only one line of code:

Timer.measure(20, () -> {
    SumCalculator calc = new SumCalculator();
    LongStream.range(0, 1000000000L)
       .parallel()  // Magical fairy dust here
       .map(n -> 1)
       .forEach(calc::add);
});

Run your code again. If you have a multi-core computer, it will be faster! (How much faster? Compare to your average time from before.)

Amazing! Just like magic!

But…hmmmmm, we’re not actually printing the total anymore. We really ought to check that we’re still getting the right answer. Add a line to print the total for each repetition:

Timer.measure(20, () -> {
    SumCalculator calc = new SumCalculator();
    LongStream.range(0, 1000000000L)
       .parallel()
       .map(n -> 1)
       .forEach(calc::add);
    System.out.println("total=" + calc.getTotal());  // Just to be sure
});

If your computer has a single core, you will see the correct answer. But if you have a multi-core computer — as most computers these days are — you will see some very puzzling results.

Wait, was it ever correct? Comment out the .parallel() line and watch your code work slowly. Uncomment the .parallel() line and watch your code fail quickly.

The problem lives in this innocent-looking line in SumCalculator:

total += x;

That line of code does three different things:

Read the current value of total.
Compute a new result.
Write the new result back to total.

When we made the stream run in parallel, we made it so that multiple calls to the sum method are doing those three steps simultaneously. Suppose that total is currently 999. If we’re lucky, very lucky, then this happens:

Parallel worker 1	Parallel worker 2
Read total → 999
Add 1 to 999 → 1000
Write 1000 → total
	Read total → 1000
	Add 1 to 1000 → 1001
	Write 1001 → total

But sometimes we’ll get unlucky, and something like this happens:

Parallel worker 1	Parallel worker 2
Read total → 999
	Read total → 999
Add 1 to 999 → 1000
Write 1000 → total
	Add 1 to 999 → 1000
	Write 1000 → total

…and then we get the wrong answer.

This is called a race condition: the two workers here are in a data race, each one racing to write its result back to total before the other can read it.

The fundamental source of trouble here is that even though there are multiple parallel workers, they are all sharing the same instance variable of the same SumCalculator object:

private long total;

The term for this is shared mutable state: “shared” because multiple things all use it, “mutable” because it changes. Shared mutable state is a common source of bugs in many situations. When parallelism is present, it can be deadly.

Our simple, innocent-looking SumCalculator class is the source of our trouble. It isn’t built for parallelism. (The term people often use for this is that it “isn’t thread-safe.” In this context, “thread” is a term for the abstraction the computer uses to manage parallelism, and even simulate it on single-core computers.)

Fortunately, we don’t need SumCalculator at all: Java already provides a sum_ method for streams_. And unlike the SumCalculator class, it knows how to handle parallelism:

Timer.measure(20, () -> {
   long total = LongStream.range(0, 1000000000L)
      .parallel()
      .map(n -> 1)
      .sum();  // Instead of using SumCalculator
   System.out.println("total=" + total);
});

Run it. Look at those timings! Look at those correct results!

How precisely does Java avoid the pitfalls of shared mutable state and make that sum method work? How do parallel algorithms work in general? It is a complicated and fascinating topic, so complicated and fascinating that we devote an entire course to it (COMP 445: Parallel and Distributed Processing). These are not questions we answer in COMP 127. However, there are still lessons we can learn right now.

Parallel computing can make things much faster. It is amazing!
Parallel computing can also make code break. It is dangerous.
The problems that parallelism causes can be surprising and hard to spot. Do not assume your code works just because it looks right and you’ve thought about it very hard.
Measure performance. Don’t introduce danger or complexity to make things faster until you know you actually have a problem.

“Programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming.” — Donald Knuth, 1974 - If you don’t need parallelism, don’t use it. - If you do need the performance benefits of parallelism, let a library handle it for you whenever possible. Let somebody else’s well-engineered code do the tricky stuff!

Don’t get clever when something simple and obvious will do the job. - If your own code does need to exist in a context where there is parallelism, remember that you are walking on lava, and beware of shared mutable state.

Beware the following words that warn you that you might be in the danger realm:

parallel
thread
multitasking
distributed

Get up! Stretch! Then proceed to part 2.

Concurrency, Part 1

Set up a performance test

Let’s make it fast! What could possibly go wrong?

What went wrong?

How do we fix it?!

Lessons from this adventure

Onward