Parallelism
other posts

Counting with friends.

One of the fundamental problems in computing, one I commented on in September, and the one that receives more attention with each passing year, is parallelism. How do you define a process that utilizes multiple workers? This is of course one of the underlying problems in many other fields, from city planning to sociology to business management. In computing it is analyzed at what may be its smallest, most manageable form.

# Small, Decentralized: Locks and Transactions

Let us take a simple example. There is a chalk board on which is written exactly one number. There are a bunch of people who are each observing some group of widgets and when one of the widgets they watch sproings they increase the number by one. In other worlds, they are counting sproinging widgets in parallel.

Counting, though we don’t often think of it this way, involves several steps. First, you read the number on the board and remember it in your mind. Next, you erase the number from the board. Finally, you write the new, incremented number on the board.

But what happens if, as you reach up to erase the number, so does another person? Who wins? Do you both erase the 7 together, then both write up an 8, on atop the other? At some level that would work fine, except you would have only counted one sproing. This is what is called a race condition.

On the other hand, we could require there is only one eraser present. Now we have a five-step process: grab the eraser, read, erase, write, drop the eraser. The eraser now serves as a mutex or lock, something that enforces that the board update happen in series, not parallel. The stuff that is non-parallel is often called synchronized.

Now, the mutex may seem like it solves the problem of counting, and indeed you can’t do much better for this problem. But there is another approach which works better for other problems. This other approach uses “‍transactions‍”.

The idea here is that, instead of a chalkboard and eraser, we have sticky notes. To count we read the number from the note posted on the board, write a new note that is one larger, walk up to the board, and stick your new note up if and only if the note there right now is the same one you read earlier. When you do place the note it is called committing the transaction; when you don’t it is called rolling back the transaction.

# Human Approaches

When you are on a collision course with another person, you likely use what I call “‍guess and hope‍” resolution to determine who goes in which direction. That is, you dodge to one side and hope they either dodge later or dodge the other way. When that doesn’t work, you repeat. Probably three or four times before you switch to speech; and even then it’s still possible to be too synchronized and have to try several times before only one of you is talking. In theory this could go one forever.

If you work for a large company, you probably participate in centralized parallelism. Some person is in charge of splitting tasks into mostly non-intersecting pieces and you are in charge of doing one of those tasks. If you and another employee find your tasks compete too directly there’s a good chance you’ll go back to that central organizer for resolution.

If you’ve played almost any tabletop game—card games, board games, whatever—you’ve engaged in perhaps the lest parallel version of parallelism: taking turns. In theory much of the processing load, the planning what to do, takes place in parallel, but anything that can influence other players is synchronized through a very simple round-robin token passing.

But to me the most iconic way of handling parallelism in human endeavor is the traffic signal. We take myriad people each moving individually; we then separate their travel into parts where they cooperate (streets) and places where we arbitrarily serialize all their actions (signaled intersections). Neither part is perfect, but we trust the sort of ad-hoc human synchronization to handle the parallelism between intersections and we (more-or-less) patiently sit through the inefficient signals.

Such is parallelism. Get more stuff done faster, but beware: you might stomp on yourself by mistake.