System Design: Mutual Exclusion on Distributed Systems

Scenario: x = 10. At a moment, machine A executes “x *= 10” while machine B executes “x +=10”. Either x=110 or x = 200 is correct. Unfortunately, x = 100 or x = 20 will be the result without coordination between A and B.

Solution1-Centralized Coordinator: C is the manager and only grants permission to one machine at a time. Everything is good except that C is the bottle-neck on performance and failures.

Solution2-Token Ring. Machines are linked up as one cycle/ring. A token(permission) is passed along the ring. Machine i acquires permission by waiting for the token arrival from its left node(i-1). Machine i release permission by passing the token to its right node(i+1). The good is no one-point failure in this model. But it requires O(n) messaging transmittion.

Solution3-Lamport’s Bakery Algorithm: please see wiki.

System Design: Endianness

Q: Why on earth we have endianness?

A: It’s a matter of hardware. Computer memory is just a sequence of cells which are just bytes. A 32-bit processor has 32-bit registers that store 32/8=4 bytes. Endianness happens when data is loaded/stored between registers and memory. Big-endian stores the high-byte(significant byte) in registers as the low byte in memory. Little-endian does the opposite.

Q: Why couldn’t the world agree on one universal standard of endianness?

Each endian has its own strengths in specfic application domains. Internet protocols/Telephone network are standardized with Big-endian. Just like IP addresses, the high-order byte/number is sent first as the area code. Little-endian simplifies arithmetic calculation for assembly programming because we humans start at the lower digit when we do add/minus/divide. Also, with little-endian, reading a 64-bit number is the same as reading 32-bit number(still the calculation order).

Distributed Systems: Consistency Model

First, let’s study the formal definition from LESLIE LAMPORT.

A high-speed processor might execute program operations in a different order. A Sequential Processor guarantees that the result of an execution is the same as if the program is executed in the order specified by the program.

A Sequentially-Consistent multi-process algorithm guarantees that the result of an execution is the same as if operations of all processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by the program.

In other words, reordering program operations is a common approach for code optimization(i.e. load operations are issued early because it is slow). Both compilers and processors like to do that. Sequential Consistency guarantees the correctness of such a  reordering.

To sum up the key points of Sequential Consistency for further reference,

  • Each process/processor issues memory requests in the order specified by the program.
  • All memory requests to an individual memory location are served FIFO.

Notes: memory requests(usualy load/store) are all we concern because they are visible across different processes/processors.

System Design: A Thread Pool

Q: How does server serve requests?
A: A server assigns each client request with a dedicated thread. Those threads can be created on demand or be maintained as a single thread pool to be reused. The idea of the thread pool pattern is that thread-creation and destruction are expensive overheads compared to thread-reuse.

Q: How does a thread-pool work?
A thread pool is created with a parameter x indicating the default size of the pool. The size x can be dynamically tuned according to the service load. Each worker thread stays asleep and be waked up when requests arrive. Requests come in and are enqueued FIFO. Moveover, a most-recently active thread might has a high priority to be reused because of “hot” cache locality.

Q: What would happen if x is not tuned properly?
Suppose x = 10 and y = 100(the number of requests have arrived). This leads to heavy context-switchs, wasting CPU times. Suppose x = 100 and y = 100. This leads to a waste of memory.

Q: What else is needed for implementing a thread pool?
A thread-safe queue and a clear understanding of semaphores.