Lecture 4
Correctness Concerns
-
Asynchronous interaction is difficult to
reason about
-
Single process software is difficult to
get correct anyway
-
Because of variability of interaction,
distributed systems seem non-deterministic and failures
are harder to reproduce
-
Some advocate a more formal
proof-theoretic approach to correctness
-
But formal proofs are not easy and they
can have bugs too.
Correctness Concerns 2
-
State system invariants (IMPORTANT: inability to do
so is often a sign of system incomprehensibility).
-
Assume invariants hold going into an operation
-
Show operation preserves invariants (Seems like an
induction proof doesn't it).
-
Guarantee mutually exclusive use of state variable
during certain critical regions where invariants may be
temporarily inconsistent.
Example 1: Logical Vector Clocks
Consider assertion: for-all i,j: Ci[i] >= Cj[i] about logical vector clocks
States that a process is always more up to date on its own
time than any other process is.
WHY?
Because time is monotonically increasing and
only a process can increment its own clock.
The clocks of other processes are never changed by a process
(only remembered and passed on).
Expanding on previous
notes about limitations of logical clocks. Recall
if a->b then C(a) < C(b).
However if C(a) < C(b) we cannot imply anything about the
casually relationship between events a and b. Either (a -> b
or a || b). We only know that (not b->a).
However vector
clocks (see ISIS also)
can provide a partial order to event times.
Consider a vector timestamp Ta and Tb, then the following
relationships can be defined:
Now can state:
a -> b iff Ta < Tb
Question: for two different
events a and b, can Ta = Tb?
Homework (see also part
b): Prove if Ta < Tb then a -> b
Huang's Termination Detection Algorithm
Problem: how to know when all processes have finished a
computation (need consistent global view of this computation, be
it an election, deadlock detection or resolution, token
generation, etc).
A process is either IDLE or ACTIVE in the computation. A
computation message is sent to initiate a computation.
DEFINITION: a computation is terminated iff all processes are
idle and there are no messages in transit.
There is a controlling agent which initially has weight = 1.
Weight is used to coordinate work sent and results received.
Let B(DW) be a computation request message sent with weight DW
and C(DW) be an acknowledgement message with weight DW.
Huang's Termination Detection Algorithm 2
- Rule 1: a process having weight W
may send a computation message to P as follows:
- Derive W1 and W2, such
that
W1 + W2 = W, W1 and W2 > 0
- Set W = W1
- Send B(W2) to P
- Rule 2: On receipt of B(DW),
process P having weight W does:
- W := W + DW
- If P idle, P becomes
active
- Rule 3: An active process having
weight W may become idle by:
- send C(W) to control agent
- W := 0
- Rule 4: On receiving C(DW), the
controlling agent having weight W:
- W := W + DW
- if W = 1, conclude
computation terminated
Correctness of Huang's Termination Detection
Algorithm
Let
A : set of weights of all active processes
B : set of weights of all computation messages in transit
C : set of weights of all control messages in transit
Wc: weight of controlling agent
Then the following invariants hold:
I1: Wc + SUM{over union of A,B and C} = 1 (conservation of
weight)
I2: for-all W in union of A,B and C) W > 0 (weights are
never negative)
------
By I1, Wc = 1 implies SUM{over union of A,B and C} = 0
By P2, SUM{over union of A,B and C} = 0 implies UNION A,B and
C is empty
A UNION B = empty implies termination.
if assume message sending is finite and reliable, then
eventually C will become empty and Wc = 1 so noting the
termination
Proof by contradiction: Assume that two sites Si and Sj are
executing the critical section (CS) concurrently and that Si's
request has a smaller timestamp than Sj (timestamps are totally
ordered). Si must have receieved Sj's request after it made its
own request. But Sj can only be in the CS if Si returned a reply
to it before Si finishes the CS. But this is not possible since
Sj has lower priority than Si's request.
Homework (part b - see also part a):
State invariants for the
Processes
On Uni-processors,
processes are mainly to create illusion of virtual processor.
Therefore they are meant to keep computations logically apart.
For distributed systems
they are additionally used to create cooperating
computations, fault tolerant computations and real time and
parallel systems.
Threads
Single Address space
Multiple threads of control =
- program counter
- set of registers
- execution stack
- Child Threads
- Other State Info
AKA mini-processes or lightweight processes
Threads Share Memory
- Can easily share memory objects.
- open files
- global variables
- buffers
- signals
- timers
- child processes
- semphores
- accounting
- Can destroy each others state
information
Threads can execute in parallel on
appropriate shared memory multiprocessors (such as high end
workstations).
Server Applications
In Client Server model,
- server receives requests from many
processes
- Requests are usually independent
and atomic
- Computation to satisfy request may
take considerable time
- Requested resource probably shared
(file, printer, database, web site)
- Some applications may require
synchronization between requests (e.g. distributed
simulation)
Server Implementations
- Single Process, Single Thread
- No parallelism,
- blocking system calls,
- serializes requests,
- inefficient
- Single Process, Multiple Thread
- Possible parallelism,
- threads block on system
calls,
- interleaved requests
- Finite-state machine, simulate
multiple computations using state tables
- non-blocking,
- not truly parallel
- complex
Consider analogy to dentist's office:
- one dentist, one patient
- many patients, many dentist (one
per patient)
- many patients, one dentists
Using Threads: Organizational Models
- Dispatcher, interchangeable
workers
- Peer/Team
- Pipeline (assembly line)
specialized workers,
process broken down into worker tasks.
producer/consumer
- Mixtures of the above
QUESTION: Do
threads make software easier to write?
Design Issues/Threads
- Static vs Dynamic creation
- Mutual Exclusion
- (Binary Semphore)
- Trylock (non-blocking)
- Condition variable
- Global variables
- Scheduling
Threads in User Space
Advantages:
- No change to underlying OS
- Flexible scheduling
- eliminates overhead of system call
Disadvantages:
- Blocking system calls
- Swap due to page faults
- Clock interrupts?
- Other interrupts (signals)
Threads in Kernel Space
- Cost of system call
- Reentrant library procedures
(static variables)
In non thread system, procedures calls are mutual
exclusive
A program cannot be in two places at the same time.
Scheduler Activations
Hybrid solution:
- Keep thread management at user
level
- Systems calls/page faults block
thread not process
- Kernel signals user level thread
manager (UPCALL) on blocking or unblocking events.
- Problem: wait if interrupt thread
in critical section on unblocking event
- Problem: upcalls violate layered
approach
Not surprising since thread management is between peers.
RPC and Threads
Many RPCs are to processes on same
machine:
Can share memory(map page registers to calling stack)
Not just for threads
For server RPC: don't need to
save/restore state while waiting.
Implicit receive: create new thread to
handle incoming message
Pop-up thread: created to handle RPC
Copyright chris wild 1996.
For problems or questions regarding this web contact [Dr. Wild].
Last updated: October 03, 1996.