Lecture 4
Correctness Concerns
-
Asynchronous interaction is
difficult to reason about
-
Single process software is difficult
to get correct anyway
-
Because of variability of
interaction, distributed systems seem
non-deterministic and failures are harder to
reproduce
-
Some advocate a more formal
proof-theoretic approach to correctness
-
But formal proofs are not easy and
they can have bugs too.
Correctness Concerns 2
-
State system invariants (IMPORTANT: inability to
do so is often a sign of system incomprehensibility).
-
Assume invariants hold going into an operation
-
Show operation preserves invariants (Seems like
an induction proof doesn't it).
-
Guarantee mutually exclusive use of state
variable during certain critical regions where
invariants may be temporarily inconsistent.
Example 1: Logical Vector Clocks
Consider assertion: for-all i,j: Ci[i] >= Cj[i] about logical vector clocks
States that a process is always more up to date on its
own time than any other process is.
WHY?
Because time is monotonically increasing and
only a process can increment its own clock.
The clocks of other processes are never changed by a process
(only remembered and passed on).
Expanding on previous notes
about limitations of logical clocks. Recall
if a->b then C(a) < C(b).
However if C(a) < C(b) we cannot imply anything about the
casually relationship between events a and b. Either (a ->
b or a || b). We only know that (not b->a).
However vector
clocks (see ISIS
also) can provide a partial order to event times.
Consider a vector timestamp Ta and Tb, then the following
relationships can be defined:
Now can state:
a -> b iff Ta < Tb
Question: for two
different events a and b, can Ta = Tb?
Homework (see also part b): Prove if Ta <
Tb then a -> b
Huang's Termination Detection Algorithm
Problem: how to know when all processes have finished a
computation (need consistent global view of this computation,
be it an election, deadlock detection or resolution, token
generation, etc).
A process is either IDLE or ACTIVE in the computation. A
computation message is sent to initiate a computation.
DEFINITION: a computation is terminated iff all processes are
idle and there are no messages in transit.
There is a controlling agent which initially has weight =
1.
Weight is used to coordinate work sent and results received.
Let B(DW) be a computation request message sent with weight
DW
and C(DW) be an acknowledgement message with weight DW.
Huang's Termination Detection Algorithm 2
- Rule 1: a process having
weight W may send a computation message to P as
follows:
- Derive W1 and W2, such
that
W1 + W2 = W, W1 and W2 > 0
- Set W = W1
- Send B(W2) to P
- Rule 2: On receipt of B(DW),
process P having weight W does:
- W := W + DW
- If P idle, P becomes
active
- Rule 3: An active process
having weight W may become idle by:
- send C(W) to control
agent
- W := 0
- Rule 4: On receiving C(DW),
the controlling agent having weight W:
- W := W + DW
- if W = 1, conclude
computation terminated
Correctness of Huang's Termination Detection
Algorithm
Let
A : set of weights of all active processes
B : set of weights of all computation messages in transit
C : set of weights of all control messages in transit
Wc: weight of controlling agent
Then the following invariants hold:
I1: Wc + SUM{over union of A,B and C} = 1 (conservation
of weight)
I2: for-all W in union of A,B and C) W > 0 (weights
are never negative)
------
By I1, Wc = 1 implies SUM{over union of A,B and C} = 0
By P2, SUM{over union of A,B and C} = 0 implies UNION A,B
and C is empty
A UNION B = empty implies termination.
if assume message sending is finite and reliable, then
eventually C will become empty and Wc = 1 so noting the
termination
Proof by contradiction: Assume that two sites Si and Sj
are executing the critical section (CS) concurrently and that
Si's request has a smaller timestamp than Sj (timestamps are
totally ordered). Si must have receieved Sj's request after
it made its own request. But Sj can only be in the CS if Si
returned a reply to it before Si finishes the CS. But this is
not possible since Sj has lower priority than Si's request.
Homework (part b - see also part a):
State invariants for the
Processes
On Uni-processors,
processes are mainly to create illusion of virtual processor.
Therefore they are meant to keep computations logically
apart.
For distributed systems
they are additionally used to create cooperating
computations, fault tolerant computations and real time and
parallel systems.
Threads
Single Address space
Multiple threads of control =
- program counter
- set of registers
- execution stack
- Child Threads
- Other State Info
AKA mini-processes or lightweight processes
Threads Share Memory
- Can easily share memory
objects.
- open files
- global variables
- buffers
- signals
- timers
- child processes
- semphores
- accounting
- Can destroy each others state
information
Threads can execute in parallel on
appropriate shared memory multiprocessors (such as high end
workstations).
Server Applications
In Client Server model,
- server receives requests from
many processes
- Requests are usually
independent and atomic
- Computation to satisfy request
may take considerable time
- Requested resource probably
shared (file, printer, database, web site)
- Some applications may require
synchronization between requests (e.g. distributed
simulation)
Server Implementations
- Single Process, Single Thread
- No parallelism,
- blocking system calls,
- serializes requests,
- inefficient
- Single Process, Multiple
Thread
- Possible parallelism,
- threads block on
system calls,
- interleaved requests
- Finite-state machine, simulate
multiple computations using state tables
- non-blocking,
- not truly parallel
- complex
Consider analogy to dentist's
office:
- one dentist, one patient
- many patients, many dentist
(one per patient)
- many patients, one dentists
Using Threads: Organizational Models
- Dispatcher, interchangeable
workers
- Peer/Team
- Pipeline (assembly line)
specialized workers,
process broken down into worker tasks.
producer/consumer
- Mixtures of the above
QUESTION: Do
threads make software easier to write?
Design Issues/Threads
- Static vs Dynamic creation
- Mutual Exclusion
- (Binary Semphore)
- Trylock (non-blocking)
- Condition variable
- Global variables
- Scheduling
Threads in User Space
Advantages:
- No change to underlying OS
- Flexible scheduling
- eliminates overhead of system
call
Disadvantages:
- Blocking system calls
- Swap due to page faults
- Clock interrupts?
- Other interrupts (signals)
Threads in Kernel Space
- Cost of system call
- Reentrant library procedures
(static variables)
In non thread system, procedures calls are mutual
exclusive
A program cannot be in two places at the same time.
Scheduler Activations
Hybrid solution:
- Keep thread management at user
level
- Systems calls/page faults
block thread not process
- Kernel signals user level
thread manager (UPCALL) on blocking or unblocking
events.
- Problem: wait if interrupt
thread in critical section on unblocking event
- Problem: upcalls violate
layered approach
Not surprising since thread management is between
peers.
RPC and Threads
Many RPCs are to processes on same
machine:
Can share memory(map page registers to calling stack)
Not just for threads
For server RPC: don't need to
save/restore state while waiting.
Implicit receive: create new thread
to handle incoming message
Pop-up thread: created to handle
RPC
Copyright chris wild 1996.
For problems or questions regarding this web contact [Dr. Wild].
Last updated: October 03, 1996.