lecture 8

Fall 2000: CS 771/871 Operating Systems

Distributed Scheduling

Scheduling algorithms have been well studied for single processor systems
Even so, non-trivial
Scheduling for N processors complicates the problem
So why not use a faster processor?
Is shared memory critical?
What software features would ease the task of distributed scheduling?

Analysis of Processor Pools

Differences between "N-times" faster processor and processor pool of N processors is interesting.
While the arrival rate is N times the individual rate, the service rate is not unless the pool is constantly maximally busy (which implies an unstable queuing system). Whenever the number of requests falls below N, the pool will service them slower than a fast processor (but faster than N isolated systems).

But of course an "N" faster processor may be prohibitively expensive or non-existent (N = 1000 * Pentium 600)

Let's analysis N isolated systems (adapted from Singhal) to see how big the problem is (level of underutilization in the absence of load balancing):

Probability of Idle When Waiting

Issues In Load Distribution

Metrics of Performance
- Average Response Time
- System Utilization
- Minimum overhead (platform/network)
Defining Load:
- Queue Length is good,
  easy to measure
  correlates with task time
  Need to account for tasks in accepted but in transit to the queue
- Processor Utilization:
Classification of Algorithms
- Static: Pre-defined decisions
- Dynamic: measure load and decide
- Adaptive: change behavior in different regions of performance space
Load balancing vs load sharing
- load balancing works harder by attempting to equalize the load on all processors
Preemptive or not
NOTE: if unscheduled tasks can have a lot of state information (home directory, privileges, etc.

Components of Load Distribution Algorithms

Transfer Policy: determines whether a node is suitable for a transfer
- local based on thresholds (low or high at this processor)
- global based on system load
Selection Policy: which task to transfer
- need to pre-empt
- time left to complete
- size
- number of location dependent resources needed
Location Policy: find lightly loaded nodes (poll or broadcast need)
Information Policy: identify relevant global state
- Demand driven (sender-initiate, receiver-initiate, symmetric)
- Periodic
- State-change driven: centralized or not

Stability

Queuing-Theoretic:

Unstable: if queues lengths grow without bounds
Effective: if performance is improved

Algorithmic:

unstable: perform fruitless actions indefinitely (e.g. keep moving a task)

Load Distributions Algorithms
Sender-Initiated

Three simple yet effective algorithms:

Transfer policy: Threshold based queue length. send if exceeds T, Receive if does not exceed T.
Selection policy: only newly arrived tasks (non-preemption)
Location policy:
- Random: lack of knowledge of receiver site may led to additional transfers in heavily loaded situations. (one solution: limit number of transfers).
  No knowledge of global state.
  Can show improvement over no balancing
- Threshold: poll node selected at random to see if can be receiver. If not, poll again up to limit.
  Limited knowledge of global state.
- Shortest: select PollLimit sites at random and poll, pick shortest among replies.
  NOTE: being polled is no guarantee of transfer - so may end up getting too much work or not enough.
  Marginal improvement over Threshold
  Less limited knowledge of global state.
Information policy: demand driven
Stability: unstable under heavy loads as machines waste time searching unsuccessfully for lightly loaded machine.

Load Distributions Algorithms
Receiver-Initiated

Transfer policy: Threshold based queue length. Receive if falls below T (check after tasks completes).
Selection policy: any
Location policy: poll at random looking for overloaded machines up to PollLimit.
If none, wait until another tasks completes or time out.
Information policy: demand driven
Stability: stable under high load, don't care under low

ISSUE: catching a potential sender with a task that has not started is difficult.

Algorithm stable under load
Question: How can receiver initiated avoid preemption at sender?

Load Distributions Algorithms
Symmetrically-Initiated

Advantages of both: under low, sender-initiated more successful, under high receiver initiated are more successful.

But can also have the disadvantages as well.
Plus if sender sends then goes below threshold as a result and becomes receiver wanting to get job back, then sender because above, etc.

Goal: maintain load close to system average
Problem: could cause thrashing

Above Average Algorithm: avoid thrashing between sender/receiver

Transfer policy: Two thresholds equi-distant from estimated average load. Less than low threshold, receiver, more than high, sender. Otherwise neither
Selection policy: any
Location policy:
sender initiated:
- Broadcasts TooHigh message, sets timeout and waits
- Receiver upon hearing TooHigh, cancels its TooLow alarm and sends Accept message. Increases load, sets AwaitingTask timeout.
- On receiving an Accept message, if still a sender, chooses best task to transfer and transfers it
- If TooHigh alarm times out. increases estimate of average load. Sends ChangeAverage message
Receiver initiated:
- broadcasts TooLow message, sets TooLow alarm
- If a TooHigh message is receives - follows procedure above
- IF TooLow alarm expires, decreases average and sends ChangeAverage message
Information policy: demand driven adapts to estimated load.
Stability: Stability is maintained by changing TooHigh threshold in response to ChangeAverage messages

Load Distributions Algorithms
Adaptive

Above average does not keep track of busy and idle processors, just average load. broadcast or polling therefore may not be efficient.

Stable Symmetric: keep information from polling to classify nodes into three groups:
underloaded/overloaded/OK. (learns global state as by product of polling)
Initially all processors are receivers.

Transfer policy: Two thresholds determine which group
Selection policy: sender; only newly arrived tasks (non-preemption)
receiver: any
Location policy:
sender
- polls receiver at head of receiver list (initially all other processes)
- Polled node, puts sender in its sender list and replies
- sender, updates polled nodes status and sends if not busy, otherwise polls next.
- Polls up to limit or until no more receivers on list.
receiver
- head to tail in senders lists (most recent first), tail to head in OK list (oldest first), receivers (tail to head)
- If polled node is sender, transfers, sending its status after transfer
- If polled is not sender, moves polling node to receiver list and replies with its state.
- Original receiver updates state of polled node upon reply
- Polls until finds or PollLimit
Information policy: demand driven
Stability: Under high load, receiver list will become empty and no sender initiated polls. On low loads, waste capacity is OK and has benefit of keeping potential senders up to date on available receivers.

Stable sender initiated: uses sender initiated part of the above algorithm as is
Change receiver part to handle non-preemptive, new receivers using statevector (the current state believed by the other nodes) sends update to those who do not have ti on the receivers list.

ADSEND: Adaptive Sender

Only non-pre-emptive selection:

Uses sender-initiated portion of ADSYM algorithm.

Has an additional data structure STATEVECTOR (one location for each node) which records which list this node is on at every other node.

When sender polls, it additionally update STATEVECTOR for the polled node to reflect sender status.

Polled node updates its STATEVECTOR for the sender's list assignment.

Receiver algorithm:

On changing to receiver, sends message to all nodes which need to know this (based on info in STATEVECTOR).
Avoids broadcast Question: is this bad?

Comparison of Load Distribution Algorithms

M/M/1: distributed system, no load distribution
RECV: Receiver initiated.
RAND: Sender transfer to random receiver
SEND: Poll randomly to find receiver
SYM
ADSYM
ADSEND:
M/M/K: Ideal load sharing with no overhead

Results of Comparision(from *)

* Shivaratri, Krueger and Singhal," Load Distribution in Locally Distributed Systems", IEEE Computer, Vol. 25, no 12, Dec. 1992, pp 33-44.

Assumptions and Parameters:

Average service demand is 1 unit of time
arrival rates independent exponential distributions
homogenous load
40 identical nodes
UT=LT=1
PollLimit = 5 (P(1-P)**i gives dimishing returns)

Results Figure 1

Anything better than no load distribution
Receiver better than sender under high load

Results Figure 2

Better than sender at high

still unstable at high loads

Better than receiver at low

Results: Figure 3

Adaptive slightly better (SYM is adaptive)

ADSYM approaches ideal

ADSYM similar to RECV under heavy loads but better on light loads

Results: Figure 4

offered system load of .85 but by only a subset of processors (heterogeneous load)

RECV very unstable (random probes unlikely to find work)
SEND also unstable, can't get rid of load fast enough
ADSYM very good