[ Home | Class Roster | Syllabus | Status | Glossary | Search | Course Notes]
Scheduling algorithms have been well studied for single processor systems
Even so, non-trivial
Scheduling for N processors complicates the problem
So why not use a faster processor?
Is shared memory critical?
What software features would ease the task of distributed scheduling?
Differences
between "N-times" faster processor and processor pool
of N processors is interesting.
While the arrival rate is N times the individual rate, the
service rate is not unless the pool is constantly maximally busy
(which implies an unstable queuing system). Whenever the number
of requests falls below N, the pool will service them slower than
a fast processor (but faster than N isolated systems).
But of course an "N" faster processor may be prohibitively expensive or non-existent (N = 1000 * Pentium 600)
Let's analysis N isolated systems (adapted from Singhal) to see how big the problem is (level of underutilization in the absence of load balancing):
Queuing-Theoretic:
Unstable: if queues lengths grow without bounds
Effective: if performance is improved
Algorithmic:
unstable: perform fruitless actions indefinitely (e.g. keep moving a task)
Three simple yet effective algorithms:
Transfer policy: Threshold based queue length. send if exceeds T, Receive if does not exceed T.
Selection policy: only newly arrived tasks (non-preemption)
Location policy:
Random: lack of knowledge of
receiver site may led to additional transfers in heavily loaded
situations. (one solution: limit number of transfers).
No knowledge of global state.
Can show improvement over no balancing
Threshold: poll node selected at
random to see if can be receiver. If not, poll again up to limit.
Limited knowledge of global state.
Shortest: select PollLimit sites
at random and poll, pick shortest among replies.
NOTE: being polled is no guarantee of transfer - so may end up getting
too much work or not enough.
Marginal improvement over Threshold
Less limited knowledge of global state.
Information policy: demand driven
Stability: unstable under heavy loads as machines waste time searching unsuccessfully for lightly loaded machine.
Transfer policy: Threshold based queue length. Receive if falls below T (check after tasks completes).
Selection policy: any
Location policy: poll at random
looking for overloaded machines up to PollLimit.
If none, wait until another tasks completes or time out.
Information policy: demand driven
Stability: stable under high load, don't care under low
ISSUE: catching a potential sender with a task that has not started is difficult.
Algorithm stable under load
Question:
How can receiver initiated avoid preemption at sender?
Advantages of both: under low, sender-initiated more successful, under high receiver initiated are more successful.
But can also have the disadvantages as well.
Plus if sender sends then goes below threshold as a result and becomes receiver
wanting to get job back, then sender because above, etc.
Goal: maintain load close to system
average
Problem: could cause thrashing
Above Average Algorithm: avoid thrashing between sender/receiver
Transfer policy: Two thresholds equi-distant from estimated average load. Less than low threshold, receiver, more than high, sender. Otherwise neither
Selection policy: any
Location policy:
sender initiated:
Broadcasts TooHigh message, sets timeout and waits
Receiver upon hearing TooHigh, cancels its TooLow alarm and sends Accept message. Increases load, sets AwaitingTask timeout.
On receiving an Accept message, if still a sender, chooses best task to transfer and transfers it
If TooHigh alarm times out. increases estimate of average load. Sends ChangeAverage message
Receiver initiated:
broadcasts TooLow message, sets TooLow alarm
If a TooHigh message is receives - follows procedure above
IF TooLow alarm expires, decreases average and sends ChangeAverage message
Information policy: demand driven adapts to estimated load.
Stability: Stability is maintained by changing TooHigh threshold in response to ChangeAverage messages
Above average does not keep track of busy and idle processors, just average load. broadcast or polling therefore may not be efficient.
Stable Symmetric: keep information from polling
to classify nodes into three groups:
underloaded/overloaded/OK. (learns global state as by product of polling)
Initially all processors are receivers.
Transfer policy: Two thresholds determine which group
Selection policy: sender; only newly
arrived tasks (non-preemption)
receiver: any
Location policy:
sender
polls receiver at head of receiver list (initially all other processes)
Polled node, puts sender in its sender list and replies
sender, updates polled nodes status and sends if not busy, otherwise polls next.
Polls up to limit or until no more receivers on list.
receiver
head to tail in senders lists (most recent first), tail to head in OK list (oldest first), receivers (tail to head)
If polled node is sender, transfers, sending its status after transfer
If polled is not sender, moves polling node to receiver list and replies with its state.
Original receiver updates state of polled node upon reply
Polls until finds or PollLimit
Information policy: demand driven
Stability: Under high load, receiver list will become empty and no sender initiated polls. On low loads, waste capacity is OK and has benefit of keeping potential senders up to date on available receivers.
Stable sender initiated: uses sender
initiated part of the above algorithm as is
Change receiver part to handle non-preemptive, new receivers using statevector
(the current state believed by the other nodes) sends update to those who do not
have ti on the receivers list.
Only non-pre-emptive selection:
Uses sender-initiated portion of ADSYM algorithm.
Has an additional data structure STATEVECTOR (one location for each node) which records which list this node is on at every other node.
When sender polls, it additionally update STATEVECTOR for the polled node to reflect sender status.
Polled node updates its STATEVECTOR for the sender's list assignment.
Receiver algorithm:
* Shivaratri, Krueger and Singhal," Load Distribution in Locally Distributed Systems", IEEE Computer, Vol. 25, no 12, Dec. 1992, pp 33-44.
Assumptions and Parameters:
Better than sender at high
still unstable at high loads Better than receiver at low |
Adaptive slightly better (SYM is
adaptive)
ADSYM approaches ideal ADSYM similar to RECV under heavy loads but better on light loads |
offered system load of .85 but by only a subset of processors (heterogeneous load)