Fall 2000: CS 771/871 Operating Systems

[ Home | Class Roster | Syllabus | Status | Glossary | Search | Course Notes]


 

Lecture 4 - Distributed OS

 What are distributed systems?

Distributed System
A distributed system is a collection of independent computers that appear to the users of the system as a single computer (tanenbaum).

Distributed System should:

  1. control network resource allocation to allow their use in the most effective way
  2. Provide convenient virtual machine
  3. hide distribution of resources
  4. provide protection
  5. provide secure communication (Goscinski)

This is in contrast to centralized computing (timesharing) or independent PC's.

Distributed systems do not share memory or a clock.

Why study them? Because of perceived advantages:
better price/performance (PCs are cheap, supercomputers are not)
improved speed (Speed of light is limit, many operations in parallel)
naturally supports distributed application (banking)
increased reliability (failures can be isolated)
incremental growth (keep old machines, just add more)
data sharing (computer supported cooperative work - or games)
device sharing (printers, scanners, plotters are expensive)
communications (support groups of people working together)
flexibility (match load to idle machines more easily).

But distributed systems have their disadvantages, chief among them is the complexity and relative unavailability of <<SOFTWARE>>. Most of the problems of non-distributed systems still exist, but we have added many new problems to solve in order to achieve some of the perceived advantages among them are network congestion and rerouting and security.

Also most distributed systems complicate the job of the user by forcing them to be aware of various aspects of the distributed system (Don't you love URL's).

Software Concepts

The job of the operating system is to mold incalcitrant hardware into a beautiful virtual machine.

One distinction is based on the degree of autonomy between processors (tightly vs loosely coupled).

Combinations of hardware and software

  1. Network OS: high degree of autonomy, possible different operating systems, few system wide resources (printers, network file system). Client Server protocols. User aware of distribution of resources.
    RLOGIN, RSH, SETENV DISPLAY, FTP
    are examples of explicit user actions which reflect lack of transparency of location of resources.
    On the other hand, NFS makes the location of your files transparent within the network complex.
  2. True Distributed OS: single system image presented to user: virtual uniprocessor. Note how this contrasts with traditional timesharing systems which make a single CPU look like many virtual CPUs. Requires:
  3. Multiprocessor Timesharing OS: tightly coupled. common ready queue. Shared memory, file system is like single CPU version, possible specialization of processors.
    QUESTION: why must scheduler run in a critical section?
    QUESTION: What else must be run in a critical section?

See table in figure 1.12

Another way to look at differences is to consider the traditional hierarchical structure of a centralized OS:

  1. File Management
  2. I/O Device Management
  3. Memory Management
  4. Process Management

Now consider a network of resources consisting of

Now consider different placements of the InterProcess Communications (IPC) module within the traditional hierarchy.
If between 1 and 2, then File Service can be provided remotely and transparently.
If between 2 and 3, then shared remote devices are supported transparently.
If between 3 and 4, then shared memory
If integrated into 4, true distributed OS.

NOTE: one can make access to remote resources appear transparent by adding software above the OS. This is particularly easy if the OS has a light weight kernel and exports many of the management to user level process (like file management, I/O management). then these modules can utilize the network to access remote resources.

One of the earliest attempts at a network OS was the National Software Works, undertaken in the middle 70's.
Consisted of heterogeneous computers connected by ARPANET. Implementation was entirely at the application level (reminds one of internet and web browsers, search engines, etc). But developed a IPC which provided common functionality on diverse systems. Dealt with addressing (naming) problems which is still a big issue in distributed systems.

This early attempt had performance problems.

Other early attempts were based on remote procedure calls (RPC) built on top of a centralized OS with network access (figure 2-6 Goscinski).
Consisted of the following steps:

  1. User Process communicates using provided IPC to local Remote Access System (RAS) with a request.
  2. Local RAS sends request to an appropriate Remote RAS.
  3. Remote RAS acknowledges to local RAS
  4. local RAS transmits acknowledgment to user process with information to set up direct communication path to remote RAS.
  5. User process sends pertinent data to remote RAS.
  6. which access appropriate resource on remote system
  7. sends acknowledge to user process
  8. remote RAS awaits completion of request
  9. and send acknowledge back to user process.

Consider some of the design tradeoffs.
QUESTION: what other ways to solve?
QUESTION: what are the design issues?

The newcastle connection was an early attempt at developing a network OS based on the UNIX OS. Used an extension of the UNIX hierarchical file naming structure to tie different systems together (loosely coupled).
Draw out the naming structure.
Replaced library routines between user and kernel levels. this intermediate level communicated with other using RPC.
Because all processes are subjected to intermediate processing for kernel requests, slows down everybody.

There are no widely used commercially available distributed OS today, although there are many networked OS.

So why study? Because the design issues are important in general and the trend is towards more distributed systems (LAPLINK, mobile computing). It is the future and parts of it are already here.

Design Issues:

One of the fundamental problems in distributed OS is lack of global state and up-to-date information.


Issues in distributed systems


Communications Primitives


Communications in Distributed Systems

 


ISO Open Systems Interconnection Reference Model

Seven Layer Peer-to-Peer protocol suite:

  1. Physical

  2. DataLink

  3. Network

  4. Transport

  5. Session

  6. Presentation

  7. Application


 OSI Physical Layer

Allows different physical networks:
Twisted Pair
Coax
Fiber Optic

 


 OSI DataLink Layer

Probably adds trailer to message for checksum.


 OSI Network Layer

Primarily routing in a wide area network.

Some systems are set manually

Others use adaptive algorithms to reduce congestion

QUESTION: What are some of the issues in adaptive routing?

X.25: telephony, connection oriented protocol

IP (Internet Protocol): connectionless


 OSI Transport Layer

Provides reliable point to point connection (therefore connection oriented)

ISO provides 5 variants depending on nature of underlying network and degree of multiplexing.

(DoD has one called TCP (Transmission Control Protocol) plus connectionless one called UDP (Universal Datagram Protocol))


 OSI Session Layer

Provides synchronization and checkpoints with recovery.

Rarely used


 OSI Presentation Layer

Defines Format of information:


 OSI Application Layer

What's left:


ATM (Asynchronous Transfer Mode)

OSI was developed in 1970's and reflects older technology.
ATM takes advantage of fast switches and networks.

The later speeds imply high speed multiplexing.


ATM2

Telephone companies: integrate voice and data

Deliema: Voice continuous low bandwidth (circuit switching)
Data: bursty high bandwidth (packet switching)

ATM: international standard
Virtual Circuit: Route saved in switches:
QUESTION: Why keep route in switches?

Small fixed size blocks called CELLS.
QUESTION: Why small? Why fixed size?

Cell Switching: multicasting, multiplexing.


ATM Reference Model


ATM Physical Layer

Synchronous Continuous Stream: empty cells fill void

Can use SONET (Synchronous Optical NETwork)
or SDH (Synchronous Digital Hierarchy) used by telephone companies.

SONET: 9 x 90 byte frame, of 810 bytes, 36 overhead
transmitted every 125microseconds = 51 Mbps (OC-1)

OC-n and OC-nc used to more bandwidth.

ATM uses OC3c (155.520 Mbps) and OC12c (622.080Mbps).
2.5 Gbps coming.

Telephones might use ISDN (64kbps)


ATM Layer

Again a compromise:
Europe - small to avoid echo suppressers
Americans - big for efficiency
Result: 48 byte + 5 header

Does not fit nicely into 774 SONET data payload

Header contains (figure 2-5):

VPI/VCI reflect assigned route on call setup and change at switch to reflect next hop, VPI allows a group of connections destined for the same place to be rerouted together.


ATM Adaptation Layer

Networks are becoming too fast for typically computer OS interaction at the cell level.
Adaptation maps packets into cells.
Four classes of traffic:

  1. constant bit rate (audio/video)

  2. variable but bounded delay

  3. Connection Oriented

  4. Connectionless data

Computer Industry didn't like and drafted AAL5 (SEAL - Simple and Efficient Adaptation Layer). Distinguishes last cell which contains packet length and packet checksum.


ATM Switching

Computer connect to switches which can connect to other switches. virtual circuit sets route in each switch during setup.

Requires fast switching speeds (3microsecond for OC3), with parallel input and output ports. Problem if two inputs need same output.
May drop cells but must deliver others in order received.
Can queue but only temporary congestion relieve possible.
Different solutions depending on nature of traffic streams, may use statistical analysis.

QUESTION: How fast does an OC12c switch need to be? 2.5Gbps?


ATM Implications

High Bandwidth but Physical Delays require rethinking of protocols for flow control and error handling and bandwidth utilization.

Asymptotically utilized bandwidth approaches 0 while waiting for speed of light transmission.

Question: How is this akin to the length limit on Ethernet?

Flow control may become rate control (a-priori agreements).

Sliding window protocols leads to low utilization (see calculation).

Maybe should centralize applications with keystroke per cell from user to application. Has architectural implications.

Conclusion: Potential increase in network bandwidth may not easily be realized: Active area of research and development.


Some Calculations of effects of high speed networks

Consider a 1Gbps network connecting Norfolk and San Francisco.
ATM cells arrive every 56 nanoseconds.
Transmission latency is approx. 2/3 speed of light = 15 milliseconds one way.
Implies there are 15 megabits in the pipeline before the first bit is received.
What if the receiver cannot buffer this much and rejects?
What if I require an ACK message after every 1,000 bits?
Effective transfer rate is 1 microsecond to stuff bits into pipe, 15 milliseconds transfer latency, less than a microsecond to stuff ACK back into pipe (assuming no processing delays), another 15 milliseconds transfer latency = 1000 bits every 30.002 milliseconds or 33 bps!

If increase packet to 1,000,000 bits, transfer rate is about 33Kbs (still a long way from 1 Gbps).

Clearly requiring an ACK frequently greatly reduces usable bandwidth.


Client Server Models

Let's start with an appealing simple model.
Server processes which provide service to client processes.

Communications is request/reply
connectionless and asynchronous.

Requires only three layers: physical, datalink, request/reply.

Can be implemented with two procedures (send and receive).

Procedure calls hide distributed nature of service (except perhaps in addressing). Looks like local procedure call. (see example fig2-9)

There can be different kinds of services provided (another set of design issues - but outside realm of OS).


Client Server Issues

Not just client server issues


Addressing

What is the unit of addressing? machine, process, port, service?
Question: What is static? what is Dynamic?

Are processes given fixed names (numbers)?
Question: Can I run multiple server processes? Why would I want to?

Are processes given global names or are they machine specific?
Question: How to coordinate global names?
What is wrong with machine specific?
How does internet work its addressing for WWW?
What is the permanence of addresses?
If global, how to route to proper machine?

Could use name server?
Question: what if name server needs to move?

Assigning random addresses?
Question: who assigns? centralized/distributed
How to rout?
How client know address?

Distinguish: Name of service (dry cleaners), location (address) of service, and instantiation of server (clerk behind counter, process running on machine).
Question: what about competing servers?


Blocking vs Non-blocking transmission

Also called synchronous/asynchronous.

For both send and receive

Synchronous is easiest to program but async allows process to do other things while waiting.

Async requires polling or interrupts (call-backs) programmed into system.

Another complication: timeouts to handle transmission failures of certain types.


Buffered vs Non-buffering transmission

  Who supplies message buffer and when?
How big is it?
What if message sent before server issues "receive" call?
What if server handles many clients? How to receive all potential messages.

kernel could buffer in anticipation call to "receive".
This could be the processes mailbox.

Could block sender if no buffer available.


Reliable vs Unreliable

Who guarantees delivery: application or system (OS or network)?
Question: How does OS know which messages are requests and which are replies?

Should the reply be acknowledged?


Client Server Packet Types

 

Type From To Description
Request Client Server Service Request
Reply
Server
Client Reply
ACK Either Other ACK previous packet
Are You Alive? Client Server see if crashed
I Am Alive Server Client has not crashed
Try Again Server Client no capacity
Address Unknown Server Client no process

  Last two are needed to distinguish between hard and soft failures

Homework: compare this to WWW client/server protocol. Due one week.

 


Client/server has strong message passing flavor
Like doing I/O(read and write information from network).

Question: why do we need the concept of disk storage at all? why I/O?


Remote Procedure Calls (RPC)

procedure call which transparently executes on remote machine


 RPC:Analogy with system calls which masquerade as procedure calls

client/server request/reply hidden in library stubs


Parameter Marshaling

Could use a canonical form (network standard).
Problem: possibly inefficient between like machines

Could indicate which format used and let server translate if necessary.


But what about pointers?

what about global variables?


Addressing Solution: Dynamic Binding

Question: Why is figure 2-22 stateless? What does that mean? How to make it more like UNIX file services?

When server starts up, it exports its interface to a binder which registers the services provided.

Client stub needs to import first time called to get handle to send message.

Overhead may be a problem


Effect of Failures

  1. Cannot locate server: return error
  2. Request message lost : set timer and resend (watch duplicates)
  3. Reply lost : who notices, is operation idempotent?
  4. server crashes after request : difficult to determine if request was acted upon
  5. client crashes after request

    What about orphans which have initiated other tasks, perhaps at a later time?
    what if orphan has locks on resources?

 

Also how to report failures to client (return codes, exceptions) which may not be there for single CPU system (and hence not allowed for in the procedure spec).


RPC Implementation Issues

See critical path analysis fig 2-27.


RPC problem areas


RPC semantics

Panzeieri and Srivastava Correctness condition

Let Ci be a RPC call and Wi its execution on some machine.

Since the "Wi" can share data, the correctness condition is

C1 -> C2 implies W1 -> W2

where -> means "happens after"

Why not use a message instead of RPC?

What about passing procedures as arguments?


Group Communication

RPC not suitable abstraction


ISIS

ISIS is a synchronous system (events happen sequentially in same order on all machines).

Since events are not instantaneous, interweaving is possible
Two events can be causally related, otherwise concurrent

Virtual synchrony means if two messages are causally related, all processes must receive them in the same (correct) order).


ISIS communications Primitives

 

 

 


Copyright chris wild 1996.
For problems or questions regarding this web contact [Dr. Wild].
Last updated: September 04, 1996.

Copyright chris wild 1996.
For problems or questions regarding this web contact [Dr. Wild].
Last updated: September 04, 1996.

 

 

 


Copyright chris wild 1996.
For problems or questions regarding this web contact [Dr. Wild].
Last updated: August 29, 1996.