CS 771/871 Operating Systems
[ Home | Class
Roster | Syllabus | Status | Glossary
| Search | Course
Notes]
Lecture 2 :
Communications in Distributed Systems
-
Distributed ==> separate computers
communicating
-
System Architecture: Layered Abstract
Machines
Principle: separation of concerns
-
Some Issues:
-
Virtual circuit or
connectionless or both
-
Presentation VS Content
-
Open Systems or Architecture
Specific
-
Multimedia?
Seven Layer Peer-to-Peer protocol suite:
-
-
-
-
-
-
-
Allows different physical networks:
Twisted Pair
Coax
Fiber Optic
-
Defines Physical Properties of bits
(voltage levels, phase shifts, etc)
-
Speed of communication and clocking
-
Physical configuration of connectors.
-
Assembles bits into groups (frames,
packets)
-
Defines start/end of group
-
Protects against errors (e.g. checksum)
-
Assigns sequence numbers (detects lost
frames)
-
Defines control messages for error
correction
Probably adds trailer to message for checksum.
Primarily routing in a wide area network.
Some systems are set manually
Others use adaptive algorithms to reduce
congestion
QUESTION: What are
some of the issues in adaptive routing?
X.25: telephony, connection oriented protocol
IP (Internet Protocol): connectionless
Provides reliable point to point connection
(therefore connection oriented)
ISO provides 5 variants depending on nature of
underlying network and degree of multiplexing.
(DoD has one called TCP (Transmission
Control Protocol) plus connectionless one called UDP (Universal
Datagram Protocol))
Provides synchronization and checkpoints with recovery.
Rarely used
Defines Format of information:
-
Character set
-
number representation
-
message formats
What's left:
ATM (Asynchronous
Transfer Mode)
OSI was developed in 1970's and reflects older
technology.
ATM takes advantage of fast switches and networks.
-
Telephones 4KHz analog channels
-
ARPANET built on 56kbps lines
-
T1 1.5 mbps
-
T3 45 Mbps
-
Evolving 155 Mbps to 1 Gbps (faster than
internal disk drives)
The later speeds imply high speed multiplexing.
Telephone companies: integrate voice and data
Deliema: Voice continuous low bandwidth (circuit
switching)
Data: bursty high bandwidth (packet switching)
ATM: international standard
Virtual Circuit: Route saved in switches:
QUESTION: Why keep route in switches?
Small fixed size blocks called CELLS.
QUESTION: Why small? Why
fixed size?
Cell Switching: multicasting, multiplexing.
Synchronous Continuous Stream: empty cells fill void
Can use SONET (Synchronous Optical
NETwork)
or SDH (Synchronous Digital Hierarchy) used by
telephone companies.
SONET: 9 x 90 byte frame, of 810 bytes, 36 overhead
transmitted every 125microseconds = 51 Mbps (OC-1)
OC-n and OC-nc used to more bandwidth.
ATM uses OC3c (155.520 Mbps) and OC12c (622.080Mbps).
2.5 Gbps coming.
Telephones might use ISDN (64kbps)
Again a compromise:
Europe - small to avoid echo suppressers
Americans - big for efficiency
Result: 48 byte + 5 header
Does not fit nicely into 774 SONET data payload
Header contains (figure 2-5):
-
GFC (4 bits) Generic Flow Control (unused)
-
VPI (8) Virtual Path Id (used for grouping
end-to-end)
-
VCI(16) Virtual channel Id
-
Payload Type(3): data, control
-
CLP(1): priority
-
CRC(8) checksum on header only
VPI/VCI reflect assigned route on call setup and change at
switch to reflect next hop, VPI allows a group of connections
destined for the same place to be rerouted together.
Networks are becoming too fast for typically computer OS
interaction at the cell level.
Adaptation maps packets into cells.
Four classes of traffic:
-
constant bit rate (audio/video)
-
variable but bounded delay
-
Connection Oriented
-
Connectionless data
Computer Industry didn't like and drafted AAL5 (SEAL - Simple
and Efficient Adaptation Layer). Distinguishes last cell which
contains packet length and packet checksum.
Computer connect to switches which can connect to other
switches. virtual circuit sets route in each switch during setup.
Requires fast switching speeds (3microsecond for OC3), with
parallel input and output ports. Problem if two inputs need same
output.
May drop cells but must deliver others in order received.
Can queue but only temporary congestion relieve possible.
Different solutions depending on nature of traffic streams, may
use statistical analysis.
QUESTION: How fast does an OC12c switch
need to be? 2.5Gbps?
High Bandwidth but Physical Delays require rethinking of
protocols for flow control and error handling and bandwidth
utilization.
Asymptotically utilized bandwidth approaches 0 while waiting
for speed of light transmission.
Question: How is this akin to the
length limit on Ethernet?
Flow control may become rate control
(a-priori agreements).
Sliding window protocols leads to low
utilization (see calculation).
Maybe should centralize applications
with keystroke per cell from user to application. Has
architectural implications.
Conclusion:
Potential increase in network bandwidth may not easily be
realized: Active area of research and development.
Some Calculations of effects of
high speed networks
Consider a 1Gbps network connecting
Norfolk and San Francisco.
ATM cells arrive every 56 nanoseconds.
Transmission latency is approx. 2/3 speed of light = 15
milliseconds one way.
Implies there are 15 megabits in the pipeline before the first
bit is received.
What if the receiver cannot buffer this much and rejects?
What if I require an ACK message after every 1,000 bits?
Effective transfer rate is 1 microsecond to stuff bits into pipe,
15 milliseconds transfer latency, less than a microsecond to
stuff ACK back into pipe (assuming no processing delays), another
15 milliseconds transfer latency = 1000 bits every 30.002
milliseconds or 33 bps!
If increase packet to 1,000,000 bits,
transfer rate is about 33Kbs (still a long way from 1 Gbps).
Clearly requiring an ACK frequently
greatly reduces usable bandwidth.
Let's start with an appealing simple model.
Server processes which provide service to client processes.
Communications is request/reply
connectionless and asynchronous.
Requires only three layers: physical, datalink,
request/reply.
Can be implemented with two procedures (send and receive).
Procedure calls hide distributed nature of service (except
perhaps in addressing). Looks like local procedure call. (see
example fig2-9)
There can be different kinds of services provided (another
set of design issues - but outside realm of OS).
Not just client server issues
See Figure 2-14
What is the unit of addressing? machine, process, port,
service?
Question: What is static? what is Dynamic?
Are processes given fixed names (numbers)?
Question: Can I run multiple server
processes? Why would I want to?
Are processes given global names or are they machine
specific?
Question: How to coordinate global names?
What is wrong with machine specific?
How does internet work its addressing for WWW?
What is the permanence of addresses?
If global, how to route to proper machine?
Could use name server?
Question: what if name server needs
to move?
Assigning random addresses?
Question: who assigns?
centralized/distributed
How to rout?
How client know address?
Distinguish: Name of service (dry
cleaners), location (address) of service, and instantiation of
server (clerk behind counter, process running on machine).
Question: what about competing
servers?
Blocking
vs Non-blocking transmission
Also called synchronous/asynchronous.
For both send and receive
Synchronous is easiest to program but
async allows process to do other things while waiting.
Async requires polling or interrupts
(call-backs) programmed into system.
Another complication: timeouts to
handle transmission failures of certain types.
Buffered
vs Non-buffering transmission
Who supplies message buffer and when?
How big is it?
What if message sent before server issues "receive"
call?
What if server handles many clients? How to receive all potential
messages.
kernel could buffer in anticipation call to
"receive".
This could be the processes mailbox.
Could block sender if no buffer available.
Who guarantees delivery: application or system (OS or
network)?
Question: How does OS know which messages
are requests and which are replies?
Should the reply be acknowledged?
Type |
From |
To |
Description |
Request |
Client |
Server |
Service Request |
Reply |
|
Client |
Reply |
ACK |
Either |
Other |
ACK previous packet |
Are You Alive? |
Client |
Server |
see if crashed |
I Am Alive |
Server |
Client |
has not crashed |
Try Again |
Server |
Client |
no capacity |
Address Unknown |
Server |
Client |
no process |
Last two are needed to distinguish between hard and
soft failures
Homework: compare this to WWW client/server protocol. Due
one week.
Client/server has strong message passing flavor
Like doing I/O(read and write information from network).
Question: why do we need the concept of disk storage at all?
why I/O?
Remote Procedure Calls (RPC)
procedure call which transparently
executes on remote machine
- How is normal procedure call
implemented? (Figure 2-17)
- call by value
- by reference
- by copy/restore -
difference from call by reference
- New issues
- different address spaces
(scoping)
- possibly different
architectures
- crashes
RPC:Analogy with system calls which masquerade as procedure
calls
- Client stub is called as normal
procedure
- Assembles parameters into message
to remote server
- Traps to kernel for message
passing
- Server stub receives message
- unpacks parameters from message
- makes normal procedure call on
behalf of client process
- After call, server stub packs
result in message
- Traps to kernel to send reply back
to client
- Client stub receives message
- Unpacks results into output
parameters
- returns as normal procedure call
client/server request/reply hidden in
library stubs
- Different formats
- Different byte orders
- Different data types
Could use a canonical form (network standard).
Problem: possibly inefficient between like machines
Could indicate which format used and let server translate if
necessary.
- Forbid (not transparent)
- Copy object referenced
Question: what about user defined data structures
What if size of structure is unknown
- transfer values as needed
what about global variables?
Question: Why is figure 2-22 stateless? What does that mean?
How to make it more like UNIX file services?
When server starts up, it exports its interface to a binder
which registers the services provided.
- name
- version number (why)
- unique ID (allows several servers)
- handle (ethernet or IP address etc)
- authentication
Client stub needs to import first time called to get handle to
send message.
Overhead may be a problem
- Cannot locate server: return error
- Request message lost : set timer and resend (watch
duplicates)
- Reply lost : who notices, is operation idempotent?
- server crashes after request : difficult to determine if
request was acted upon
- At least once semantics: try and try again
- at most once: don't try again
- exactly once: not possible in general (printing,
bank transfers)
- client crashes after request
- Extermination: client logs
requests on stable storage, orphan killed
what about nested RPCS? network partition
prevents killing
- Reincarnation: Each time client boots it set
epoch number
- Gentle reincarnation: try locate client first
- Expiration: request expire restart must be longer
than expiration period
What about orphans which have initiated other tasks,
perhaps at a later time?
what if orphan has locks on resources?
Also how to report failures to client (return codes,
exceptions) which may not be there for single CPU system (and
hence not allowed for in the procedure spec).
- connection or connectionless
- general purpose protocol or not
- stop-n-wait (acks) or blast or selective repeat
- flow control (overrun errors)
- What about ACKing replies?
- copying between user and kernel spaces can be a big
factor
remember that each layer adds its own headers
might be able to able to use virtual memory hardware to
avoid copy
- Timer Management: need not be precise (could be polled by
kernel)
See critical path analysis fig 2-27.
- Global Variables
- weakly typed languages (C allows unbounded parameters to
be passed).
- Complex data structures with pointers
- Printf in C
- Given file servers for read and write, what are pipes?
could have read (only) servers or write only but ends of
pipe are problems
- terminal servers sometimes want to interrupt client
- What about group servers? (for fault tolerance)
- Addressing supported by network
- Multicasting - problem if cycles possible
- broadcasting
- unicasting
- predicate addressing (e.g. look for idle
machines)
- Closed vs Open groups
- Peer vs Hierarchical
- Membership services
- lost members
- late join
- reforming groups
- atomicity
- message ordering (need global time ordering)
RPC not suitable abstraction
ISIS is a synchronous system (events happen sequentially in
same order on all machines).
Since events are not instantaneous, interweaving is possible
Two events can be causally related, otherwise concurrent
Virtual synchrony means if two
messages are causally related, all processes must receive them in
the same (correct) order).
- ABCAST: loose synchrony
- Sender A assign time stamp (monotonically
increasing number)
- each receiver picks own timestamp greater than
any previous one sends to A
- A selects max of those received and send it in
Commit message
- CBCAST: virtual synchrony
- each process contains last message number
received form all other processes.
- This vector is incremented in the process's own
location and sent with message
- processes can compare its own vector against sent
vector to determine if any messages received by
other processes are pending. (figure 2-38).
Copyright chris wild 1996.
For problems or questions regarding this web contact [Dr. Wild].
Last updated: September 04, 1996.
Copyright chris wild 1996.
For problems or questions regarding this web contact [Dr. Wild].
Last updated: September 04, 1996.