Lecture 4

Fall 2000: CS 771/871 Operating Systems

Lecture 4 - Distributed OS

What are distributed systems?

A distributed system is a collection of independent computers that appear to the users of the system as a single computer (tanenbaum).

Distributed System should:

control network resource allocation to allow their use in the most effective way
Provide convenient virtual machine
hide distribution of resources
provide protection
provide secure communication (Goscinski)

This is in contrast to centralized computing (timesharing) or independent PC's.

Distributed systems do not share memory or a clock.

Why study them? Because of perceived advantages:: better price/performance (PCs are cheap, supercomputers are not)
improved speed (Speed of light is limit, many operations in parallel)
naturally supports distributed application (banking); increased reliability (failures can be isolated); incremental growth (keep old machines, just add more); data sharing (computer supported cooperative work - or games); device sharing (printers, scanners, plotters are expensive); communications (support groups of people working together); flexibility (match load to idle machines more easily).

But distributed systems have their disadvantages, chief among them is the complexity and relative unavailability of <<SOFTWARE>>. Most of the problems of non-distributed systems still exist, but we have added many new problems to solve in order to achieve some of the perceived advantages among them are network congestion and rerouting and security.

Also most distributed systems complicate the job of the user by forcing them to be aware of various aspects of the distributed system (Don't you love URL's).

Software Concepts

The job of the operating system is to mold incalcitrant hardware into a beautiful virtual machine.

One distinction is based on the degree of autonomy between processors (tightly vs loosely coupled).

Combinations of hardware and software

Network OS: high degree of autonomy, possible different operating systems, few system wide resources (printers, network file system). Client Server protocols. User aware of distribution of resources.
RLOGIN, RSH, SETENV DISPLAY, FTP
are examples of explicit user actions which reflect lack of transparency of location of resources.
On the other hand, NFS makes the location of your files transparent within the network complex.
True Distributed OS: single system image presented to user: virtual uniprocessor. Note how this contrasts with traditional timesharing systems which make a single CPU look like many virtual CPUs. Requires:
- Global interprocess communications
- global protection
- global process management
- transparent distributed file access
Multiprocessor Timesharing OS: tightly coupled. common ready queue. Shared memory, file system is like single CPU version, possible specialization of processors.
QUESTION: why must scheduler run in a critical section?
QUESTION: What else must be run in a critical section?

See table in figure 1.12

Another way to look at differences is to consider the traditional hierarchical structure of a centralized OS:

File Management
I/O Device Management
Memory Management
Process Management

Now consider a network of resources consisting of

file servers
printers
plotters
scanners
name servers
personal computers or workstations
processors

Now consider different placements of the InterProcess Communications (IPC) module within the traditional hierarchy.
If between 1 and 2, then File Service can be provided remotely and transparently.
If between 2 and 3, then shared remote devices are supported transparently.
If between 3 and 4, then shared memory
If integrated into 4, true distributed OS.

NOTE: one can make access to remote resources appear transparent by adding software above the OS. This is particularly easy if the OS has a light weight kernel and exports many of the management to user level process (like file management, I/O management). then these modules can utilize the network to access remote resources.

One of the earliest attempts at a network OS was the National Software Works, undertaken in the middle 70's.
Consisted of heterogeneous computers connected by ARPANET. Implementation was entirely at the application level (reminds one of internet and web browsers, search engines, etc). But developed a IPC which provided common functionality on diverse systems. Dealt with addressing (naming) problems which is still a big issue in distributed systems.

This early attempt had performance problems.

Other early attempts were based on remote procedure calls (RPC) built on top of a centralized OS with network access (figure 2-6 Goscinski).
Consisted of the following steps:

User Process communicates using provided IPC to local Remote Access System (RAS) with a request.
Local RAS sends request to an appropriate Remote RAS.
Remote RAS acknowledges to local RAS
local RAS transmits acknowledgment to user process with information to set up direct communication path to remote RAS.
User process sends pertinent data to remote RAS.
which access appropriate resource on remote system
sends acknowledge to user process
remote RAS awaits completion of request
and send acknowledge back to user process.

Consider some of the design tradeoffs.
QUESTION: what other ways to solve?
QUESTION: what are the design issues?

The newcastle connection was an early attempt at developing a network OS based on the UNIX OS. Used an extension of the UNIX hierarchical file naming structure to tie different systems together (loosely coupled).
Draw out the naming structure.
Replaced library routines between user and kernel levels. this intermediate level communicated with other using RPC.
Because all processes are subjected to intermediate processing for kernel requests, slows down everybody.

There are no widely used commercially available distributed OS today, although there are many networked OS.

So why study? Because the design issues are important in general and the trend is towards more distributed systems (LAPLINK, mobile computing). It is the future and parts of it are already here.

Design Issues:

Transparency: Hide underlying distributed implementation. (easier to do at the user level than the programming level). Different levels of transparency:
1. Location: location of resources unknown
2. Migration: resources can move without changing their names
3. Replication: number of copies unknown (cache)
4. Concurrency: share resources automatically and unobtrusively
5. Parallelism: programmer unaware of parallel activity on his behalf
Flexibility: Unresolved issue: traditional vs micro kernel. Microkernel utilizes servers which provide higher level OS functions. Can customize a set of OS functions to application. For example, different file systems (DOS, UNIX, MAC) could be provided services.
Reliability: If probability of failure of a single CPU is p, then the probability of n CPUs failing at the same time is p**n. For example, if probability of failure is .1 for a single CPU, then the probability that 3 CPUs fail simultaneously is .001. Replicated hardware is the key to the reliability of mission critical systems.
In practice failures may not be independent and in fact if there are interdependencies between components, the reliability may even decrease. (in the worst case it is the probability that at least one component fails - consider a pipeline architecture which requires all CPUs to be working to get anything done. Then the probability that all CPUs are working is (1-p)**n which for our example is .729.
Availability: is the fraction of time the system is usable.
Reliability also includes data integrity and security concerns. (copies of key files increases availability but may compromise the integrity of the data).
Fault Tolerance: is the ability to provide service even in the face of system failures.
Performance: various metrics, some end user oriented (response time), others resource oriented (throughput, utilization). Raw numbers (speed of CPU or network) are often misleading. Benchmarks are frequently used to compare systems. But makeup of benchmark is application specific.
Communications is typically the bottleneck.
For parallelism to work must consider the appropriate grain size.
Scalability: From LANs to Internet. from home PCs to smart telephones. Scalability usually implies eliminating all centralized resource handling.
Distributed algorithms are distinguished by:
- No machine has complete information of system state
- machines make decisions based on local information
- failure of one machine will not cause system failure
- no global clock assumed

One of the fundamental problems in distributed OS is lack of global state and up-to-date information.

Issues in distributed systems

Global knowledge: at any particular moment, one processes knowledge of the global state will be out of date.
Possible leading to inconsistency and erroneous actions.
- Must operate in face of dated knowledge
- Must operate without global clocks.
- Arriving at consensus is difficult if not impossible (in general)
Naming: several problems
- How to create a universally unique name
- How to make that name location transparent
- How do you find where the resource with the name is currently located
- What is the meaning of a copy?
Compatibility:
- binary level (same instruction sets)
- execution level (object code same)
- protocol level (ftp, rpc, http, rmi, soap)
Process Synchronization:
Resource Management: location transparency
- Data Migration
- computational migration
- distributing scheduling
Security:
- Authentification
- Authorization
Architecture:
- monolithic kernel
- collective kernel
- object oriented OS
- client server

Communications Primitives

Communications in Distributed Systems

Distributed ==> separate computers communicating
System Architecture: Layered Abstract Machines
Principle: separation of concerns
Some Issues:
- Virtual circuit or connectionless or both
- Presentation VS Content
- Open Systems or Architecture Specific
- Multimedia?

ISO Open Systems Interconnection Reference Model

OSI Physical Layer

Allows different physical networks:
Twisted Pair
Coax
Fiber Optic

Defines Physical Properties of bits (voltage levels, phase shifts, etc)
Speed of communication and clocking
Physical configuration of connectors.

OSI DataLink Layer

Assembles bits into groups (frames, packets)
Defines start/end of group
Protects against errors (e.g. checksum)
Assigns sequence numbers (detects lost frames)
Defines control messages for error correction

Probably adds trailer to message for checksum.

OSI Network Layer

Primarily routing in a wide area network.

Some systems are set manually

Others use adaptive algorithms to reduce congestion

QUESTION: What are some of the issues in adaptive routing?

X.25: telephony, connection oriented protocol

IP (Internet Protocol): connectionless

OSI Transport Layer

Provides reliable point to point connection (therefore connection oriented)

ISO provides 5 variants depending on nature of underlying network and degree of multiplexing.

(DoD has one called TCP (Transmission Control Protocol) plus connectionless one called UDP (Universal Datagram Protocol))

OSI Session Layer

Provides synchronization and checkpoints with recovery.

Rarely used

OSI Presentation Layer

Defines Format of information:

Character set
number representation
message formats

OSI Application Layer

What's left:

FTP
E-mail
telnet

ATM (Asynchronous Transfer Mode)

OSI was developed in 1970's and reflects older technology.
ATM takes advantage of fast switches and networks.

Telephones 4KHz analog channels
ARPANET built on 56kbps lines
T1 1.5 mbps
T3 45 Mbps
Evolving 155 Mbps to 1 Gbps (faster than internal disk drives)

The later speeds imply high speed multiplexing.

ATM2

Telephone companies: integrate voice and data

Deliema: Voice continuous low bandwidth (circuit switching)
Data: bursty high bandwidth (packet switching)

ATM: international standard
Virtual Circuit: Route saved in switches:
QUESTION: Why keep route in switches?

Small fixed size blocks called CELLS.
QUESTION: Why small? Why fixed size?

Cell Switching: multicasting, multiplexing.

ATM Reference Model

Physical Layer
ATM Layer
Adaptation Layer
Upper Layers

ATM Physical Layer

Synchronous Continuous Stream: empty cells fill void

Can use SONET (Synchronous Optical NETwork)
or SDH (Synchronous Digital Hierarchy) used by telephone companies.

SONET: 9 x 90 byte frame, of 810 bytes, 36 overhead
transmitted every 125microseconds = 51 Mbps (OC-1)

OC-n and OC-nc used to more bandwidth.

ATM uses OC3c (155.520 Mbps) and OC12c (622.080Mbps).
2.5 Gbps coming.

Telephones might use ISDN (64kbps)

ATM Layer

Again a compromise:
Europe - small to avoid echo suppressers
Americans - big for efficiency
Result: 48 byte + 5 header

Does not fit nicely into 774 SONET data payload

Header contains (figure 2-5):

GFC (4 bits) Generic Flow Control (unused)
VPI (8) Virtual Path Id (used for grouping end-to-end)
VCI(16) Virtual channel Id
Payload Type(3): data, control
CLP(1): priority
CRC(8) checksum on header only

VPI/VCI reflect assigned route on call setup and change at switch to reflect next hop, VPI allows a group of connections destined for the same place to be rerouted together.

ATM Adaptation Layer

Networks are becoming too fast for typically computer OS interaction at the cell level.
Adaptation maps packets into cells.
Four classes of traffic:

constant bit rate (audio/video)
variable but bounded delay
Connection Oriented
Connectionless data

Computer Industry didn't like and drafted AAL5 (SEAL - Simple and Efficient Adaptation Layer). Distinguishes last cell which contains packet length and packet checksum.

ATM Switching

Computer connect to switches which can connect to other switches. virtual circuit sets route in each switch during setup.

Requires fast switching speeds (3microsecond for OC3), with parallel input and output ports. Problem if two inputs need same output.
May drop cells but must deliver others in order received.
Can queue but only temporary congestion relieve possible.
Different solutions depending on nature of traffic streams, may use statistical analysis.

QUESTION: How fast does an OC12c switch need to be? 2.5Gbps?

ATM Implications

High Bandwidth but Physical Delays require rethinking of protocols for flow control and error handling and bandwidth utilization.

Asymptotically utilized bandwidth approaches 0 while waiting for speed of light transmission.

Question: How is this akin to the length limit on Ethernet?

Flow control may become rate control (a-priori agreements).

Sliding window protocols leads to low utilization (see calculation).

Maybe should centralize applications with keystroke per cell from user to application. Has architectural implications.

Conclusion: Potential increase in network bandwidth may not easily be realized: Active area of research and development.

Some Calculations of effects of high speed networks

Consider a 1Gbps network connecting Norfolk and San Francisco.
ATM cells arrive every 56 nanoseconds.
Transmission latency is approx. 2/3 speed of light = 15 milliseconds one way.
Implies there are 15 megabits in the pipeline before the first bit is received.
What if the receiver cannot buffer this much and rejects?
What if I require an ACK message after every 1,000 bits?
Effective transfer rate is 1 microsecond to stuff bits into pipe, 15 milliseconds transfer latency, less than a microsecond to stuff ACK back into pipe (assuming no processing delays), another 15 milliseconds transfer latency = 1000 bits every 30.002 milliseconds or 33 bps!

If increase packet to 1,000,000 bits, transfer rate is about 33Kbs (still a long way from 1 Gbps).

Clearly requiring an ACK frequently greatly reduces usable bandwidth.

Client Server Models

Let's start with an appealing simple model.
Server processes which provide service to client processes.

Communications is request/reply
connectionless and asynchronous.

Requires only three layers: physical, datalink, request/reply.

Can be implemented with two procedures (send and receive).

Procedure calls hide distributed nature of service (except perhaps in addressing). Looks like local procedure call. (see example fig2-9)

There can be different kinds of services provided (another set of design issues - but outside realm of OS).

Client Server Issues

Not just client server issues

Addressing
Blocking
Buffering
Reliability

Addressing

What is the unit of addressing? machine, process, port, service?
Question: What is static? what is Dynamic?

Are processes given fixed names (numbers)?
Question: Can I run multiple server processes? Why would I want to?

Are processes given global names or are they machine specific?
Question: How to coordinate global names?
What is wrong with machine specific?
How does internet work its addressing for WWW?
What is the permanence of addresses?
If global, how to route to proper machine?

Could use name server?
Question: what if name server needs to move?

Assigning random addresses?
Question: who assigns? centralized/distributed
How to rout?
How client know address?

Distinguish: Name of service (dry cleaners), location (address) of service, and instantiation of server (clerk behind counter, process running on machine).
Question: what about competing servers?

Blocking vs Non-blocking transmission

Also called synchronous/asynchronous.

For both send and receive

Synchronous is easiest to program but async allows process to do other things while waiting.

Async requires polling or interrupts (call-backs) programmed into system.

Another complication: timeouts to handle transmission failures of certain types.

Buffered vs Non-buffering transmission

Who supplies message buffer and when?
How big is it?
What if message sent before server issues "receive" call?
What if server handles many clients? How to receive all potential messages.

kernel could buffer in anticipation call to "receive".
This could be the processes mailbox.

Could block sender if no buffer available.

Reliable vs Unreliable

Who guarantees delivery: application or system (OS or network)?
Question: How does OS know which messages are requests and which are replies?

Should the reply be acknowledged?

Client Server Packet Types

Type

From

Description

Request

Client

Server

Service Request

Server

Client

ACK

Either

Other

ACK previous packet

Are You Alive?

Client

Server

see if crashed

I Am Alive

Server

Client

has not crashed

Try Again

Server

Client

no capacity

Address Unknown

Server

Client

no process

Last two are needed to distinguish between hard and soft failures

Homework: compare this to WWW client/server protocol. Due one week.

Client/server has strong message passing flavor
Like doing I/O(read and write information from network).

Question: why do we need the concept of disk storage at all? why I/O?

Remote Procedure Calls (RPC)

procedure call which transparently executes on remote machine

How is normal procedure call implemented? (Figure 2-17)
- call by value
- by reference
- by copy/restore - difference from call by reference
New issues
- different address spaces (scoping)
- possibly different architectures
- crashes

RPC:Analogy with system calls which masquerade as procedure calls

Client stub is called as normal procedure
Assembles parameters into message to remote server
Traps to kernel for message passing
Server stub receives message
unpacks parameters from message
makes normal procedure call on behalf of client process
After call, server stub packs result in message
Traps to kernel to send reply back to client
Client stub receives message
Unpacks results into output parameters
returns as normal procedure call

client/server request/reply hidden in library stubs

Parameter Marshaling

Different formats
Different byte orders
Different data types

Could use a canonical form (network standard).
Problem: possibly inefficient between like machines

Could indicate which format used and let server translate if necessary.

But what about pointers?

Forbid (not transparent)
Copy object referenced
Question: what about user defined data structures
What if size of structure is unknown
transfer values as needed

what about global variables?

Addressing Solution: Dynamic Binding

Question: Why is figure 2-22 stateless? What does that mean? How to make it more like UNIX file services?

When server starts up, it exports its interface to a binder which registers the services provided.

name
version number (why)
unique ID (allows several servers)
handle (ethernet or IP address etc)
authentication

Client stub needs to import first time called to get handle to send message.

Overhead may be a problem

Effect of Failures

Cannot locate server: return error
Request message lost : set timer and resend (watch duplicates)
Reply lost : who notices, is operation idempotent?
server crashes after request : difficult to determine if request was acted upon
- At least once semantics: try and try again
- at most once: don't try again
- exactly once: not possible in general (printing, bank transfers)
client crashes after request
- Extermination: client logs requests on stable storage, orphan killed
  what about nested RPCS? network partition prevents killing
- Reincarnation: Each time client boots it set epoch number
- Gentle reincarnation: try locate client first
- Expiration: request expire restart must be longer than expiration period
What about orphans which have initiated other tasks, perhaps at a later time?
what if orphan has locks on resources?

Also how to report failures to client (return codes, exceptions) which may not be there for single CPU system (and hence not allowed for in the procedure spec).

RPC Implementation Issues

connection or connectionless
general purpose protocol or not
stop-n-wait (acks) or blast or selective repeat
flow control (overrun errors)
What about ACKing replies?
copying between user and kernel spaces can be a big factor
remember that each layer adds its own headers
might be able to able to use virtual memory hardware to avoid copy
Timer Management: need not be precise (could be polled by kernel)

See critical path analysis fig 2-27.

RPC problem areas

Global Variables
weakly typed languages (C allows unbounded parameters to be passed).
Complex data structures with pointers
Printf in C
Given file servers for read and write, what are pipes?
could have read (only) servers or write only but ends of pipe are problems
terminal servers sometimes want to interrupt client
What about group servers? (for fault tolerance)

RPC semantics

At least once: if succeeds, at least one machine executed the call
Exactly once: is succeeds, exactly one machine executed it
At most once: no side effects allowed on abnormal termination

Panzeieri and Srivastava Correctness condition

Let Ci be a RPC call and Wi its execution on some machine.

Since the "Wi" can share data, the correctness condition is

C1 -> C2 implies W1 -> W2

where -> means "happens after"

Why not use a message instead of RPC?

What about passing procedures as arguments?

Group Communication

Addressing supported by network
- Multicasting - problem if cycles possible
- broadcasting
- unicasting
- predicate addressing (e.g. look for idle machines)
Closed vs Open groups
Peer vs Hierarchical
Membership services
lost members
late join
reforming groups
atomicity
message ordering (need global time ordering)

RPC not suitable abstraction

ISIS

ISIS is a synchronous system (events happen sequentially in same order on all machines).

Since events are not instantaneous, interweaving is possible
Two events can be causally related, otherwise concurrent

Virtual synchrony means if two messages are causally related, all processes must receive them in the same (correct) order).

ISIS communications Primitives

ABCAST: loose synchrony
- Sender A assign time stamp (monotonically increasing number)
- each receiver picks own timestamp greater than any previous one sends to A
- A selects max of those received and send it in Commit message
CBCAST: virtual synchrony
- each process contains last message number received form all other processes.
- This vector is incremented in the process's own location and sent with message
- processes can compare its own vector against sent vector to determine if any messages received by other processes are pending. (figure 2-38).

Fall 2000: CS 771/871 Operating Systems

Lecture 4 - Distributed OS

Software Concepts

Combinations of hardware and software

Design Issues:

Issues in distributed systems

Communications Primitives

Communications in Distributed Systems

Distributed ==> separate computers communicating

System Architecture: Layered Abstract Machines Principle: separation of concerns

Some Issues:

Virtual circuit or connectionless or both

Presentation VS Content

Open Systems or Architecture Specific

Multimedia?

Seven Layer Peer-to-Peer protocol suite:

Allows different physical networks: Twisted Pair Coax Fiber Optic

Defines Physical Properties of bits (voltage levels, phase shifts, etc)

Speed of communication and clocking

Physical configuration of connectors.

Assembles bits into groups (frames, packets)

Defines start/end of group

Protects against errors (e.g. checksum)

Assigns sequence numbers (detects lost frames)

Defines control messages for error correction

Probably adds trailer to message for checksum.

Primarily routing in a wide area network.

Some systems are set manually

Others use adaptive algorithms to reduce congestion

QUESTION: What are some of the issues in adaptive routing?

X.25: telephony, connection oriented protocol

IP (Internet Protocol): connectionless

Provides reliable point to point connection (therefore connection oriented)

ISO provides 5 variants depending on nature of underlying network and degree of multiplexing.

(DoD has one called TCP (Transmission Control Protocol) plus connectionless one called UDP (Universal Datagram Protocol))

Provides synchronization and checkpoints with recovery.

Rarely used

Defines Format of information:

Character set

number representation

message formats

What's left:

FTP

E-mail

telnet

ATM (Asynchronous Transfer Mode)

OSI was developed in 1970's and reflects older technology. ATM takes advantage of fast switches and networks.

Telephones 4KHz analog channels

ARPANET built on 56kbps lines

T1 1.5 mbps

T3 45 Mbps

Evolving 155 Mbps to 1 Gbps (faster than internal disk drives)

The later speeds imply high speed multiplexing.

Telephone companies: integrate voice and data

Deliema: Voice continuous low bandwidth (circuit switching) Data: bursty high bandwidth (packet switching)

ATM: international standard Virtual Circuit: Route saved in switches: QUESTION: Why keep route in switches?

Small fixed size blocks called CELLS. QUESTION: Why small? Why fixed size?

Cell Switching: multicasting, multiplexing.

Upper Layers

Synchronous Continuous Stream: empty cells fill void

Can use SONET (Synchronous Optical NETwork) or SDH (Synchronous Digital Hierarchy) used by telephone companies.

SONET: 9 x 90 byte frame, of 810 bytes, 36 overhead transmitted every 125microseconds = 51 Mbps (OC-1)

OC-n and OC-nc used to more bandwidth.

ATM uses OC3c (155.520 Mbps) and OC12c (622.080Mbps). 2.5 Gbps coming.

Telephones might use ISDN (64kbps)

Again a compromise: Europe - small to avoid echo suppressers Americans - big for efficiency Result: 48 byte + 5 header

Does not fit nicely into 774 SONET data payload

Header contains (figure 2-5):

GFC (4 bits) Generic Flow Control (unused)

VPI (8) Virtual Path Id (used for grouping end-to-end)

VCI(16) Virtual channel Id

Payload Type(3): data, control

CLP(1): priority

CRC(8) checksum on header only

VPI/VCI reflect assigned route on call setup and change at switch to reflect next hop, VPI allows a group of connections destined for the same place to be rerouted together.

Networks are becoming too fast for typically computer OS interaction at the cell level. Adaptation maps packets into cells. Four classes of traffic:

constant bit rate (audio/video)

variable but bounded delay

Connection Oriented

Connectionless data

System Architecture: Layered Abstract Machines
Principle: separation of concerns

Allows different physical networks:
Twisted Pair
Coax
Fiber Optic

OSI was developed in 1970's and reflects older technology.
ATM takes advantage of fast switches and networks.

Deliema: Voice continuous low bandwidth (circuit switching)
Data: bursty high bandwidth (packet switching)

ATM: international standard
Virtual Circuit: Route saved in switches:
QUESTION: Why keep route in switches?

Small fixed size blocks called CELLS.
QUESTION: Why small? Why fixed size?

Can use SONET (Synchronous Optical NETwork)
or SDH (Synchronous Digital Hierarchy) used by telephone companies.

SONET: 9 x 90 byte frame, of 810 bytes, 36 overhead
transmitted every 125microseconds = 51 Mbps (OC-1)

ATM uses OC3c (155.520 Mbps) and OC12c (622.080Mbps).
2.5 Gbps coming.

Again a compromise:
Europe - small to avoid echo suppressers
Americans - big for efficiency
Result: 48 byte + 5 header

Networks are becoming too fast for typically computer OS interaction at the cell level.
Adaptation maps packets into cells.
Four classes of traffic:

Let's start with an appealing simple model.
Server processes which provide service to client processes.

Communications is request/reply
connectionless and asynchronous.

What is the unit of addressing? machine, process, port, service?
Question: What is static? what is Dynamic?

Are processes given fixed names (numbers)?
Question: Can I run multiple server processes? Why would I want to?

Are processes given global names or are they machine specific?
Question: How to coordinate global names?
What is wrong with machine specific?
How does internet work its addressing for WWW?
What is the permanence of addresses?
If global, how to route to proper machine?

Could use name server?
Question: what if name server needs to move?

Assigning random addresses?
Question: who assigns? centralized/distributed
How to rout?
How client know address?

Distinguish: Name of service (dry cleaners), location (address) of service, and instantiation of server (clerk behind counter, process running on machine).
Question: what about competing servers?

Who supplies message buffer and when?
How big is it?
What if message sent before server issues "receive" call?
What if server handles many clients? How to receive all potential messages.

kernel could buffer in anticipation call to "receive".
This could be the processes mailbox.

Who guarantees delivery: application or system (OS or network)?
Question: How does OS know which messages are requests and which are replies?