15.01.2013 Views

U. Glaeser

U. Glaeser

U. Glaeser

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Message-passing IPC shares many characteristics from the single-machine case. Bytes are gathered into an<br />

explicit message. Instead of merely being transferred from one buffer to another on the same machine, the<br />

message goes over the network. Failure of the network or the receiver introduces new problems. The issue of<br />

how long to block the sender is also different, since the time required to confirm delivery of the message in<br />

the remote case can be much longer than in the local case. Issues of message addressing must also be considered.<br />

In a single-machine system, all addressable processes are locally known. In a distributed system, some facility<br />

must be provided to allow the local process to discover the addressable names of remote processes and to send<br />

messages to those names.<br />

Remote procedure call (RPC) faces similar challenges. In a single machine, remote procedures aren’t that<br />

remote. Although they may have a different address space, the local operating system has access to all necessary<br />

facilities of both the calling and called processes. In distributed systems, the caller is on one machine and<br />

the called process on another. Further, the actual transfer of data must take place via messages crossing the<br />

network. One implication is that call-by-reference must be translated to call-by-return-value. Other complexities<br />

exist, including some similar to those for message passing. Another issue is handling partial failures.<br />

Either the caller or the called process can fail independently of the other, requiring the operating system on<br />

the surviving machine to recover.<br />

Shared memory is the hardest common IPC mechanism to provide in distributed operating systems,<br />

because it relies most heavily on hardware characteristics not present in the distributed environment.<br />

Early distributed operating systems made no attempt to provide shared memory across the network;<br />

however, as LANs became more capable, researchers tackled the difficult problems of providing the<br />

semantics of shared memory across the network. This problem spawned vast amounts of research, which<br />

will not be covered in detail here. A slightly closer look at the concept of distributed shared memory will<br />

reveal why this research was necessary.<br />

As before, the distributed system can only communicate via messages. Yet the distributed operating system<br />

must provide the illusion that two processes on different machines are sharing a single piece of physical<br />

memory. The basic approach is to give each process access to a local copy of the memory, then have the<br />

operating systems work behind the scenes to provide a consistent view between the processes. Another<br />

approach is to migrate the memory segments between machines, as needed. This approach can run into<br />

difficulties if processes frequently access the same memory locations. Also, because the overheads of handling<br />

shared memory at the word level are too extreme, distributed shared memory systems must aggregate words<br />

into shared blocks. If the aggregation is too large, false sharing occurs, where one process accesses the first<br />

part of a block while another process accesses the second part. Because the two parts are aggregated into a<br />

single block, the block must migrate back and forth, despite no actual commonly accessed memory locations.<br />

Alternately, memory segments can be replicated. Doing so leads to problems when writes occur. Either<br />

the other copies of the segment must be updated (before they are accessed again), or they must be invalidated.<br />

Either approach requires much bookkeeping and incurs overheads when writes occur. False sharing effects<br />

can also play a role here, since writing to the first word of a block tends to invalidate or cause updates to<br />

the entire block.<br />

Much inventive research has been performed on distributed shared memory, using various techniques<br />

to overcome its challenges. Although distributed shared memory has been demonstrated to be feasible, its<br />

performance, complexity, and limitations have prevented it from becoming popular. Few systems today<br />

provide this facility. Research continues on distributed shared memory, but not as widely as in the past.<br />

Naming Services<br />

Names play an important role in operating systems. Many operating systems support several distinct<br />

name spaces for different purposes. For example, one name space might describe the file system, while<br />

another describes the processes running on a machine, and a third describes the users permitted to<br />

work on the machine. One legacy of Unix systems is that the file name space is used aggressively to<br />

provide name spaces for things that are not classically files, such as devices and interprocess communication<br />

services. (One distributed operating system, Plan 9, relies on this abstraction for all its naming<br />

needs [4].)<br />

© 2002 by CRC Press LLC

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!