Topics related to distributed systems
Hardware Concepts
Software Concepts
Design Issues
Kernels
memory management
Distributed shared memory
Transaction management
Communication
IPC
Real-time distributed
systems
Processes
process management
Naming
Synchronization
distributed synchronization
Concurrent Computing
Concurrent Processes: Basic Issues
Threads
Consistency
Replication
Fault tolerance
Fault-tolerant distributed systems
ATM networks
File systems
Distributed File Systems
Distributed file systems (with NFS and X.500)
Distributed Object-Based Systems
Object-based operating systems
Distributed Document-Based Systems
Distributed Coordination-Based Systems
Security
distributed
security
Middleware models.
How distributed systems
are designed and implemented in real systems.
Case studies of Distributed Operating Systems
Amoeba
Amoeba distributed operating system
2K A Component-Based Network-Centric Operating
System for the Next Millennium
E1 Distributed Operating System
Clouds
Mach
Chorus
JavaOS™
OSF/DCE.
DISTRIBUTED AND NETWORK OPERATING SYSTEMS
This section lists distributed and network operating systems, those designed
to provide common control for a set of computers communicating through a
network. Network operating systems are considered here to be those which
provide support for networking and remote resource access, often by a separate
layer of software on top of a conventional OS. Distributed operating systems
strive for a high degree of transparency and often support data and process
migration.
Distributed systems intended primarily for real-time applications are
listed in the real-time section. Distributed systems for shared-memory
multiprocessors are listed in the multiprocessor section.
Network Operating Systems
- ACCENT
- Network OS kernel developed at Carnegie-Mellon U. for the PERQ workstation.
Early 1980s [Rashid & Robertson 1981].
- BOS/NET
- Multitasking, multiprocessing version of BOS/5.
- COCANET UNIX
- A local network operating system based on UNIX, developed for the COCANET
local area network at U.C. Berkeley. Early 1980s [Rowe & Birman 1982].
- CP/NET
- Networking version of CP/M. Digital Research, Early 1980s [Kildall
1981, Rolander 1981].
- CP/NOS
- A memory-resident, diskless version of CP/NET. Digital Research, Early
1980s [Kildall 1981].
- HetNOS
- LahNOS
- MP/NET
- Version of MP/M with networking facilities. Digital Research, Early
1980s [Kildall 1981].
- MP/NOS
- Memory-resident, diskless version of MP/NET. Digital Research, Early
1980s [Kildall 1981].
- NetWare
- Network OS for Local Area Network and server control by Novell.
- Newcastle Connection
- A network OS layer for UNIX systems providing transparent distributed
access. Early 1980s [Brownbridge et al 1982].
- NSW
- National Software Works. Late 1970s [Millstein 1977].
- PC/NOS
- Network OS for MS-DOS or CP/M. Applied Intelligence [Row & Daugherty
1984].
- RIO/CP
- Network operating system for the ZNET. Late 1970s [Zarella 1981].
- RSEXEC
- Network OS for the ARPANET, based principally on TENEX. Early 1970s
[Thomas 1973].
- TRIX
- A network oriented OS. Late 1970s [Ward 1980].
- uNETix
- Network OS for the 8086, 68000, & 16032 families. Multitasking
with transparent remote file access, load balancing, and multiple windows.
UNIX and PC-DOS compatible. Lantech Systems, Mid-1980s [Foster 1984].
- Open Distributed
Systems.
Distributed Operating Systems
Distributed operating systems differ from network operating systems in
supporting a transparent view of the entire network, in which users normally
do not distinguish local resources from remote resources.
- AEGIS
- OS for the Apollo DOMAIN Distributed system. Early 1980s.
- AMOEBA
- A distributed OS based partly on UNIX. Based on passive data objects
protected by encrypted capabilities. 1980s [Tanenbaum & Mullender 1981,
Mullender & Tanenbaum 1986].
- Arachne
- A distributed operating system developed at the U. of Wisconsin. Late
1970s [Finkel 1980].
- Charlotte
- Distributed OS for the Crystal Multicomputer project at the U. of Wisconsin.
Explores coarse-grained parallelism without shared memory for computationally
intensive tasks. 1980s [Finkel et al 1989].
- CHOICES
- Distributed, object-oriented OS featuring a high degree of customization.
U. of Idaho, 1990s [Campbell et al 1993].
- Clouds
- A distributed object-based operating system developed at Georgia Institute
of Technology. Early 1990s. [DasGupta 1991]
- CMDS
- The Cambridge Model Distributed System. U. of Cambridge (England).
Late 1970s [Wilkes & Needham 1980].
- CONDOR
- A distributed OS described as a "hunter of idle workstations,"
which distributes large computationally intensive jobs among available
processors in a workstation pool. U. Wisconsin at Madison, 1980s [Litzkow
1988].
- Cronus
- Object-oriented distributed computing system for heterogenous environments.
BBN Systems, 1980s [Schantz et al 1986].
- DEMOS/MP
- A distributed version of the DEMOS operating system. Message-based,
featuring process migration. U.C. Berkeley, early 1980s [Miller et al
1984].
- DISTOS
- A Distributed OS for a network of 68000s.
- DISTRIX
- Message-based distributed version of Unix. Early 1980s.
- DUNIX
- A distributed version of UNIX developed at Bell Labs. late 1980s [xxx
1988].
- Eden
- A distributed object-oriented OS at the U. of Washington, based on
an integrated distributed network of bit-mapped workstations. Capability-based.
Early 1980s [Almes et al 1985].
- Galaxy
- A distributed UNIX-compatible system featuring multi-level IPC and
variable-weight processes. Univ. of Tokyo, late 1980s [Sinha et al 1991].
- LOCUS
- Distributed OS based on UNIX. Mid 1980s. [Popek & Walker, 1985].
- MDX
- MICROS
- Distributed OS for MICRONET, a reconfigurable network computer. Late
1970s [Wittie & van Tilborg 1980].
- MOS
- An early version of MOSIX. Controls four linked PDP-11s. Mid 1980s
[Barak & Litman 1985].
- MOSIX
- A distributed version of UNIX supporting full transparency and dynamic
process migration for load balancing. Developed at the Hebrew U. of Jersusalem.
Mid 1980's to 1990's [Barak et al 1993].
- Newark
- Early version of Eden developed for the VAX environment. The name
was chosen because it was "far from Eden."
- NSMOS
- A version of MOSIX for National Semiconductor VR32 systems. late 1980's
[Barel 1987].
- Plan9
- Distributed UNIX-like system developed at Bell Labs by the originators
of UNIX. Features per-process name-spaces, allowing each process a customized
view of the resources in the system. 1990s [Pike et al 1995].
- REPOS
- Operating System for small PDP-11's attached to a host computer. Late
1970s [Maegaard & Andreasan 1979].
- RIG
- Rochester Intelligent Gateway. Network OS developed at the University
of Rochester. Influenced Accent and Mach. Early 1970s [Ball et al 1976].
- Roscoe
- Distributed OS for multiple identical processors (LSI-11s). University
of Wisconsin, Late 1970s [Solomon & Finkel 1979].
- Saguaro
- Distributed OS at the U. of Arizona, supporting varying degrees of
transparency. Mid 1980s [Andrews et al 1987].
- SODA
- A Simplified OS for Distributed Applications. Mid 1980s [Kepecs &
Solomon 1985].
- SODS/OS
- OS for a Distributed System developed on the IBM Series/1 at the U.
of Delaware. Late 1970s [Sincoskie & Farber 1980].
- Spring
- Distributed multiplatform OS developed by Sun. Not related to the Spring
Kernel, a real-time system. 1990s [Mitchell et al 1994].
- Uniflex
- Multitasking, multiprocessing OS for the 68000 family. Technical Systems
Consultants. Early 1980s [Mini-Micro 1986].
- V
- Experimental Distributed OS linking powerful bit-mapped workstations
at Stanford U. Early 1980s [Cheriton 1984, Berglund 1986].
Distributed Programming Systems
Distributed programming systems combine a distributed OS with language
support for a particular programming model. Very often these systems are
object-oriented.
- Argus
- A distributed programming system featuring resilient objects. Developed
at M.I.T. mid 1980s [Liskov 1984].
- Arjuna
- A C++ based distributed programming system based on objects and atomic
actions, developed at the University of Newcastle upon Tyne [Shrivastava
et al 1991].
- Emerald
- A distributed object-based operating system based on fine-grained (object
level) mobility. U. of Washington, 1980s [Jul et al 1988].
CORBA™
OMG's Common Object Request Broker Architecture (CORBA) standard
MICOSec, an open source implementation of the CORBA
specification
DCOM™
Microsoft's DCOM
NFS
NFS v4
LDAP
X.500
Kerberos
RSA
DES
SSH
NTP
Real-world examples and case
studies,
CORBA
DCOM
Jini
World Wide Web.
Implementation
Sockets
RPC
Threads
Implementation of distributed algorithms using these tools.
Fundamental Concepts (Transparency, Service, and Coordination)
-
Chapter 1
introduces a classification of centralized operating system, network
operatingsystem, distributed operating system, and cooperative
autonomous systems, usingthe key characteristics of virtuality,
interoperability, transparency and autonomicity,respectively, for each
system. It illustrates the evolution that led to the development
ofmodern distributed operating systems and explains the emerging need
for distributedsoftware and the importance of distributed coordination
algorithms.
-
Chapter 2 begins the
discussion of distributed operating systems. It presents theconcepts of
transparency and services. Distributed systems and their
underlyingcommunication architectures are introduced. The chapter
concludes with a list ofmajor system design issues that establishes an
order for the presentation of thesubsequent chapters.
Distributed Processes (Synchronization, Communication, andScheduling)
-
Chapter 3
describes concurrent processes and programming. It defines processes
andthreads and shows how their interaction can be modeled by using some
fundamentalconcepts such as a graph, a logical clock, and the client
and server model. Both sharedmemory and message passing for
synchronization and communication are addressed.They are presented
along with the development of concurrent language constructs.A taxonomy
of these language mechanisms and their implementation is given.
Thischapter presents an integrated view of synchronization and
communication.
-
Chapter 4 extends the
discussion of process interaction from synchronization tocommunication
and to distributed process coordination using message passing
communication.Three communication models, message passing (socket),
request/reply(RPC), and transaction communication, are presented. A
special emphasis is placedon group communication and coordination. Two
classical distributed coordinationproblems, mutual exclusion and leader
election using message passing interprocesscommunication, are
introduced. These problems are further studied in Chapters 10 and11 in
Part II of the textbook. The chapter also includes a presentation of
name service,an essential facility for communication in distributed
systems.
-
Chapter 5 turns to the third
process management issue, that of process scheduling. Theeffect of
communication on both static and dynamic process scheduling is
emphasized.The chapter describes distributed computation through
dynamic redistribution ofprocesses by using remote execution and
process migration techniques. It also addressesseveral unique issues in
real-time scheduling.
Distributed Resources (Files and Memory)
-
Chapter 6
discusses the distributed implementation of file systems, the first of
thetwo important distributed resources: files and memory. It
demonstrates the use of theconcept of transparency and service in the
design of distributed file systems. Twomajor implementation issues,
data caching and file replication, are discussed in thischapter. The
chapter also covers distributed transactions as part of the file
service.Since management of replicated data touches upon both data and
communication,two central issues in distributed systems, it is further
detailed in Chapter 12.
-
Chapter 7
covers distributed shared memory systems that simulate a logical
sharedmemory on a physically distributed memory system. The issues
studied are coherenceand consistency of data due to memory sharing. The
chapter describes implementationstrategies for different memory
consistency requirements. It also shows the significanceof the
object-based data sharing models.
-
Chapter 8
addresses unique security issues in network and distributed
environments.These issues are divided into two areas: authorization and
authentication. Authorizationincludes the study of distributed access
and flow control models. Authenticationcovers cryptography and its
applications for mutual authentication and key distributionprotocols.
Implementations of some security features in modern systems
areillustrated.
Part II of the textbook discusses distributed
algorithms. The discussion is pragmaticand is intended to give the
reader a solid understanding of common problems andsolution techniques.
The topics are organized in five chapters.
Distributed Algorithms
-
Chapter 9:
introduces the concepts of time and global states in a distributed
system.The fundamental problem of distributed algorithms is a lack of a
global clock and aglobal state. Recent research on vector time and
distributed predicates has developedunified models for thinking about
distributed time and the distributed state. Thischapter presents the
concepts of causality, vector timestamps, and global states.
Thealgorithms for implementing these concepts are presented. The
connections betweenthe different models are explored. Finally, a model
for proving the correctness ofdistributed algorithms is presented.
-
Chapter 10:
covers distributed synchronization and distributed election. While
thedistributed synchronization algorithms are not considered pragmatic,
they illustrateimportant algorithm design techniques. For example,
voting algorithms for replicateddata management are foreshadowed in
Maekawa's algorithm, and the Chang-Singhal-Liualgorithm illustrates the
ideas behind distributed shared memory (and distributedobject)
algorithms. The chapter concludes with algorithms for electing a
computationleader. Election is a critical component of many systems.
The invitation algorithmviiin particular is a prototype for handling
failures in an asynchronous system andforeshadows the group view
maintenance algorithms of Chapter 12.
-
Chapter 11
discusses the abstract distributed agreement problem. First,
Byzantineagreement is discussed. Next, the Fischer-Lynch-Paterson (FLP)
result that no algorithmsolves distributed agreement problems in an
asynchronous system is coveredin detail. This is the appropriate point
to introduce the FLP result, because the nextchapter covers replicated
data management and must solve distributed agreement inasynchronous
systems. The FLP result leaves open three ways to achieve
distributedagreement in an asynchronous system: hope that it happens,
use relative agreement,or use a randomized algorithm. The chapter
discusses these implications of the FLPresult and concludes with some
randomized agreement protocols.
-
Chapter 12
covers replicated data management. Since providing replicated servers
reducesto replicating the state of the servers, this section also
discusses the problems andconcepts of replication. We cover three main
approaches: the transaction approach,the reliable multicast approach,
and the log propagation approach. The transactionapproach includes
discussion of two-phase commit, three-phase commit,
one-copyserializability, voting, and dynamic voting protocols. The
reliable multicast approachincludes discussion of virtual synchrony,
algorithms for implementing reliable andcausal multicast, algorithms
for totally ordered multicast, and consistent multicastgroup
maintenance algorithms. The log propagation approach covers naive log
propagation,epidemics, and causal log propagation. This chapter is the
culmination of PartII of the text and draws together the results
presented in previous chapters.
-
Chapter 13
covers distributed rollback and recovery. These techniques are critical
forimplementing fault-tolerant systems and are complimentary to the
replicated datamanagement techniques of the previous chapter. By using
the theory developed in theprevious chapters (especially Chapter 9),
different rollback and recovery algorithmsare presented in a unified
manner and are related to algorithms discussed previously.
Is Deviant Behavior the Norm on P2P File-Sharing Networks?
Mobile Operating Systems
OS-Resources page
OS-References page
Operating Systems-Events
People
Operating Systems
Simulation-Related Pages
Operating Systems — Organizations
Book
Programming with system calls and libraries
Integrating Parallel and Distributed Computing in Computer Science Curricula