Guided Tour of the Book

      Part I, Distribution, addresses the fundamental issues concerning distribution, and it is the largest part of the book. It contains a comprehensive set of notions that will develop in the reader a thorough understanding of distributed system architecture, from concepts and paradigms, to models and example systems. Chapter 1, Distributed Systems Foundations, discusses the foundations of distributed systems, and is intended as a review of the basic subjects regarding distribution, such as computer networks, distributed operating systems and services, complemented with a few formal notions, useful for a more elaborate treatment of some subjects. Distributed system architectures are given from a evolutionary perspective, from remote login to mobile computing, so that the reader, further to understanding what the several architectural models are, captures why they appeared or mutated, and what needs each one serves.

       Inasmuch as History is paramount to Architecture, so is the knowledge of computing systems evolution to the system architect. Chapter 2, Distributed System Paradigms, presents the most important paradigms in distributed systems, in a problem-oriented manner, purposely addressed to to-be architects. That is, rather than being exposed to the subjects in a paradigm-centric manner, enveloped in some formal description, the reader is faced with a problem or a need, then with a solution in the form of a paradigm, and when appropriate, with details about relevant mechanisms or algorithms. And finally, should it be the case, the limitations of that paradigm may also be pointed out, so that another paradigm, solving the problem, is motivated, and so forth. Namely, the chapter addresses: message passing, remote operations, group communication, naming and addressing, time and clocks, ordering, synchrony, coordination, concurrency, and consistency. Chapter 3, Models of Distributed Computing, discusses the main models used nowadays in distributed systems, that is: what are the main classes of distributed activities; why different models serve different needs, and how we design the software architecture and structure the runtime environment of distributed applications. The chapter explains clearly the main reasons for the known debate between the synchronous and asynchronous frameworks for distributed computing. Then, it addresses known models such as: client-server with RPC, group-oriented, World-Wide Web, distributed shared memory, message-buses.

       Chapter 4, Distributed Systems and Platforms, consolidates the notions learnt along the previous chapters, in the form of examples of enabling technologies, toolboxes, platforms and systems. The last two chapters do not address system-call details or system internals, since the scope of the book is designing and building systems, rather than programming them. Chapter 5 starts a case study: The VP'63 (Vintageport'63) Large-Scale Information System. An imaginary Portuguese wine company, with facilities spread through the country, has a traditional information system, the VintagePort'63 (VP'63), that must adapt to the modern times. Centralized, mainframe-based, little interactivity, proprietary, it must adapt to the distributed nature of the company and its distribution network, and to the business evolution. The case study is methodically addressed at the end of each part, so that we progressively solve the above-mentioned problems, making VP'63: modular, distributed and interactive; dependable; timely; and secure.

      Part II, Fault-Tolerance, addresses dependability of distributed systems, that is, how to ensure that they keep running correctly. It contains the fundamental notions concerning dependability, such as the trilogy fault-error-failure and provides a comprehensive treatment of distributed fault-tolerance. Chapter 6, Fundamental Concepts of Fault-Tolerance, starts with the generic notion of dependability and its associated concepts, and ends with the introduction of distributed fault-tolerance. In fact, distribution and fault-tolerance go hand in hand, since the former requires the latter to keep reliability at an acceptable level, and the latter is made easier by some qualities of the former, such as independence of failure of individual machines. Chapter 7, Paradigms for Distributed Fault-Tolerance, discusses the main paradigms of this discipline. After introductory concepts and notions about fault-tolerant communication, it addresses issues such as: replication management, resiliency and voting, and recovery. Chapters 8 and 9, Models of Distributed Fault-Tolerant Computing and Dependable Systems and Platforms, show how to incorporate fault- tolerance in distributed systems. Explaining the main strategies for the diverse fault models, its materialization in discussed for remote operation, diffusion and transactional computing models. Finally, examples of relevant systems are given. Chapter 10 continues the case study: Making the VP'63 System Dependable.

      Part III, Real-Time, takes the same explanatory approach of Part II, and discusses how to ensure that systems are timely. It contains the fundamental notions concerning real-time, and provides a comprehensive treatment of the problem of real-time in distributed systems. Chapters 11 and 12, Fundamental Concepts of Real-Time and Paradigms for Real-Time, address the fundamental notions and misconceptions about real-time, in a distributed context. The main paradigms are presented, in a comparative manner when applicable, such as synchronism versus asynchronism, or event versus time-triggered operation. Chapter 12 further addresses issues such as: real-time networks, real-time processing, real-time communication, clock synchronization, and input-output. Chapters 13 and 14, Models of Distributed Real-Time Computing and Real-Time Systems and Platforms show how to achieve timeliness of distributed systems, in its several forms, from the hard, soft or best-effort real-time classes, to the time-triggered and event-triggered models. Chapter 14 gives examples of distributed real-time systems in several settings. Chapter 15 continues the case study: Making the VP'63 System Timely.

      Part IV, Security, addresses security of distributed systems, that is, how to ensure that they resist intruders. Security is paramount to the recognition of open distributed systems as the key technology in today's global communication and processing scenario. This part contains the fundamental notions concerning security, and provides a comprehensive treatment of the problem of security in distributed systems. Chapter 16, Fundamental Concepts of Security, discusses the fundamental principles, such as the notions of risk, threat and vulnerability, and the properties of confidentiality, authenticity, integrity and availability. Chapter 17, Fundamental Security Paradigms, treats the most important paradigms, such as: cryptography, digital signature and payment, secure networks and communication, protection and access control, firewalls, auditing. Chapters 18 and 19, Models of Distributed Secure Computing, and Secure Systems and Platforms, consolidate the notions of the previous chapters, in the form of models and systems for building and achieving: information security, authentication, electronic transactions, secure channels, remote operations and messaging, intranets and firewall systems, extranets and virtual private networks. Chapter 20 continues the case study, this time: Making the VP'63 Secure.

      Last but not least, Part V on Management, because distributed systems are too complex to be managed ad-hoc. In essence, there is a contradiction in the nature of the problem. Distributed systems are geographically spread. They have a large number of visible-- and thus manageable-- components, from computers, routers, modems, and network media, to programs, operating systems, protocols, etc. However, whilst some of the components can be self or locally managed, thus in a distributed fashion, system management is human-centric, and by nature centralized. The book does not intend to give a magic recipe for this problem, which is still an active area of research, but will give the reader the ability to understand it and become aware of the existing solutions for it. Chapters 21 and 22, Fundamental Concepts of Management, and Paradigms for Distributed Systems Management, give insight on the fundamental concepts, architectures and paradigms concerning network and distributed systems configuration and management. Chapter 22 presents the main management functions: fault, configuration, accounting, performance, security, quality-of-service, name and directories, and monitoring. Chapters 23 and 24, Models of Network and Distributed Systems Management and Management Systems and Platforms, discuss the main models, such as centralized, decentralized, integrated, and domain-oriented management, and point to examples of tools, systems and architectures. Chapter 25 finalizes the case study: Managing the VP'63 System.


Make Distributed Systems For System Architects your HomePage
Web Site hosted by: The Navigators  of  LaSIGE  at Faculdade de Ciências of Universidade de Lisboa
© 2001 by Kluwer Academic Publishers and Paulo Veríssimo and Luís Rodrigues