“Computational System for Real-Time Distributed Control”
Ph.D. dissertation, Technical University of Lisbon, Instituto Superior Técnico, Lisbon, Portugal, Jul. 2002
Abstract: Standard fieldbuses are nowadays a cost-effective solution for distributed control systems. However, the efficient implementation of fault-tolerance and real-time mech- anisms on fieldbus environments is far from being a plain engineering task. Rather, it poses a comprehensive set of non-trivial problems whose solution requires a systemic approach, taken here in the context of CAN, the Controller Area Network. One key point is that fault-tolerant distributed systems may take advantage from the availability of reliable communications. In this regard, we dismiss the misconception that CAN native mechanisms guarantee reliable message broadcast. Then, reasoning about the reliability of CAN communications and their weaknesses, we discuss a suite of low-level protocols providing: reliable and atomic broadcast; node failure detection and site membership; clock synchronization. Refuting a common belief that bus media redundancy is too complex to be im- plemented in the CAN infrastructure, we present an innovative and extremely simple mechanism that makes such an approach feasible, using off-the-shelf components. This secures resilience against permanent partitioning of the CAN infrastructure. In addition, we discuss a problem often disregarded in many analysis of CAN timing properties: temporary partitions (inaccessibility). We explain how to secure CAN real-time operation in the presence of temporary network errors.
Research line(s): Timeliness and Adaptation in Dependable Systems (TADS)