Patrizio Pelliccione

Dipartimento di Informatica, Università degli Studi dell'Aquila
Via Vetoio, I-67010 L'Aquila (Italy)

Tutorials


Tutorial home page:
http://www.di.univaq.it/TutorialAFTS/index.html

SAFECOMP 2008


Title: Architecting Fault Tolerant Systems
Location: The 27th International Conference on Computer Safety, Reliability and Security (SAFECOMP 2008) Thursday 25th September, 13:30 - 17:00, Newcastle upon Tyne, UK.
Authors: H. Muccini, P. Pelliccione, A. Romanovsky
Tutorial overview: Fault tolerance, being one of the four means for achieving dependability, is intended to ensure the delivery of the correct service in the presence of active faults. While typical fault tolerance solutions aim at extending the design and implementation phases of the software life-cycle (e.g., Java and Windows NT exception handling), more recently the need for explicit integration of fault tolerance solutions into the entire life cycle has been advocated by some researchers. For example, several solutions have been proposed for fault tolerance using exception handling at the software architecture and component levels. This tutorial describes how the concepts of fault tolerance and software architectures have been integrated so far. The tutorial is structured in five parts (Overview of the Software Architecture Domain, Overview of Fault Tolerance and Exception Handling, Integrating Fault Tolerance into Software Architecture, Coordinated Atomic Actions, Examples and Case Studies) and is based on our recent survey study in which more than fifteen approaches to architecting fault tolerant systems have been analysed and classified. The tutorial concludes by identifying the issues that are still open and require further investigation.

ISSRE 2007


Title: Architecting Fault Tolerant Systems
Location: The 18th IEEE International Symposium on Software Reliability Engineering, ISSRE 2007 tutorial, 5-9th of November, Trollhattan, Sweden
Authors: H. Muccini, P. Pelliccione, A. Romanovsky
Tutorial overview: Fault tolerance, being one of the four means for guaranteeing dependability, is intended to ensure the delivery of the correct services in the presence of active faults. It is implemented by error detection and subsequent system recovery. While typical solutions focus on fault tolerance (and specifically, exception handling) during the design and implementation phases of the software life-cycle (e.g., Java and Windows NT exception handling), more recently the need for explicit exception handling solutions during the entire life cycle has been advocated by some researchers. Several solutions have been proposed for fault tolerance via exception handling at the software architecture and component levels. This tutorial describes how the two concepts of fault tolerance and software architectures have been integrated so far. It is structured in five parts (Overview on Software Architecture, Overview on Fault Tolerance and Exception Handling, Integrating Fault Tolerance into Software Architecture, Coordinated Atomic Actions, Examples and Case Studies) and is based on a survey study on architecting fault tolerant systems where more than fifteen approaches have been analyzed and classified. The tutorial concludes identifying those issues that remain still open and require deeper investigation.

WICSA 2007


Title: Architecting Fault Tolerant Systems
Location: Sixth Working IEEE/IFIP Conference on Software Architecture (WICSA 2007), Mumbai, India, January 7 2007.
Authors: H. Muccini, P. Pelliccione, A. Romanovsky
Tutorial overview: Fault tolerance, being one of the four means for guaranteeing dependability, is intended to ensure the delivery of the correct services in the presence of active faults. It is implemented by error detection and subsequent system recovery. Error detection finds an erroneous system state. Following system recovery transforms the system state that contains one or more errors and (possibly) faults into a state without detected errors and faults (fault handling). Exceptions and exception handling provide a general framework for structuring the fault tolerance activities in a system, by focusing on the concept of exceptional/abnormal behaviour (as opposed to normal behaviour), exception handling enables specifying actions to be undertaken in the presence of abnormal events. While typical solutions focus on fault tolerance (and specifically, exception handling) during the design and implementation phases of the software life-cycle (e.g., Java and Windows NT exception handling), more recently the need for explicit exception handling solutions during the entire life cycle has been advocated by some researchers. Several solutions have been proposed for fault tolerance via exception handling at the software architecture and component levels. This tutorial describes how the two concepts of fault tolerance and software architectures have been integrated so far. It is structured in two parts (Overview on Fault Tolerance and Exception Handling, and Integrating Fault Tolerance into Software Architecture) and is based on a survey study on architecting fault tolerant systems where more than fifteen approaches have been analyzed and classified. The tutorial concludes identifying those issues that remain still open and require deeper investigation.