Semantics of Recovery Lines for Backward Recovery in Distributed Systems
Résumé
This paper addresses the definition of {\it recovery lines} in the context of backward recovery whose aim is to cope with failures in distributed sytems. A general framework that allows for several semantics of recovery lines is introduced. Key notions such as {\it missing messages} and {\it orphan messages} are precisely defined and their impact on the definition of consistency of recovery lines is carefully analyzed. Basic mechanisms such as local checkpointing, messages identification and (optimistic or pessimistic) messages logging are then discussed as an illustration of (coordinated or uncoordinated) checkpointing protocols.