Recovery for sporadic operations on cloud applications

Download files
Access & Terms of Use
open access
Copyright: Fu, Min
Altmetric
Abstract
Cloud-based systems get changed more frequently than traditional systems. These frequent changes involve sporadic operations such as installation and upgrade. Sporadic operations on cloud manipulate cloud resources and they are prone to unpredictable and inevitable failures largely due to cloud uncertainty. To recover from failures in sporadic operations on cloud, we need cloud operational recovery strategies. Existing operational recovery methods on cloud have several drawbacks, such as poor generalizability of the exception handling mechanism and the coarse-grained recovery manner of rollback mechanisms. Hence, this thesis proposes a novel and innovative recovery approach, called POD-Recovery, for sporadic operations on cloud. One novelty of POD-Recovery is that it is based on eight cloud operational recovery requirements formulated by us (e.g. recovery time objective satisfaction and recovery generalizability). Another novelty of POD-Recovery is that it is non-intrusive and does not modify the code which implements the sporadic operation. POD-Recovery works in the following innovative way: it first treats a sporadic operation as a process which provides the workflow of the operation and the contextual information for each operational step. Then, it identifies the recovery points (where failure detection and recovery should be performed) inside the sporadic operation, determines the unified resource space (the resource types required and manipulated by the sporadic operation), and generates the expected resource state templates (the abstraction level of resource states) for all operational steps. For a given recovery point inside the sporadic operation, POD-Recovery first filters the applicable recovery patterns from the eight recovery patterns it supports and then automatically generates the recovery actions for the applicable recovery patterns. Next, it evaluates the generated applicable recovery actions based on the metrics of Recovery Time, Recovery Cost and Recovery Impact. This quantitative evaluation leads to the selection of an acceptable recovery action for execution for a given recovery point. We implement POD-Recovery and evaluate it by recovering from faults injected into five representative types of sporadic operations on cloud. The experimental results show that POD-Recovery is able to perform operational recovery while satisfying all the recovery requirements and it improves on the existing recovery methods for cloud operations.
Persistent link to this record
Link to Publisher Version
Link to Open Access Version
Additional Link
Author(s)
Fu, Min
Supervisor(s)
Zhu, Liming
Liu, Anna
Bass, Len
Creator(s)
Editor(s)
Translator(s)
Curator(s)
Designer(s)
Arranger(s)
Composer(s)
Recordist(s)
Conference Proceedings Editor(s)
Other Contributor(s)
Corporate/Industry Contributor(s)
Publication Year
2017
Resource Type
Thesis
Degree Type
PhD Doctorate
UNSW Faculty
Files
download public version.pdf 4.16 MB Adobe Portable Document Format
Related dataset(s)