You are required to read and agree to the below before accessing a full-text version of an article in the IDE article repository.

The full-text document you are about to access is subject to national and international copyright laws. In most cases (but not necessarily all) the consequence is that personal use is allowed given that the copyright owner is duly acknowledged and respected. All other use (typically) require an explicit permission (often in writing) by the copyright owner.

For the reports in this repository we specifically note that

  • the use of articles under IEEE copyright is governed by the IEEE copyright policy (available at http://www.ieee.org/web/publications/rights/copyrightpolicy.html)
  • the use of articles under ACM copyright is governed by the ACM copyright policy (available at http://www.acm.org/pubs/copyright_policy/)
  • technical reports and other articles issued by M‰lardalen University is free for personal use. For other use, the explicit consent of the authors is required
  • in other cases, please contact the copyright owner for detailed information

By accepting I agree to acknowledge and respect the rights of the copyright owner of the document I am about to access.

If you are in doubt, feel free to contact webmaster@ide.mdh.se

Hybrid Adaptive Checkpointing for Virtual Machine Fault Tolerance

Fulltext:


Authors:

Abel Souza , Alessandro Papadopoulos, Luis Tomás Bolivar , David Gilbert , Johan Tordsson

Publication Type:

Conference/Workshop Paper

Venue:

IEEE International Conference on Cloud Engineering 2018

DOI:

https://doi.org/10.1109/IC2E.2018.00023


Abstract

Active Virtual Machine (VM) replication is an application independent and cost-efficient mechanism for high availability and fault tolerance, with several recently proposed implementations based on checkpointing. However, these methods may suffer from large impacts on application latency, excessive resource usage overheads, and/or unpredictable behavior for varying workloads. To address these problems, we propose a hybrid approach through a Proportional-Integral (PI) controller to dynamically switch between periodic and on-demand check- pointing. Our mechanism automatically selects the method that minimizes application downtime by adapting itself to changes in workload characteristics. The implementation is based on modifications to QEMU, LibVirt, and OpenStack, to seamlessly provide fault tolerant VM provisioning and to enable the controller to dynamically select the best checkpointing mode. Our evaluation is based on experiments with a video streaming application, an e-commerce benchmark, and a software development tool. The experiments demonstrate that our adaptive hybrid approach improves both application availability and resource usage compared to static selection of a checkpointing method, with application performance gains and neglectable overheads.

Bibtex

@inproceedings{Souza5130,
author = {Abel Souza and Alessandro Papadopoulos and Luis Tom{\'a}s Bolivar and David Gilbert and Johan Tordsson},
title = {Hybrid Adaptive Checkpointing for Virtual Machine Fault Tolerance},
pages = {12--22},
month = {April},
year = {2018},
booktitle = {IEEE International Conference on Cloud Engineering 2018},
url = {http://www.es.mdu.se/publications/5130-}
}