Contents

论文笔记2——《Scalable Virtual Machine Migration using Reinforcement Learning》

Scalable Virtual Machine Migration using Reinforcement Learning

  • Abdul Rahman Hummaida · Norman W. Paton · Rizos Sakellariou
  • (JGC)Journal of Grid Computing
  • 2022 CCF-C

1、introduction

文章贡献主要有:

  1. 在混合架构中实现了强化学习策略,降低了SLA(?)损失且具备高扩展性
  2. 使用混合架构实现了并行多级强化学习代理合作方式(?)。
  3. 实验证明强化学习和混合架构的结合与其他方法的优势。

2、Problem Statement

文章解决了以下几个问题:

  • Detect when a VM is stressed.
  • Identify which VMs to migrate
  • Apply a decision making approach to optimize the migration of a VM, and choose a target node to host the VM in such a way that brings response time within SLA levels.
  • Develop an architecture for the control system that monitors and optimises the migration of VMs.

3、Related work

  • VM placement
    • 包括初始VM放置和VM迁移,以达到SLA协议或效益目标等。
    • 达成SLA协议可以视为检测过载节点,随后调用管理程序将VM从压力节点迁移至目标节点。
    • 迁移选择的目标VM由Maximum Correlation决定,即挑选与CPU使用率最相关的VM。
  • reinforcement learning (RL)
    • Our approach also caters for reducing SLA violations,however we choose a different state action reduction approach to manage the challenges with Q-learning
    • our proposal uses a decentralized architecture, applies knowledge sharing among the RL agents,uses aggregation to reduce the state action space,and uses linear regression to monitor QoS metrics like response time.
    • We use a state action aggregation approach to address the dimensionality challenge
    • We propose a highly scalable RL approach and examine its ability to manage a large infrastructure, with many thousands of nodes

4、 RL Background

we develop an RL based controller to solve the VM migration problem and combine QLearning with an aggregated state action space to address the curse of dimensionality in Q-learning. To speed up RL convergence, we utilise parallel learning agents that learn from a shared collective experience of all agents. We develop a reward function that focuses on learning a policy to reduce SLA violations, and balance this with energy consumption.

5、Hybrid Architecture

Our scalable hybrid architecture, SHDF, attempts to service resource requests at the lowest local level possible, in order to reduce the overhead of servicing the request and to reduce the performance impact of migrating VMs across cluster boundaries .

  • Controller Functionality - VM Migration SDHF

  • Controller Functionality - Consolidation

6、 Proposed Reinforcement Learning Management Algorithm

  • Algorithm 1 RL@NC.

  • A Monitoring Module tracks VM response times and is used as input by other modules to manage the node. The module additionally tracks the outcome of reinforcement learning actions.

  • A Classification module assesses the state of a node and the VMs running on it, by using input from the Monitoring node. The module decides if a VM is stressed.

  • A Learning Module uses the other modules as input to carry out decision making. When a VM is classed as stressed, the module determines the available actions and runs the Q-learning algorithm to decide the action to take. The module additionally executes the action and invokes the monitoring module to determine the outcome of the action, calculates the reward for each taken action and updates the Q-table.

  • Algorithm 2 Update Q-table.