Maxime Lorrillere

My Ph.D.

A kernel cooperative cache for I/ O intensive applications in virtual machines

With the advent of cloud architectures, virtualization has become a key mechanism. In clouds, virtual machines (VMs) offer both isolation and flexibility. This is the foundation of cloud elasticity, but it induces fragmentation of the physical resources, including memory. While each VM memory needs evolve during time, existing mechanisms used to dynamically adjust VMs memory are either inefficient or application specific, and it is currently impossible to take benefit of the unused memory of VMs hosted by another host.

To this end, we designed APIs inside the Linux kernel that can be used to build cooperative caches that are agnostic to applications, file systems or hypervisor. Those APIs have enabled us to create Puma, a mechanism that improves I/ O intensive applications performance by providing the ability for a VM to entrust clean page-cache pages to other VMs having unsused memory. Puma rely on existing page-cache data structures to make it very efficient to reclaim the memory lent to another VM. By being distributed, Puma increases the memory consolidation at the scale of a data center. In our evaluations made with TPC-C, TPC-H, BLAST and Postmark, we show that Puma can significantly boost the performance without impacting potential activity peaks on the lender. This work has been presented at SYSTOR'15 (doi).

Virtual machines widely use dynamic memory management techniques such as memory ballooning or opportunistic caches to improve server memory consolidation. These techniques rely on the ability to detect idle memory that can be reallocated without cost. However, if we are able to detect used memory (i.e. malloc), free memory is generally used as an I/ O cache, thus it is hard to know if memory is really useful: when an I/ O workload stops, the cache is full of unused memory, but we can't say that the workload has finished. This is particularly true for intermittent workloads that we can observe in clouds. In this work, we propose heuristics that can be used to detect when a VM has a caching activity to improve server memory consolidation. They rely on existing memory management mechanisms of the Linux kernel, such as the shadow page cache and the active/ inactive LRU-lists. This work has been presented at the french conference on parallelism, architecture and operating systems (Compas'2015, hal).