How to disable or tune filesystem cache sharing for OpenVZ?
- by gertvdijk
For OpenVZ, an example of container-based virtualization, it seems that host and all guests are sharing the filesystem cache. This sounds paradoxical when talking about virtualization, but this is actually a feature of OpenVZ.
It makes sense too. Because only one kernel is running, it's possible to benefit from sharing the same pages of filesystem cache in memory. And while it sounds beneficial, I think a set up here actually suffers in performance from it. Here's why I think why: my machines aren't actually sharing any files on disk so I can't benefit from this feature in OpenVZ.
Several OpenVZ machines are running MySQL with MyISAM tables. MyISAM relies on the system's filesystem cache for caching of data files, unlike InnoDB's buffer pool. Also some virtual machines are known to do heavy and large I/O operations on the same filesystem in the host.
For example, when running cat *.MYD > /dev/null on some large database in one machine, I saw the filesystem cache lowering in another, monitored by htop. This essentially flushes all the useful filesystem cache in guests (FIFO) and so it flushes the MySQL caches in the guests.
Now users are complaining that MySQL is very slow. And it is. Some simple SELECT queries take several seconds on times disk I/O is heavily used by other machines.
So, simply put:
Is there a way to avoid filesystem cache being wiped out by other virtual machines in container-based virtualization?
Some thoughts:
Choosing algorithm for flushing filesystem cache in the kernel. (possible? how?)
Reserving a certain amount of pages for a single VM. (seems no option for filesystem cache type of pages that reading man vzctl)
Will running MySQL on another filesystem get me anywhere?
If not, I think my alternatives are:
Use KVM for MySQL-MyISAM running VMs. KVM actually assigns memory to the VM and does not allow swapping out caches unless using a balloon driver.
Move to InnoDB and tune the buffer pools, dirty pages, etc. This is now considered to be 'nice to have' on the long-term as not everyone responsible for administration of the system understands InnoDB.
more suggestions welcome.
System software: Proxmox (now 1.9, could be upgraded to 2.x). One big LV assigned for the VMs.