Proxmox High CPU Usage: Identifying and Fixing the Culprit

Diagnose and resolve high CPU usage on Proxmox VE hosts. Covers kvm processes, ksmd memory dedup, rrdcached, pvestatd, CPU steal, and overbooking.

Investigating High CPU on a Proxmox Host

When your Proxmox VE host shows sustained high CPU usage, it can affect all VMs and containers running on that node. The challenge is figuring out whether the problem is a runaway VM, a Proxmox service, a kernel process, or simply too many workloads on insufficient hardware. This guide walks you through a systematic diagnosis approach.

Step 1: Identify the Top CPU Consumers

Start with the basics. Use top or htop to see what is consuming CPU time on the host.

# Use top sorted by CPU usage
top -bn1 | head -30

# Or use htop for a more interactive view
htop

# Key things to look for:
# - kvm processes (each is a VM's vCPU thread)
# - lxc processes (container workloads)
# - ksmd (kernel same-page merging / memory dedup)
# - rrdcached (RRD data collection daemon)
# - pvestatd (Proxmox status daemon)
# - pveproxy (web interface proxy)

# Filter for just KVM/QEMU processes
ps aux --sort=-%cpu | grep qemu | head -10

# See which VM each kvm process belongs to
ps aux | grep "kvm.*-id" | awk '{for(i=1;i<=NF;i++) if($i=="-id") print $(i+1), $0}'

High CPU from KVM Processes (VMs)

The most common cause of high host CPU is one or more VMs consuming excessive CPU. Each kvm process corresponds to a VM, and its threads represent vCPUs.

# Check CPU usage per VM from Proxmox
pvesh get /nodes/$(hostname)/qemu --output-format json | \
    python3 -c "import sys,json; [print(f'VM {v[\"vmid\"]}: {v.get(\"cpu\",0)*100:.1f}% CPU') for v in json.load(sys.stdin)[\"data\"]]"

# Or check individual VM stats
qm monitor 100
info cpus

# If a specific VM is the culprit:
# 1. Check inside the VM what process is consuming CPU
# 2. Consider CPU-limiting the VM
qm set 100 --cpulimit 2
# This limits the VM to 2 CPU cores worth of host time

# Or set CPU units (relative weight, lower = less priority)
qm set 100 --cpuunits 512  # default is 1024

ksmd: Kernel Same-Page Merging

KSM (Kernel Same-Page Merging) is a memory deduplication feature that saves RAM by identifying identical memory pages across VMs. However, it consumes CPU to scan and compare pages, and can become a significant overhead when many VMs are running.

# Check if ksmd is consuming significant CPU
top -bn1 | grep ksmd

# Check KSM statistics
cat /sys/kernel/mm/ksm/pages_shared
cat /sys/kernel/mm/ksm/pages_sharing
cat /sys/kernel/mm/ksm/pages_unshared

# See how aggressively KSM is scanning
cat /sys/kernel/mm/ksm/pages_to_scan   # pages per scan cycle
cat /sys/kernel/mm/ksm/sleep_millisecs  # pause between cycles

# Reduce KSM CPU usage by scanning fewer pages or sleeping longer
echo 100 > /sys/kernel/mm/ksm/pages_to_scan     # default: 100
echo 200 > /sys/kernel/mm/ksm/sleep_millisecs    # default: 20

# Disable KSM entirely if RAM is not a constraint
echo 0 > /sys/kernel/mm/ksm/run

# Make the change persistent via /etc/default/qemu-kvm or sysctl
# In /etc/sysctl.d/99-ksm.conf:
# kernel.mm.ksm.run = 0

rrdcached and pvestatd

These Proxmox services collect and store performance metrics (RRD data). On systems with many VMs and containers, they can consume noticeable CPU.

# Check rrdcached CPU usage
systemctl status rrdcached
ps aux | grep rrdcached

# rrdcached spikes can occur when many VMs start simultaneously
# or when the RRD database files are on slow storage

# If rrdcached is consuming excessive CPU:
# 1. Move RRD data to faster storage
# Default location: /var/lib/rrdcached/
ls -la /var/lib/rrdcached/db/pve2-*

# 2. Restart rrdcached to clear any backlog
systemctl restart rrdcached

# Check pvestatd
systemctl status pvestatd
# pvestatd collects status from all VMs every few seconds
# On a host with 50+ VMs, this is normal overhead

# If pvestatd is stuck or consuming excessive CPU
systemctl restart pvestatd

CPU Steal Time

If you are running Proxmox inside a VM (nested virtualization) or on a shared hosting platform, CPU steal time indicates that the hypervisor above yours is taking CPU cycles away from your Proxmox host.

# Check for CPU steal time
top -bn1 | head -5
# Look for %st (steal) in the CPU line
# Cpu(s):  5.2%us, 2.1%sy, 0.0%ni, 85.3%id, 0.0%wa, 0.0%hi, 0.1%si, 7.3%st
#                                                                        ^^^
# 7.3% steal means your CPU is being throttled by the host hypervisor

# Check steal over time
vmstat 5
# Look at the "st" column

# Solutions:
# - Move to dedicated hardware (no steal on bare metal)
# - Contact your hosting provider about CPU allocation
# - Reduce workload on this Proxmox instance

CPU Overbooking

Overbooking happens when the total number of vCPUs allocated to VMs exceeds the physical CPU cores available. While moderate overbooking is normal and often works fine, excessive overbooking causes performance degradation for all VMs.

# Calculate your overbooking ratio
# Count physical cores
nproc
# or
lscpu | grep "^CPU(s):"

# Count total allocated vCPUs
qm list | awk 'NR>1 {total += $3} END {print "Total vCPUs:", total}'

# Overbooking ratio guidelines:
# 1:1 to 2:1 - Conservative, good for CPU-intensive workloads
# 2:1 to 4:1 - Moderate, fine for mixed workloads
# 4:1+       - Aggressive, only for idle/low-usage VMs

# Check actual CPU load average
uptime
cat /proc/loadavg

# If load average consistently exceeds physical core count,
# you are overbooking too aggressively

# Solutions:
# - Reduce vCPU count on idle VMs (many VMs do not need 4+ cores)
qm set 101 --cores 2

# - Set CPU affinity to pin VMs to specific cores
qm set 100 --affinity 0-3

# - Enable CPU ballooning features
# - Migrate some VMs to another node

Monitoring CPU Usage Over Time

A single snapshot of CPU usage does not tell the whole story. Use Proxmox's built-in graphs or external monitoring to track CPU patterns. ProxmoxR can help you visualize CPU utilization trends across all your nodes and identify which VMs are consistently the heaviest consumers.

# Quick check of recent CPU history via RRD data
rrdtool fetch /var/lib/rrdcached/db/pve2-node/$(hostname) \
    AVERAGE -s -1h | head -20

# Set up a simple CPU monitoring cron job
# In /etc/cron.d/cpu-monitor:
# */5 * * * * root uptime >> /var/log/cpu-load.log

High CPU on a Proxmox host is usually caused by VM workloads, memory dedup (ksmd), or CPU overbooking. Identify the specific culprit before making changes, and address the root cause rather than just treating the symptom.

Proxmox High CPU Usage: Identifying and Fixing the Culprit

Investigating High CPU on a Proxmox Host

Step 1: Identify the Top CPU Consumers

High CPU from KVM Processes (VMs)

ksmd: Kernel Same-Page Merging

rrdcached and pvestatd

CPU Steal Time

CPU Overbooking

Monitoring CPU Usage Over Time

Take Proxmox management mobile

Related Articles

How to Unlock a Locked VM in Proxmox VE

How to Remove Proxmox Subscription Warning

Proxmox Web Interface Not Loading: Troubleshooting Guide

Manage Proxmox from your phone