Files
Knowledge-Base/20 - Knowledge/infrastructure/proxmox-cluster-basics.md

115 lines
3.7 KiB
Markdown

---
title: Proxmox Cluster Basics
description: Overview of how Proxmox VE clusters work, including quorum, networking, and operational constraints
tags:
- proxmox
- virtualization
- clustering
category: infrastructure
created: 2026-03-14
updated: 2026-03-14
---
# Proxmox Cluster Basics
## Introduction
A Proxmox VE cluster groups multiple Proxmox nodes into a shared management domain. This allows centralized administration of virtual machines, containers, storage definitions, and optional high-availability workflows.
## Purpose
Use a Proxmox cluster when you want:
- Centralized management for multiple hypervisor nodes
- Shared visibility of guests, storage, and permissions
- Live migration or controlled workload movement between nodes
- A foundation for HA services backed by shared or replicated storage
## Architecture Overview
A Proxmox cluster relies on several core components:
- `pvecm`: the cluster management tool used to create and join clusters
- Corosync: provides the cluster communication layer
- `pmxcfs`: the Proxmox cluster file system used to distribute cluster configuration
- Quorum: majority voting used to protect cluster consistency
Important operational behavior:
- Each node normally has one vote
- A majority of votes must be online for state-changing operations
- Loss of quorum causes the cluster to become read-only for protected operations
## Cluster Design Notes
### Network requirements
Proxmox expects a reliable low-latency network for cluster traffic. Corosync is sensitive to packet loss, jitter, and unstable links. In homelabs, this generally means wired LAN links, stable switching, and avoiding Wi-Fi for cluster communication.
### Odd node counts
Three nodes is the common minimum for a healthy quorum-based design. Two-node designs can work, but they need extra planning such as a QDevice or acceptance of reduced fault tolerance.
### Storage considerations
Clustering does not automatically provide shared storage. Features such as live migration and HA depend on storage design:
- Shared storage: NFS, iSCSI, Ceph, or other shared backends
- Replicated local storage: possible for some workflows, but requires careful planning
- Backup storage: separate from guest runtime storage
## Configuration Example
Create a new cluster on the first node:
```bash
pvecm create lab-cluster
```
Check cluster status:
```bash
pvecm status
```
Join another node to the cluster from that node:
```bash
pvecm add 192.0.2.10
```
Use placeholder management addresses in documentation and never expose real administrative IPs publicly.
## Troubleshooting Tips
### Cluster is read-only
- Check quorum status with `pvecm status`
- Look for network instability between nodes
- Verify time synchronization and general host health
### Node join fails
- Confirm name resolution and basic IP reachability
- Make sure cluster traffic is not filtered by a firewall
- Verify the node is not already part of another cluster
### Random cluster instability
- Review packet loss, duplex mismatches, and switch reliability
- Keep corosync on stable wired links with low latency
- Separate heavy storage replication traffic from cluster messaging when possible
## Best Practices
- Use at least three voting members for a stable quorum model
- Keep cluster traffic on reliable wired networking
- Document node roles, storage backends, and migration dependencies
- Treat the Proxmox management network as a high-trust segment
- Test backup and restore separately from cluster failover assumptions
## References
- [Proxmox VE Administration Guide: Cluster Manager](https://pve.proxmox.com/pve-docs/chapter-pvecm.html)
- [Proxmox VE `pvecm` manual](https://pve.proxmox.com/pve-docs/pvecm.1.html)