first version of the knowledge base :)

This commit is contained in:
2026-03-14 11:41:54 +01:00
commit 27965301ad
47 changed files with 4356 additions and 0 deletions

View File

@@ -0,0 +1,64 @@
---
title: Backup Architecture
description: Reference backup architecture for self-hosted services, data, and infrastructure components
tags:
- backup
- architecture
- self-hosting
category: systems
created: 2026-03-14
updated: 2026-03-14
---
# Backup Architecture
## Summary
A backup architecture defines what is protected, where copies live, and how recovery is validated. In self-hosted environments, the architecture must account for application data, infrastructure configuration, and the operational steps needed to restore service safely.
## Why it matters
Many backup failures are architectural rather than tool-specific. Storing copies on the wrong system, skipping configuration, or never testing restores can make an otherwise successful backup job useless during an incident.
## Core concepts
- Multiple copies across different failure domains
- Separation of live storage, backup storage, and off-site retention
- Consistent backups for databases and stateful services
- Restore validation as part of the architecture
## Practical usage
A practical backup architecture usually includes:
- Host or VM backups for infrastructure nodes
- File or repository backups for application data
- Separate backup of configuration, Compose files, and DNS or proxy settings
- Off-site encrypted copy of critical repositories
Example model:
```text
Primary workloads -> Local backup repository -> Off-site encrypted copy
Infrastructure config -> Git + encrypted secret store -> off-site mirror
```
## Best practices
- Back up both data and the metadata needed to use it
- Keep at least one copy outside the main site or storage domain
- Use backup tooling that supports verification and restore inspection
- Make restore order and dependency assumptions explicit
## Pitfalls
- Treating snapshots as the only backup mechanism
- Backing up encrypted data without preserving key recovery paths
- Assuming application consistency without database-aware handling
- Skipping restore drills for high-value services
## References
- [restic documentation](https://restic.readthedocs.io/en/latest/)
- [BorgBackup documentation](https://borgbackup.readthedocs.io/en/stable/)
- [Proxmox VE Backup and Restore](https://pve.proxmox.com/pve-docs/chapter-vzdump.html)

View File

@@ -0,0 +1,156 @@
---
title: Homelab Architecture
description: Reference architecture for building a maintainable homelab with clear trust zones and operational boundaries
tags:
- homelab
- architecture
- infrastructure
category: systems
created: 2026-03-14
updated: 2026-03-14
---
# Homelab Architecture
## Introduction
A homelab architecture should make experimentation possible without turning the environment into an undocumented collection of one-off systems. The most effective designs separate compute, networking, storage, identity, and operations concerns so each layer can evolve without breaking everything above it.
## Purpose
This document describes a reusable architecture for:
- Self-hosted services
- Virtualization and container workloads
- Secure remote access
- Monitoring, backup, and update workflows
## Architecture Overview
A practical homelab can be viewed as layered infrastructure:
```text
Edge and Access
-> ISP/router, firewall, VPN, reverse proxy
Network Segmentation
-> management, servers, clients, IoT, guest
Compute
-> Proxmox nodes, VMs, container hosts
Platform Services
-> DNS, reverse proxy, identity, secrets, service discovery
Application Services
-> dashboards, git forge, media, automation, monitoring
Data Protection
-> backups, snapshots, off-site copy, restore testing
```
## Recommended Building Blocks
### Access and identity
- VPN or zero-trust access layer for administrative entry
- SSH with keys only for infrastructure access
- DNS as the primary naming system for internal services
### Compute
- Proxmox for VM and LXC orchestration
- Dedicated container hosts for Docker or another runtime
- Utility VMs for DNS, reverse proxy, monitoring, and automation
### Storage
- Fast local storage for active workloads
- Separate backup target with different failure characteristics
- Clear distinction between snapshots and real backups
### Observability and operations
- Metrics collection with Prometheus-compatible exporters
- Dashboards and alerting through Grafana and Alertmanager
- Centralized backup jobs and restore validation
- Controlled update workflow for host OS, containers, and dependencies
## Example Layout
```text
VLAN 10 Management hypervisors, switches, storage admin
VLAN 20 Servers reverse proxy, app VMs, databases
VLAN 30 Clients desktops, laptops, admin workstations
VLAN 40 IoT cameras, smart home, media devices
VLAN 50 Guest internet-only devices
```
Service placement example:
- Reverse proxy and DNS on small utility VMs
- Stateful applications in dedicated VMs or clearly documented persistent containers
- Monitoring and backup services isolated from guest and IoT traffic
## Configuration Example
Example inventory model:
```yaml
edge:
router: gateway-01
vpn: tailscale
compute:
proxmox:
- pve-01
- pve-02
- pve-03
docker_hosts:
- docker-01
platform:
dns:
- dns-01
reverse_proxy:
- proxy-01
monitoring:
- mon-01
backup:
- backup-01
```
## Troubleshooting Tips
### Services are easy to deploy but hard to operate
- Add inventory, ownership, and restore notes
- Separate platform services from experimental application stacks
- Avoid hiding critical dependencies inside one large Compose file
### Changes in one area break unrelated systems
- Recheck network boundaries and shared credentials
- Remove unnecessary coupling between storage, reverse proxy, and app hosts
- Keep DNS, secrets, and backup dependencies explicit
### Remote access becomes risky over time
- Review which services are internet-exposed
- Prefer tailnet-only or VPN-only admin paths
- Keep management interfaces off user-facing networks
## Best Practices
- Design around failure domains, not only convenience
- Keep a small number of core platform services well documented
- Prefer simple, replaceable building blocks over fragile all-in-one stacks
- Maintain an asset inventory with hostnames, roles, and backup coverage
- Test recovery paths for DNS, identity, and backup infrastructure first
## References
- [Proxmox VE Administration Guide: Cluster Manager](https://pve.proxmox.com/pve-docs/chapter-pvecm.html)
- [Docker: Docker overview](https://docs.docker.com/get-started/docker-overview/)
- [Tailscale: What is Tailscale?](https://tailscale.com/kb/1151/what-is-tailscale)
- [Prometheus](https://prometheus.io/)

View File

@@ -0,0 +1,67 @@
---
title: Homelab Network Architecture
description: Reference network architecture for a segmented homelab with private access and clear service boundaries
tags:
- homelab
- networking
- architecture
category: systems
created: 2026-03-14
updated: 2026-03-14
---
# Homelab Network Architecture
## Summary
A homelab network architecture should separate trust zones, keep administrative paths private, and make service traffic easy to reason about. The goal is not enterprise complexity, but a structure that reduces blast radius and operational confusion.
## Why it matters
Flat networks are easy to start with and difficult to secure later. A basic segmented design helps isolate management, servers, clients, guest devices, and less trusted endpoints such as IoT hardware.
## Core concepts
- Segmentation by trust and function
- Routed inter-VLAN policy instead of unrestricted layer-2 reachability
- Separate administrative access paths from public ingress
- DNS and reverse proxy as shared network-facing platform services
## Practical usage
Example logical layout:
```text
Management -> hypervisors, switches, storage admin
Servers -> applications, databases, utility VMs
Clients -> workstations and laptops
IoT -> low-trust devices
Guest -> internet-only access
VPN overlay -> remote access for administrators and approved services
```
This model works well with:
- A firewall or router handling inter-segment policy
- Private access through Tailscale or another VPN
- Reverse proxy entry points for published applications
## Best practices
- Keep management services on a dedicated segment
- Use DNS names and documented routes instead of ad hoc host entries
- Limit which segments can reach storage, backup, and admin systems
- Treat guest and IoT networks as untrusted
## Pitfalls
- Publishing management interfaces through the same path as public apps
- Allowing lateral access between all segments for convenience
- Forgetting to document routing and firewall dependencies
- Relying on multicast-based discovery across routed segments without a plan
## References
- [RFC 1918: Address Allocation for Private Internets](https://www.rfc-editor.org/rfc/rfc1918)
- [RFC 4193: Unique Local IPv6 Unicast Addresses](https://www.rfc-editor.org/rfc/rfc4193)
- [Tailscale: Subnet routers](https://tailscale.com/kb/1019/subnets)

View File

@@ -0,0 +1,64 @@
---
title: Identity Management Patterns
description: System-level identity management patterns for self-hosted and homelab environments
tags:
- identity
- authentication
- architecture
category: systems
created: 2026-03-14
updated: 2026-03-14
---
# Identity Management Patterns
## Summary
Identity management patterns describe how users, devices, and services are authenticated and governed across a self-hosted environment. Strong patterns reduce credential sprawl and make account lifecycle management more consistent.
## Why it matters
As services multiply, local account management becomes a source of weak passwords, missed offboarding, and inconsistent MFA coverage. A system-level identity pattern helps centralize trust while preserving operational fallback paths.
## Core concepts
- Central identity provider for users
- Federated login to applications through OIDC or SAML
- Strong admin authentication for infrastructure access
- Separate handling for service accounts and machine credentials
## Practical usage
A practical identity pattern often looks like:
```text
Users -> Identity provider -> Web applications
Admins -> VPN + SSH key or hardware-backed credential -> Infrastructure
Services -> Scoped machine credentials -> Databases and APIs
```
Supporting services may include:
- MFA-capable identity provider
- Reverse proxy integration for auth-aware routing
- Secrets management for service credentials
## Best practices
- Centralize user login where applications support it
- Require MFA for administrative and internet-exposed access
- Keep service credentials scoped to one system or purpose
- Maintain documented break-glass and recovery procedures
## Pitfalls
- Treating shared admin accounts as acceptable long-term practice
- Leaving old local users in place after federation is introduced
- Using one service credential across many applications
- Forgetting to protect the identity provider as critical infrastructure
## References
- [OpenID Connect Core 1.0](https://openid.net/specs/openid-connect-core-1_0.html)
- [NIST Digital Identity Guidelines](https://pages.nist.gov/800-63-3/)
- [Yubico developer documentation](https://developers.yubico.com/)

View File

@@ -0,0 +1,67 @@
---
title: Monitoring Stack Architecture
description: Reference architecture for a monitoring stack in a self-hosted or homelab environment
tags:
- monitoring
- observability
- architecture
category: systems
created: 2026-03-14
updated: 2026-03-14
---
# Monitoring Stack Architecture
## Summary
A monitoring stack architecture defines how metrics, probes, dashboards, and alerts fit together. In self-hosted environments, the stack should stay small enough to operate but broad enough to cover infrastructure, ingress, and critical services.
## Why it matters
Monitoring that is bolted on late often misses the services operators actually depend on. A planned stack architecture makes it easier to understand where signals come from and how alerts reach the right people.
## Core concepts
- Collection: exporters and scrape targets
- Storage and evaluation: Prometheus
- Visualization: Grafana
- Alert routing: Alertmanager
- External validation: blackbox or equivalent endpoint checks
## Practical usage
Typical architecture:
```text
Hosts and services -> Exporters / probes -> Prometheus
Prometheus -> Grafana dashboards
Prometheus -> Alertmanager -> notification channel
```
Recommended coverage:
- Host metrics for compute and storage systems
- Endpoint checks for user-facing services
- Backup freshness and certificate expiry
- Platform services such as DNS, reverse proxy, and identity provider
## Best practices
- Monitor the path users depend on, not only the host underneath it
- Keep the monitoring stack itself backed up and access controlled
- Alert on actionable failures rather than every threshold crossing
- Document ownership for critical alerts and dashboards
## Pitfalls
- Monitoring only CPU and memory while ignoring ingress and backups
- Running a complex stack with no retention or alert review policy
- Depending on dashboards alone for outage detection
- Forgetting to monitor the monitoring components themselves
## References
- [Prometheus overview](https://prometheus.io/docs/introduction/overview/)
- [Prometheus Alertmanager overview](https://prometheus.io/docs/alerting/latest/overview/)
- [Prometheus `node_exporter`](https://github.com/prometheus/node_exporter)
- [Grafana documentation](https://grafana.com/docs/grafana/latest/)