first version of the knowledge base :)
This commit is contained in:
64
30 - Systems/homelab/backup-architecture.md
Normal file
64
30 - Systems/homelab/backup-architecture.md
Normal file
@@ -0,0 +1,64 @@
|
||||
---
|
||||
title: Backup Architecture
|
||||
description: Reference backup architecture for self-hosted services, data, and infrastructure components
|
||||
tags:
|
||||
- backup
|
||||
- architecture
|
||||
- self-hosting
|
||||
category: systems
|
||||
created: 2026-03-14
|
||||
updated: 2026-03-14
|
||||
---
|
||||
|
||||
# Backup Architecture
|
||||
|
||||
## Summary
|
||||
|
||||
A backup architecture defines what is protected, where copies live, and how recovery is validated. In self-hosted environments, the architecture must account for application data, infrastructure configuration, and the operational steps needed to restore service safely.
|
||||
|
||||
## Why it matters
|
||||
|
||||
Many backup failures are architectural rather than tool-specific. Storing copies on the wrong system, skipping configuration, or never testing restores can make an otherwise successful backup job useless during an incident.
|
||||
|
||||
## Core concepts
|
||||
|
||||
- Multiple copies across different failure domains
|
||||
- Separation of live storage, backup storage, and off-site retention
|
||||
- Consistent backups for databases and stateful services
|
||||
- Restore validation as part of the architecture
|
||||
|
||||
## Practical usage
|
||||
|
||||
A practical backup architecture usually includes:
|
||||
|
||||
- Host or VM backups for infrastructure nodes
|
||||
- File or repository backups for application data
|
||||
- Separate backup of configuration, Compose files, and DNS or proxy settings
|
||||
- Off-site encrypted copy of critical repositories
|
||||
|
||||
Example model:
|
||||
|
||||
```text
|
||||
Primary workloads -> Local backup repository -> Off-site encrypted copy
|
||||
Infrastructure config -> Git + encrypted secret store -> off-site mirror
|
||||
```
|
||||
|
||||
## Best practices
|
||||
|
||||
- Back up both data and the metadata needed to use it
|
||||
- Keep at least one copy outside the main site or storage domain
|
||||
- Use backup tooling that supports verification and restore inspection
|
||||
- Make restore order and dependency assumptions explicit
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- Treating snapshots as the only backup mechanism
|
||||
- Backing up encrypted data without preserving key recovery paths
|
||||
- Assuming application consistency without database-aware handling
|
||||
- Skipping restore drills for high-value services
|
||||
|
||||
## References
|
||||
|
||||
- [restic documentation](https://restic.readthedocs.io/en/latest/)
|
||||
- [BorgBackup documentation](https://borgbackup.readthedocs.io/en/stable/)
|
||||
- [Proxmox VE Backup and Restore](https://pve.proxmox.com/pve-docs/chapter-vzdump.html)
|
||||
156
30 - Systems/homelab/homelab-architecture.md
Normal file
156
30 - Systems/homelab/homelab-architecture.md
Normal file
@@ -0,0 +1,156 @@
|
||||
---
|
||||
title: Homelab Architecture
|
||||
description: Reference architecture for building a maintainable homelab with clear trust zones and operational boundaries
|
||||
tags:
|
||||
- homelab
|
||||
- architecture
|
||||
- infrastructure
|
||||
category: systems
|
||||
created: 2026-03-14
|
||||
updated: 2026-03-14
|
||||
---
|
||||
|
||||
# Homelab Architecture
|
||||
|
||||
## Introduction
|
||||
|
||||
A homelab architecture should make experimentation possible without turning the environment into an undocumented collection of one-off systems. The most effective designs separate compute, networking, storage, identity, and operations concerns so each layer can evolve without breaking everything above it.
|
||||
|
||||
## Purpose
|
||||
|
||||
This document describes a reusable architecture for:
|
||||
|
||||
- Self-hosted services
|
||||
- Virtualization and container workloads
|
||||
- Secure remote access
|
||||
- Monitoring, backup, and update workflows
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
A practical homelab can be viewed as layered infrastructure:
|
||||
|
||||
```text
|
||||
Edge and Access
|
||||
-> ISP/router, firewall, VPN, reverse proxy
|
||||
|
||||
Network Segmentation
|
||||
-> management, servers, clients, IoT, guest
|
||||
|
||||
Compute
|
||||
-> Proxmox nodes, VMs, container hosts
|
||||
|
||||
Platform Services
|
||||
-> DNS, reverse proxy, identity, secrets, service discovery
|
||||
|
||||
Application Services
|
||||
-> dashboards, git forge, media, automation, monitoring
|
||||
|
||||
Data Protection
|
||||
-> backups, snapshots, off-site copy, restore testing
|
||||
```
|
||||
|
||||
## Recommended Building Blocks
|
||||
|
||||
### Access and identity
|
||||
|
||||
- VPN or zero-trust access layer for administrative entry
|
||||
- SSH with keys only for infrastructure access
|
||||
- DNS as the primary naming system for internal services
|
||||
|
||||
### Compute
|
||||
|
||||
- Proxmox for VM and LXC orchestration
|
||||
- Dedicated container hosts for Docker or another runtime
|
||||
- Utility VMs for DNS, reverse proxy, monitoring, and automation
|
||||
|
||||
### Storage
|
||||
|
||||
- Fast local storage for active workloads
|
||||
- Separate backup target with different failure characteristics
|
||||
- Clear distinction between snapshots and real backups
|
||||
|
||||
### Observability and operations
|
||||
|
||||
- Metrics collection with Prometheus-compatible exporters
|
||||
- Dashboards and alerting through Grafana and Alertmanager
|
||||
- Centralized backup jobs and restore validation
|
||||
- Controlled update workflow for host OS, containers, and dependencies
|
||||
|
||||
## Example Layout
|
||||
|
||||
```text
|
||||
VLAN 10 Management hypervisors, switches, storage admin
|
||||
VLAN 20 Servers reverse proxy, app VMs, databases
|
||||
VLAN 30 Clients desktops, laptops, admin workstations
|
||||
VLAN 40 IoT cameras, smart home, media devices
|
||||
VLAN 50 Guest internet-only devices
|
||||
```
|
||||
|
||||
Service placement example:
|
||||
|
||||
- Reverse proxy and DNS on small utility VMs
|
||||
- Stateful applications in dedicated VMs or clearly documented persistent containers
|
||||
- Monitoring and backup services isolated from guest and IoT traffic
|
||||
|
||||
## Configuration Example
|
||||
|
||||
Example inventory model:
|
||||
|
||||
```yaml
|
||||
edge:
|
||||
router: gateway-01
|
||||
vpn: tailscale
|
||||
|
||||
compute:
|
||||
proxmox:
|
||||
- pve-01
|
||||
- pve-02
|
||||
- pve-03
|
||||
docker_hosts:
|
||||
- docker-01
|
||||
|
||||
platform:
|
||||
dns:
|
||||
- dns-01
|
||||
reverse_proxy:
|
||||
- proxy-01
|
||||
monitoring:
|
||||
- mon-01
|
||||
backup:
|
||||
- backup-01
|
||||
```
|
||||
|
||||
## Troubleshooting Tips
|
||||
|
||||
### Services are easy to deploy but hard to operate
|
||||
|
||||
- Add inventory, ownership, and restore notes
|
||||
- Separate platform services from experimental application stacks
|
||||
- Avoid hiding critical dependencies inside one large Compose file
|
||||
|
||||
### Changes in one area break unrelated systems
|
||||
|
||||
- Recheck network boundaries and shared credentials
|
||||
- Remove unnecessary coupling between storage, reverse proxy, and app hosts
|
||||
- Keep DNS, secrets, and backup dependencies explicit
|
||||
|
||||
### Remote access becomes risky over time
|
||||
|
||||
- Review which services are internet-exposed
|
||||
- Prefer tailnet-only or VPN-only admin paths
|
||||
- Keep management interfaces off user-facing networks
|
||||
|
||||
## Best Practices
|
||||
|
||||
- Design around failure domains, not only convenience
|
||||
- Keep a small number of core platform services well documented
|
||||
- Prefer simple, replaceable building blocks over fragile all-in-one stacks
|
||||
- Maintain an asset inventory with hostnames, roles, and backup coverage
|
||||
- Test recovery paths for DNS, identity, and backup infrastructure first
|
||||
|
||||
## References
|
||||
|
||||
- [Proxmox VE Administration Guide: Cluster Manager](https://pve.proxmox.com/pve-docs/chapter-pvecm.html)
|
||||
- [Docker: Docker overview](https://docs.docker.com/get-started/docker-overview/)
|
||||
- [Tailscale: What is Tailscale?](https://tailscale.com/kb/1151/what-is-tailscale)
|
||||
- [Prometheus](https://prometheus.io/)
|
||||
67
30 - Systems/homelab/homelab-network-architecture.md
Normal file
67
30 - Systems/homelab/homelab-network-architecture.md
Normal file
@@ -0,0 +1,67 @@
|
||||
---
|
||||
title: Homelab Network Architecture
|
||||
description: Reference network architecture for a segmented homelab with private access and clear service boundaries
|
||||
tags:
|
||||
- homelab
|
||||
- networking
|
||||
- architecture
|
||||
category: systems
|
||||
created: 2026-03-14
|
||||
updated: 2026-03-14
|
||||
---
|
||||
|
||||
# Homelab Network Architecture
|
||||
|
||||
## Summary
|
||||
|
||||
A homelab network architecture should separate trust zones, keep administrative paths private, and make service traffic easy to reason about. The goal is not enterprise complexity, but a structure that reduces blast radius and operational confusion.
|
||||
|
||||
## Why it matters
|
||||
|
||||
Flat networks are easy to start with and difficult to secure later. A basic segmented design helps isolate management, servers, clients, guest devices, and less trusted endpoints such as IoT hardware.
|
||||
|
||||
## Core concepts
|
||||
|
||||
- Segmentation by trust and function
|
||||
- Routed inter-VLAN policy instead of unrestricted layer-2 reachability
|
||||
- Separate administrative access paths from public ingress
|
||||
- DNS and reverse proxy as shared network-facing platform services
|
||||
|
||||
## Practical usage
|
||||
|
||||
Example logical layout:
|
||||
|
||||
```text
|
||||
Management -> hypervisors, switches, storage admin
|
||||
Servers -> applications, databases, utility VMs
|
||||
Clients -> workstations and laptops
|
||||
IoT -> low-trust devices
|
||||
Guest -> internet-only access
|
||||
VPN overlay -> remote access for administrators and approved services
|
||||
```
|
||||
|
||||
This model works well with:
|
||||
|
||||
- A firewall or router handling inter-segment policy
|
||||
- Private access through Tailscale or another VPN
|
||||
- Reverse proxy entry points for published applications
|
||||
|
||||
## Best practices
|
||||
|
||||
- Keep management services on a dedicated segment
|
||||
- Use DNS names and documented routes instead of ad hoc host entries
|
||||
- Limit which segments can reach storage, backup, and admin systems
|
||||
- Treat guest and IoT networks as untrusted
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- Publishing management interfaces through the same path as public apps
|
||||
- Allowing lateral access between all segments for convenience
|
||||
- Forgetting to document routing and firewall dependencies
|
||||
- Relying on multicast-based discovery across routed segments without a plan
|
||||
|
||||
## References
|
||||
|
||||
- [RFC 1918: Address Allocation for Private Internets](https://www.rfc-editor.org/rfc/rfc1918)
|
||||
- [RFC 4193: Unique Local IPv6 Unicast Addresses](https://www.rfc-editor.org/rfc/rfc4193)
|
||||
- [Tailscale: Subnet routers](https://tailscale.com/kb/1019/subnets)
|
||||
64
30 - Systems/homelab/identity-management-patterns.md
Normal file
64
30 - Systems/homelab/identity-management-patterns.md
Normal file
@@ -0,0 +1,64 @@
|
||||
---
|
||||
title: Identity Management Patterns
|
||||
description: System-level identity management patterns for self-hosted and homelab environments
|
||||
tags:
|
||||
- identity
|
||||
- authentication
|
||||
- architecture
|
||||
category: systems
|
||||
created: 2026-03-14
|
||||
updated: 2026-03-14
|
||||
---
|
||||
|
||||
# Identity Management Patterns
|
||||
|
||||
## Summary
|
||||
|
||||
Identity management patterns describe how users, devices, and services are authenticated and governed across a self-hosted environment. Strong patterns reduce credential sprawl and make account lifecycle management more consistent.
|
||||
|
||||
## Why it matters
|
||||
|
||||
As services multiply, local account management becomes a source of weak passwords, missed offboarding, and inconsistent MFA coverage. A system-level identity pattern helps centralize trust while preserving operational fallback paths.
|
||||
|
||||
## Core concepts
|
||||
|
||||
- Central identity provider for users
|
||||
- Federated login to applications through OIDC or SAML
|
||||
- Strong admin authentication for infrastructure access
|
||||
- Separate handling for service accounts and machine credentials
|
||||
|
||||
## Practical usage
|
||||
|
||||
A practical identity pattern often looks like:
|
||||
|
||||
```text
|
||||
Users -> Identity provider -> Web applications
|
||||
Admins -> VPN + SSH key or hardware-backed credential -> Infrastructure
|
||||
Services -> Scoped machine credentials -> Databases and APIs
|
||||
```
|
||||
|
||||
Supporting services may include:
|
||||
|
||||
- MFA-capable identity provider
|
||||
- Reverse proxy integration for auth-aware routing
|
||||
- Secrets management for service credentials
|
||||
|
||||
## Best practices
|
||||
|
||||
- Centralize user login where applications support it
|
||||
- Require MFA for administrative and internet-exposed access
|
||||
- Keep service credentials scoped to one system or purpose
|
||||
- Maintain documented break-glass and recovery procedures
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- Treating shared admin accounts as acceptable long-term practice
|
||||
- Leaving old local users in place after federation is introduced
|
||||
- Using one service credential across many applications
|
||||
- Forgetting to protect the identity provider as critical infrastructure
|
||||
|
||||
## References
|
||||
|
||||
- [OpenID Connect Core 1.0](https://openid.net/specs/openid-connect-core-1_0.html)
|
||||
- [NIST Digital Identity Guidelines](https://pages.nist.gov/800-63-3/)
|
||||
- [Yubico developer documentation](https://developers.yubico.com/)
|
||||
67
30 - Systems/observability/monitoring-stack-architecture.md
Normal file
67
30 - Systems/observability/monitoring-stack-architecture.md
Normal file
@@ -0,0 +1,67 @@
|
||||
---
|
||||
title: Monitoring Stack Architecture
|
||||
description: Reference architecture for a monitoring stack in a self-hosted or homelab environment
|
||||
tags:
|
||||
- monitoring
|
||||
- observability
|
||||
- architecture
|
||||
category: systems
|
||||
created: 2026-03-14
|
||||
updated: 2026-03-14
|
||||
---
|
||||
|
||||
# Monitoring Stack Architecture
|
||||
|
||||
## Summary
|
||||
|
||||
A monitoring stack architecture defines how metrics, probes, dashboards, and alerts fit together. In self-hosted environments, the stack should stay small enough to operate but broad enough to cover infrastructure, ingress, and critical services.
|
||||
|
||||
## Why it matters
|
||||
|
||||
Monitoring that is bolted on late often misses the services operators actually depend on. A planned stack architecture makes it easier to understand where signals come from and how alerts reach the right people.
|
||||
|
||||
## Core concepts
|
||||
|
||||
- Collection: exporters and scrape targets
|
||||
- Storage and evaluation: Prometheus
|
||||
- Visualization: Grafana
|
||||
- Alert routing: Alertmanager
|
||||
- External validation: blackbox or equivalent endpoint checks
|
||||
|
||||
## Practical usage
|
||||
|
||||
Typical architecture:
|
||||
|
||||
```text
|
||||
Hosts and services -> Exporters / probes -> Prometheus
|
||||
Prometheus -> Grafana dashboards
|
||||
Prometheus -> Alertmanager -> notification channel
|
||||
```
|
||||
|
||||
Recommended coverage:
|
||||
|
||||
- Host metrics for compute and storage systems
|
||||
- Endpoint checks for user-facing services
|
||||
- Backup freshness and certificate expiry
|
||||
- Platform services such as DNS, reverse proxy, and identity provider
|
||||
|
||||
## Best practices
|
||||
|
||||
- Monitor the path users depend on, not only the host underneath it
|
||||
- Keep the monitoring stack itself backed up and access controlled
|
||||
- Alert on actionable failures rather than every threshold crossing
|
||||
- Document ownership for critical alerts and dashboards
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- Monitoring only CPU and memory while ignoring ingress and backups
|
||||
- Running a complex stack with no retention or alert review policy
|
||||
- Depending on dashboards alone for outage detection
|
||||
- Forgetting to monitor the monitoring components themselves
|
||||
|
||||
## References
|
||||
|
||||
- [Prometheus overview](https://prometheus.io/docs/introduction/overview/)
|
||||
- [Prometheus Alertmanager overview](https://prometheus.io/docs/alerting/latest/overview/)
|
||||
- [Prometheus `node_exporter`](https://github.com/prometheus/node_exporter)
|
||||
- [Grafana documentation](https://grafana.com/docs/grafana/latest/)
|
||||
Reference in New Issue
Block a user