Running bare metal means you inherit the full hardware surface: BMCs, RAID controllers, NIC firmware, and BIOS settings that were “adjusted once” five years ago. Treating that surface casually is how you get mystery reboots and slow disks in the middle of a release window.
IPMI and BMC hygiene
Out-of-band access is a lifeline when the OS is unhealthy, but it is also an attack surface. Segmented networks, strong credentials, audited access, and firmware updates belong in the same checklist as SSH hardening. Document the default routes to BMC interfaces so on-call is not guessing VLANs during an outage.
RAID is a policy decision
Hardware RAID versus software-defined storage trades predictability, portability, and recovery tools. Whatever you choose, automate health reporting and practice pulling a failed disk before one fails for real. Firmware baselines should ride alongside OS patch cycles, not “when someone remembers.”