Day 1: Boot Process & Systemd
Understanding how a system goes from “Off” to a login prompt is critical for troubleshooting “Kernel Panics” and boot loops.
The Linux Boot Sequence
1. BIOS/UEFI & POST
The firmware is the first code to execute.
- POST (Power-On Self-Test): A diagnostic testing sequence that verifies hardware integrity (RAM, CPU, storage controllers).
- Handoff: Once the hardware is validated, the BIOS/UEFI searches for a bootable device and executes the Boot Loader from the Master Boot Record (MBR) or EFI System Partition (ESP).
2. Boot Loader (GRUB2 / LILO)
The bridge between firmware and the Operating System.
- Responsibilities:
- Locates the OS kernel on the disk.
- Loads the Kernel and the initramfs (Initial RAM Filesystem) into memory.
- Passes control to the Kernel.
- Note: While GRUB2 is the modern industry standard, LILO (LInux LOader) is in use in Slackware.
3. Kernel & initramfs
The heart of the operating system takes control.
- initramfs: A temporary, memory-resident root filesystem. It contains the essential drivers and scripts needed to mount the real root filesystem (especially if the real root is on LVM, RAID, or encrypted partitions).
- Kernel Actions: Initializes hardware, loads device drivers/modules, and switches from the temporary
initramfsto the actual root disk.
4. The Init System (PID 1)
The Kernel starts exactly one process: the Init System.
- Role: As PID 1, it is the ancestor of all other processes. It manages the lifecycle of services, daemons, and hardware states.
- The systemd Standard: Almost all major distributions (Debian, RHEL, Ubuntu) use systemd. It uses “Unit files” to manage services in parallel, significantly speeding up boot times compared to the old SysVinit (which started services sequentially).
The “No-systemd” Perspective
While efficient, systemd is often criticized for violating the Unix Philosophy (“Do one thing and do it well”).
- Critique: It is often viewed as “bloated” because it handles logging (
journald), network management (networkd), and more, creating a single point of failure. - Alternatives: SREs and enthusiasts who prefer minimalist or modular init systems often look to projects like nosystemd.org or distributions like Alpine (OpenRC), Void (runit), or Devuan (SysVinit/OpenRC).
- https://nosystemd.org/
SRE Cheat Sheet: Boot Troubleshooting
| If it fails at… | Look at… |
|---|---|
| POST | Hardware/Motherboard (Listen for “Beep codes”). |
| Boot Loader | GRUB configuration or missing EFI partition. |
| Kernel/initramfs | Missing drivers or corrupted initrd image (Kernel Panic). |
| Init System | Failed service dependencies or corrupted .service files. |
dmesg to view kernel-level messages or journalctl -b to view logs from the current boot cycle if the system reached the Init stage.Core Resources
- Deep Dives: Linux Booting Process (Wikipedia) & Initial Ramdisk (initramfs)
- The Bible: How Linux Works: What Every Super-User Should Know by Brian Ward.
Video
- How Does Linux Boot Process Work?: IHow Does Linux Boot Process Work?
- Linux Boot Process: Linux Boot Process
- How Linux Boots: How Linux Boots
Systemd Mastery
systemd is the first process (PID 1) and the parent of all other processes.
In systemd, everything is managed as a Unit. Instead of running a script, systemd activates a unit based on its configuration file.
1. The Anatomy of Unit Types
While .service files are the most common, systemd manages nearly every aspect of the Linux ecosystem through specialized unit types.
| Unit Type | Purpose |
|---|---|
.service |
Starts and manages a background daemon or process. |
.socket |
For “Socket Activation”—starts a service only when traffic arrives on a port. |
.target |
A group of units (similar to “Runlevels”); used for boot synchronization. |
.timer |
Modern replacement for Cron; triggers a service based on time/events. |
.mount / .automount |
Manages file system mount points. |
.path |
Triggers a service when a specific file or directory is modified. |
.swap |
Manages swap memory partitions/files. |
.slice |
Cgroups (Control Groups); used for resource allocation (CPU/RAM limits). |
.device |
Represents hardware recognized by the kernel (via udev). |
.scope / .snapshot |
Manages externally created processes or saved system states. |
2. Essential systemctl Operations
The systemctl command is your primary interface for interacting with the init system.
Exploring Units
# List all loaded units (full names and status)
systemctl --full
# List only running services
systemctl list-units --type=serviceControlling Services
# Start/Stop/Restart
systemctl start [unit]
systemctl stop [unit]
systemctl restart [unit] # Full stop and start
# Configuration reload
systemctl reload [unit] # Asks the process to re-read its config WITHOUT stopping3. Monitoring & Customization
Viewing Logs
systemd uses a binary logging system called the Journal.
# View logs specifically for one service
journalctl -u [unit]
# View logs for the current boot only
journalctl -u [unit] -bCreating Your Own Units
As an administrator, you should never modify files in /lib/systemd/system/ (these are reserved for the package manager).
- Custom path: Place your units in
/etc/systemd/system/. - Workflow:
- Create
/etc/systemd/system/my-app.service. sudo systemctl daemon-reload(Tells systemd to look for new files).sudo systemctl enable --now my-app(Starts it immediately and at boot).
- Create
SRE Knowledge: Cgroups & Slices
One of the most powerful features mentioned is cgroups. By using .slice units, systemd allows you to prevent a single service from taking down the entire machine.
Example Scenario:
If you have a web crawler service that tends to leak memory, you can wrap it in a .slice that limits it to 2GB of RAM. If it exceeds that, systemd (via the kernel) will kill only that specific process, keeping the rest of your system stable.
restart a service, the PID (Process ID) will change. When you reload, the PID typically stays the same. This is crucial to know when debugging PID-locked applications.Learning Modules
- Unit Files: Intro and Unit Files
- Management: Systemctl Commands Guide
- Targets: Understanding Runlevels/Targets
- Dependencies: Unit Ordering & Dependencies
- Automation: Systemd Timers vs Cron
Quick Reference
Practical Lab (1 Hour)
Task 1: Boot Performance Analysis
Identify which services are slowing down your startup.
systemd-analyze blame
systemd-analyze critical-chainTask 2: Target Manipulation
Use systemctl set-default multi-user.target to boot into a terminal-only mode, then switch back to graphical.target.
Task 3: Custom Service & Timer
- Create a dummy script at /usr/local/bin/dummy.sh.
- Write a unit file at /etc/systemd/system/dummy.service.
- Create a .timer file to replace a traditional cron job.
# Verify your new timer
systemctl list-timers --all