Most developers work at a layer where hardware is invisible. You call read(), data appears. You write to a socket, bytes leave the machine. The abstraction is so good you never have to think about what’s underneath.
But at some point the abstraction leaks. You hit a performance problem you can’t explain. You debug a syscall that’s taking too long. You need to understand why Python cold starts are slow on Lambda. At that point, not knowing what’s underneath costs you.
This is my attempt to document the full picture — from the electrical signal a device sends to the CPU, to the function call your program makes. Let’s start at the bottom.
1. Device Controllers — The Middlemen
Every piece of hardware in your computer — your keyboard, your SSD, your GPU — speaks a different electrical language. The CPU doesn’t know how to talk to any of them directly. That’s where device controllers come in.
A device controller is a dedicated chip that sits between the CPU and the physical device. Think of it as a translator. The CPU says “read sector 42 from disk”, and the controller handles all the low-level electrical signaling to make that happen.
What’s inside a controller:
| Component | What it does |
|---|---|
| Control registers | Where the OS writes commands (e.g., “read sector 42”) |
| Status register | OS reads this to check if the operation finished |
| Local buffer | Small memory on the chip for temporary data storage |
Drivers: the software layer
The driver is the software that translates generic OS requests into specific commands for a particular controller. This is why you install drivers when you plug in new hardware — the OS needs to know how to talk to that specific controller.
The full chain:
Application → OS (generic interface) → Driver → Controller → Physical Device
This layered approach is what lets the OS be generic. The OS doesn’t care if you have a Samsung SSD or a Western Digital HDD — it just calls read(), and the driver handles the rest.
2. Hardware Interrupts — How Devices Talk Back
Here’s a fundamental problem: the CPU tells a disk to read some data, but the disk is slow. What does the CPU do while it waits?
Polling: the bad old days
One approach is polling — the CPU keeps asking “are you done yet?” in a loop. This works, but it’s like standing next to your microwave staring at it. You’re technically “busy” but doing nothing useful.
Interrupts: the modern approach
With interrupts, the device sends an electrical signal to the CPU when it’s done. The CPU is free to do other work in the meantime. It’s like putting food in the microwave and going to do laundry — the beep tells you when it’s ready.
This is crucial: interrupts are what made multiprogramming possible. Without them, the CPU would be stuck babysitting one device at a time.
| Aspect | Polling | Interrupts |
|---|---|---|
| Who initiates | CPU asks periodically | Device signals when ready |
| CPU usage | 100% occupied checking | Free for other work |
| Multiprogramming | Makes it impractical | Makes it possible |
The interrupt vector
Each device gets a unique interrupt number (IRQ). The kernel keeps a table in memory — the interrupt vector — that maps each IRQ to its handler function. When an interrupt fires, the CPU looks up the handler and jumps to it.
IRQ 1 → keyboard handler | IRQ 12 → touchpad handler | IRQ 14 → disk handler
Full lifecycle of an interrupt
- The driver tells the controller to start an I/O operation.
- The controller does the work independently from the CPU.
- When done, the controller fires an interrupt signal (IRQ).
- The CPU detects it between instructions and saves its current state.
- The CPU looks up the handler in the interrupt vector and jumps to it.
- The handler processes the data and signals completion.
- The CPU restores its state and goes back to what it was doing.
DMA — Direct Memory Access
DMA takes this further. Instead of the CPU copying data byte-by-byte from the controller to RAM, the DMA controller does it directly. The CPU only gets interrupted once — after the entire transfer is done.
Disk → Controller buffer → DMA copies to RAM → Interrupt → CPU reads from RAM
3. System Calls — Asking the Kernel for Help
Your program runs in user mode. It can’t touch hardware, can’t access other processes’ memory, can’t do anything dangerous. When it needs to do something privileged — read a file, allocate memory, create a process — it has to ask the kernel. That request is a syscall.
Syscalls vs. interrupts — don’t confuse them
Both result in the CPU switching to kernel mode, but they’re fundamentally different:
| Aspect | Syscall (TRAP) | Hardware Interrupt |
|---|---|---|
| Origin | Software — your program asks | Hardware — device notifies |
| Timing | Synchronous — specific instruction | Asynchronous — any moment |
| Direction | Inside out (program → kernel) | Outside in (device → kernel) |
| Examples | read(), write(), fork() | Disk finished, key pressed |
The TRAP operation
Every syscall uses a TRAP instruction to switch from user mode to kernel mode. This is enforced by hardware — specifically the CR0 register on x86 processors. A program in user mode literally cannot flip that bit. If it tries, the CPU throws an exception. The security boundary is in silicon, not software.
What a read() actually looks like under the hood
When your C program calls read(), here’s the full sequence:
read()triggers a TRAP → user mode switches to kernel mode- Kernel calls the disk driver → driver writes to controller registers
- Process goes to SLEEP state → CPU is free for other processes
- Controller reads the data → DMA copies it to RAM
- Controller fires an interrupt → CPU runs the handler
- Kernel wakes up the process → data is now available in RAM
read()returns to the program → back to user mode
Notice how both a syscall AND an interrupt happen for a single read(). The syscall is the request, the interrupt is the completion notification. They’re complementary.
4. strace — Seeing Syscalls in Real Time
strace is a Linux tool that intercepts every syscall a process makes. It uses ptrace internally — you just prepend strace to your command and it records everything.
strace ./program # intercept all syscalls
strace -c ./program # show summary statistics
strace -e trace=read,write cat file.txt # filter specific syscalls
strace -p 1234 # attach to running process
Real comparison: echo vs cat vs ps
| Program | Syscalls | Most used | Notes |
|---|---|---|---|
echo "Hello" | ~110 | openat (30x) | Locale setup dominates |
cat /etc/hostname | ~120 | openat (31x) | Simple file read |
ps | ~1562 | openat (516x) | Reads /proc for every process |
Every program follows the same boot pattern:
execve → mmap (libc) → arch_prctl → locale files → [actual work] → exit_group
C vs. Python — the cost of abstraction
Python 3.12 generates 16.5x more syscalls than C for the same “Hello World” output. That’s 562 vs 34 syscalls. The actual work — the write() that prints to screen — is exactly 1 syscall in both cases. Everything else is infrastructure: loading the interpreter, initializing the runtime, importing stdlib modules.
In both cases, the output is produced by exactly the same syscall:
write(1, "Hello World\n", 12) = 12
This matters in practice. If you’re running serverless functions where cold start time is critical, those extra 528 syscalls add up.
5. POSIX — The Contract Between Programs and the OS
POSIX (Portable Operating System Interface) is an IEEE standard that defines how programs interact with Unix-like operating systems. It’s basically the ISA of operating systems.
Just like the x86 ISA guarantees that a binary compiled for x86 runs on any x86 CPU (Intel, AMD, doesn’t matter), POSIX guarantees that code using POSIX APIs runs on any POSIX-compliant OS (Linux, macOS, FreeBSD).
| Aspect | ISA (e.g., x86) | POSIX |
|---|---|---|
| Defines | CPU instructions | Syscalls and OS APIs |
| Guarantees | Binary runs on any x86 | Code runs on any Unix |
| Who implements | Chip manufacturer | OS developer |
| Level | Hardware | Software (kernel) |
The key point: POSIX specifies what must exist and how it should behave, but not how to implement it internally. Linux and macOS both implement open() completely differently under the hood, but both are POSIX-compliant because the interface is identical.
The scope covers: syscalls, the C standard library (printf, malloc, pthread), shell utilities (ls, cat, echo), environment variables (PATH, HOME), and filesystem structure.
In the next post, I get into Unix processes specifically — fork(), exec(), wait(), zombies, orphans, and how all of this fits together into the lifecycle of every process running on your machine.