Post

The Contract Before the Code: What Is an Operating System, Really?

An operating system is a collection of invariants — promises the kernel makes to every program. This post names them.

The Contract Before the Code: What Is an Operating System, Really?

Every program you’ve ever written ran on a lie. Your code believed it owned the CPU. It believed memory was private and infinite. It believed the disk would remember what it wrote. None of that was true. The operating system made it appear true — and spent extraordinary effort ensuring the illusion never cracked.

That’s what an OS is. Not a desktop. Not a file manager. Not a boot screen. An operating system is a collection of invariantsproperties that must hold true no matter what else is happening on the machine. Every process gets its own CPU. Every address space is isolated. Every fsync makes data durable. These aren’t features. They’re promises. And the kernel’s entire job is keeping them.

This series is about those promises: what the kernel guarantees, how it enforces each guarantee, and what breaks when it fails.

graph TB
    subgraph Processes["User Processes"]
        direction TB
        P1["Process A"]
        P2["Process B"]
        P3["Process C"]
    end

    subgraph OS["Operating System"]
        direction LR
        V["Virtualization"]
        CO["Concurrency"]
        PE["Persistence"]
    end

    subgraph HW["Hardware"]
        CPU["CPU"]
        RAM["RAM"]
        DISK["Disk"]
    end

    P1 & P2 & P3 -->|system calls| OS
    V --> CPU
    CO --> RAM
    PE --> DISK

    style OS fill:#1a1a2e,stroke:#e94560,color:#fff
    style HW fill:#0f3460,stroke:#16213e,color:#fff

Three Pillars, Three Invariants

Operating systems are built on three pillars. Each one is an invariant in disguise.

Virtualization: The Illusion of Ownership

You have one CPU with four cores. Sixty processes want to run. The OS virtualizes the CPU — it gives each process a time slice, switches between them fast enough that every process believes it has a dedicated processor.

Same trick with memory. Your machine has 16 GB of physical RAM. But every process sees its own address space starting at 0x0, stretching to some large virtual limit. Two processes can both store data at “address 0x4000” and they’ll never collide — because neither one is talking about the same physical location.

1
2
3
4
5
6
7
8
9
10
11
// Process A writes to "its" memory at address 0x4000
let ptr: *mut u8 = 0x4000 as *mut u8;
unsafe { *ptr = 42; }

// Process B does the same thing — same virtual address
let ptr: *mut u8 = 0x4000 as *mut u8;
unsafe { *ptr = 99; }

// A's value is still 42. B's is 99.
// Same address. Different physical memory.
// The OS made this happen.
graph LR
    subgraph Physical["Physical Reality"]
        CPU1["4 CPU Cores"]
        MEM1["16 GB RAM"]
    end

    subgraph Virtual["What Each Process Sees"]
        direction TB
        PA["Process A: full CPU + private memory"]
        PB["Process B: full CPU + private memory"]
        PC["Process C: full CPU + private memory"]
    end

    CPU1 -->|"time-slicing"| PA & PB & PC
    MEM1 -->|"page tables"| PA & PB & PC

    style Physical fill:#0f3460,stroke:#16213e,color:#fff
    style Virtual fill:#1a1a2e,stroke:#e94560,color:#fff

The virtualization invariant: every process believes it owns the entire machine. One CPU appears as many. One block of RAM appears as many isolated address spaces. The process cannot tell it’s sharing — and that’s the point.

When this invariant breaks, you get the kind of bug that ruins months of work. A process reads memory it doesn’t own. A stale CPU register leaks between context switches. The illusion cracks, and programs that worked for years start producing wrong answers with no explanation.

Concurrency: Coordination Without Corruption

The moment two threads share state, everything gets dangerous.

1
2
3
4
5
6
7
8
9
10
11
12
13
use std::thread;

fn main() {
    let mut counter = 0u64;

    // Two threads increment the same counter — without synchronization
    // This is a data race. Rust won't even let you compile this.
    // But in C? This compiles. Ships. Breaks at 3am.

    // Thread A: read counter (0), add 1, write (1)
    // Thread B: read counter (0), add 1, write (1)
    // Expected: 2. Actual: 1. A lost update.
}

Rust’s ownership system catches this at compile time — you can’t share a mutable reference across threads without synchronization. But the underlying problem is universal. Any system with concurrent access to shared state needs invariants about who can read what, when, and in what order.

The OS itself is massively concurrent. Two CPUs can enter the kernel simultaneously, both trying to modify the process table. The kernel must protect every shared data structure — with locks, with lock-free algorithms, with careful ordering — or the scheduler corrupts itself.

sequenceDiagram
    participant A as Thread A
    participant S as Shared Counter (0)
    participant B as Thread B

    Note over A,B: Without synchronization - lost update
    A->>S: read (gets 0)
    B->>S: read (gets 0)
    A->>S: write 0 + 1 = 1
    B->>S: write 0 + 1 = 1
    Note over S: Expected 2, Actual 1

    Note over A,B: With mutex - invariant upheld
    A->>S: lock, read (gets 0), write 1, unlock
    B->>S: lock, read (gets 1), write 2, unlock
    Note over S: Result: 2

The concurrency invariant: shared state is never corrupted by concurrent access. Not “usually safe.” Not “safe under low load.” Never corrupted. Every lock, every atomic operation, every memory barrier exists to uphold this one promise.

Persistence: Surviving the Crash

Memory is volatile. Kill the power and RAM goes blank. But your files survive — because the OS maintains a persistence layer between your program and the disk.

This sounds simple. It isn’t. Disks reorder writes. Controllers cache data and lie about flushing. The OS must use journaling, log-structuring, or careful write ordering to ensure that a crash at any point leaves the filesystem in a recoverable state.

1
2
3
4
5
6
7
8
9
10
11
use std::fs::File;
use std::io::Write;

fn save_critical_data(data: &[u8]) -> std::io::Result<()> {
    let mut f = File::create("important.dat")?;
    f.write_all(data)?;
    f.sync_all()?;  // THIS is the invariant boundary
    // Before sync_all: data might be in a kernel buffer. Crash = lost.
    // After sync_all: data is on stable storage. Crash = safe.
    Ok(())
}
graph LR
    A["Program writes data"] --> B["User Buffer - volatile"]
    B -->|"write()"| C["Kernel Page Cache - volatile"]
    C -->|"fsync()"| D["Disk - durable"]

    style B fill:#e94560,stroke:#e94560,color:#fff
    style C fill:#e94560,stroke:#e94560,color:#fff
    style D fill:#16c79a,stroke:#16c79a,color:#fff

    linkStyle 2 stroke:#16c79a,stroke-width:3px

Before fsync: crash = data lost. After fsync: crash = data safe. The boundary is the syscall.

The persistence invariant: after a successful fsync, the data is on stable storage. A crash cannot lose it.

ext2 broke this invariant regularly — a power failure could leave the filesystem in an inconsistent state, requiring a full fsck scan on reboot. ext3 added journaling. ext4 refined it. The entire history of filesystem design is the history of trying to make this invariant cheaper to enforce.

The OS as Resource Manager

There’s a second way to think about what an OS does. It’s not just a magician maintaining illusions — it’s a resource manager deciding who gets what.

  • CPU time: Which process runs next? For how long? What happens when a higher-priority task arrives?
  • Memory: Which pages stay in RAM? Which get evicted to disk? What happens when a process asks for more than is available?
  • Disk bandwidth: Which I/O requests go first? How are writes ordered to maintain consistency?
graph TB
    subgraph RM["OS: Resource Manager"]
        direction LR
        S["Scheduler"]
        MM["Memory Manager"]
        FS["Filesystem"]
    end

    CPU["CPU Time"] ---|shared across time| S
    RAM["Physical RAM"] ---|shared across space| MM
    DISK["Disk I/O"] ---|shared across bandwidth| FS

    style RM fill:#1a1a2e,stroke:#e94560,color:#fff

Every resource allocation decision is a tradeoff. OS designers balance five goals that pull against each other:

Performance — minimize the overhead of OS operations. Every abstraction costs cycles. A system call is 100x slower than a function call. The OS must be fast enough that programs don’t notice the tax.

Protection — isolate processes from each other. A buggy browser tab shouldn’t crash your editor. A malicious program shouldn’t read your SSH keys. But isolation costs memory (separate page tables) and CPU time (TLB flushes on context switches).

Reliability — the OS must not crash. A user program can segfault and die; the kernel cannot. Every kernel code path must handle allocation failures, invalid inputs, and hardware errors without panicking.

Energy efficiency — do more with less power. A laptop that virtualizes perfectly but drains the battery in two hours is a failure.

Security — defend against malicious programs. This is protection’s sharper sibling. Not just “don’t interfere with each other” but “actively resist attempts to subvert the system.”

These goals conflict constantly. More isolation means more overhead. More features mean more bugs. The art of OS design is choosing which invariants to guarantee and accepting the cost of enforcing them.

Why This Framing?

Most OS courses teach mechanisms: here’s how a page table works, here’s how a context switch works, here’s how ext4 lays out inodes. Those mechanics matter. But without knowing what property they enforce, you’re memorizing steps without understanding purpose.

Every mechanism in the kernel exists to uphold an invariant. Page tables enforce the memory isolation invariant. The scheduler enforces the fairness invariant. Journaling enforces the crash consistency invariant. When you know the invariant, the mechanism makes sense. When you don’t, it’s just complexity.

This series will cover:

InvariantWhat It GuaranteesWhen It Breaks
Process isolationEvery process has private memorySpectre/Meltdown
Scheduling fairnessEvery runnable thread eventually runsPriority inversion (Mars Pathfinder)
Mutual exclusionAt most one holder of a lock at a timeDeadlock, data corruption
Crash consistencyFilesystem is recoverable after a crashext2 on power failure
Privilege separationUser code cannot execute kernel instructionsKernel exploits

The code changes. The hardware changes. The mechanisms evolve. The invariants don’t. A process still needs isolation whether it’s running on a PDP-11 or an ARM server. A filesystem still needs crash consistency whether the storage is a spinning disk or an NVMe SSD. Learn the invariants, and you understand every OS — past, present, and future.

In the next post, we cross the most important boundary in computing: the line between user mode and kernel mode. Every program you write makes system calls. Every system call crosses a trust boundary. And the kernel, on every single crossing, enforces an invariant: it never trusts you.


This is Post 1 of the series Invariants the Kernel Keeps — operating systems through the guarantees the kernel makes, and what happens when they break.

This post is licensed under CC BY 4.0 by the author.