Process and Process Management

What is a Process?

A process is an instance of an executing program (also called a task or job). An application on disk is a static entity; once launched and loaded into memory, it becomes a process — an active entity.

Multiple launches of the same program create separate processes that share the same code but have independent state.

A process does not have to be currently running on the CPU — it may be waiting for input or waiting to be scheduled.

Process Address Space

A process encapsulates all state needed for execution: code, data, heap, and stack. The OS wraps this in an address space spanning virtual addresses V0 to Vmax.

Regions of the address space:

  • Text — compiled code (static)

  • Data — initialized variables (static)

  • Heap — dynamically allocated memory (grows via malloc); may be non-contiguous with holes

  • Stack — LIFO structure for function call frames; grows/shrinks dynamically. Saves caller state before a procedure call and restores it on return

Virtual Addresses and Page Tables

Virtual addresses do not correspond directly to physical memory locations. The OS and MMU hardware maintain a page table that maps virtual → physical addresses.

This decoupling means:

  • Multiple processes can use the same virtual address range (e.g., both P1 and P2 can have addresses 0–64000) — the OS maps them to distinct physical locations

  • Physical memory layout is independent of application data layout

  • The OS can swap pages to disk when physical memory is scarce, and bring them back on demand

The OS uses page table entries to both translate addresses and validate access permissions.

Process Execution State

The OS must track enough state to stop a process and later resume it at the exact same point.

Key state:

  • Program Counter (PC) — address of the current instruction; maintained in a CPU register during execution

  • CPU registers — data addresses, status flags, intermediate values

  • Stack pointer — top of the process stack

Process Control Block (PCB)

The Process Control Block is a per-process OS data structure containing:

  • Process state: PC, stack pointer, all CPU register values

  • Virtual-to-physical memory mappings (page table info)

  • List of open files

  • Scheduling info: CPU time consumed, priority, time allocation

The PCB is created and initialized when the process is created (PC set to first instruction). Some fields change rarely (memory mappings), while others change on every instruction (PC) — the CPU tracks the PC in a hardware register; the OS only saves it to the PCB when the process is switched out.

Context Switching

A context switch is the mechanism for switching CPU execution from one process to another:

  1. Save the current process’s CPU state into its PCB (in memory)

  2. Load the next process’s state from its PCB into CPU registers

  3. Resume execution of the next process

Costs:

  • Direct cost — cycles to load/store PCB values to/from memory

  • Indirect cost (cache) — the running process has a hot cache (data in L1/L2/LLC). After a context switch, the incoming process finds a cold cache and suffers cache misses (memory access: ~100s of cycles vs. cache access: ~few cycles). This makes frequent context switching expensive.

Process Life Cycle

States

  • New — process created; OS performs admission control, allocates PCB and initial resources

  • Ready — admitted and waiting to be scheduled on the CPU

  • Running — executing on the CPU

  • Waiting — blocked on I/O or an event (e.g., disk read, timer, keyboard input)

  • Terminated — execution complete or error; exit code returned

Transitions:

  • New → Ready (admitted)

  • Ready → Running (scheduled by CPU scheduler)

  • Running → Ready (preempted / time slice expired)

  • Running → Waiting (I/O request or event wait)

  • Waiting → Ready (I/O complete or event occurred)

  • Running → Terminated (exit or error)

The CPU can execute a process when it is in the running or ready state (ready processes just need to be scheduled).

Process Creation

Processes form a tree: a parent creates children. On UNIX, init is the root of all processes. On Android, Zygote is the parent of all app processes (each app is forked from Zygote).

Two fundamental mechanisms:

  • fork — creates a new child process by copying the parent’s PCB. Both parent and child resume execution at the instruction immediately after the fork call (same PC value).

  • exec — replaces the child’s memory image with a new program. The child’s PC is set to the first instruction of the new program.

Typical pattern: fork to create a process, then exec to load a new program into it.

CPU Scheduling

The CPU scheduler decides which ready process runs next and for how long. Its operations:

  1. Preempt the running process (interrupt and save context)

  2. Schedule — run a scheduling algorithm to pick the next process

  3. Dispatch — context-switch to the chosen process

The scheduler must be efficient — time spent scheduling is overhead, not useful work.

Timeslice and Efficiency

If each process runs for time Tp and scheduling takes Tsched:

  • CPU utilization = (total Tp) / (total Tp + total Tsched)

  • If Tp = Tsched → only 50% useful work

  • If Tp = 10 × Tsched → ~91% useful work

Longer timeslices reduce scheduling overhead but may hurt responsiveness.

I/O and Scheduling

When a process issues an I/O request, it moves to the waiting state (placed on the I/O device’s queue). When the I/O completes, it returns to the ready queue.

Paths into the ready queue:

  • I/O completion

  • Time slice expiration (preemption)

  • Process creation (fork)

  • Interrupt occurrence

The scheduler is not responsible for: maintaining I/O queues, or generating the external events that processes wait on (except timer interrupts, which the scheduler controls).

Inter-Process Communication (IPC)

Processes are isolated by the OS (separate address spaces, controlled resource access), but they often need to communicate — especially in multi-process applications (e.g., web server + database backend).

IPC mechanisms transfer data between address spaces while preserving isolation.

Message Passing

The OS provides a shared communication channel (e.g., a buffer). Processes send and recv messages through it.

  • Pro: OS-managed; uniform API (system calls for send/receive)

  • Con: every exchange requires copying data: user space → kernel → user space (overhead)

Shared Memory

The OS maps a memory region into both processes’ address spaces. Processes read/write directly — the OS is out of the data path after setup.

  • Pro: no per-message kernel overhead; fast individual exchanges

  • Con: no OS-enforced API for access — developers must coordinate access (error-prone); the initial mapping setup is expensive

Performance: shared memory is faster per exchange, but the setup cost must be amortized over many messages. For small numbers of exchanges, message passing may be better. The answer is: it depends.