I/O Management¶
I/O Devices¶
I/O devices include keyboards, microphones, displays, speakers, mice, network interfaces, and disks.
Input only: keyboard, microphone
Output only: speaker, display
Both: hard disk, NIC, flash card
Device Features¶
Any device can be abstracted to have:
Control registers (accessed by CPU): - Command registers: CPU tells device what to do - Data registers: used for data transfer control - Status registers: CPU reads device state
Internal logic: microcontroller (device’s CPU), on-device memory, specialized hardware (e.g., analog-digital converters)
CPU-Device Interconnect¶
Devices connect to CPU via controllers and interconnects (e.g., PCI, PCI-X, PCIe)
PCIe: higher bandwidth, lower latency, more devices than PCI/PCI-X
Other buses: SCSI (disks), peripheral bus (keyboards)
Bridge controllers handle differences between interconnect types
Device Drivers¶
Device-specific software components in the OS
Responsible for all device access, management, and control
Manufacturers provide drivers for each OS
OS provides a device driver framework with standardized interfaces
Benefits:
Device independence: OS not specialized per device
Device diversity: easily support new devices by adding drivers
Types of Devices¶
Block devices (e.g., disks): operate on fixed-size blocks; support direct/random access
Character devices (e.g., keyboards): serial stream; get/put character interface
Network devices: stream of variable-size data chunks (between block and character)
OS represents devices internally as files (on Unix: /dev/ directory, managed by tmpfs/devfs).
Pseudo devices: /dev/null (discards output), /dev/random (pseudo-random bytes).
CPU-Device Interactions¶
Memory-Mapped I/O¶
Device registers mapped to physical memory addresses
CPU writes to those addresses → PCI controller routes to device
Portion of physical address space dedicated to device interaction
Configured via Base Address Registers (BAR) during boot (PCI configuration protocol)
I/O Port Model¶
Special CPU instructions (e.g., x86
in/out) target specific I/O portsEach instruction specifies target device (port) and value (in register)
Interrupt vs. Polling¶
Interrupts: device signals CPU; overhead from handler execution, interrupt mask management, cache pollution; but immediate notification
Polling: CPU periodically reads device status register; can choose convenient times (less cache disruption); risk of delay or wasted cycles
Choice depends on device type, throughput/latency goals, interrupt handler complexity, and device data rate
Programmed I/O (PIO)¶
CPU directly writes commands to device command registers and moves data through device data registers
No additional hardware required
Example: sending 1500-byte packet on NIC with 8-byte bus = 1 command write + ~188 data writes = 189 CPU accesses
Direct Memory Access (DMA)¶
Relies on DMA controller hardware
CPU writes command to device + configures DMA controller (source address, size)
DMA controller moves data directly between device and memory without CPU involvement per byte
Example: same 1500-byte packet = 1 command write + 1 DMA configuration = 2 operations
DMA configuration is complex (more cycles than a single store) → PIO better for small transfers
Pinning: memory regions involved in DMA must be pinned (non-swappable) in physical memory
Trade-off: For a system where store costs 1 cycle and DMA config costs 5 cycles (8-byte bus):
Keyboard (small data): PIO is better
NIC: depends on packet size (< 5 stores → PIO; larger → DMA)
Typical Device Access Flow¶
Process issues system call (e.g., send, read)
OS runs in-kernel stack (e.g., TCP/IP, file system)
Device driver configures the device (via PIO or DMA)
Device performs the operation
Results/events traverse the chain in reverse (interrupt → driver → kernel → process)
OS Bypass¶
Device registers/memory mapped directly to user process address space
User-level driver (library) handles device-specific operations
OS involved only in setup and coarse-grain control (enable/disable, permissions)
Requirements: device must have sufficient separate registers for user operations vs. OS control
Device must support demultiplexing to route data to correct process (e.g., inspecting packet port numbers)
Synchronous vs. Asynchronous I/O¶
Synchronous: calling thread blocks until I/O completes (placed on device wait queue)
Asynchronous: thread continues after issuing I/O call; later polls for results or receives notification
Block Device Stack¶
From top to bottom:
User application: operates on files (logical storage units)
POSIX API:
open(),read(),write(),close()Virtual File System (VFS): abstraction layer hiding details of underlying file systems
Specific file system (e.g., ext2, ext3, ext4): maps files to disk blocks
Generic block layer: standard interface to all block device types; masks device-specific differences
Device driver: speaks device-specific protocol
Virtual File System (VFS)¶
Hides from applications whether files span multiple devices, use different file system implementations, or reside on remote servers.
VFS Key Abstractions¶
File:
Represented by file descriptors (created on open)
Operations: read, write, lock, sendfile, close
Inode (index node):
Persistent data structure; one per file
Contains: list of data blocks, permissions, size, lock status
Files identified by inode number
Files need not be stored contiguously on disk
Dentry (directory entry):
Soft-state (in-memory only, not persisted to disk)
One dentry per path component (e.g.,
/users/ada→ dentries for/,users,ada)Dentry cache avoids re-traversing paths
Superblock:
Map of how file system is organized on storage device
Contains: number of inodes, number of blocks, start of free blocks
File-system-specific metadata
ext2 File System¶
Disk partition layout:
Block 0: boot block (not used by Linux)
Remaining partition divided into Block Groups, each containing: - Superblock: count of inodes, disk blocks, start of free blocks - Group descriptor: bitmap locations, free node count, directory count - Bitmaps: quickly find free blocks/inodes - Inode table: each inode is 128 bytes, describes one file (owner, stats, data block locations) - Data blocks: actual file content
ext2 tries to balance allocation of directories and files across block groups.
Inodes and Indirect Pointers¶
Direct pointers only (simple approach):
128-byte inode with 4-byte block pointers → max 32 pointers → max file size = 32KB (with 1KB blocks)
Too restrictive
Indirect pointer scheme (used in practice):
Direct pointers: each points to one data block (1KB each)
Single indirect: points to a block of pointers (256 pointers × 1KB = 256KB)
Double indirect: pointer → block of pointers → blocks of pointers → data (256² × 1KB = 64MB)
Triple indirect: 256³ × 1KB = 16GB
With 12 direct + 1 single + 1 double + 1 triple indirect (1KB blocks, 4-byte pointers):
Max file size ≈ 16 GB
With 8KB blocks (2K pointers per block):
Max file size ≈ 64 TB
Trade-off: deeper indirection = more disk accesses per file read (up to 4 for double indirect).
Disk Access Optimizations¶
Buffer cache:
Cache file data in main memory; periodic flush to disk (
fsync)Amortizes disk write cost over multiple in-memory writes
I/O scheduling:
Reorders disk operations to maximize sequential access and minimize disk head movement
Example: reorder write(25), write(17) to write(17), write(25) if head is at position 15
Prefetching:
Read ahead multiple blocks when one block is accessed (exploits locality)
Uses more disk bandwidth but improves cache hit rate
Journaling:
Write updates to a sequential log before applying to proper disk locations
Protects against data loss on crash; reduces random writes
Used by ext3, ext4, and many modern file systems
Journal must be periodically flushed to proper disk locations