π Available languages: English | ζ₯ζ¬θͺ
What makes this project special:
In typical application development, you write programs using APIs provided by the OS (Linux, Windows, macOS). However, in this project, you can build the OS itself, gaining fundamental understanding of how computers work.
Why learn OS development:
- Understanding system operation: Memory management, process switching, I/O processing mechanisms
- Performance optimization: Deep understanding of bottlenecks and countermeasures
- Improved debugging skills: Low-level knowledge enhances complex problem-solving abilities
- Architecture understanding: Coordination mechanisms of CPU, memory, and I/O devices
x86: A processor family starting from Intel 8086. Foundation of current Intel/AMD processors.
- 16-bit era: 8086, 8088 (1978) - MS-DOS era
- 32-bit era: 80386 (1985) - Windows 95/Linux emergence
- 64-bit era: x86_64 (2003) - Modern PCs/servers
This project targets 32-bit x86.
Real Mode:
- 16-bit environment, maximum 1MB memory
- Mode in which BIOS operates
- Segment:offset addressing format
- Even modern systems start in this mode after power-on
Protected Mode:
- 32-bit environment, maximum 4GB memory
- Safe execution environment with memory protection
- Foundation of modern OS
- Segment management via GDT (Global Descriptor Table)
PIC (Programmable Interrupt Controller) - 8259A:
- Transmits interrupt signals from external devices (keyboard, timer, etc.) to CPU
- 16 interrupt lines: IRQ0-7 (master), IRQ8-15 (slave)
- OS typically remaps from BIOS settings (IRQ0-15) to 32-47
PIT (Programmable Interval Timer) - 8254:
- Generates periodic timer interrupts (necessary for OS scheduling)
- Divides reference clock 1.193182MHz to generate arbitrary periods
- Channel 0 used as system timer
VGA (Video Graphics Array):
- Text mode: 80x25 characters, each character with color attributes
- Frame buffer: Direct memory access from address 0xB8000
- Screen control via character+attribute pairs (2 bytes)
PS/2 Keyboard:
- Sends scan codes via serial communication
- Asynchronous input processing via IRQ1 interrupts
- Access via ports 0x60(data), 0x64(status)
This OS is created as a 1.44MB floppy disk image.
Floppy disk cross-section:
ββββββββββββββββββββββββββββββ
β Track 0 β Track 0 (outer)
β βββββββββββββββββββββββββ β
β β Track 1 β β Track 1
β β βββββββββββββββββββ β β
β β β Track 2 β β β Track 2
β β β Track 3 β β β Track 3
β β β ... β β β ...
β β β Track 79 β β β Track 79 (inner)
β β β β β β
β β βββββββββββββββββββ β β
β βββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββ
One track (actually circular) stretched horizontally:
ββββββββββββ¬ββββ¬ββββ¬β¬β¬β¬β¬β¬β¬βββββ
β Sector 1 | 2 | 3 | ... | 18 |
ββββββββββββ΄ββββ΄ββββ΄β΄β΄β΄β΄β΄β΄βββββ
Track = Concentric data track (0-79)
Sector = Fan-shaped section on track (1-18)
1.44MB Floppy Disk Layout:
Total capacity: 1,474,560 bytes (1440KB)
ββ Tracks: 80 (0-79)
ββ Heads: 2 (front/back surfaces)
ββ Sectors per Track: 18 (1-18)
ββ Bytes per Sector: 512
Calculation: 80 tracks Γ 2 heads Γ 18 sectors Γ 512 bytes = 1,474,560 bytes
Sector number calculation:
Physical Sector = (Track Γ 2 + Head) Γ 18 + (Sector - 1)
Example: Track 0, Head 0, Sector 1 = 0 (Boot Sector)
Example: Track 0, Head 0, Sector 2 = 1 (Kernel start position)
Sector 0 (First 512 bytes):
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββ
β Offset β Contents β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β 0x000 - 0x1FD β Boot Code (510 bytes) β
β β ββ A20 Line Enable β
β β ββ GDT (Global Descriptor Table) Setup β
β β ββ Load Kernel from Sector 1+ β
β β ββ Switch to Protected Mode β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β 0x1FE - 0x1FF β Boot Signature (0x55AA) β
βββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββββββββ
Sectors 1+ (Kernel):
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββ
β Sectors 1-N β Kernel Binary (Variable Size) β
β β ββ kernel_entry.s (Assembly entry point) β
β β ββ interrupt.s (Interrupt handlers) β
β β ββ context_switch.s (Thread switching) β
β β ββ kernel.c (Main kernel code) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Sectors N+1-2879β Unused Space (Padded with zeros) β
β (End of disk) β Total: 1440KB = 2880 sectors β
βββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββββββββ
Physical Memory Layout (32-bit Protected Mode):
0x00000000 βββββββββββββββββββββββββββββββββββββββ
β Interrupt Vector Table (IVT) β
0x00000400 βββββββββββββββββββββββββββββββββββββββ€
β BIOS Data Area β
0x00000500 βββββββββββββββββββββββββββββββββββββββ€
β Free Conventional Memory β
0x00007C00 βββββββββββββββββββββββββββββββββββββββ€
β Boot Sector (Loaded by BIOS) β β 512 bytes
0x00007E00 βββββββββββββββββββββββββββββββββββββββ€
β Free Memory β
0x000A0000 βββββββββββββββββββββββββββββββββββββββ€
β VGA Memory β
0x000B8000 β ββ Text Mode Buffer β β VGA text display
0x000C0000 βββββββββββββββββββββββββββββββββββββββ€
β BIOS ROM β
0x00100000 βββββββββββββββββββββββββββββββββββββββ€ β 1MB boundary
β Kernel Code (Loaded here) β β Our OS kernel
0x00200000 βββββββββββββββββββββββββββββββββββββββ€ β 2MB
β Kernel Stack β β Stack grows down
0x00300000 βββββββββββββββββββββββββββββββββββββββ€
β Thread Stacks & Data β
β Available RAM... β
0xFFFFFFFF βββββββββββββββββββββββββββββββββββββββ
This repository is an educational curriculum for learning step-by-step from basic PC boot concepts to a full-featured multithreaded operating system in 12 days. Each day builds upon previous concepts, accumulating knowledge through practical development.
C language knowledge is assumed, while x86 assembly is explained in a beginner-friendly manner. At each stage, you'll learn core computer science concepts experientially while building a working OS.
The final completed version is in day99_completed.
C Language (intermediate level):
- Understanding of pointers, structures, arrays
- Concepts of function pointers and callbacks
- Understanding of memory layout (stack, heap)
- Bit operations and hexadecimal notation
Computer Science Fundamentals (recommended):
- Data structures (lists, queues, stacks)
- Basic algorithm concepts
- Reading and writing hexadecimal numbers
- Understanding of binary format
x86 Assembly (learn as you go):
- Explained step-by-step in this curriculum
- No prior understanding needed, start with copy & paste
For macOS:
# Install tools with Homebrew
brew install i686-elf-gcc nasm qemu clang-formatFor Linux (Ubuntu/Debian):
# Cross-compiler and assembler
sudo apt-get update
sudo apt-get install build-essential nasm qemu-system-i386 clang-format
# download i686-elf-tools-linux.zip
# from https://github.com/lordmilko/i686-elf-tools/releases/
wget https://github.com/lordmilko/i686-elf-tools/releases/download/13.2.0/i686-elf-tools-linux.zip
# extract it in /usr/local
cd /usr/local
sudo unzip ~/Downloads/i686-elf-tools-linux.zipFor Windows:
- WSL2 (recommended): Run Ubuntu on Windows and follow Linux instructions above
- MSYS2: Native Windows development environment
- VMware/VirtualBox: Use Linux virtual machine
Required tools:
i686-elf-gcc: 32-bit x86 cross-compilernasm: x86 assemblerqemu-system-i386: x86 emulatormake: Build automation tool
- C Language: Intermediate (can handle pointers, structures, arrays)
- Assembly: Beginner (knows concepts of registers, memory, basic instructions)
- Linux/Unix: Basic operations (can use make, compile, terminal operations)
- Day 01-02: Boot process is complex β Solution: Start with Day03, return to Day01 when comfortable
- Day 04-05: Inline assembly is cryptic β Solution: Run sample code first, then deepen understanding
- Day 08-09: Context switching is hard to grasp β Solution: Observe actual register changes with debugger
- Day 10-11: Scheduler is complex β Solution: Start with simple 2 threads, expand gradually
- Build failures: Re-check development environment setup steps
- QEMU won't start: Test with day99_completed directory, then compare with your code
- Difficult to understand: Use "Understanding Check" in each day's README.md
Goal: Bootloader and C kernel basics
| Day | Theme | Main Learning Content | Output |
|---|---|---|---|
| Day 01 | Bootloader basics | BIOS, MBR, 16-bit x86 assembly | "Hello World" bootloader |
| Day 02 | Protected mode transition | A20 line, GDT, 32-bit switching | VGA text display |
| Day 03 | C language integration | freestanding C, VGA API | C kernel foundation |
| Day 04 | Serial debugging | UART, I/O ports, inline assembly | Debug environment setup |
Goal: Interrupt system and multithreading foundations
| Day | Theme | Main Learning Content | Output |
|---|---|---|---|
| Day 05 | Interrupt infrastructure | IDT, exception handling, ISR stubs | Exception handler system |
| Day 06 | Timer interrupts | PIC, PIT, IRQ0 handling | 100Hz timer operation |
| Day 07 | Thread data structures | TCB, READY list, state management | Multithreading foundation |
| Day 08 | Context switching | Register save/restore, ESP switching | Thread switching functionality |
Goal: Complete practical OS
| Day | Theme | Main Learning Content | Output |
|---|---|---|---|
| Day 09 | Preemptive scheduling | Round-robin, time slicing | Automatic thread switching |
| Day 10 | Sleep/timing | Blocking, wake-up | sleep() function implementation |
| Day 11 | Keyboard input system | PS/2, ring buffer, IRQ1 | User input processing |
| Day 12 | Integration and final project | Quality improvement, testing, documentation | Complete OS |
# Clone repository
git clone <repository-url>
cd mini-os
# Check development tools
make check-envcd day01
make clean
make all
make run # Launch OS in QEMU- Read README.md: Understand learning goals and theory for that day
- Implement: Create code step by step
- Test: Verify operation with
make run - Understand: Consider why that code is necessary
- Move on: Compare with completed version and proceed to next Day
mini-os/
βββ README.md # This file
βββ day01/ # Day 01: Bootloader basics
β βββ README.md
β βββ boot.s
β βββ Makefile
βββ day01_completed/ # Day 01 completed version (reference)
βββ day02/ # Day 02: Protected mode
βββ day02_completed/
...
βββ day12/ # Day 12: Final integration
βββ day12_completed/
βββ day99_completed/ # Final completed version (with extensions)
βββ src/
β βββ kernel/
β βββ drivers/
β βββ boot/
β βββ include/
βββ tests/
βββ docs/
1. Gradual understanding:
- Don't try to understand everything at once
- Verify it works before learning details
- Note questions and research later
2. Hands-on practice:
- Copy & paste is OK initially
- Verify operation before understanding meaning
- Make small changes and experiment
3. Debugging skills:
- Utilize serial output
- Learn QEMU monitor commands
- Get comfortable with system hangs (normal phenomenon)
4. Review and application:
- Review code from previous Days
- Compare with other architectures
- Consider differences from modern OS
Assembly language:
- Memorization is OK initially, understand gradually
- Distinguish register names and sizes (AX, EAX, etc.)
- Addressing modes (direct, indirect, etc.)
Memory layout:
- Physical vs logical addresses
- Segmentation concepts
- Stack growth direction (downward)
Interrupt handling:
- Difference between synchronous (exceptions) and asynchronous (IRQ)
- Need for interrupt disabling
- EOI (End of Interrupt) transmission timing
1. Build errors:
# Tools not found
brew install i686-elf-gcc nasm qemu # macOS
sudo apt-get install build-essential nasm qemu-system-i386 # Linux
# Path not set
export PATH="/usr/local/bin:$PATH"2. QEMU won't start:
# Check QEMU binary
which qemu-system-i386
# Check permissions
ls -la os.img3. Black screen:
- Check boot signature (0x55AA)
- Jump instruction destination address
- GDT settings and segment selectors
4. System hang:
- Check intentional infinite loop placement
- Interrupt setup order
- Stack pointer initialization
Using serial output:
# Save serial output to file
make run > debug.log 2>&1
# Check log in real-time
make run | tee debug.logQEMU monitor:
# Launch in debug mode
make debug
# Monitor commands (QEMU console)
(qemu) info registers # Register status
(qemu) info mem # Memory map
(qemu) x/10i $eip # Disassemble current instructionThis project aims to optimize for educational purposes:
Welcome contributions:
- Corrections for typos and technical errors
- Proposals for clearer explanations
- Additional debug information and troubleshooting
Pull request guidelines:
- Clearly explain changes
- Maintain existing learning sequence
- Prioritize clarity for beginners
- Specify that changes have been tested
- Day 01-02: "Hello World" output to screen β Feel of controlling hardware
- Day 03-04: Build kernel with C β Complete foundation for serious OS development
- Day 05-06: Heartbeat display with interrupts β Sense of system being "alive"
- Day 07-08: Successful thread switching β Learn basic principles of multitasking
- Day 09-10: Preemptive scheduling β Understand same mechanisms as modern OS
- Day 11-12: Keyboard input support β First step toward practical OS
- Run day99_completed: See the completed form and regain motivation
- Step back: Return to previous Day to reconfirm basics if current Day is difficult
- Community: Share learning progress with peers in similar situations
- Career perspective: Imagine how this knowledge can be applied to work
- Beginners: 2-4 hours per Day, 40-60 hours total
- Intermediate: 1-2 hours per Day, 15-30 hours total
- Advanced: 0.5-1 hour per Day, 8-15 hours total
- Regular time allocation: Concentrate on weekends or 30 minutes daily on weekdays
- Visualize results: Record videos or take screenshots of each Day's operation
- Learning log: Note what was understood and stumbling points
- SNS sharing: Share learning progress on Twitter etc. to boost motivation
Content you can challenge after completing this project:
Intermediate level:
- Shell creation: Create command-line shell
- 64-bit migration: Extend to x86_64 architecture
- Memory management: Paging, virtual memory, MMU
- File system: FAT12/16, simple file operations
- Networking: Basic TCP/IP implementation
Advanced level:
- Multi-core support: SMP, inter-CPU synchronization
- Device drivers: Support for more hardware
- Userland: System calls, process isolation
- GUI: Basic window system
Other architectures:
- ARM: OS development for Raspberry Pi
- RISC-V: New open architecture
- Embedded: Real-time OS for microcontrollers
Fields where these skills are useful:
- System software: OS, device drivers, firmware
- Embedded systems: IoT, automotive, industrial equipment
- High-performance computing: HPC, distributed systems, databases
- Security: Malware analysis, system auditing, vulnerability research
- Game development: Engine optimization, low-level performance tuning
Interview appeal points:
- Optimization abilities based on hardware understanding
- Practical experience with low-level debugging
- Design capability to oversee entire systems
- Persistent approach to difficult problems
- Boot Sector / MBR: 512-byte data area that BIOS loads first. This size is determined by hardware constraints, and BIOS loads this into memory at address 0x7C00 and begins execution. Important entry point for OS development.
- org 0x7c00: Assembler directive declaring that program is placed at memory address 0x7C00. For historical reasons, this address was chosen to maintain compatibility with CP/M-86. Based on real mode memory mapping.
- Protected Mode: Opposite of real mode; 32-bit/64-bit protected execution environment. Provides memory protection, virtual memory, and privilege level concepts. While real mode had 1MB memory limit, protected mode enables 4GB+ memory usage and is essential for modern OS.
- VGA Text Buffer / 0xB8000: Mechanism for screen display by writing directly to video card memory area (memory-mapped I/O). Writing character codes and attributes in 2-byte units to address 0xB8000 automatically displays on screen. Important concept bridging hardware and software.
- Linker Script (.ld): Instructions for arranging compiled object files into final executable. Controls which memory addresses to place object file sections (.text, .data, etc.). Important for OS development in freestanding environment as memory layout must be precisely defined.
- Interrupt: "Event notification" mechanism for CPU. Responds to asynchronous requests from hardware or software, temporarily suspending current processing to execute handler. Core concept for OS responsiveness.
- IRQ (Interrupt ReQuest): Physical lines through which interrupt request signals pass (interrupt lines). PIC manages multiple IRQs and transmits to CPU. IRQ 0-15 are standardly assigned, corresponding to devices like keyboard, timer, etc.
- PIC (Programmable Interrupt Controller): 8259A chip that bundles IRQs from multiple devices and efficiently transmits to CPU. In BIOS settings, remaps IRQ0-15 to 32-47 for use, enabling OS interrupt management.
- PIT (Programmable Interval Timer): 8254 chip that generates periodic interrupts as timer device. Divides reference clock 1.193182MHz to generate arbitrary periods. Channel 0 is used as system timer, forming foundation for scheduling.
- I/O Port (inb, outb): Mechanism for communicating with hardware using address space (I/O space) separate from memory space. inb instruction reads data from port, outb writes. Basic method for hardware control.
- Context Switch: Core concept of multitasking. Process of "saving execution state (registers, stack pointer, etc.) of one thread and restoring state of another thread". Efficiently managed using TCB.
- TCB (Thread Control Block): Data structure for storing thread state. Contains information like register values, stack pointer, execution state, priority. One assigned per thread, manipulated by scheduler.
- Privilege Level (Ring 0): Part of CPU protection functionality; highest privilege mode where OS kernel operates. Ring 0 allows all hardware access, while Ring 3 (user mode) has restrictions. Foundation of OS security.
- Ring Buffer (SPSC): Abbreviation for Single Producer Single Consumer; "simple circular buffer with one writer and one reader". Can safely exchange data without collisions. Suitable for asynchronous communication like keyboard input, implemented in technical detail in day99_completed.
- Toolchain (i686-elf-gcc, nasm): Cross-compilation environment. While normal gcc generates code for host OS (macOS, etc.), i686-elf-gcc is a "cross-compiler" that generates code for target OS (custom OS). nasm is assembler; both essential for OS development.
π― With this curriculum, you too will join the ranks of OS developers!
We welcome questions, feedback, and improvement suggestions. Let's build better learning resources together!