Qemu Address Space Abstraction
Timeline
2025-11-21
- init
Environment
Source Code
1 | wget https://download.qemu.org/qemu-10.1.2.tar.xz |
Create .clangd:
1 | CompileFlags: |
gdb
1 | gdb -args ./build/qemu-system-riscv64 -M virt -device edu,id=edu1 -nographic |
Basic Introduction
From the CPU’s perspective, all memory accesses operate on addresses (load/store). The CPU doesn’t care what device lies behind an address — as long as it can read/write the correct result.
CPU Memory Access Flow
- After the CPU computes the target address (via the ALU), it sends it onto the address bus along with read/write control signals;
- The device corresponding to the address (which could be regular memory or an I/O device, i.e., a peripheral) responds to the address bus signals;
- For reads, the data at that address is returned via the bus at the CPU-specified width, typically stored in the register given by the memory access instruction;
- For writes, bus data is written to the specified address at the CPU-specified width. For I/O devices, this typically updates the register at that address and may produce side effects.
How QEMU Simulates Memory/Peripherals
To simulate memory/peripheral behavior, QEMU must implement at least:
- Basic address space management — distinguish what device corresponds to a CPU-delivered address;
- Discrete address mapping — some peripheral addresses may not be contiguous;
- Address remapping — e.g., MCS-51’s RAM and XRAM both start from address 0.
QEMU provides two concepts: address-space and memory-region (mr). The former describes the mapping relationships of an entire address space (different components may see different address spaces), while the latter describes the mapping rules within a specific address range in an address space.
Address Space Layout
Use info mtree in the QEMU monitor to see the layout:
1 | (qemu) info mtree |
| Address Range | Type | Device |
|---|---|---|
| 0x00001000~0xFFFF | ROM | Board firmware |
| 0x00100000~0x00101023 | I/O | Test device + RTC |
| 0x02000000~0x020BFFFF | I/O | CLINT (SW/timer) |
| 0x0C000000~0x0C5FFFFF | I/O | PLIC |
| 0x10000000~0x100081FF | I/O | UART + Virtio-MMIO |
| 0x80000000~0x87FFFFFF | RAM | Guest memory |
- A Guest (the simulated target, here a virt machine) can have multiple address-spaces, each with potentially different mapping relationships — typical ones are I/O and memory.
- Each address-space corresponds to a mr tree. For example, address-space: memory has the root node
system, with child nodes ordered by address.
Since a mr describes mapping rules for a specific address range, discrete device mapping is easily achieved.
Example:
1 | address-space: cpu-memory-0 |
cpu-memory-0is the address-spacesystemis the top-level memory-region (container for the entire virtual system)riscv_virt_board.mrom,sifive.test, etc. are child memory-regions- Each memory-region hangs under an address-space and provides access handlers
Memory Region Address Overlap
mr supports overlapping address ranges at the same level. Overlapping regions are resolved by priority; higher-priority overlapping sections become the access target.
For mr A, its address range can be viewed as:
1 | A:[DDDDDDDDDDDDDDDDD|CCCCCCCCCCCCCCCCC|BBBBBBBBBBBBBBBBB|AAAAAAAAAAAAAAAAA] |
To implement this, QEMU uses aliases to describe overlapping parts of an mr. An alias maps part of one mr onto another mr, simplifying memory simulation (analogous to mmap).
Alias Example
1 | 0000000030000000-000000003fffffff (prio 0, i/o): alias pcie-ecam @pcie-mmcfg-mmio 0000000000000000-000000000fffffff |
- An alias memory-region maps a region to another address range in the address-space.
- Convenient for different buses accessing the same physical device.
- An alias is also a memory-region, but internally references another region.
AddressSpace
Represents the complete address space as seen by a CPU or bus. Includes all mapped memory-regions, each region’s priority (prio), and type (RAM/ROM/I/O/alias). Can be understood as the virtual machine’s “physical address space view”.
A CPU can have multiple address-spaces (e.g., a RISC-V CPU has cpu-memory-0, plus other I/O spaces or PCI bus spaces).
1 | struct AddressSpace { |
MemoryRegion
A concrete block of memory or I/O. Includes: start address range (offset relative to address-space), size, type (RAM/ROM/I/O/alias/container), child memory-regions (supports nesting), and corresponding read/write handlers or object pointers. Can be understood as a “single block” within an address-space — it could be physical memory, device registers, PCI BAR, etc.
1 | struct MemoryRegion { |
Initialization Flow
1 | main() // system/main.c |
system_memory is a global variable pointer, pointing to the mr root node. The article traces how system_memory->ops and system_memory->subregions are initialized using gdb watchpoints.
The subregions form a red-black tree structure. The root field in an address-space points to the mr root node, implementing one address-space → one memory-region tree.
Each mr corresponds to a concrete memory block RAMBlock, allocated from the Host, serving as storage for Guest peripheral devices.
mr types include: RAM, ROM, IOMMU, container.
Memory-region container is a special type containing other mrs and recording each child’s offset. It stores no data itself and has no direct read/write handler. Its purpose:
- Manage child memory-regions
- Provide address offset mapping
- Build hierarchical structures
Using mr containers, different address hierarchy relationships can be created, clearly describing relationships between subsystems at the address space level — beneficial for modularization.
Reference: