Timeline

2025-11-21

  1. init

Environment

Source Code

1
2
3
4
5
6
wget https://download.qemu.org/qemu-10.1.2.tar.xz
tar xvJf qemu-10.1.2.tar.xz
cd qemu-10.1.2
mkdir -p output
./configure --prefix=$PWD/output --target-list=aarch64-softmmu,riscv64-softmmu --enable-debug
bear -- make -j$(nproc)

Create .clangd:

1
2
3
CompileFlags:
Add: -Wno-unknown-warning-option
Remove: [-m*, -f*]

gdb

1
gdb -args ./build/qemu-system-riscv64 -M virt -device edu,id=edu1 -nographic

Basic Introduction

From the CPU’s perspective, all memory accesses operate on addresses (load/store). The CPU doesn’t care what device lies behind an address — as long as it can read/write the correct result.

CPU Memory Access Flow

  1. After the CPU computes the target address (via the ALU), it sends it onto the address bus along with read/write control signals;
  2. The device corresponding to the address (which could be regular memory or an I/O device, i.e., a peripheral) responds to the address bus signals;
  3. For reads, the data at that address is returned via the bus at the CPU-specified width, typically stored in the register given by the memory access instruction;
  4. For writes, bus data is written to the specified address at the CPU-specified width. For I/O devices, this typically updates the register at that address and may produce side effects.

How QEMU Simulates Memory/Peripherals

To simulate memory/peripheral behavior, QEMU must implement at least:

  1. Basic address space management — distinguish what device corresponds to a CPU-delivered address;
  2. Discrete address mapping — some peripheral addresses may not be contiguous;
  3. Address remapping — e.g., MCS-51’s RAM and XRAM both start from address 0.

QEMU provides two concepts: address-space and memory-region (mr). The former describes the mapping relationships of an entire address space (different components may see different address spaces), while the latter describes the mapping rules within a specific address range in an address space.

Address Space Layout

Use info mtree in the QEMU monitor to see the layout:

1
2
3
4
5
6
7
8
(qemu) info mtree
address-space: cpu-memory-0
address-space: memory
0000000000000000-ffffffffffffffff (prio 0, i/o): system
0000000000001000-000000000000ffff (prio 0, rom): riscv_virt_board.mrom
0000000000100000-0000000000100fff (prio 0, i/o): riscv.sifive.test
...
0000000080000000-0000000087ffffff (prio 0, ram): riscv_virt_board.ram
Address Range Type Device
0x00001000~0xFFFF ROM Board firmware
0x00100000~0x00101023 I/O Test device + RTC
0x02000000~0x020BFFFF I/O CLINT (SW/timer)
0x0C000000~0x0C5FFFFF I/O PLIC
0x10000000~0x100081FF I/O UART + Virtio-MMIO
0x80000000~0x87FFFFFF RAM Guest memory
  • A Guest (the simulated target, here a virt machine) can have multiple address-spaces, each with potentially different mapping relationships — typical ones are I/O and memory.
  • Each address-space corresponds to a mr tree. For example, address-space: memory has the root node system, with child nodes ordered by address.

Since a mr describes mapping rules for a specific address range, discrete device mapping is easily achieved.

Example:

1
2
3
4
5
address-space: cpu-memory-0
0000000000000000-ffffffffffffffff (prio 0, i/o): system
0000000000001000-000000000000ffff (prio 0, rom): riscv_virt_board.mrom
0000000000100000-0000000000100fff (prio 0, i/o): riscv.sifive.test
...
  • cpu-memory-0 is the address-space
  • system is the top-level memory-region (container for the entire virtual system)
  • riscv_virt_board.mrom, sifive.test, etc. are child memory-regions
  • Each memory-region hangs under an address-space and provides access handlers

Memory Region Address Overlap

mr supports overlapping address ranges at the same level. Overlapping regions are resolved by priority; higher-priority overlapping sections become the access target.

For mr A, its address range can be viewed as:

1
A:[DDDDDDDDDDDDDDDDD|CCCCCCCCCCCCCCCCC|BBBBBBBBBBBBBBBBB|AAAAAAAAAAAAAAAAA]

To implement this, QEMU uses aliases to describe overlapping parts of an mr. An alias maps part of one mr onto another mr, simplifying memory simulation (analogous to mmap).

Alias Example

1
0000000030000000-000000003fffffff (prio 0, i/o): alias pcie-ecam @pcie-mmcfg-mmio 0000000000000000-000000000fffffff
  • An alias memory-region maps a region to another address range in the address-space.
  • Convenient for different buses accessing the same physical device.
  • An alias is also a memory-region, but internally references another region.

AddressSpace

Represents the complete address space as seen by a CPU or bus. Includes all mapped memory-regions, each region’s priority (prio), and type (RAM/ROM/I/O/alias). Can be understood as the virtual machine’s “physical address space view”.

A CPU can have multiple address-spaces (e.g., a RISC-V CPU has cpu-memory-0, plus other I/O spaces or PCI bus spaces).

1
2
3
4
5
6
7
8
9
10
11
12
struct AddressSpace {
struct rcu_head rcu;
char *name;
MemoryRegion *root;
struct FlatView *current_map;
int ioeventfd_nb;
int ioeventfd_notifiers;
struct MemoryRegionIoeventfd *ioeventfds;
QTAILQ_HEAD(, MemoryListener) listeners;
QTAILQ_ENTRY(AddressSpace) address_spaces_link;
// ...
};

MemoryRegion

A concrete block of memory or I/O. Includes: start address range (offset relative to address-space), size, type (RAM/ROM/I/O/alias/container), child memory-regions (supports nesting), and corresponding read/write handlers or object pointers. Can be understood as a “single block” within an address-space — it could be physical memory, device registers, PCI BAR, etc.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
struct MemoryRegion {
Object parent_obj;
bool romd_mode;
bool ram;
bool subpage;
bool readonly;
bool nonvolatile;
bool rom_device;
bool is_iommu;
RAMBlock *ram_block;
Object *owner;
DeviceState *dev;
const MemoryRegionOps *ops;
void *opaque;
MemoryRegion *container;
int mapped_via_alias;
Int128 size;
hwaddr addr;
MemoryRegion *alias;
hwaddr alias_offset;
int32_t priority;
QTAILQ_HEAD(, MemoryRegion) subregions;
QTAILQ_ENTRY(MemoryRegion) subregions_link;
const char *name;
// ...
};

Initialization Flow

1
2
3
4
5
6
7
8
9
main() // system/main.c
|--qemu_init(argc, argv) // system/vlc.c
| |--cpu_exec_init_all() // system/physmem.c
| | |--io_mem_init()
| | |--memory_map_init()
| | | |--memory_region_init(system_memory, NULL, "system", UINT64_MAX)
| | | |--address_space_init(&address_space_memory, system_memory, "memory")
| | | |--memory_region_init_io(system_io, NULL, &unassigned_io_ops, NULL, "io", 65536)
| | | |--address_space_init(&address_space_io, system_io, "I/O")

system_memory is a global variable pointer, pointing to the mr root node. The article traces how system_memory->ops and system_memory->subregions are initialized using gdb watchpoints.

The subregions form a red-black tree structure. The root field in an address-space points to the mr root node, implementing one address-space → one memory-region tree.

Each mr corresponds to a concrete memory block RAMBlock, allocated from the Host, serving as storage for Guest peripheral devices.

mr types include: RAM, ROM, IOMMU, container.

Memory-region container is a special type containing other mrs and recording each child’s offset. It stores no data itself and has no direct read/write handler. Its purpose:

  1. Manage child memory-regions
  2. Provide address offset mapping
  3. Build hierarchical structures

Using mr containers, different address hierarchy relationships can be created, clearly describing relationships between subsystems at the address space level — beneficial for modularization.

Reference: