GNU LD
Timeline
2025-09-27
init
- A linker is a program that combines one or more object files generated by a compiler or assembler, along with libraries, into an executable file.
- GNU Linker uses the AT&T linker script language.
ld Command
aarch64-linux-gnu-ld- Common parameters:
-T: specify linker script.-Map: output a symbol table file.-o: output the final executable binary.
A Simple Example
1 | SECTIONS |
Basic Concepts
- Input sections and output sections.
- Each section has a name and size.
- Section attributes:
- loadable: the section contents will be loaded into memory at runtime.
- allocatable: the section contents will not be loaded at runtime.
- Section addresses:
- VMA (Virtual Memory Address): virtual address, the runtime address.
- LMA (Load Memory Address): load address.
- Typically, the ROM address is the load address, while the RAM address is the VMA.
Linker Script Commands
-
ENTRY(symbol): sets the program entry point. -
The linker has several ways to set the entry point:
- Using the
-eparameter. - Using
ENTRY(symbol). - At the very beginning of
.text. - Address 0.
- Using the
-
INCLUDE filename: includes thefilenamelinker script. -
OUTPUT filename: outputs the binary file, equivalent to using-o filenameon the command line. -
OUTPUT_FORMAT(bfd): outputs BFD format. -
OUTPUT_ARCH(bfdarch): outputs the processor architecture format.
Symbol Assignment
- Symbols can be assigned values just like in C.
.represents the location counter, indicating the current position.
Symbol References
- High-level languages often need to reference symbols defined in the linker script.
- In C, defining a variable and initializing it — e.g.,
int foo = 100:- The compiler defines a symbol
fooin the symbol table. - The compiler stores
100in memory for that symbol.
- The compiler defines a symbol
- Defining a variable in a linker script:
- The linker only defines the symbol in the symbol table; it does not allocate memory to store the variable’s value.
- Accessing a linker script-defined variable: you access the variable’s address, not its value.
- We can set symbols at each section boundary to facilitate C code accessing the start and end addresses of each section.
SECTIONS Command
- The SECTIONS command tells the linker how to map input sections to output sections, and how to lay out those output sections in memory.
- Output section descriptors:
LMA Load Address
- Every section has a VMA (virtual address, runtime address) and an LMA (load address).
- In output section descriptors, use
ATto specify the LMA. - If LMA is not specified via
AT, typically LMA = VMA. - Building a ROM-based image often requires setting different virtual and load addresses for output sections.
- The data section’s load address differs from its link address (virtual address), so program initialization must copy the data section from the ROM load address to the SDRAM virtual address.
- The data load address starts at
_etext, the data section runtime address starts at_data, and the data section size is_edata - _data. The following code copies the data section from_etextto_data:
Common Built-in Functions
ADDR(section)
Returns the VMA address of a previously defined section.
ALIGN(n)
Returns the next address aligned to
nbytes, calculated based on the current location counter.
Note:nbytes here, not 2^n bytes (different from the assembler’s.align).
SIZEOF(section)
Returns the size of a section.
MAX(exp1, exp2) / MIN(exp1, exp2)
Returns the maximum or minimum of two expressions.
Experiment 1: Printing Memory Layout of Each Section
- Linker-exported symbols are addresses, not variable values.
These symbols in the linker script:
1 | _text = .; |
define an address label (symbol address), not a variable. C has no syntax for “address labels,” so the only way to indirectly reference this address is through some kind of “variable.”
Declaring it as char[] essentially says:
“This is a memory region starting at
_text; I care about its address, not its specific contents.”
char[]is the smallest addressable memory unit, convenient for pointer arithmetic.
char is the smallest addressable unit in C (1 byte). Using char[] type allows precise address operations:
1 | extern char _text[], _etext[]; |
If you used int[] or void*, this calculation might be incorrect or uncompilable.
- Difference between
char[]vschar*: Linker symbols are “array addresses,” not pointer variables.
While you could write:
1 | extern char *_text; |
this actually means _text is a “pointer variable,” not an address label.
char *_text;tells the compiler to “fetch the value of the variable_text,” which must be assigned by code.
char _text[];declares “the linker will provide this address” — no extra symbol or variable is generated.
So the recommended approach is:
1 | extern char _text[]; |
Experiment 2: Load Address ≠ Runtime Address
Need to copy code from load address to runtime address:
Experiment 3: Analyzing the Linux 5.0 Kernel Linker Script
Runtime Address, Load Address, Link Address
Link Address
Definition:
The address assigned by the compiler and linker to each section (such as .text, .data, .bss) when generating an executable file (such as an ELF file).
Characteristics:
- An address set by the linker at compile time.
- Can be explicitly set via a linker script, e.g.,
. = 0x80000;. - These addresses are recorded in the executable’s section headers or program headers.
Example:
1 | .text : { *() } > 0x80000 |
means the .text section’s link address is 0x80000.
Load Address
Definition:
The location in memory where the executable file’s contents are loaded — the location where the OS/bootloader places the file into memory.
Characteristics:
- Usually equals the link address, but can differ in certain cases (such as dynamic linking or load address relocation).
- Determined by the OS or bootloader; can also be relocated using tools like
objcopy.
Example:
- Your ELF file’s
.textsection link address is0x80000, but the bootloader loads it at0x100000. Then:- Link address ≠ load address.
- If no relocation is performed, the program will crash on execution (due to absolute addresses in the code).
Runtime/Execution Address
Definition:
The actual memory address accessed by the CPU during program execution.
Characteristics:
- Typically equals the load address (wherever the program is loaded, it executes from there).
- If MMU (Memory Management Unit) is enabled, the runtime address is a virtual address mapped by the MMU to the physical load address.
- In bare-metal programs, generally link address = load address = runtime address.






















