GNU Build IDs for Firmware

In this post, we demonstrate how to use the GNU Build ID to uniquely identify a build. We explain what the GNU build ID is, how it is enabled, and how it is used in a firmware context.

Much has been written on how to craft a firmware version. From Embedded Artistry’s excellent blog post, to Wolfram Rösler’s how to.

Versions are a great way to identify a release: a set of interfaces and capabilities bundled together.

Versions do not - however - identify a specific binary all that well. For example, you could have multiple binaries for a given version in order to accommodate different variants of your hardware in the field.

For this, we need something else. This is where the GNU build ID comes in.

Why would we want to identify a specific binary? A few cases:

  • To match a set of debug symbols to a given binary when trying to debug a device
  • To verify that two binaries are in fact the same build

What is the GNU Build ID

Firmware engineers are not alone in wanting to uniquely identify a build for debugging purposes. In fact, Linux developers have long wanted to match a coredump to a specific build.

Roland McGrath of glibc fame came up with the GNU build ID 15 years ago, and contributed the implementation to various build tools.

The build ID is a 160-bit SHA1 string computed over the elf header bits and section contents in the file. It is bundled in the elf file as an entry in the notes section.

Each note section entry has the following layout:

+----------------+
|     namesz     |   32-bit, size of "name" field
+----------------+
|     descsz     |   32-bit, size of "desc" field
+----------------+
|      type      |   32-bit, vendor specific "type"
+----------------+
|      name      |   "namesz" bytes, null-terminated string
+----------------+
|      desc      |   "descsz" bytes, binary data
+----------------+

In the case of the GNU build ID:

  • name is "GNU\0", which gives us namesz = 4
  • desc is our 160-bit SHA1 value, which gives us descsz = 20
  • type is 3

Adding the GNU build ID to your builds

Note: all of our example are based on the minimal program from our Zero to main() series.

In GCC, you can enable build IDs with the -Wl,--build-id which passes the --build-id flag to the linker. You can then read it back by dumping the notes section of the resulting elf file with readelf -n.

By default, this is not enabled:

francois-mba:minimal francois$ make clean all
...
arm-none-eabi-size build/minimal.elf
   text    data     bss     dec     hex filename
   1252       0    8192    9444    24e4 build/minimal.elf
...
francois-mba:minimal francois$ arm-none-eabi-readelf -n build/minimal.elf
[No output]

But a small change to the CFLAGS is all it takes:

francois-mba:minimal francois$ CFLAGS="-Wl,--build-id" make clean all
build/minimal.elf
...
arm-none-eabi-size build/minimal.elf
   text    data     bss     dec     hex filename
   1288       0    8192    9480    2508 build/minimal.elf
francois-mba:minimal francois$ arm-none-eabi-readelf -n build/minimal.elf

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size       Description
  GNU                  0x00000014       NT_GNU_BUILD_ID (unique build ID
bitstring)
    Build ID: bab6b09f86b3c3017499d8e386447a610c559bd5

As you can see, our binary now contains the build id bab6b09f86b3c3017499d8e386447a610c559bd5.

Bundling the GNU build ID in firmware bin files

Getting the build ID in your executables on Linux is easy: they are ELF files! Firmware on the other hand typically deals with binaries which are assembled by copying relevant sections of the elf at the right offset in a file.

This is typically accomplished with objcopy:

$ arm-none-eabi-objcopy firmware.elf firmware.bin -O binary

This takes every elf section earmarked to be loaded and places them at the correct offset in the bin file. In the process, most debug sections are stripped out.

Dumping the elf sections of the resulting minimal.elf gives us:

$ arm-none-eabi-objdump -h build/minimal.elf

build/minimal.elf:     file format elf32-littlearm

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .note.gnu.build-id 00000024  00000000  00000000  00010000  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .text         000004e4  00000024  00000024  00010024  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  2 .bss          00000000  20000000  20000000  00000000  2**0
                  ALLOC
  3 .data         00000000  20000000  20000000  00010508  2**0
                  CONTENTS, ALLOC, LOAD, DATA
  4 .stack        00002000  20000000  20000000  00020000  2**0
                  ALLOC
  5 .debug_info   00004076  00000000  00000000  00010508  2**0
                  CONTENTS, READONLY, DEBUGGING
  6 .debug_abbrev 00000ac3  00000000  00000000  0001457e  2**0
                  CONTENTS, READONLY, DEBUGGING
  7 .debug_aranges 000000f8  00000000  00000000  00015041  2**0
                  CONTENTS, READONLY, DEBUGGING
  8 .debug_ranges 000000b8  00000000  00000000  00015139  2**0
                  CONTENTS, READONLY, DEBUGGING
  9 .debug_line   00000cfa  00000000  00000000  000151f1  2**0
                  CONTENTS, READONLY, DEBUGGING
 10 .debug_str    0000111a  00000000  00000000  00015eeb  2**0
                  CONTENTS, READONLY, DEBUGGING
 11 .comment      00000075  00000000  00000000  00017005  2**0
                  CONTENTS, READONLY
 12 .ARM.attributes 0000002c  00000000  00000000  0001707a  2**0
                  CONTENTS, READONLY
 13 .debug_frame  00000298  00000000  00000000  000170a8  2**2
                  CONTENTS, READONLY, DEBUGGING

As you can see, the .text, .bss, .data, .stack sections each have the LOAD attribute, all others (including our .note.gnu.build-id) do not and will be discarded.

To add a section to our binary, we must specify an address for it in our linker script. Assuming your linker script declares the following memory layout:

MEMORY
{
  rom      (rx)  : ORIGIN = 0x00000000, LENGTH = 0x00040000
  ram      (rwx) : ORIGIN = 0x20000000, LENGTH = 0x00008000
}

You can add the build ID to your flash memory with:

.gnu_build_id :
{
  PROVIDE(g_note_build_id = .);
  *(.note.gnu.build-id)
} > rom

This instructs the linker to append the contents of .note.gnu.build-id to the rom region of memory and create a symbol (g_note_build_id) to point to it.

Let’s check our elf sections now:

$ arm-none-eabi-objdump -h build/minimal.elf

build/minimal.elf:     file format elf32-littlearm

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         000004e4  00000000  00000000  00010000  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .gnu_build_id 00000024  000004e4  000004e4  000104e4  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .bss          00000000  20000000  20000000  00000000  2**0
                  ALLOC
  3 .data         00000000  20000000  20000000  00010508  2**0
                  CONTENTS, ALLOC, LOAD, DATA
  4 .stack        00002000  20000000  20000000  00020000  2**0
                  ALLOC
  5 .debug_info   00004076  00000000  00000000  00010508  2**0
                  CONTENTS, READONLY, DEBUGGING
  6 .debug_abbrev 00000ac3  00000000  00000000  0001457e  2**0
                  CONTENTS, READONLY, DEBUGGING
  7 .debug_aranges 000000f8  00000000  00000000  00015041  2**0
                  CONTENTS, READONLY, DEBUGGING
  8 .debug_ranges 000000b8  00000000  00000000  00015139  2**0
                  CONTENTS, READONLY, DEBUGGING
  9 .debug_line   00000cfa  00000000  00000000  000151f1  2**0
                  CONTENTS, READONLY, DEBUGGING
 10 .debug_str    0000111a  00000000  00000000  00015eeb  2**0
                  CONTENTS, READONLY, DEBUGGING
 11 .comment      00000075  00000000  00000000  00017005  2**0
                  CONTENTS, READONLY
 12 .ARM.attributes 0000002c  00000000  00000000  0001707a  2**0
                  CONTENTS, READONLY
 13 .debug_frame  00000298  00000000  00000000  000170a8  2**2
                  CONTENTS, READONLY, DEBUGGING

As you can see, our build ID now has an address assigned to it and is marked with the LOAD attribute.

Note: Make sure to declare the .gnu_build_id section after the .text section, otherwise the build ID will be set to address 0x0 and the firmware will not boot.

Reading the build ID in firmware

Once the build ID is in our binary, we need to modify our firmware to read it and, at the very least, print it over serial at boot.

From the linker script, we know that we will find the build ID section at &g_note_build_id. From the spec, we know the structure of the section and can write down a typedef:

typedef struct {
    uint32_t namesz;
    uint32_t descsz;
    uint32_t type;
    uint8_t data[];
} ElfNoteSection_t;

extern const ElfNoteSection_t g_note_build_id;

We can now simply index into the data field and print the build ID data.

void print_build_id(void) {
    const uint8_t *build_id_data = &g_note_build_id.data[g_note_build_id.namesz];

    printf("Build ID: ");
    for (int i = 0; i < g_note_build_id.descsz; ++i) {
        printf("%02x", build_id_data[i]);
    }
    printf("\n");
}

Calling this code on boot, we get:

...
Build ID: 8d7aec8b900dce6c14afe557dc8889230518be3e
...

Update: A prior version of the above code was incorrect: g_note_build_id was declared as a pointer which would lead to random data being read in the best case, and a crash in the worst case. Thanks to Simon Doppler for reporting the problem.

References

François Baldassari has worked on the embedded software teams at Sun, Pebble, and Oculus. He is currently the CEO of Memfault.