Faulty Stack Smashing Protection on ARM Systems

I discovered during the analysis of the CVE-2021-31616 vulnerability that the stack canary logic in the KeepKey firmware was broken and could be bypassed to perform practical stack smashing attacks.
Further investigation revealed that the incorrect stack protection assembler code is produced through a bug in certain GCC 9 and GCC 10 compiler versions for ARM, where it has been present for about a year. This problem has the potential to affect a wide range of ARM based embedded systems.

As with many of my previous blog articles, this is going to be a technical deep-dive into a complex security bug. Correspondingly, the article is written for technical readers with a background in the area of IT security. Go to the general summary of the issue if you are interested in the less technical version.

Technical Background
The Initial Vulnerability Discovery
Researching the Issue Origin
General Summary of the Issue
Coordinated Disclosure

Consulting

I’m a freelance Security Consultant and currently available for new projects. If you are looking for assistance to secure your projects or organization, contact me.

Technical Background

Stack canaries are one of the few low-level countermeasures that are available to detect and mitigate some subclasses of the attacks that happen through memory corruption of the stack.

If a malicious out-of-bounds write manipulates a continuous stretch of memory from a legitimate region into higher memory addresses, there is the potential to overwrite special pointers that control the program flow. A classical opportunity for this is a copy operation into a target buffer on the stack that is too small. If successful, the attacker gains control over the code execution by overwriting where the CPU jumps to once it returns from the current function. Stack canaries are an attempt at mitigating this attack and turn it into a simple crash.

The protection works in two steps:

At the beginning of a protected function, the stack canary is placed in a lower memory address near the special pointers
The stack canary value is checked against the global canary reference value before returning from the current function and using the special pointer contents

Like a canary in a coal mine, the problematic state of the stack canary triggers emergency procedures, which explains the name.

If the original contents of the stack canary are not known to the attacker, this mechanism makes it very difficult to overwrite the stack canary with the right value so that the problematic write stays undetected in step no. 2 since every other value leads to a hard exit.

Notably, stack canaries are not enabled by default in C compilers such as GCC, in part due to their performance impact and very conservative compiler default settings. Build systems have to explicitly enable canaries with one of the -fstack-protector* flags levels which makes the compiler add the described security checks to some or all functions.

The Initial Vulnerability Discovery

The KeepKey uses arm-none-eabi-gcc with the CPU architecture -mcpu=cortex-m3 -mthumb since it is based on the Cortex-M3 STM32F205 chip. The build configuration sets the strongest stack canary flag -fstack-protector-all and correctly seeds the 32bit uint32_t __stack_chk_guard variable from the internal random number generator at every device boot. It has a custom error handler that triggers an endless loop once stack smashing is detected, giving the attacker only one chance to guess the value before manual intervention by the user is required.

Unless there is a way for the attacker to read the current __stack_chk_guard canary value from the device through an information leak before the attack and dynamically adjust the exploit data that is written into memory, this should be a strong defense against code execution via the described out of bounds writes.

During the disclosure process for CVE-2021-31616, I found myself in a position where I had exhausted other attack variants, but wasn’t yet willing to give up on showing a more severe impact than a device crash. So I dug deeper into the firmware code, looking for incorrect initialization, problematic linker settings or other implementation problems. I didn’t find anything usable at first, but then stumbled over some error behavior that surprised me since it did not fit my expectations of what should happen. This led me to discovering the issue that is at the heart of this article.

Consider the following disassembler view of the ARM firmware code, generated with Ghidra on the .elf debugger symbol version of the KeepKey firmware.

What you’re looking at here is a color-coded interpretation of the assembler instructions in ARM Thumb format that make up the firmware program, more specifically the first and last part of the address_check_prefix() function. Note that this function and its intended behavior are not important since I’ve chosen it primarily for demonstration purposes. We care only about the stack canary placement and checking steps that were described in the previous section.

Disassembler view on an example function in KeepKey firmware 7.0.3

For technical reasons, the ARM assembler generally needs multiple instructions to load and store the global __stack_chk_guard canary value onto its intended local_c place on the stack 0x0c = 12 byte lower than the important pointers. This is done in three steps.

1 - Determine where the __stack_chk_guard canary data is located in memory:

        08086b6e 15 4b           ldr        r3,[PTR_PTR___stack_chk_guard_08086bc4]

2 - Fetch the __stack_chk_guard canary values from the target:

        08086b72 1b 68           ldr        r3,[r3,#0x0]=>->__stack_chk_guard

3 - Put the canary value on the stack:

        08086b74 01 93           str        r3=>__stack_chk_guard,[sp,#local_c]

Readers who are experienced with ARM assembly, usual ARM address regions and Ghidra may have noticed already that something is wrong in this code excerpt. The PTR_PTR___stack_chk_guard_08086b68 reference does not point to __stack_chk_guard directly. As the name implies, it points to another pointer which points to the location of __stack_chk_guard, hence the PTR_PTR__ prefix by Ghidra.

This indirect access is not intended and breaks the protection logic. The described ARM instructions have not been adjusted to do the required third ldr and will therefore blindly fetch the location of __stack_chk_guard from the second pointer, not the contents of __stack_chk_guard that were requested!

In the case of the KeepKey, the memory location of the stack canary is fixed at 0x2001f7f8, which is the value that Ghidra has figured out automatically and that is highlighted in the disassembler view above. In practical terms, this means that the firmware has a faulty stack smashing protection which always uses the fixed and well-known 0x2001f7f8 value as a stack canary. The randomization of __stack_chk_guard is completely ignored as that variable is never actually read.

As a result, attackers with a suitable out-of-bounds write bug and knowledge of this vulnerability can defeat the stack protection with 100% reliability.

There are two aspects that explain a little why this behavior is not so easy to detect during development:

The bug is internally consistent, meaning that the first and last part of a function that write and read the stack canary both follow the same flawed logic. Otherwise, the firmware would have crashed with stack canary mismatch symptoms immediately.
The protection is not completely broken. The fixed canary that is written is also checked correctly at the end of the function. “Naive” out of bounds writes that write into the canary are detected, giving a false sense of security since everything appears to be fully functional for out of bounds writes that are triggered intentionally.

Researching the Issue Origin

Naturally, I set out to find out how this firmware issue got introduced and if other products are also affected. At this time in the research, ShapeShift had already publicly fixed the CVE-2021-31616 issue and indicated that the original disclosure could go public, so I had the opportunity to share some context with fellow researcher Dr. Jochen Hoenicke.

He found a very interesting report in the comments of a developer blog:

Daniel Worley

I have observed something strange with arm-none-eabi-gcc 9.3.1 20200408 release building code for cortex-m4. When I tested SSP, looked at objdump output it appeared the generated code was storing the address of __stack_chk_guard on the function’s stack and comparing the address in the function epilog, not the value. I also used a debugger stepped through the assembly code to verify and it is saving and comparing the address not the value of __stack_chk_guard. I compiled the code with arm-none-eabi 8.2.1 and GCC emits code that stores and compares the value of __stack_chk_guard. Seems to be a bug.

Further research brought up this upstream bug report GCC 9 stack smash protector not generating correct code for the gcc-arm-embedded package by Daniel Worley from 2020-03-20, which unfortunately was ignored and quickly auto-closed.

It was clear that the published bug very likely referenced the same stack canary issue that I had detected, as every aspect matched. In any case, the information on weak stack canaries for ARM Cortex had not really been a secret since March 2020, even if those were two subtly different issues, meaning that any coordinated disclosure had to move on a significantly accelerated pace to be worth anything.

From there, I concentrated on two questions: which compiler versions were affected, and which embedded products were relevant?

Affected Compiler Versions

There were clear indications that some versions of GCC 9 were problematic. But which versions? And were other major series affected as well?

After manually analyzing published firmware versions of hardware wallets, I had the idea to use the excellent Godbolt compiler service to check some common versions without actually compiling new firmware. As you can see in this Godbolt compiler comparison, the correctly working compiler versions use one local segment to store the direct reference to __stack_chk_guard, while the buggy compiler versions use two, with the indirect access usually going to .LC0 and therefore running into the known issue pattern.

A table with the currently known problematic versions is shown here:

GCC Version	Affected	arm-none-eabi-gcc Version	Compiler / Release Date	Note
8.2	no
8.3.1	no	8-2019-q3-update	2019-07-10
9.2.1	yes 🔥	9-2019-q4-major	2019-11-06	fault observed in RC2.1
9.3	yes 🔥
9.3.1	yes 🔥	9-2020-q2-update	2020-04-08 / 2020-05-29	Reported externally
10.2	yes 🔥
10.2.1	no	10-2020-q4-major	2020-11-03 / 2020-12-11

This list may be extended later.

Other Affected Products

Additional research showed that firmware releases for the SatoshiLabs Trezor One and Trezor T hardware wallets from 2020 were also affected.
(Side note, I’m involved with those projects, see the other articles here on the blog)
As it turned out, a recent change to the newer 10.2.1 compiler version had already patched the issue in the new Trezor firmware versions from 2021. This had the fortunate side effect for SatoshiLabs that it was not necessary to release a new firmware version. Fortunately, all bootloader binaries had also been built with safe compiler versions and did not need replacement.

Looking at over a dozen other embedded products, it became clear that most of them had dodged this bullet simply by being slower to upgrade their compilers and staying with GCC 5.x, GCC 6.x or GCC 8.x from 2018-2019 or earlier. In multiple cases, I was also unable to find any use of the -fstack-protector-* stack protection mechanism in the build configuration. If a project is not using this security feature, it doesn’t matter if the compiler support for it is faulty or not.

A list with the known affected and unaffected firmware versions can be found in the disclosure section below.

Affected ARM Architectures

The known problematic compiler behavior is thoroughly confirmed for compiled firmwares for ARM Cortex M3 and M4 chips. But which other architectures are hit by this bug?

Looking at the Godbolt explorer for ARM gcc 9.2.1 (none) and cycling through -mcpu flag values, one can recognize the same problematic patterns for a number of ARM target architectures. Here is an example comparison for Cortex M3 and M23.

Most Cortex M architectures are affected, but many non-Cortex ARM variants are as well:

`-mcpu` Architecture	Affected	Determined via
cortex-m0	yes	compiler
cortex-m0plus	yes	compiler
cortex-m1	yes	compiler
cortex-m3	yes	firmware + compiler
cortex-m4	yes	firmware + compiler
cortex-m7	yes	compiler
cortex-m23	no	compiler
cortex-m33	yes	compiler
arm8	yes	compiler
arm9e	yes	compiler
arm10e	yes	compiler
strongarm	yes	compiler
xscale	yes	compiler

This list is incomplete and may be extended in the future. Unaffected variants are mostly not listed for readability.

Unrelated Stack Canary Fix

Starting with GCC 10.2.1, there is another canary-related fix that is acknowledged in the changelog:

Fixed issue https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96191 where the -fstack-protector option was leaving the canary value in a temporary register on return from the function.

In short, the canary check didn’t properly clean up some of the side effects, which could help an attacker to obtain the canary value through other code problems. However, this is not directly related to the more serious issue that is described in this article. It is possible that patching the information leak also had a positive influence on the problematic indirect canary access, but this is not clear at this point and may be a coincidence.

Future Issues & Other Projects

I think it is plausible that other projects which currently use GCC 8 or earlier may want to switch to a newer compiler at some time in the future, landing on one of the affected versions and getting hit by this bug as well.

It was one of the motivations for this detailed blog article to serve as a public reference so that this scenario can be avoided in some cases. Also, it would obviously be helpful if the compiler bug is acknowledged officially and if backports are introduced for the popular GCC 9.x packages.

I’m also open to requesting a CVE for this issue, but have not done so at the moment.

General Summary of the Issue

This publication is about a faulty software protection mechanism in multiple embedded devices.

Unlike other bugs that break the first line of defense and create an opportunity for an attacker to do something malicious, this bug partially undermines the second line of defense. The faulty protection fails to stop attacks, but is not an open door by its own.

To make this more comparable for non-technical people: imagine a broken seatbelt that doesn’t lock correctly. The faulty seatbelt doesn’t cause car crashes, but in the case of a car crash, the consequences of this the lack of protection can be serious, right? And you wouldn’t want to drive around with it if you can help it.

Adding the correct protection is one of the responsibilities of the compiler, which translates human-readable code into machine-readable code, among other things. Some versions of a very common compiler are faulty, which resulted in these “broken seatbelts” in various projects that depend on this compiler.

There are two factors that make this less of a widespread issue:

Unlike the modern world of cars, not all manufacturers of embedded devices actually use this digital “seatbelt” protection mechanism. There are many products that haven’t lost their protection since they didn’t include any in the first place. (Some of them should start doing that, though).
Many projects have been using older compiler versions that are not buggy. At the moment, those projects are safe, but there might be problems in the future if they upgrade to one of the buggy compiler versions.

Of course, it is troublesome that important functionality like this can silently break at all. It really shouldn’t. Particularly for security-conscious projects such as hardware wallets, bugs introduced by the compiler are really concerning. Unfortunately, software development is hard, and compilers are complicated pieces of digital machinery. It is difficult to notice if they break in some weird edge cases or only do 95% of what they’re told to do.

For an end user of one of the mentioned wallet products, my recommendation is to upgrade to one of the fixed firmware versions. This is not super urgent, but should not be delayed for too long either to be on the safe side.

KeepKey users: please also see the recent CVE-2021-31616 vulnerability that is urgent.

Coordinated Disclosure

Due to the overlap with CVE-2021-31616 and the fact that only KeepKey had to release a new firmware, some of that disclosure summary from the other article applies here as well.

I want to positively mention that ShapeShift has released a patched firmware for this after just 5 days and were very responsive.

Overall, this disclosure was interesting, but also a lot of detective work and analysis tasks.

Affected Products

Product	Source	Affected Versions	Unaffected or Fixed Versions	Publications	IDs
ShapeShift KeepKey	GitHub	v6.4.0 to v7.1.1	≤ v6.3.0, ≥ v7.1.2	ShapeShift Advisory	VULN-21013
SatoshiLabs Trezor One	GitHub	1.9.0 to 1.9.3	≤ 1.8.3, ≥ 1.9.4
SatoshiLabs Trezor T	GitHub	2.2.0 to 2.3.4	≤ 2.1.8, ≥ 2.3.5

At the time of writing, I’m not aware of other affected hardware wallets or other security devices.

It is difficult to determine this for closed-source projects, of course. If you find new information on other products, let me know.

Detailed timeline

Date	Information
2020-03-20	Public bug report by Daniel Worley
2020-08-20	Public blog comment by Daniel Worley
2021-04-21	I (re-)discover the vulnerability
2021-04-22	Coordinated disclosure to ShapeShift and SatoshiLabs
2021-04-23	Call with ShapeShift, new patched KeepKey test firmware confirmed as unaffected
2021-04-27	ShapeShift releases fixed v7.1.2 firmware on GitHub (note: public patching was allowed during the embargo)
2021-05-02	Proposed embargo end for 2021-05-07
2021-05-03	Coordinated disclosure to Ledger (to confirm their status)
2021-05-04	Ledger indicates that they are not affected
2021-05-07 12:00 UTC	End of the coordinated disclosure embargo
2021-05-07	Publication of this blog article

Credits

Special thanks go to Dr. Jochen Hoenicke, who has helped with this research.

A Note About Research Affiliation

I want to emphasize that this research was done on my own time and initiative. In particular, it was not sponsored by SatoshiLabs, for whom I do some paid freelance security research on the related Trezor project.

Bug Bounty

ShapeShift has paid a bug bounty for this issue.

Contents