bech32_decode vulnerability

Bech32 reference C code written by Bitcoin developer Pieter Wuille for the BIP 173 standard contained an unsigned integer overflow that leads to a buffer overflow for certain malformed Bech32 addresses. While this code is not used in the Bitcoin core implementation, it is included in over a dozen cryptocurrency projects, notably in multiple hardware wallets and a Lightning node implementation.

I found this issue during security research on the Trezor One in September 2018.

The vulnerability
- Attack scenario and security implications
- Lessons learned - development perspective
Disclosure coordination

Consulting

I’m a freelance Security Consultant and currently available for new projects. If you are looking for assistance to secure your projects or organization, contact me.

The vulnerability

The code in question is in the C reference implementation of the Bech32 decoder, the other seven language variants are not affected.

The problematic lines are:

hrp_len = input_len — (1 + *data_len);
if (hrp_len < 1 || *data_len < 6) {
    return 0;
}

segwit_addr.c

This code section should return early on invalid inputs, but does not function correctly.

If the address input string does not contain the separator character 1 then *data_len is equal to input_len and hrp_len would be assigned the value -1, which would satisfy hrp_len < 1. However, hrp_len and input_len are declared as unsigned integers via size_t, so hrp_len is set to a large positive number instead (= MAX_UINT). In C, the overflow or underflow of unsigned integers is well defined, so compilers would see no problem with this step.
The integer underflow behavior causes the described if-condition to miss the check and the address parsing is allowed to continue.

In the case of malicious inputs, the following loop then runs over the larger-than-expected hrp_len and copies more input bytes to the hrp buffer than it should:

for (i = 0; i < hrp_len; ++i) {
    int ch = input[i];
    if (ch < 33 || ch > 126) {
        return 0;
    }
    if (ch >= 'a' && ch <= 'z') {
        have_lower = 1;
    } else if (ch >= 'A' && ch <= 'Z') {
        have_upper = 1;
        ch = (ch - 'A') + 'a';
    }
    hrp[i] = ch;
    chk = bech32_polymod_step(chk) ^ (ch >> 5);
}

segwit_addr.c

In practice, the bech32_decode function returns once the NULL terminator of the input string is found (ch == 0 goes into line 4 of the code snippet). Problematic inputs are between 85 and 90 characters in length (longer inputs are rejected at an earlier stage) which will perform out of bounds writes behind the array since the calling function segwit_addr_decode() only allocates 84 bytes for the hrp buffer.

Example input address to trigger this behavior: abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefg
(85 bytes, no separator, valid charset)

Attack scenario and security implications

Bech32 addresses are - by design - meant to be unsanitized, printable input from external sources. Consequently, arbitrary address strings provided by an attacker can reach affected implementations in a number of ways, either through automated protocols or potentially through manual copy operations by the end user.

Once the issue is triggered, up to six bytes of out of bounds writes directly behind the hrp buffer on the stack are performed. Attackers are limited to writing the following characters into memory:
!"#$%&'()\*+0-./023456789:;<=>?@[\]^\_`abcdefghijklmnopqrstuvwxyz{|}
These limitations are due to the input checks shown in the previous section.

We were able to build a POC for a browser-based remote attack on the Trezor One which showed that this vulnerability could be of value to an attacker.

In theory, this flaw could potentially be used to change the program flow through manipulation of addresses or content on the stack. To my knowledge, the main impact for most or all projects with hardened compiler settings was limited to a denial of service, as described in the next section. The inherent range and size limitations would likely make remote code execution on non-hardened projects very difficult, but it is hard to rule this out completely.

Existing mitigations

For projects compiled with stack protection flags such as -fstack-protector-all, the linear out of bounds write behind the intended buffer is reliably detected and leads to a deliberate program stop. As a result, the exploitation is limited to a denial of service attack on implementations with this countermeasure.

Potential issues for the Lightning network

During the impact assessment for c-lightning, Rusty Russell brought up the fact that the BOLT #11 standard mandates the use of Bech32 addresses for certain parts of the Lightning P2P protocol. Due to the importance of stable operations for Lightning network nodes, a potential remote denial of service vulnerability via the Lightning network itself could obviously be serious and have even monetary implications for some users (closed channels, …). For this reason, we spent extra effort in making sure that this was fixed as soon as possible via an early mitigation patch (as noted below) and included another Lightning node project in the disclosure.

Lessons learned - development perspective

Reference implementations can have flaws in them
Fuzz foreign C libraries, particularly if they handle external input
Address Sanitizer + libFuzzer found the out of bounds write
The non-standard -fsanitize=unsigned-integer-overflow setting is needed for the Undefined Behavior Sanitizer to catch the original unsigned integer underflow
Use -fstack-protector-all wherever you can

Disclosure coordination

Due to the potential high-profile nature of the vulnerability in the context of actively used cryptocurrency systems, I quickly reached out to upstream author Pieter Wuille and began searching for potentially affected open source projects. Relevant projects were contacted with mandatory PGP encryption to keep the issue confidential.

Due to the number of affected projects, this turned out to be a time-consuming and complex task.
Fortunately, the available fix was a one-liner with no known negative side effects, so at communicating the patch was straightforward.

After receiving the impact assessment from the major affected projects as well as positive feedback from Pieter Wuille, I decided to set the end of the information embargo to late October, with less than 45 days total between discovery and disclosure.

Exceptions to the coordinated disclosure date

c-lightning patch

Due to the above mentioned concerns for P2P network-based attacks on c-lightning nodes, I agreed to a preliminary patch by Rusty Russell that did not fix the integer underflow, but corrected the target buffer sizes and therefore mitigated the practical impact.

Early Trezor One firmware binary

SatoshiLabs had to ship a new firmware shortly before the ZCash hardfork on 29.10.2018 (-> ~one day before the regular disclosure end date) and asked to include the Bech32 fix in the firmware binary to ensure a quick and wide rollout to their users, which I agreed to. Public patches and disclosure followed on the regular embargo end date.

During the search for potentially affected projects, I noted that ledger-app-particl and ledger-app-btc both contained the problematic Bech32 code, but did not use it and were therefore not practically affected. I decided to disclose the issue to Ledger anyway, in part because there was a chance that they used the affected Bech32 code in other places, e.g. their non-public code base, which would be allowed by the relevant MIT license.

In Ledger’s reply, they informed me that they had already found the Bech32 vulnerability independently a few weeks ago, but had not disclosed it yet to other projects.

On the same day, they disclosed to SatoshiLabs that the cash_decode() function used in the Trezor One and KeepKey via the trezor-crypto library was similarly affected. This step was helpful and allowed the second issue to be fixed in time alongside the Bech32 patches. I thank them for their quick reaction in this matter.

Relevant projects

product	source	fix	vendor references	notes
bech32 reference implementation	GitHub	git
SatoshiLabs Trezor One	GitHub	1.7.1	Trezor article, issue 9 + 10	practical impact only on 1.6.2 and 1.6.3
ShapeShift KeepKey	GitHub	git	VULN-1845
ElementsProject c-lightning	GitHub	0.6.2-rc1, 0.6.3	mailing list
ElementsProject libwally-core	GitHub	0.6.5
Riddle & Code ECLet_STM32	GitHub	git
Nayutaco Ptarmigan	GitHub	git
libbtc	GitHub	git
Komodo SuperNet	GitHub	no
Ledger ledger-app-btc	GitHub	no		unused code, no practical impact
Ledger ledger-app-particl	GitHub	no		unused code, no practical impact
nym-zone bech32 utility	GitHub	no		no reply
CWV crypto lib	GitHub	no		no reply
ACINQ Eclair	GitHub	-		written in Scala, not affected
Square		-		not affected

Detailed timeline

Date	info
2018-09-26	The bech32_decode issue and resulting buffer overflow are discovered
2018-09-26	Disclosure to SatoshiLabs, initial fix is suggested
2018-09-27	Crash is reproduced on Trezor One hardware
2018-09-28	Internal proof of concept for a remote attack on the Trezor One
2018-10-04	First attempt to inform Pieter Wuille
2018-10-06	Second attempt to inform Pieter Wuille
2018-10-11	First round of disclosure to affected vendors including ShapeShift, ElementsCore and libbtc
2018-10-11	Initial contact with Pieter Wuille
2018-10-13	libwally-core is informed
2018-10-14	Pieter Wuille confirms the bug
2018-10-16	ACINQ is informed
2018-10-17	Komodo is informed
2018-10-18	Riddle & Code and Nayutaco are informed
2018-10-23	Proposed public disclosure release date: 2018-10-30
2018-10-24	Ledger is informed. The `cash_decode()` vulnerability is disclosed by Ledger to SatoshiLabs.
2018-10-25	Disclosure of `cash_decode()` vulnerability to other affected projects
2018-10-26	Square is informed
2018-10-30 12:00 UTC+1	Coordinated public disclosure

Bug bounty

SatoshiLabs and ShapeShift provided a bug bounty.

Credits

I would like to thank Dr. Jochen Hoenicke⁽¹⁾ for his help during multiple steps of the disclosure process. Please credit him as co-author in public references of this issue.

Contents