The article describes a new vulnerability in the KeepKey hardware wallet. Vulnerable code in the Ethereum transaction handling can leak memory from attacker-controlled address locations onto the display when processing a crafted EthereumSignTx message. An attacker with physical access to an unlocked KeepKey device can extract the BIP39 seed or other confidential device secrets via this flaw without tampering with the device hardware or leaving permanent traces.


Consulting

I’m a freelance Security Consultant and currently available for new projects. If you are looking for assistance to secure your projects or organization, contact me.

The Vulnerability

Annotated photo of KeepKey with successful leak of BIP39 seed portion (details described in PoC section)
Annotated photo of KeepKey with successful leak of BIP39 seed portion (details described in PoC section)


Attacker-Controlled Out-of-bounds Read (CVE-2023-27892)

This section outlines how Ethereum-related processing code introduced with firmware v7.5.2 can be used as an arbitrary read gadget to display confidential device memory on the OLED screen, which violates security goals.

Once an attacker sends a special Ethereum signing request message, the following code path in ethereum_signing_init() can be triggered:

if (!ethereum_cFuncConfirmed(data_total, msg)) {

ethereum.c L715

This calls the recently added cf_confirmExecTx() function via the short intermediary function ethereum_cFuncConfirmed():

bool ethereum_cFuncConfirmed(uint32_t data_total, const EthereumSignTx *msg) {
   if (cf_isExecTx(msg)) {
     return cf_confirmExecTx(data_total, msg);

ethereum_contracts.c L38-L40

The code vulnerability is located in cf_confirmExecTx().

Before we dig deeper, first some context on this section of the firmware.

On an abstract level, the code for Ethereum transaction confirmation functionality is supposed to

  • Parse the incoming attacker-controlled EthereumSignTx *msg request received via USB from the host computer.
  • Show several important transaction details on the device screen for secure confirmation by the human operator independent of the untrusted host computer.

Subsequent code stages then perform the actual Ethereum transaction signing, but they are not relevant to understanding this issue.

The confirmation flow in question has multiple stages to display and approve the individual components of the transaction:

  1. Call confirm(ButtonRequestType_ButtonRequest_ConfirmOutput, ...) on the decoded receiver address.
  2. Repeat this action on the decoded transfer amount.
  3. Display a dynamic amount of raw data in the parsed request message in hexadecimal encoding on the OLED display, with pagination if needed.

It is at the third step where things go bad. 🌩

Here is the problematic code section:

// get data bytes
bn_from_bytes(msg->data_initial_chunk.bytes + 4 + 2*32, 32, &bnNum);        // data offset
offset = bn_write_uint32(&bnNum);
bn_from_bytes(msg->data_initial_chunk.bytes + 4 + offset, 32, &bnNum);      // data len
dlen = bn_write_uint32(&bnNum);
data = (uint8_t *)(msg->data_initial_chunk.bytes + 4 + 32 + offset);

contractfuncs.c L68-L73

The goal of the listed code instructions is to prepare the uint8_t* data pointer and uint32_t dlen length variables of the data that should be printed. The display logic then uses them to show hexadecimal encoded text versions of the referenced data payload in the Ethereum transaction message to the user. Due to the limited screen size, the conversion and screen dialog operates on paginated chunks.

Display logic:

n = 1;
chunkSize = 39;
while (true) {
    chunk=chunkSize*(n-1);
    for (ctr=chunk; ctr<chunkSize+chunk && ctr<dlen; ctr++) {
        snprintf(&confStr[(ctr-chunk)*2], 3, "%02x", data[ctr]);
    }
    if (!confirm(ButtonRequestType_ButtonRequest_ConfirmOutput,
              title, "Data payload %d: %s", n, confStr)) {
        return false;
    }
    if (ctr >= dlen) {
        break;
    }
    n++;
}

contractfuncs.c L75-L90

The crucial mistake in the message parsing logic is the lack of range checks for the variables. Both uint32_t offset and uint32_t dlen are assigned and used without ensuring that the referenced memory region is firmly within the msg->data_initial_chunk.bytes payload section. This leads to serious problems!

Let’s walk through one of the problematic assignments in more detail:

bn_from_bytes(msg->data_initial_chunk.bytes + 4 + 2*32, 32, &bnNum);        // data offset
offset = bn_write_uint32(&bnNum);

contractfuncs.c L69-L70

In simplified terms, the combination of void bn_from_bytes(const uint8_t *value, size_t value_len, bignum256 *val) and uint32_t bn_write_uint32(const bignum256 *in_number) reads an uint32_t value from a particular memory location without imposing any additional range limitations on the resulting number. In the code snippet shown above, the number conversion first reads a 256 bit bignum number from a fixed byte offset within msg->data_initial_chunk.bytes and then assigns the least significant four bytes to offset, discarding the rest of the input.

A similar operation happens for the dlen read, but from a flexible offset location (more on this later).

It’s important to remember that the Ethereum transfer request message comes from an untrusted source - the computer acting as the USB host could be compromised by malware, which is the reason behind showing the user confirmation steps on the hardware wallet display in the first place. In this particular code branch of Ethereum transaction signing, the format validation functions run before cf_confirmExecTx() impose no meaningful limitations on the msg->data_initial_chunk.bytes content.

To summarize, msg->data_initial_chunk.bytes passes over a trust boundary, isn’t validated to any strict specification, and then used without sufficient length checks.

An attacker with control over the message content can exploit the unbounded conversion flaws in two general ways:

  1. Set a large uint32_t offset value, use it to move the data pointer, and leak content from an arbitrary memory location.
  2. Set a small uint32_t offset value, control uint32_t dlen, and run the memory printing function arbitrarily far beyond the packet buffer.

In both cases, the previously quoted display logic will trigger confirm() dialogs that leak raw memory from out-of-bounds regions via snprintf() to the KeepKey device OLED screen. That’s a pretty powerful attack gadget on a hardware wallet, which is supposed to avoid data leaks at all costs!

The following attack description will focus on direct data pointer control via large offset values (variant no. 1), which I’ve found to be more powerful and practical for manual attacks without physical automation. It’s simpler to leak an interesting memory region directly on the screen in a few display pages, compared to setting an oversized dlen length and manually cycling through thousands of display pages before arriving there.

Digging deeper into the code behavior, we can see that the attacker can force arbitrary pointer addresses for data. Due to the unsigned integer overflow wrapping, the msg->data_initial_chunk.bytes + 4 + 32 + offset calculation can end up with any address in front of or behind msg->data_initial_chunk.bytes! To make matters worse for the defenders, msg->data_initial_chunk.bytes is at a static and well-known absolute address. The currently processed Ethereum message will always be located in a special decode buffer after it is converted from the protobuf wire format:

static void dispatch(const MessagesMap_t *entry, uint8_t *msg,
                     uint32_t msg_size) {
  static uint8_t decode_buffer[MAX_DECODE_SIZE] __attribute__((aligned(4)));

messages.c L122-L124

Since decode_buffer[] is a static global variable and the ARM Cortex-M3 platform has no address space layout randomization, the buffer and the msg->data_initial_chunk.bytes struct field will always be located at the same absolute memory location for a given firmware version. This allows attackers precise and reliable exploitation of this issue without the need for guesses or usage of other information leaks.

For attacks that intend to read out a specific, narrow memory region via crafted offset values, the last remaining obstacle is the limited attacker control over dlen when manipulating data. By picking crafted offset values in the attack message which move data towards other microcontroller memory outside of the message buffer, the dlen-defining read operation moves there as well:

bn_from_bytes(msg->data_initial_chunk.bytes + 4 + offset, 32, &bnNum);      // data len
dlen = bn_write_uint32(&bnNum);

contractfuncs.c L71-L72

Unfortunately for the defenders, this is drawback can be worked around since the bignum data read logic and display code is very forgiving and will treat basically any data as a meaningful length field. The attackers can simply point to a memory region slightly in front of the targeted data that is known to have some non-null data bytes in the 4 byte window of dlen. As long as the converted dlen value is at least as large as the desired data readout section, the resulting memory readout will successful leaks all relevant data after some pagination.

For edge cases where dlen is unexpectedly small, the display code runs into another failure mode and leaks previously used stack memory via the unitialized char confStr[131]; variable. However, compared to the arbitrary read gadget of specific memory address contents, this is not nearly as interesting or powerful. Similarly, the attacker can set offset such that display reads will access forbidden memory regions and cause a crash. Given the requirements of this attack, this exploitation variant is also not of much interest, but technically part of the potential impact.

Additional Attack Considerations

The problematic functionality can be triggered by local or remote attackers once the device is in an unlocked state (if a PIN is set on the target device) and the user physically confirms at least some steps of an Ethereum signing flow. The most limiting factor in the attack is that the secret information is only rendered on the physical KeepKey display as hexadecimal-encoded data and not leaked back towards the host computer.

The latter behavior is due to the confirm() handler at confirm_sm.c which does not make use of the data field in the ButtonRequest message and therefore does not send the displayed string towards the computer, where malware could read it after tricking the user to confirm a supposedly low-value Ethereum transaction.

message ButtonRequest {
  optional ButtonRequestType code = 1;
  optional string data = 2;
}

messagemap.def L78

As with other KeepKey USB related vulnerabilities, a malicious website with user-granted WebUSB permissions could trigger this issue. However, in this particular vulnerability there is no return channel for the leaked information, so additional physical capabilities by the attacker are needed. Under some edge conditions, social engineering may be used to trick the victim user of the KeepKey to voluntarily copy or photograph the leaked information from the device screen, but I see this as difficult to achieve reliably given the circumstances.

From a threat model perspective, I see this vulnerability as relevant despite the high attack requirements since it undermines both the implicit and explicit security guarantees of the hardware wallet with regards to the confidentiality of long-term cryptographic key material.

One of the affected mechanisms is an advanced wallet initialization mode of the KeepKey wallet which doesn’t reveal the generated BIP39 mnemonic seed to the user at any point, see lib/firmware/reset.c. Wallets initialized with this mode permanently have the no_backup flag set to true, and the communicated goal is to make a recovery of the key impossible. The demonstrated attack for CVE-2023-27892 clearly violates this goal, as the no_backup flag stays unchanged despite the revealed secret.

Similarly, wallet users may have the expectation that the effects of hands-on attacks against their wallet have to be immediate, i.e., the transfer of funds during the attack, or that attacks are only possible if the unlocked wallet already has significant funds available at the time of the attack. While this doesn’t have to be correct 100% from the technical side, for example since attackers could delay the submission of their illegitimately obtained signed transactions to public networks, access to the underlying BIP39 seed by the attacker certainly allows for much more flexible and targeted theft months or years later across various coins, wallet accounts and addresses. In the case of wallets which were temporarily less protected - no PIN configured, accessible to other people, left connected to an unlocked and unsupervised computer for some minutes - this could make a significant difference in practical risk over the multi-year lifetime of a typical BIP39 seed.

Finally, there’s also the consideration with regards to BIP39 passphrases, which are an additional and highly recommended safety layer on top of the BIP39 mnemonic words to prevent the theft of funds. CVE-2023-27892 opens the door for two particular attacks against passphrases. If a given hardware wallet is accessed/stolen by the attacker due to lack of PIN protection (or by using a known PIN), even a moderately complex passphrase could prevent an attacker from discovering and using the custom passphrase-based wallet that holds some additional funds. An online brute-force attack against possible passphrases using the built-in firmware mechanisms is significantly rate-limited due to slow APIs, limited microcontroller processor speed for derivations as well as physical confirmation steps, which results in very limited attack capabilities. Using CVE-2023-27892, an attacker can obtain the BIP39 seed and then scale offline brute-force attacks to an arbitrary number of powerful systems, making it much more feasible to determine the correct derivation with e.g., a dictionary-based attack. In rare scenarios where the attacker temporarily gets access to a hardware wallet that is not just plugged in and unlocked, but also has a sensitive passphrase cached in-memory, the passphrase may also be revealed directly. This also applies to other volatile secrets in memory such as the PIN, but note that auto-locking and other functionality may interfere with this.

To summarize, CVE-2023-27892 does not benefit attackers who steal a PIN-protected KeepKey that is powered off, but significantly increases attacker capabilities for delayed theft, circumvention of no_backup mode guarantees, and enables BIP39 passphrase brute-forcing or direct retrieval as well as other attacks in case of temporarily unprotected and unsupervised devices.

Also noteworthy: this security issue may be beneficial to legitimate owners who have partially or completely forgotten/lost essential secrets of their configured devices. Under some conditions, it may be possible to recover secrets that are still in the device (see the previous paragraphs). Leveraging firmware up- and downgrade capability between vendor-signed official firmwares without the mandatory erasure of BIP39 seed secrets could help with this (disclaimer: perform at your own risk!). I’m looking forward to feedback from users in case this security research was helpful in particular recovery cases.

POC

WARNING: use the provided PoC code at your own risk. The instructions will PERMANENTLY overwrite the configuration of the hardware wallet. Only test with an expendable unit.

  1. Prepare the target KeepKey with a well-known BIP39 seed.
    In the following example, this is done via keepkeyctl wipe_device and keepkeyctl load_device -l "poc_test" -m "keep key program problem process input result memory display defense broken inform", which is a custom seed with a valid checksum.
  2. Ensure the target KeepKey has the firmware v7.5.2, which the PoC is prepared for.
  3. Ensure a working Python3 environment with the pyusb module installed.
  4. Re-connect and PIN-unlock the target KeepKey to simulate the targeted scenario.
  5. Run the BIP39 seed extraction PoC with sufficient permissions for USB access.
  6. Confirm the Ethereum transaction dialogs until data payload information is shown.
  7. Transcribe the hex-encoded ASCII data to obtain the revealed seed information, and skip through additional pages to reveal additional parts of the seed data.
    In the example, the Data payload #1 reveals ep key program problem process input and the Data payload #2 page reveals result memory display defense broken in, with additional data following on the third page.
  8. The revealed information is sufficient to fully recover the configured BIP39 secret.
  9. Also see the PoC code documentation for additional context.

Coordinated disclosure

This disclosure was marked by significant delays and missing feedback when communicating with the vendor (KeepKey). Initially, they created a public patch for the issue on GitHub but did not respond to the confidential disclosure. After three weeks and two reminders, I got a direct response and technical confirmation, but then the contact broke off again and didn’t resume after multiple followups. Despite releasing public security patches and issuing a firmware release, I’m not aware of any public security notes or advisory by the vendor on this issue at the time of publishing of this blog post. This is a further regression of disclosure handling over the last disclosure process CVE-2022-30330 with this vendor in 2022, and may be related to ownership and team changes of the KeepKey product.

In summary, the overall coordinated disclosure progress and publication handling was neither motivating on the researcher side nor overall adequate in my opinion.

In future disclosures, I’ll consider releasing my disclosure information sooner in cases where vendors silently fix security issues during the disclosure period, depending on the patch publication and software release circumstances.

Relevant product

Product Source Known Affected Version Fixed Version Patch Vendor Publications IDs
KeepKey GitHub firmware v.7.5.2 to v7.6.0 v7.7.0 PR337 none CVE-2023-27892

A Note About Research Affiliation and Work Time

I want to emphasize that this research was done on my own time and initiative. In particular, it was not sponsored by SatoshiLabs, for whom I do some paid freelance security research on the related Trezor project.

Detailed timeline

Date Information
2023-01-17 Confidential disclosure to KeepKey
2023-01-26 KeepKey publishes GitHub Pull Request no. 337 with security patch
2023-01-29 POC and additional analysis communicated to KeepKey
2023-02-05 Followup email to KeepKey requesting feedback
2023-02-06 Issue confirmation by KeepKey
2023-02-22 GitHub Pull Request no. 337 is merged
2023-03-06 MITRE assigns requested CVE
2023-03-07 Release of KeepKey firmware v7.7.0 with security patch
2023-04-17 End of disclosure period
2023-04-17 Publication of this report
2023-04-19 Report: “Additional Attack Considerations” section extended

Bug bounty

At the time of the report publication, KeepKey has not offered a bug bounty.