The article describes a vulnerability in the KeepKey hardware wallet which allows an attacker to erase a cryptographic key and compromise the U2F 2nd factor protection of the KeepKey. I discovered this issue by fuzzing a custom KeepKey emulator setup with libFuzzer and AddressSanitizer. The vulnerability was fixed with firmware v6.2.2 in September 2019.

See the CVE-2019-18671 article for another KeepKey vulnerability that I reported during the same disclosure process.

Please note: As with other articles, this is going to be a technical deep-dive into the specific details that are relevant for the issue.
Correspondingly, the article is written for technical readers with IT security and coding experience.

Technical background

As explained in a previous Trezor One vulnerability article, the private key at the heart of a cryptocurrency hardware wallet is usually encoded as a human-readable seed phrase of standardized English words in accordance with the BIP39 standard. The secret key is normally generated on the device in a random fashion during the initial wallet initialization and only displayed once to the user. Among other things, this ensures that the owner can create an appropriate backup of the key (e.g. on paper or another analog medium) and restore the wallet later in case of issues or migration between devices.

At the request of the user, the KeepKey can be initialized directly from such a pre-existing BIP39 seed phrase to import the wallet(s) associated with this private key. This workflow is called recovery.

In my opinion, this flexibility and ownership over the private key is an important and advantageous design property of a hardware wallet. However, this functionality exposes an interesting attack surface for adversaries (handling of sensitive data, flash writes) and has to be secured against misuse.
Unfortunately, some parts of the KeepKey recovery functionality were not correctly protected.

The vulnerability

During fuzzing-based research, interesting discoveries often begin with sanitizer warnings, in this case with memory corruption:

==16601==ERROR: AddressSanitizer: global-buffer-overflow on address
0x000001a2049f at pc 0x00000055d7b1 bp 0x7fffa1f850f0 sp 0x7fffa1f850e8
WRITE of size 1 at 0x000001a2049f thread T0
    #0 0x55d7b0 in recovery_cipher_finalize
[...]


Digging into the function backtrace, one can see the following code path:

usbPoll -> handle_usb_rx -> usb_rx_helper -> dispatch -> fsm_msgCharacterAck -> recovery_cipher_finalize

This path makes sense in general since CharacterAck messages are used to enter individual parts of the seed words during the KeepKey recovery procedure which uses a specific character entry combined with a substitution matrix shown on the display.

Closer inspection shows that the problematic fuzzing input consists of a single USB packet. Why is recovery_cipher_finalize called after processing the first MessageType_MessageType_CharacterAck protobuf message?

Code analysis shows that it is indeed possible to jump from the fsm_msgCharacterAck function directly to the recovery_cipher_finalize function if the decoded protobuf message has the msg->has_done and msg->done flags set:

void fsm_msgCharacterAck(CharacterAck *msg)
{
  if (msg->has_delete && msg->del) {
    recovery_delete_character();
  } else if(msg->has_done && msg->done) {
    recovery_cipher_finalize();

fsm_msg_common.h

In the “normal” workflow, recovery_cipher_finalize is called on an uninitialized (brand-new or “wiped”) device after a number of seed words have been entered. However, it appears that all relevant security checks in recovery_cipher_init() can be circumvented by an attacker by simply calling recovery_cipher_finalize without doing the initialization.

There are no checks at all to prevent this call on an initialized device (that is in active use and has a valid secret key). Even worse, it can be done on a device in locked state (without PIN checks) and without physical button interaction.

The beginning of the recovery_cipher_finalize() function looks like this:

void recovery_cipher_finalize(void)
{
    static char CONFIDENTIAL new_mnemonic[MNEMONIC_BUF] = "";
    static char CONFIDENTIAL temp_word[CURRENT_WORD_BUF];
    volatile bool auto_completed = true;

recovery_cipher.c

new_mnemonic is the target buffer for the new “recovered” secret seed phrase that should be stored in the device flash by the recovery operation.

The code expects the related mnemonic variable to be filled with space-separated words by the regular recovery steps. However, the attacker can skip those steps by not sending the messages, so mnemonic it is still at it’s initial 0x00 state:

(gdb) print mnemonic
$1 = '\000' <repeats 287 times>


Correspondingly, char *tok = strtok(mnemonic, " "); only assigns a NULL pointer because there is no space-separated word left in mnemonic and so the string copy routine that copies words from mnemonic to new_mnemonic is skipped completely.

Interestingly, this also keeps auto_completed at its previous value of true and so the following error handling is not executed:

if (!auto_completed && !enforce_wordlist) {
  if (!dry_run) {
    storage_reset();
  }
  fsm_sendFailure(FailureType_Failure_SyntaxError,
                 "Words were not entered correctly. Make sure you are using the substition cipher.");
  awaiting_character = false;
  layoutHome();
  return;
}

recovery_cipher.c

Since new_mnemonic is still an empty string at this point, its strlen(new_mnemonic) value is 0, which makes the following assignment an out of bounds write before the buffer:

new_mnemonic[strlen(new_mnemonic) - 1] = '\0';

recovery_cipher.c

This is the global-buffer-overflow one byte out-of-bounds write that the Address Sanitizer complained about as shown in the beginning of the article.

Depending on the data that is saved in the buffer before new_mnemonic and how it is accessed, this out of bounds write would be an interesting memory corruption issue on its own under different circumstances, but in this case it is largely irrelevant due to the steps that follow it. A relevant aspect here is that the protections of the microcontroller will not detect any issues with this statement and the execution continues normally (since the out of bounds write before the buffer is in the .bss segment).

The next code lines are crucial:

if (!dry_run && (!enforce_wordlist || mnemonic_check(new_mnemonic))) {
  storage_setMnemonic(new_mnemonic);
  memzero(new_mnemonic, sizeof(new_mnemonic));
  if (!enforce_wordlist) {
    // not enforcing => mark storage as imported
    storage_setImported(true);
  }
  storage_commit();
  fsm_sendSuccess("Device recovered");

recovery_cipher.c

Despite the broken state and empty new_mnemonic buffer, it appears that the recovery_cipher_finalize actually tries to overwrite device secrets in the flash!

storage_setMnemonic(new_mnemonic); replaces the RAM copy of the existing private key in shadow_config.storage.sec.mnemonic with 0x00 data bytes:

void storage_setMnemonic(const char *m)
{
  memset(shadow_config.storage.sec.mnemonic, 0,
       sizeof(shadow_config.storage.sec.mnemonic));
  strlcpy(shadow_config.storage.sec.mnemonic, m,
       sizeof(shadow_config.storage.sec.mnemonic));

storage.c

The second half of storage_setMnemonic() derives the secret key of the U2F second factor authentication mechanism directly from the private key and stores it in a separate variable. This is a deliberate design decision since the U2F key is needed in different security contexts than the main private key. In particular, the secret U2F key needs to be available directly after device startup before the secure section of the storage can be decrypted with the correct PIN as entered by the user.

During the problematic call, the U2F private key will be derived directly from the null mnemonic that was just set and so a static U2F key that is completely known to the attacker will be written to flash:

  storage_compute_u2froot(&session, shadow_config.storage.sec.mnemonic,
                        &shadow_config.storage.pub.u2froot);
  shadow_config.storage.pub.has_u2froot = true;
}

storage.c

As a final step, storage_commit(); is called.

Unfortunately for the attacker, it appears that the changes to storage.sec are not actually applied or persisted to flash by storage_commit() although no exception is thrown and some other changes are persisted. So far, no changes to the wallet private key could be observed in practice under the device conditions that were tested.

The U2F secret key is stored in the storage.pub area (which is “easier” to write to) and is persisted to flash according to my observations.

It is possible to call recovery_delete_character() on initialized devices without authentication. As far as I can see, this misses any actual security impact since storage_reset() is not reachable.

The fix

The main patch changes the finite state machine logic to reject the problematic messages if the recovery had not been started properly. This is done by introducing a recovery_started state flag which enforces that the necessary protections and checks are performed through recovery_cipher_init().

Attack scenario and security implications

It’s clear there should be no way for USB packets from local programs to overwrite important device secrets in this way, regardless of the exact failure scenario. To make things worse, This issue can be triggered remotely via WebUSB by malicious javascript after the user agrees to an unspecific permission dialog. There is either no visual indication or only a brief animation on the OLED display and the device appears to work as before, which makes this attack very hard to detect.

Due to the particular behavior described in the analysis section, this issue is mainly interesting for attacks on U2F.

After the vulnerability has been exploited, the U2F secret key is set to a new, static and well-known value, which is used for future registration and authentication operations:

  • U2F-secured logins configured before the attack no longer work (SW_COND_NOT_SAT)
  • U2F-secured logins configured after the attack “work”, but are based on a very insecure key

Scenario I:

Remotely exploit the vulnerability from a malicious webpage with the goal of invalidating existing U2F login configurations and cause the user inconvenience (persistent denial of service). Fallback methods such as site-specific recovery codes have to be used to regain access to the login in question. The U2F key can be recovered by re-importing the BIP39 seed, but this is not obvious to the user.

Scenario II:

Malware on the host computer could perform this attack to effectively remove the U2F protections of important high-value logins secured with a vulnerable KeepKey (such as a cryptocurrency exchange service). This allows an attacker to access the account in question with just the regular password credentials (obtained for example by a keylogger) and knowledge of the new static U2F secret key. This would only work with U2F-secured logins registered after the attack, but I think it is plausible that at least some users will re-register the U2F key for existing logins or can be tricked into doing so.

Proof of concept

As described in the analysis, only one USB packet is required to trigger the vulnerability:

# 1x CharacterAck message with flags set
?##\x00Q\x00\x00\x00\x02\x18\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00

If successful, the answer will look like this:

0040   3f 23 23 00 02 00 00 00 12 0a 10 44 65 76 69 63   ?##........Devic
0050   65 20 72 65 63 6f 76 65 72 65 64 00 00 00 00 00   e recovered.....
0060   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
0070   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................

Existing countermeasures and mitigating factors

None of the regular security mechanisms prevents this issue:

  • PIN check and physical button confirmation for dangerous operations are circumvented
  • the device flash is normally locked against accidental writes, but the problematic code unlocks it explicitly
  • the incidental out of bounds write is not detected by the stack protection or memory protection unit

Affected versions

The issue was discovered with firmware v6.2.0. The problematic state machine behavior was confirmed for previous firmware versions v6.0.0 and v4.0.0 as well. Looking at the patch history of the recovery logic, I think it is plausible that most or all recent firmware versions are similarly affected in terms of the general logic bug.

Note: not all firmware versions in questions have U2F capabilities enabled and WebUSB is unavailable for older firmware versions.

Coordinated disclosure

The disclosure process is described in the “Coordinated disclosure” section of the CVE-2019-18671 article, which covers all relevant aspects of this disclosure as well.

It’s noteworthy that this issue was fixed by ShapeShift with a firmware update after just 9 days.

Relevant product

product source fixed version patch vendor references
ShapeShift KeepKey Github v6.2.2 via patch disclosure post, release notes , VULN-1971 CVE-2019-18672

Detailed timeline

Date info
2019-09-11 Confidential disclosure to ShapeShift
2019-09-12 ShapeShift assigns VULN-1971
2020-09-15 ShapeShift proposes a patch
2020-09-19 ShapeShift releases firmware v6.2.2
2020-09-19 ShapeShift publishes v6.2.2 release announcement
2020-11-02 CVE requested from MITRE
2020-11-02 MITRE assigns CVE-2019-18672
2020-12-04 ShapeShift publishes disclosure post for v6.2.2
2020-12-04 The CVE-2019-18672 details are published

Bug bounty

ShapeShift provided a bug bounty for this issue.