The article describes a buffer overflow vulnerability in the USB receive buffer of the KeepKey hardware wallet that was fixed with firmware v6.2.2 in September 2019. I discovered this issue by fuzzing a custom KeepKey emulator setup with libFuzzer and AddressSanitizer.

See the CVE-2019-18672 disclosure post for another KeepKey vulnerability disclosed at the same time.

Please note: As with other articles, this is going to be a technical deep-dive into the specific details that are relevant for the issue.
Correspondingly, the article is written for technical readers with IT security and coding experience.

Consulting

I’m a freelance Security Consultant and currently available for new projects. If you are looking for assistance to secure your projects or organization, contact me.

Technical background

As described in the article on a previous KeepKey buffer overflow, the KeepKey uses a specific packet encoding scheme to transport custom protobuf messages over USB, similar to the Trezor devices.

There is a special subset of protobuf messages that are designated as “tiny” and are particularly small and simple in both their encoded as well as decoded form. Their main use is to signal user-related decisions like a cancel action that concern ongoing operations. The state machine of the KeepKey is designed to allow special handling of these messages during certain time periods when all other messages are rejected.

Unfortunately, the packet handling logic of the KeepKey insufficiently checks whether the received messages are actually “tiny”.

The vulnerability

This message handling issue can be attacked once a regular operation goes into a state where feedback from the host computer via a “tiny” message is required. As a practical example, this is the case once the firmware requests a physical button confirmation by the user.

To signal this special mode, the firmware will set msg_tiny_flag to true.

When the next (attacker-controlled) USB packet from the host is processed, the uint16_t msgId and uint32_t msgSize parameters are read from the packet header:

void handle_usb_rx(const void *msg, size_t len)
{
  if (msg_tiny_flag) {
    uint8_t buf[64];
    memcpy(buf, msg, sizeof(buf));

    uint16_t msgId = buf[4] | ((uint16_t)buf[3]) << 8;
    uint32_t msgSize = buf[8]        |
          ((uint32_t)buf[7]) <<  8 |
          ((uint32_t)buf[6]) << 16 |
          ((uint32_t)buf[5]) << 24;

messages.c

The encoded size of the protobuf message is checked to be <= 55 byte in length.
(Note that this does not give guarantees about the decoded size of the message, which is relevant for later.)

if (msgSize > 64 - 9) {
  (*msg_failure)(FailureType_Failure_UnexpectedMessage, "Malformed tiny packet");
  return;
}

messages.c

The msg_id is then checked against the known list of all protobuf input message types that the device recognizes. This will reject unknown message types, but does not actually check whether the message belongs to the “tiny” message subset.

// Determine callback handler and message map type.
const MessagesMap_t *entry = message_map_entry(NORMAL_MSG, msgId, IN_MSG);
if (!entry) {
  (*msg_failure)(FailureType_Failure_UnexpectedMessage, "Unknown message");
  return;
}

messages.c

After passing the checks, tiny_dispatch() is called with the attacker-controlled message body (up to 55 bytes in length):

static void tiny_dispatch(const MessagesMap_t *entry, uint8_t *msg, uint32_t msg_size)
{
    if (!pb_parse(entry, msg, msg_size, msg_tiny)) {

messages.c

Within the tiny_dispatch() function, pb_parse() will be called with a target buffer of size 64. The buffer is globally defined and therefore lives in the .bss segment:

static CONFIDENTIAL uint8_t msg_tiny[MSG_TINY_BFR_SZ];

messages.c

The pb_parse() function then calls the Nanopb library’s pb_decode() function, trusting that the small target buffer will be large enough for the expanded message:

static bool pb_parse(const MessagesMap_t *entry, uint8_t *msg, uint32_t msg_size,
                     uint8_t *buf)
{
  pb_istream_t stream = pb_istream_from_buffer(msg, msg_size);
  return pb_decode(&stream, entry->fields, buf);
}

messages.c

Unfortunately, the KeepKey includes several protobuf input message definitions that can have valid, short encoded representations when sent over USB, but need much more space in their decoded representation in the target buffer (large max_size, optional fields).

In those cases, Nanopb will perform multiple non-continuous out of bounds writes with known values behind the msg_tiny. Additionally, some of those out of bounds writes are likely partially controllable by the attacker through data in the encoded message.

The fix

In their patch, ShapeShift has reworked the code to perform actual checks that the msg_id corresponds to one of the five whitelisted “tiny” message types via a switch-case statement before calling pb_decode(). This resolves this issue.

Attack scenario and security implications

As described in previous articles, this type of vulnerability in “always exposed” USB packet handling code can be particularly serious due to the potential of automated attacks via malware or impact on stolen devices.

The relevant functionality can be triggered without unlocking the device (-> without PIN) and allows overwriting certain memory areas in or behind the .bss segment with known values. This might be leveraged to change the regular program flow and compromise the device depending on the memory layout of individual firmware versions.

Existing countermeasures and mitigating factors

As far as I’m aware, existing countermeasures such as stack protection and MPU configuration do not directly detect and prevent this issue.

Proof of concept

Only two ping packets are required to trigger the issue:

# 1x ping message with physical button confirmation request
\x3f\x23\x23\x00\x01\x00\x00\x00\x11\x0a\x09\x56\x55\x4c\x4e\x2d\x31\x39\x36\x39\x10\x01\x18\x00\x20\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00

# note: the firmware is now expecting a "tiny" message

# 1x ping message with content, this will be expanded to 256+ bytes in decoded form
\x3f\x23\x23\x00\x01\x00\x00\x00\x11\x0a\x09\x76\x65\x72\x79\x20\x6c\x6f\x6e\x67\x10\x00\x18\x00\x20\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00

Affected versions

The relevant message handling code was last changed in May 2019, but as far as I can see, the vulnerability also affects previous code versions. I would assume that all recent versions are affected unless other information becomes available via the vendor.

Coordinated disclosure

I confidentially disclosed the issue to ShapeShift in September 2019 and they quickly acknowledged the report and started looking into the issue. The disclosure was simplified by the fact that only one vendor was affected. The Trezor One uses different code to handle the relevant packets and was not vulnerable.

During the relevant time in September, I discovered CVE-2019-18672 and several smaller issues and disclosed them as well. As with previous reports, they assigned clear internal issue identifiers for each distinct issue. This is helpful to keep track of vulnerabilities and reference then in internal and public reports - even more so when reporting half a dozen vulnerabilities at once!
I also value the fact that I was able to give feedback on their planned patches before the release.

They shipped a fixed release within 19 days of the disclosure, which is a fairly good response time for firmware issues in my opinion.

As discussed with ShapeShift in multiple emails during the disclosure process, I highly recommend publishing a reasonable number of details about the fixed vulnerabilities either with the release or soon after in a separate article. I realize that the release of security fixes is a stressful time and the release of accompanying disclosure articles easily get pushed back a few days behind schedule.

Nevertheless, public security patches essentially give away the vulnerability to malicious actors but not to customers. They might not prioritize the upgrade soon enough without knowing its importance. In my opinion, the benefit of substantially delayed publication of relevant information on publicly fixed vulnerabilities is therefore questionable. For more context on this topic, I recommend the Google Project Zero FAQ.

After the initial analysis phase had been completed, I decided that this issue was relevant enough to request a CVE ID from MITRE and ShapeShift was supportive of this step. The CVE assignment itself went quickly and smoothly. See the references below for the technical information.

Relevant product

product source fixed version vendor references CVE
ShapeShift KeepKey GitHub v6.2.2 via patch disclosure post, release notes , VULN-1969 CVE-2019-18671

Detailed timeline

Date info
2019-09-01 Confidential disclosure to ShapeShift
2019-09-03 ShapeShift acknowledges the report
2019-09-11 ShapeShift assigns VULN-1969, severity assessment: critical
2019-09-15 ShapeShift proposes a patch
2019-09-19 ShapeShift releases firmware v6.2.2
2019-09-19 ShapeShift publishes v6.2.2 release announcement
2019-11-02 CVE requested from MITRE
2019-11-02 MITRE assigns CVE-2019-18671
2019-12-04 ShapeShift publishes disclosure post for v6.2.2
2019-12-04 The CVE-2019-18671 details are published

Note: previous versions of this timeline included the wrong year.

Bug bounty

ShapeShift provided a bug bounty for this issue.