The article describes a buffer overflow vulnerability in the USB receive buffer of the KeepKey hardware wallet that was fixed with firmware v6.2.2 in September 2019. I discovered this issue by fuzzing a custom KeepKey emulator setup with libFuzzer and AddressSanitizer.
See the CVE-2019-18672 disclosure post for another KeepKey vulnerability disclosed at the same time.
Please note: As with other articles, this is going to be a technical deep-dive into the specific details that are relevant for the issue.
Correspondingly, the article is written for technical readers with IT security and coding experience.
I’m a freelance Security Consultant and currently available for new projects. If you are looking for assistance to secure your projects or organization, contact me.
As described in the article on a previous KeepKey buffer overflow, the KeepKey uses a specific packet encoding scheme to transport custom protobuf messages over USB, similar to the Trezor devices.
There is a special subset of protobuf messages that are designated as “tiny” and are particularly small and simple in both their encoded as well as decoded form. Their main use is to signal user-related decisions like a cancel action that concern ongoing operations. The state machine of the KeepKey is designed to allow special handling of these messages during certain time periods when all other messages are rejected.
Unfortunately, the packet handling logic of the KeepKey insufficiently checks whether the received messages are actually “tiny”.
This message handling issue can be attacked once a regular operation goes into a state where feedback from the host computer via a “tiny” message is required. As a practical example, this is the case once the firmware requests a physical button confirmation by the user.
To signal this special mode, the firmware will set
msg_tiny_flag to true.
When the next (attacker-controlled) USB packet from the host is processed, the
uint16_t msgId and
uint32_t msgSize parameters are read from the packet header:
The encoded size of the protobuf message is checked to be
<= 55 byte in length.
(Note that this does not give guarantees about the decoded size of the message, which is relevant for later.)
msg_id is then checked against the known list of all protobuf input message types that the device recognizes. This will reject unknown message types, but does not actually check whether the message belongs to the “tiny” message subset.
After passing the checks,
tiny_dispatch() is called with the attacker-controlled message body (up to 55 bytes in length):
pb_parse() will be called with a target buffer of size 64. The buffer is globally defined and therefore lives in the
pb_parse() function then calls the Nanopb library’s
pb_decode() function, trusting that the small target buffer will be large enough for the expanded message:
Unfortunately, the KeepKey includes several protobuf input message definitions that can have valid, short encoded representations when sent over USB, but need much more space in their decoded representation in the target buffer (large
max_size, optional fields).
In those cases, Nanopb will perform multiple non-continuous out of bounds writes with known values behind the
msg_tiny. Additionally, some of those out of bounds writes are likely partially controllable by the attacker through data in the encoded message.
In their patch, ShapeShift has reworked the code to perform actual checks that the
msg_id corresponds to one of the five whitelisted “tiny” message types via a switch-case statement before calling
pb_decode(). This resolves this issue.
Attack scenario and security implications
As described in previous articles, this type of vulnerability in “always exposed” USB packet handling code can be particularly serious due to the potential of automated attacks via malware or impact on stolen devices.
The relevant functionality can be triggered without unlocking the device (-> without PIN) and allows overwriting certain memory areas in or behind the .bss segment with known values.This might be leveraged to change the regular program flow and compromise the device depending on the memory layout of individual firmware versions.
Existing countermeasures and mitigating factors
As far as I’m aware, existing countermeasures such as stack protection and MPU configuration do not directly detect and prevent this issue.
Proof of concept
Only two ping packets are required to trigger the issue:
# 1x ping message with physical button confirmation request \x3f\x23\x23\x00\x01\x00\x00\x00\x11\x0a\x09\x56\x55\x4c\x4e\x2d\x31\x39\x36\x39\x10\x01\x18\x00\x20\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 # note: the firmware is now expecting a "tiny" message # 1x ping message with content, this will be expanded to 256+ bytes in decoded form \x3f\x23\x23\x00\x01\x00\x00\x00\x11\x0a\x09\x76\x65\x72\x79\x20\x6c\x6f\x6e\x67\x10\x00\x18\x00\x20\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
The relevant message handling code was last changed in May 2019, but as far as I can see, the vulnerability also affects previous code versions. I would assume that all recent versions are affected unless other information becomes available via the vendor.
I confidentially disclosed the issue to ShapeShift in September 2019 and they quickly acknowledged the report and started looking into the issue. The disclosure was simplified by the fact that only one vendor was affected. The Trezor One uses different code to handle the relevant packets and was not vulnerable.
During the relevant time in September, I discovered CVE-2019-18672 and several smaller issues and disclosed them as well. As with previous reports, they assigned clear internal issue identifiers for each distinct issue. This is helpful to keep track of vulnerabilities and reference then in internal and public reports - even more so when reporting half a dozen vulnerabilities at once!
I also value the fact that I was able to give feedback on their planned patches before the release.
They shipped a fixed release within 19 days of the disclosure, which is a fairly good response time for firmware issues in my opinion.
As discussed with ShapeShift in multiple emails during the disclosure process, I highly recommend publishing a reasonable amount of details about the fixed vulnerabilities either with the release or soon after in a separate article. I realize that the release of security fixes is a stressful time and the release of accompanying disclosure articles easily get pushed back a few days behind schedule.
Nevertheless, public security patches essentially give away the vulnerability to malicious actors but not to customers. They might not prioritize the upgrade soon enough without knowing its importance. In my opinion, the benefit of substantially delayed publication of relevant information on publicly fixed vulnerabilities is therefore questionable. For more context on this topic, I recommend the Google Project Zero FAQ.
After the initial analysis phase had been completed, I decided that this issue was relevant enough to request a CVE ID from MITRE and ShapeShift was supportive of this step. The CVE assignment itself went quickly and smoothly. See the references below for the technical information.
|product||source||fixed version||vendor references||CVE|
|ShapeShift KeepKey||Github||v6.2.2 via patch||disclosure post, release notes , VULN-1969||CVE-2019-18671|
|2019-09-01||Confidential disclosure to ShapeShift|
|2019-09-03||ShapeShift acknowledges the report|
|2019-09-11||ShapeShift assigns VULN-1969, severity assessment: critical|
|2019-09-15||ShapeShift proposes a patch|
|2019-09-19||ShapeShift releases firmware v6.2.2|
|2019-09-19||ShapeShift publishes v6.2.2 release announcement|
|2019-11-02||CVE requested from MITRE|
|2019-11-02||MITRE assigns CVE-2019-18671|
|2019-12-04||ShapeShift publishes disclosure post for v6.2.2|
|2019-12-04||The CVE-2019-18671 details are published|
Note: previous versions of this timeline included the wrong year.
ShapeShift provided a bug bounty for this issue.