Trezor One receive buffer vulnerability - part one

The article describes the buffer overflow vulnerability in the USB receive buffer of the Trezor One that was fixed with the 1.6.2 firmware in June 2018.

This vulnerability is the most serious issue that I found during my master thesis on fuzzing and verification. On the older firmware 1.5.2, it was shown by Saleem Rashid to be usable to extract the mnemonic seed and PIN via a software-based POC exploit. The attack works without physical access or user interaction, as discussed in the following paragraphs.

To phrase this a bit differently: it’s a really good idea to upgrade to a newer firmware version if you are still running anything before version 1.6.2 in 2020.

See the previous article for the unrelated stack overflow issue I also found during my academic research and this article for the buffer overflow issue in the KeepKey.

Technical background
The vulnerability
Responsible disclosure

Consulting

I’m a freelance Security Consultant and currently available for new projects. If you are looking for assistance to secure your projects or organization, contact me.

Technical background

Data communication between the Trezor One and the host computer is performed via USB packets that are always 64 bytes in size. Since many of the protobuf messages that make up the Trezor-specific communication protocol are bigger than these 64 bytes, a special format is used to send them in fragmented chunks via multiple individual USB packets if necessary.

The first USB packet in a sequence contains a header with the relevant protobuf message type as well as the intended overall message size (9 byte header). For fragmented messages, the state machine in the Trezor One will then switch to a mode where it copies subsequent packets (1 byte header) into a large receive buffer until the required number of bytes is reached and the fully reassembled protobuf message is available. Afterwards, the message will be evaluated and decoded.

The vulnerability

The static receive buffer is defined as 12288 bytes in size:

#define MSG_IN_SIZE (12*1024)

messages.h

static uint8_t msg_in[MSG_IN_SIZE];

messages.c At first, the buffer size appears to be of reasonable size since it is a multiple of 64. However, this receive buffer is used to store the contents of the USB packets without the custom headers. The required memory is 55 bytes for the first packet and 63 bytes for each subsequent packet of the current fragmented message flow.

After 195 well-defined input packets, the buffer is filled with 1x 55 + 194 x 63 = 12277 bytes and is not yet full (11 bytes remaining), but the next payload chunk of 63 bytes will not fit completely.

Through this edge case, an attacker can write 52 arbitrary bytes via the 196th USB packet once msg_pos == 12277 if the first packet specifies a total message size of 12277 < msg_size <= 12288:

// [...]
if (read_state == READSTATE_READING) {
	if (buf[0] != '?') {	// invalid contents
		read_state = READSTATE_IDLE;
		return;
	}
	memcpy(msg_in + msg_pos, buf + 1, len - 1);
	msg_pos += len - 1;
}

if (msg_pos >= msg_size) {
	msg_process(type, msg_id, fields, msg_in, msg_size);
	msg_pos = 0;
	read_state = READSTATE_IDLE;
}

messages.c

The out of bounds write will hit the next variable(s) in the memory behind msg_in. Notably, the stack canary protection will not detect this error, msg_pos >= msg_size is true and the program execution continues normally towards msg_process().

The fix

The fix by Dr. Jochen Hoenicke enforces the buffer boundaries by limiting the copied area of the offending packet:

-		memcpy(msg_in + msg_pos, buf + 1, len - 1);
-		msg_pos += len - 1;

+		/* raw data starts at buf + 1 with len - 1 bytes */
+		buf++;
+		len = MIN(len - 1, MSG_IN_SIZE - msg_pos);
+
+		memcpy(msg_in + msg_pos, buf, len);
+		msg_pos += len;

messages.c

Attack scenario and security implications

As outlined in the introduction, a practical software-based exploit for a wallet vulnerability is particularly serious due to the potential scalability of the attack. For example, malware targeted at cryptocurrency users could infect a large number of hosts and then attack the relevant devices.

The general attack complexity is fairly low for this specific vulnerability:

the vulnerable component is always exposed while the device is connected
the attack requires no knowledge of the internal device state
the device can be attacked in a locked state (before PIN entry)
there is no manual button action necessary to reach the affected functionality (“0-click”)
no other device communication has to be read (no USB man-in-the-middle)
no elevated permissions are necessary on the host (if the regular Trezor environment setup is installed)

Additionally, the attack is not visible on the OLED display and the device is not altered in a permanent way, which makes a practical detection by the user fairly difficult.

Unlike other software vulnerabilities such as the stack overflow issue, this flaw is also highly interesting to an attacker in a theft or evil-maid scenario where the attacker has physical access to the device. It allows the mnemonic to be copied off the device without knowledge of the PIN or the necessity to open the device case. Consequently, the actual theft of funds can happen months or years after the attack.

It was judged by multiple people involved in the analysis to be plausible that this attack can be leveraged even further to achieve full code execution on the device. If this is the case, then it might even be relevant in a supply-chain-attack scenario where there are no funds on the device yet, but some parts of the firmware can be permanently altered in favor of the attacker.

Existing countermeasures and mitigating factors

At the time of the discovery of the issue, the three most recent firmware versions of the Trezor One were

version 1.5.2 - released 2017-08-16
version 1.6.0 - released 2017-11-16
version 1.6.1 - released 2018-03-20

During analysis, we learned that an unrelated change in the memory layout, which was intended to group the more sensitive data areas, had the positive but unintended effect of mitigating the attack. To cause a practical impact and allow exploitation, the memory area overwritten by the buffer overflow must contain data that is then used during the following program steps. For firmware 1.6.0 and 1.6.1, this was no longer the case, so only 1.5.2 and earlier versions are practically affected.

With release 1.6.1, the memory protection unit (MPU) of the microcontroller was activated, which makes certain attacks more difficult. It is unclear if this would have been a strong protection against exploitation.

Countermeasures introduced after firmware 1.6.2

More recent firmware versions have switched to a different storage layout where the PIN is no longer directly stored on the device and the mnemonic seed is stored in an encrypted form. This reduces the chance that attackers can misuse intended functionality to leak data in a device state where the PIN has not been entered correctly.

Unfortunately, the relevant microcontrollers do not support a concept of virtual memory where techniques like address space layout randomization (ASLR) are possible, which would have made this attack harder to execute by randomizing the positions of certain important memory areas.

Downgrade attack

Firmware version 1.6.0 can be downgraded to the firmware version 1.5.2 without a forced erasure of the secrets on the device and without knowing the PIN, see the Trezor wiki. This is relevant for an attacker with physical access to the device since a “not practically affected” 1.6.0 can be downgraded into a “vulnerable” 1.5.2 and then exploited.

Due to this aspect, firmware version 1.6.0 should be treated as vulnerable as well if any form of physical access by an attacker is plausible.

Details about the exploit

This will be described in part 2 of the article.

Responsible disclosure

Initial disclosure

I responsibly disclosed the issue to SatoshiLabs through my thesis advisor Dr. Jochen Hoenicke, who is a part-time security researcher at SatoshiLabs. SatoshiLabs fixed this issue quickly with a firmware release after ~32 days, as described in the timeline. See also the related article on a KeepKey vulnerability that influenced this disclosure process.

As mentioned in the article, Saleem Rashid had confidential access to the report and quickly built the POC exploit.

Given the sensitive nature of the issue, I reached out to ShapeShift since their KeepKey product was based on similar code. They quickly confirmed that they were not affected by the vulnerability.

I co-authored the public disclosure article on the issue.

Second disclosure

As described in the article on the related disclosure of the dry-run recovery stack overflow, there is another product which is affected by this issue.

During the OLED information leak issue in mid-2019, I found that the firmware of the ARCHOS Safe-T hardware wallet also contained the vulnerable code and was not yet fixed although ~12 months had passed after the public and high-profile Trezor disclosure. I notified ARCHOS privately and urged them to patch the issue as soon as possible. They confirmed the vulnerability in early July 2019 and indicated that they were planning to do a firmware release within two weeks of that date.

In late July 2019, they announced that the beta firmware was ready and that the GitHub repository would be updated soon.

Unfortunately, I have not heard back from them as of 2020-01-05 despite multiple attempts to reach them. At this point, their GitHub repository has not been updated for over a year and the newest firmware version is still not patched.

The relevant message.c code of the current Safe-T 1.1.3 firmware release is very similar to the Trezor One firmware version 1.6.0, which moves the msg_in buffer to a different position in the binary via the CONFIDENTIAL property. This likely results in the “accidental mitigation” of moving the buffer overflow to a memory position where it does no practical damage to the Safe-T code flow and therefore prevents the practical impact.
Please note: I have not performed an in-depth analysis of the relevant memory sections on the Safe-T that are relevant in this consideration. Also, it might be possible to downgrade some of the more recent firmware to older versions that are vulnerable in a different way.

Therefore I recommend treating all versions of the Safe-T firmware as vulnerable at the moment until there are clarifying statements from ARCHOS.

Relevant products

product	source	fixed version	vendor references
SatoshiLabs Trezor One	GitHub	1.6.2	Issue disclosure post, General release notes
ShapeShift KeepKey	GitHub	not affected	-
Archos Safe-T	GitHub	no public patch in the repository, see (1)	no public report

Detailed timeline

Date	info
2018-05-25	Issue is described to Dr. Hoenicke and disclosed to SatoshiLabs
2018-05-29	Internal patch is available
2018-06-04	Planned release date: 2019-06-13
2018-06-04	Issue is disclosed to ShapeShift
2018-06-06	ShapeShift confirms that they are not affected
2018-06-25	Firmware v1.6.2 is released
2018-06-25	Public blog post on the general firmware update
2018-07-12	Public disclosure of the vulnerability via an in-depth Trezor blog post
~~~~
2019-05-04	First attempt to reach ARCHOS development team
2019-06-28	First response from ARCHOS development team
2019-07-01	Detailed notice to ARCHOS development team about unpatched status of Safe-T
2019-07-01	ARCHOS acknowledges the issue
2019-07-12	ARCHOS reports internal evaluation of the patch
2019-07-25	ARCHOS reports that the firmware fix will be released soon
2019-08-10	Request to ARCHOS for a response
2019-10-14	Request to ARCHOS for a response
2019-11-27	Request to ARCHOS for a response
2019-12-07	Request to ARCHOS for a response

Credit

I would like to credit and thank

Dr. Jochen Hoenicke as a co-author of this discovery
Saleem Rashid (1) for the discovery of the exploit
Dr. Daniel Dietsch for his general assistance and supervision during the thesis

Bug bounty

SatoshiLabs provided a bug bounty for this issue.

Contents