This is a well-known attack vector: It’s often used by the Android rooting and modding community, but our guess is that it’s way more popular with law enforcement and government agencies.
All the more interesting, then, that S-Boot on contains several memory corruption bugs, one of which could be used to reach full code execution within the bootloader.
We can currently confirm the existence of the vulnerability only on Exynos chipsets. It seems universal to approximately 90% of the Samsung Exynos ROMs running on S5, S6 and S7. The very newest ROMs for S7 (February 2017) appear to include a fix for this bug, but we’ll confirm this in a few days.
There’s a lot of ground to cover, so we’ll break up this write-up into two posts. In this post we’ll focus on some S-Boot internals, then explore the bootloader’s attack surface and get basic debugging capabilities. We’ll end the post with the discovery of an especially interesting attack surface. In the next post we’ll disclose the actual vulnerability and how we exploited it to get code execution in S-Boot.
We won’t go into much detail on the basics of reversing S-Boot, such as how to load it into IDA or find the base address. Fernand Lone Sang (@_kamino_) is about to publish a great article exactly about that and I’ll put a link for it here when it’s out. If you need any help beyond that, just DM me and I’d be glad to give you a hand if I can.
|The boot stages on Samsung phones|
The Android boot process on Samsung begins with code running in the Boot ROM, which verifies the integrity of the next-stage bootloader using the OEM public key, known on Samsung devices as the Samsung Secure Boot Key (SSBK). It then loads two separate processes into memory: One is S-Boot itself, and the other is the TrustZone TEE (Trusted Execution Environment), running in the so-called “Secure world”.
The two processes work in tandem. The TEE OS, which in the Exynos case is Trustonic (formerly MobiCore), is called from S-Boot to verify that images are properly signed before they’re loaded or flashed. Therefore, a compromise in either S-Boot or the TEE will mean a potential compromise of the whole system.
S-Boot itself is divided in two: The first stage bootloader, BL1, is called from the Boot ROM and initializes the low-level system primitives. BL2, which BL1 jumps into after verifying its signature, is already a minimal OS on its own, complete with driver support for USB, display and I/O.
Since we were interested in finding a bug that will let us subvert the boot process, we decided to look for it as close to the actual kernel booting as possible. That’s because we knew we’d already have an initialized system at our disposal, making further operations such as disk I/O – which we’ll need to do to flash our custom image – rather trivial. So we decided to jump into BL2 and ignore BL1 at this stage (although we’re sure it’ll be fascinating to reverse it at a later stage).
At this stage we didn’t have any debugging capabilities at all, just the sboot.bin blob that comes together with the standard Samsung Exynos image. So we opened the blob in IDA and honed in on BL2.
|A typical function in BL2. Notice the quantity of strings|
This was pretty easy: knowing that BL1 is mainly responsible for low-level initialization, while BL2 is almost a full-featured OS, we concluded that functions belonging to BL2 will be necessarily bigger and with more debug strings and references to other functions. Once we determined where BL2 was, we used some old reversing tricks to determine the base address of the image in memory.
From a high level, BL2 has several interesting responsibilities, including but not limited to:
1. Booting the kernel
2. Flashing a new firmware image
3. Displaying a basic user interface during firmware updates
4. Debugging (if we’re lucky)
On bootloaders, the mechanism to load a new firmware image is usually the best attack surface to start with, since it involves direct input from the attacker as well as fairly complicated logic. So that’s where we set our sights first.
|Odin, the Samsung flashing client. A 90s era beauty|
Anyone who’s had any research experience with Samsung’s Android phones knows Odin, the venerable but somewhat clumsy piece of software which flashes firmware ROMs to the device’s storage.
On the device side, flashing new firmware involves first switching the phone to Download Mode, which is implemented in S-Boot, by pushing a three-key combination, then connecting it via USB to the host which is running the Odin client. The Odin client then sends the selected firmware image to an Odin server running on the device. You can’t just flash any image, of course, and on locked Samsungs the bootloader will reject firmware that is not signed by Samsung.
|Download mode. Locked bootloaders reject unsigned images|
If you want to follow along with our analysis, the ROM version we’re using here is G930FXXU1APF2. That’s a Samsung Galaxy S7. Go ahead and download it from Sam Mobile.
The key function in the Odin handler code, which handles almost all of the Odin protocol, is process_packet (at address 0x8F00A0A4). And we’re immediately faced with a bug as soon as we read the function:
|The beginning of process_packet|
As you can see, the Odin protocol looks at the packet ID and chooses the relevant branch of the code. Packet ID 0x65 tells Odin that we’re about to do an operation related to a PIT file (PITs contain partitioning information, read more about them at this XDA thread).
When the code runs into ID 0x65, it can either read out the current PIT file to a buffer or write a new one to the special partition which holds the PIT data. If the second byte of the packet is 1, Odin goes ahead and copies the current PIT to a buffer which will then be transferred to the Odin client. The client needs this to determine whether the new firmware fits within the current partitioning scheme.
But where does the buffer to which the PIT is copied (xfer_data.pit_buf) get initialized? Apparently, it only gets allocated in this case:
|The allocated of pit_buf|
Meaning you have to first send an initialization packet (ID 0x64) before the buffer gets allocated. If you don’t, the buffer just points to 0. And if you try to copy the PIT before the buffer gets allocated, the code just goes ahead and tries to copy to 0: a classic null-pointer dereference.
This bug is similar to many other bugs that we found in Odin, in that it crashes the bootloader but is probably not exploitable. In this case, since we’re on an ARM64 architecture, the address 0 is just not mapped and any attempt to copy to it results in instant panic. Things aren’t so bad on ARM32 architectures, since the address 0 could contain the Exception Vector Table (EVT) which could be overwritten. The problem with this is that we still don’t control what we write, since we don’t control the PIT data.
But this bug does give us something else. What do we get on the screen when we trigger the bug and crash the bootloader?
|Inside Upload Mode|
A quick look at the code reveals that the bootloader exception handler prints the above output to screen, then enters something that’s referred to as “Upload Mode”. That’s an interesting development: Upload Mode is a semi-secret bootloader mode that’s been puzzling the modding community for years. Some users report getting it after especially bad kernel panics; others say that it comes up because of PMIC issues. Now we also know that we enter it during bootloader exceptions.
Looking at the code, we see that Upload Mode is implemented as an inline function in usbd3_rdx_process (at address 0x8F028C1C). We’ve edited and simplified the code a bit for clarity.
mode_switch = p_board_info->mode_switch;
if ( mode_switch & UPLOAD_MODE )
if ( !transaction_data.response_buffer )
transaction_data.response_buffer = (char *)malloc(0x80000);
if ( !transaction_data.response_buffer )
printf(“%s: buffer allocation failed.\n”, “usbd3_rdx_process”);
if ( !strcmp(packet_buf, “PoWeRdOwN”) )
if ( !strcmp(packet_buf, “PrEaMbLe”) )
memcpy(transaction_data.response_buffer, “AcKnOwLeDgMeNt”, 15);
if ( !strcmp(packet_buf, “PrObE”) )
memcpy(transaction_data.response_buffer, log_location_buf, log_location_buf_size);
dump_start_addr = strtol(packet_buf, NULL, 16);
dump_end_addr = strtol(packet_buf + 9, NULL, 16);
(some length checks)
memcpy(transaction_data.response_buffer, dump_start_addr, dump_end_addr – dump_start_addr);
This is a fairly basic protocol to dump memory from the device. After sending a sequence of initialization packets, you simply send a dump start address and a dump end address, and you get back the dump over USB.
This is extremely useful for debugging and reversing purposes, since we can dump a memory image after a crash, look at the registers and the stack and generally get an idea of what’s going on. We can, of course, also dump the full range of memory to aid us with reversing. We’ll see that this ability will become useful in the second part of this write-up.
Since we haven’t been able to find a public tool which dumps RAM over Upload Mode, we’ve written up one of our own. Feel free to experiment with it.
At this stage we went back into the Odin protocol, hopefully to find an exploitable bug. One of the things we automatically do when diving into new attack surfaces is to write raw, basic fuzzers as we go along to help find some easy wins.
This proved a bit harder to do with S-Boot, because it uses a proprietary protocol over CDC ACM (a form of serial) and it’s pretty hard and frustrating to get to work with correctly. The small details are hard to get right: For instance, you have to send in an empty packet after every standard packet, some packets need to be 1024 bytes even if they only contain only 4 bytes of real data, etc. Writing a packet fuzzer from scratch was too slow for our time limits.
That’s where Benjamin Dobell’s awesome Heimdall comes in. Heimdall is an open-source implementation of the Odin client protocol which takes care of all the annoying bits of talking to the Odin bootloader code, so we used this as a basis for a basic fuzzer and just extended it a bit.
We’ve added a command line option called “fuzz”, which just takes a bunch of raw packets that you can pre-generate with some Python code, then sends them to the device in sequence while taking care of the low-level details. You can get it here.
We got some interesting crashes in Odin using this approach, but none that seemed exploitable at first glance. We were about to go deeper into Odin when we decided that we want to spend some time on extending our debugging capabilities. And this is when we made an interesting discovery.
The UART Console
Searching through the binary, we found a set of suggestive string pointers at 0x8F08BD78:
|The possible command list|
|The Samsung Anyway|
Getting a used Anyway on eBay, we tested various combinations of switches to try and get the MUIC to switch to UART terminal mode. This did work on older Samsung phones, but we only succeeded with getting input – we got logs from the bootloader and kernel, but we didn’t actually get a terminal.
At this stage we decided to build our own makeshift UART cable, similar to what Joshua Drake did with the Nexus 4 UART cable. We collected various scraps of data from XDA regarding ID pin resistor values and corresponding manufacturer’s modes. We also got some hints from the kernel DTS files. This is what we came up with:
|Our makeshift jig|
Our jig is quite simple: an RS232-to-USB has its TX/RX lines connected to the D+/D- USB lines of the micro USB connector and the ID pin is connected to the ground pin via the variable resistor.
It turned out that the correct resistance value is 619K ohm. When set to that resistance, we’d get some output when booting up the device. But that still didn’t seem to do the trick, since the output would go silent after a few lines – and we still couldn’t get a terminal.
|The initial UART output. Logs went silent after ifconn_com_to_open|