Sunday, October 26, 2014

An OS Kernel Bug in Windows 8.1 32-bit OS When Handling Task Switch Events

I'm not sure if this kernel bug that I reported in last year has been fixed in the latest Win8.1 32bit system. The bug is : any NMI (Non-Maskable Interrupt) can cause system crash/BSOD with BugCheck 7F, {7, *, *, *}. 


Previously, I found this issue when we were testing an NMI event by registering a customized NMI callback handler with the kernel API KeRegisterNmiCallback(). The expected behavior should be that whenever a nonmaskable interrupt (NMI) occurs, our registered NMI callback must be called, and system runs OK after calling our callback. On old systems like XP, Win8, Win7, this is exactly correct. However, we got system crash/BSOD with a bugcheck 7F, {7, 3b, 406e9, 0} on a Windows 8.1 32-bit OS

Note that we didn't see this issue on Windows 8.1 64-bit OS. After investigation, I see that this is because 64bit NMI handler (vector 2 in IDT table) has different implementation with 32bit NMI handler in IDT table. When x86/Intel processor running in 32-bit mode, a hardware task switch mechanism is implemented, however, when processor running in 64-bit mode, the hardware task switch won't be supported by processor. Hence the OS also implements NMI handler (it uses task gate in vector 2 of IDT table for 32bit OS) differently.

Here are some details for this OS kernel bug. 

When an NMI is triggered (for example, through the Local APIC or software interrupt INT 2h), the Win8.1 32-bit OS gets immediately crashed on the target processor that received NMI. By checking the kernel memory dump file, we can see this error message:


BugCheck 7F, {7, 3b, 406e9, 0}
Arguments:
Arg1: 00000007, EXCEPTION_NPX_NOT_AVAILABLE
Arg2: 0000003b
Arg3: 000406e9
Arg4: 00000000
Where, the Arg1 above is an exception number, #7, which means "Device Not Available Exception (#NM)" in x86 processor. The question here is why this is a #NM fault (vector = 7) when an NMI occurs? Actually, the behaviors behind are more complicated than I expected.

Here the real process happened behind after my investigation:
  1. [current task Tr = 28]:
    Trigger an NMI through Local APIC or INT 2h.  Since NMI vector #2 in  IDT entry is a task gate, and its TR=58, then the processor will switch to a new task with Tr=58, and set CR0.TS bit (This is a task-switch bit, the processor always automatically set this bit whenever a hardware task switch occurs).
  2. [new task Tr=58]:
    The processor now executes the NMI ISR routine nt!KiTrap02 (I observed that CLTS – clear CR0.TS bit – is executed inside this ISR function), then call my registered NMI callback, and finally execute IRET with a return to original task Tr=28.
  3. [original task Tr=28]:
    Now, the processor continues execution after switching task back to original task with Tr=28 (and CR0.TS bit automatically is set by processor again as explained in step 1 above).
    Then, if there is an external interrupt or exception happening, e.g. #PF (vector 14), then the nt!KiTrap0E will be executed for #PF handling.
    So, in this case when the processor is executing:
    stmxcsr dword ptr [ebp+48h]  (see below snapshots)
    then ... 
  4. [original task Tr=28]:
    A #NM exception will happen. This is because of CR0.TS =1 (and CR0.EM = 0, CR0.MP = 1 by examining CR0 control register value in Windbg tool), See the table below for the combinations, which triggers a #NM exception when executing the instruction stmxcsr
    This eventually can explain why we get a #7 fault EXCEPTION_NPX_NOT_AVAILABLE

This below is Win blue (8.1) 32-bit OS #PF handler (nt!KiTrap0E) disassembled by Windbg, note that you can see stmxcsr/ldmxcsr/movaps instructions (related to floating-point operations) are used to save floating-point registers. 


This snapshot below from x86/Intel SDM. It shows that #NM fault occurs when !CR0.EM && CR0.MP && CR0.TS is true, unfortunately a task switch event can just happen to trigger this condition.

Note that why we didn't get a nested #NM event because in #NM ISR handler (nt!KiTrap07), one of these instructions (stmxcsr/ldmxcsr/movaps) might also be executed... 

Here is the reason, see below snapshot, where the #NM ISR routine enforces CR0.TS=0, CR0.EM=0, CR0.MP=1, at the very beginning before executing any instructions like stmxcsr/ldmxcsr/movaps. When this condition is true, you can see that (according to picture above), the #NM fault won't be triggered any more



So, to fix this kernel bug, in my opinion, we should always clear CR0.TS bit in each of exception/interrupt handler routine just like we do in #NM nt!KiTrap07 handler ISR. 

However, could we take advantage of it for an exploitation? It seems not!


No comments:

Post a Comment