Wednesday, October 22, 2014

Security OS Kernel Design: an idea to prevent malicious software overwriting the critical system kernel data structures

Recently, when reading this paper "HyperSafe - A Lightweight Approach to Provide Lifetime Hypervisor Control-Flow Integrity", an idea just comes out from my mind: using write protection (CR0.WP) and read-only (RO) page attribute to prevent the critical kernel data structures being overwritten by malicious software through buffer (stack, heap) overflow in an exploitable kernel module.


In a virtual memory management system, OS kernel can configure a Read-Only (RO) virtual page by clearing R/W bit in corresponding page table entry (PTE) so that a virtual page cannot be modified by OS kernel and/or application.

Write Protection (CR0.WP) allows virtual pages to be protected from supervisor-mode writes. If CR0.WP = 0, supervisor-mode write accesses are allowed to linear addresses with read-only access rights; if CR0.WP = 1, such write accesses are not allowed. However, note that user-mode write accesses are never allowed to virtual addresses with Read-Only access rights, regardless of the value of CR0.WP.


In a modern operating system design, Write Protection bit by default is always set by kernel, which means that even the kernel code cannot directly modify the Read-Only pages. And basically the Read-Only access permission are used in these two ways below:

  1. COW (Copy-On-Write). WP and RO facilitates implementation of the copy-on-write method of creating a new process (fork()). >
    When a new child process is created, the OS kernel doesn't create a whole bunch of things at one time, instead, the child process shares most of pages with its parent process, and OS kernel only clears corresponding PTE.R/W bit for those shared pages. Whenever a process (parent or child) modifies a shared page, a write access violation (#PF) occurs, then a separate copy of that particular page alone will be made by the #PF handler for that process (parent or child).
    >
  2. Write protection for kernel code and static data. Obviously, at OS runtime, those pages (except Self-Modified Code) should never be changed by any legitimate software. So generally speaking OS kernel must configure them with Read-Only access right.

However, this post presents a 3rd usage in OS kernel design: Setting Read-Only access permission even for the 
critical system kernel data structures that previously are configured with write access permission. Here are the reasons.

Like my last post explained, thread_info structure is just one of critical data structures. Its field addr_limit can be overwritten by exploiting stack overflows existing in any kernel module/driver. This is because the thread_info structure is a writable page (the corresponding PTE.R/W bit = 1). 

So the question is - what if we configure that page (e.g. thread_info page) with Read-Only permission

In this way, we can prevent such those stack overflow attacks, because whenever a kernel module has an exploitable bug that attempts to overwrite this read-only structure, a write access violation (#PF, page fault) will be triggered.

But this will cause a problem, when a kernel legitimately writes/modifies that read-only structure, an access violation will happen too. To solve this problem, the legitimate kernel code must turn off write protection (clearing CR0.WP bit) before writing that read-only structure, and turn it on back after write access. 

In a summary, the OS kernel designer can arrange the critical data structures together as possible as we can, and set those pages with read-only access permission. Whenever the OS kernel attempts to modify them, it turns off WP, and after modification turns it on again. And as a security design rule, the kernel module (e.g. *.ko) should not directly modify any kernel critical data structures, only kernel itself can modify the those write-protected kernel data structures (actually, this is the case in any modern operating system design).

However, this introduces performance issue (but not too bad), for example, on each legitimate write access, extra CPU cycles will be consumed by writing CR0.WP bit. However, we can reduce this by grouping those legitimate access code to decrease the times of writing CR0.WP bit. Another problem might be interrupts. During the window of disabling CR0.WP and enabling it, an interrupt might happen. We can solve this by disabling interrupt (or preemption) or saving/restoring CR0.WP bit in the interrupt handler (ISR). 



UPDATE:
thread_info is not a good example, because it shares the same space with kernel stack which must have to be configured as Read-Write attribute. 

Some other kernel structures, like task_struct, mm_struct might be good examples for this design. 

No comments:

Post a Comment