Sunday, November 02, 2014

Security OS Design (cont.): Write Protection for Linux Kernel critical data structures (GDT, IDT, syscall table, task_strcture, mm_struct,...)

To be continued for previous post, let me review what must be changed in Linux kernel in order to prevent buffer overrun/overflow attacks from modifying the critical kernel data structures, like GDT, IDT, task_struct, mm_struct, etc.

There are some kernel data structures that are never changed at runtime as long as the operating system completes their initialization. For example, the GDT and IDT table, the system call table, or SSDT (pointed by nt!KeServiceDescriptorTable, see this link for SSDT hooking in Windows OS). Note that in Linux system, some of GDT table entries will also be updated by kernel. 

For those data structures, we can directly configure them with Read-Only memory permission in page table entries. 

However, there are many kernel data structures like task_struct, mm_struct, GDT, which must have to be configured with Read-Write attribute because they are changed very frequently during OS runtime. 

But the good thing is that those data are only changed by kernel itself. Basically, the system drivers or other LKMs must not change them, and we can even think that any changes to them by those LKMs are illegitimate, and not desired behaviors. 

So, with this assumption we can now take a look at what we should have to do on existing Linux kernel system or a new operating system started from scratch.

Kernel virtual memory management subsystem:
First of all, add a new type of memory allocation to support Read-Only memory allocation (with kmalloc() or even vmalloc() ), for example, adding a new parameter GFP_ROMEM. 

This means that the kernel internal memory management subsystem (e.g. Linux slab allocator) must be extended to group RO memory chunks together in a single or multiple RO pages (4KB or 2MB in size), and traditional RW memory chunks into other multiple RW pages in 4KB or 2MB size. This might greatly increase the complicity of memory management system design. 

Memory allocation for kernel itself and drivers (or any LKMs)
Once we add a new type GFP_ROMEM, we must define the rules to use it. 

The first rule #1 is ... for the data structures that will only be modified by kernel module itself, we must use this new type for memory allocation in kernel (e.g. scheduler). All the drivers (or other LKMs) are disallowed to use this new type, we can use code static analysis tool to enforce this usage. 

The second rule #2 is ... since the data structure are now RO attribute, and by default CR0.WP bit is set, so kernel module must have to disable CR0.WP before writing access to those data structures. So the code logic is as below:

     disable_wp();                              // clear CR0.WP bit.
     write access to RO data fields.
     enable_wp();                              // set CR0.WP bit again.

 At the same time, any legitimate drivers (LKMs) are not allowed to change CR0.WP bit (code scanning to enforce this).

With solution, we can prevent many buffer overflow attacks like, some driver bug that causes arbitrary kernel memory overwriting. However, ROP (JOP) attacks might bypass this solution, but this security design is not intended to address such a specific attack like ROP. 

  1. This will increase tremendous changes to existing Linux kernel system. But it would be good if we plan to write our own operating system starting from scratch. 
  2. Performance impact. Too many extra cycles for disabling/enabling CR0.WP bit. But we can optimize it, the real impact might not so big.
  3. Need to consider the interrupts or NMIs between disable_wp() and enable_wp() functions. This is just an implementation consideration. It can be solved very easily. 

Any other big issues? 

The memory Protection Keys feature can do kind of similar protection for key data structures. With this feature enabled, each process also has a protection key value associated with it. On a memory access the hardware checks that the current process's protection key matches the value associated with the memory block being accessed; if not, an exception occurs. 

See the wikipedia page for details:

No comments:

Post a Comment