The year 2020 has been a disaster of biblical proportions. Old Testament, real wrath of God type stuff. Fire and brimstone coming down from the skies! Rivers and seas boiling! Forty years of darkness, earthquakes, volcanoes, the dead rising from the grave! Human sacrifices, dogs and cats living together...mass hysteria and reporting Linux kernel bugs to Microsoft!? I thought I would write up a quick blog post explaining the following tweet and walk through a memory corruption flaw reported to MSRC that was recently fixed. Windows kernel graphics drivers. The announcement of dxgkrnl was exciting and piqued our interest regarding the new attack surface it opens up. So we decided to quickly dive into it and race to find bugs. When examining kernel drivers the first thing I head to are the IOCTL (Input/Output Control) handlers. IOCTL handlers allow users to communicate with the driver via the ioctl syscall. This is a prime attack surface because the driver is going to be handling userland-provided data within kernel space. Looking into drivers/gpu/dxgkrnl/ioctl.c the following function is at the bottom, showing us a full list of the IOCTL handlers that we want to analyze.*by Joseph Tartaro*
When working through this list of functions, I eventually stumbled into dxgk_signal_sync_object_cpu which has immediate red flags. We can see that data is copied from userland into kernel space via dxg_copy_from_user() in the form of the structure d3dkmt_signalsynchronizationobjectfromcpu and the data is passed as various arguments to dxgvmb_send_signal_sync_object().
The IOCTL handler dxgk_signal_sync_object_cpu lacked input validation of user-controlled data. The user passes a d3dkmt_signalsynchronizationobjectfromcpu structure which contains a uint value for object_count. Moving deeper into the code, in dxgvmb_send_signal_sync_object (drivers/gpu/dxgkrnl/dxgvmbus.c), we know that we control the following arguments at this moment and there's been zero validation:
* args.flags (flags) * args.object_count (object_count, fence_count) * args.objects (objects) * args.fence_values (fences) * args.device (device)An interesting note is that args.object_count is being used for both the object_count and fence_count. Generally a count is used to calculate length, so it's important to keep an eye out for counts that you control. You're about to witness some extremely trivial bugs. If you're inexperienced at auditing C code for vulnerabilities, see how many issues you can spot before reading the explanations below.
This count that we control is used in multiple locations throughout the IOCTL for buffer length calculations without validation. This leads to multiple integer overflows, followed by an allocation that is too short which causes memory corruption.
**Integer overflows:** 17) Our controlled value *object_count* is used to calculate *object_size* 19) Our controlled value *fence_count* is used to calculate *fence_size* 21) The final result of *cmd_size* is calculated using the previous *object_size* and *fence_size* values 25) *cmd_size* could simply overflow from adding the size of *d3dkmt_handle* if it were large enough **Memory corruption:** 27) The result of *cmd_size* is ultimately used as a length calculation for *dxgmem_alloc*. As an attacker, we can force this to be very small. Since our new allocated buffer *command* can be extremely small, the following execution that writes to it could cause memory corruption. 33-44) These are all writing data to what is pointing at the buffer, and depending on the size we force there's no guarantee that there is space for the data. 46,59) Eventually execution will lead to two different calls of *[dxg_copy_from_user](https://www.kernel.org/doc/htmldocs/kernel-api/API---copy-from-user.html)*. In both cases, it is copying in user-controlled data using the original extremely large size values (remember our *object_count* was used to calculate both *object_size* and *fence_size*). Hopefully this inspired you to take a peek at other opensource drivers and hunt down security bugs. This issue was reported to MSRC on May 20th, 2020 and resolved on August 26th, 2020 after receiving the severity of *Important* with an impact of *Elevation of Privilege*.
You can view the patch commit [here](https://github.com/microsoft/WSL2-Linux-Kernel/commit/7212aa038af18cbe1b383ab4398567d0160eb63d) with the new added validation.